8 min read

Data synchronization

All mobile devices—handheld computers, mobile phones, pagers, and laptops—need to synchronize their data with the server where the information is stored. This ability to access and update information on the fly is the key to the pervasive nature of mobile computing. Yet, today almost every device uses a different technology for data synchronization.

Data synchronization is helpful for:

  • Propagating updates between a growing number of applications
  • Overcoming the limitations of mobile devices and wireless connections
  • Maximizing the user experience by minimizing data access latency
  • Keeping scalability of the infrastructure in an environment where the number of devices (clients) and connections tend to increase considerably
  • Understanding the requirements of mobile applications, providing a user experience that is helpful, and not an obstacle, for mobile tasks

Data synchronization is the process of making two sets of data look identical, as shown in the following figure:

Funambol Mobile Open Source

This involves many techniques, which will be discussed in the following sections. The most important are:

  • ID handling
  • Change detection
  • Modification exchange
  • Conflict detection
  • Conflict resolution
  • Slow and fast synchronization

ID handling

At first glance, ID handling seems like a pretty straightforward process that requires little or no attention. However, ID handling is an important aspect of the synchronization process and is not trivial.

In some cases a piece of data is identifiable by a subset of its content fields. For example, in the case of a contact entry, the concatenation of a first name and last name uniquely selects an entry in the directory. In other cases, the ID is represented by a particular field specifically introduced for that purpose. For example, in a Sales Force Automation mobile application, an order is identified by an order number or ID. The way in which an item ID is generated is not predetermined and it may be application or even device specific.

In an enterprise system, data is stored in a centralized database, which is shared by many users. Each single item is recognized by the system because of a unique global ID. In some cases, two sets of data (the order on a client and the order on a server) represent the same information (the order made by the customer) but they differ. What could be done to reconcile client and server IDs to make the information consistent? Many approaches can be chosen:

  • Client and server agree on an ID scheme (a convention on how to generate IDs must be defined and used).
  • Each client generates globally unique IDs (GUIDs) and the server accepts client-generated IDs.
  • The server generates GUIDs and each client accepts those IDs.
  • Client and server generate their own IDs and a mapping is kept between the two. Client-side IDs are called Locally Unique Identifiers (LUID) and server-side IDs are called Globally Unique Identifiers (GUID). The mapping between local and global identifiers is referred as LUID-GUID mapping. The SyncML specifications prescribe the use of LUID-GUID mapping technique, which allows maximum freedom to client implementations.

Change detection

Change detection is the process of identifying the data that was modified after a particular point in time that is, the last synchronization. This is usually achieved by using additional information such as timestamp and state information. For example, a possible database enabled for efficient change detection is shown in the following table:

ID

First name

Last name

Telephone

State

Last_update

12

John

Doe

+1 650 5050403

N

2008-04-02 13:22

13

Mike

Smith

+1 469 4322045

D

2008-04-01 17:32

14

Vincent

Brown

+1 329 2662203

U

2008-03-21 17:29

However, sometimes legacy databases do not provide the information needed to accomplish efficient change detection. As a result, the matter becomes more complicated and alternative methods must be adopted (based on content comparison, for instance). This is one of the most important aspects to consider when writing a Funambol extension, because the synchronization engine needs to know what’s changed from a point in time.

Modification exchange

A key component of a data synchronization infrastructure is the way modifications are exchanged between client and server. This involves the definition of a synchronization protocol that client and server use to initiate and execute a synchronization session. In addition to the exchange modification method, a synchronization protocol must also define a set of supported modification commands. The minimal set of modification commands are as follows:

  • Add
  • Replace
  • Delete

Conflict detection

Let’s assume that two users synchronize their local contacts database with a central server in the morning before going to the office. After synchronization, the contacts on their smartphones are exactly the same. Let’s now assume that they update the telephone number for “John Doe” entry and one of them makes a mistake and enters a different number. What will happen the next morning when they both synchronize again? Which of the two new versions of the “John Doe” record should be taken and stored into the server? This condition is called conflict and the server has the duty of identifying and resolving it.

Funambol detects a conflict by means of a synchronization matrix shown in the following table:

Database A →

↓ Database B

New

Deleted

Updated

Synchronized

/

Unchanged

Not Existing

New

C

C

C

C

B

Deleted

C

X

C

D

X

Updated

C

C

C

B

B

Synchronized/Unchanged

C

D

A

=

B

Not Existing

A

X

A

A

X

As both users synchronize with the central database, we can consider what happens between the server database and one of the client databases at a time. Let’s call Database A, as the client database and Database B, as the server database. The symbols in the synchronization matrix have the following meaning:

  • X: Do nothing
  • A: Item A replaces item B
  • B: Item B replaces item A
  • C: Conflict
  • D: Delete the item from the source(s) containing it

Conflict resolution

Once a conflict arises and is detected, proper action must be taken. Different policies can be applied. Let’s see some of them:

  • User decides: The user is notified of the conflict condition and decides what to do.
  • Client wins: The server silently replaces conflicting items with the ones sent by the client.
  • Server wins: The client has to replace conflicting items with the ones from the server.
  • Timestamp based: The last modified (in time) item wins.
  • Last/first in wins: The last/first arrived item wins.
  • Merge: Try to merge the changes, at least when there is no direct conflict. Consider the case of a vcard, where two concurrent modifications have been applied to two different fields. There is a conflict at the card level, but the two changes can be merged so that both clients can then have a valid version of the card. This is the best example of the case when the change is not directly conflicting.
  • Do not resolve.

Note that Funambol adopts a special merging policy that guarantees that the user does not lose data. The server always tries to merge if possible. When a conflict cannot be resolved with merging (for example, there are conflicting changes on the same field), the value in the last synchronization wins over the older synchronizations to meet the expectation of the user who is synchronizing. In this way, when the users who applied previous changes receive the new updates all devices will be in sync.

Synchronization modes: Full or fast

There are many modes to carry out the synchronization process. The main distinction is between fast and full synchronization. Fast synchronization involves only the items changed since the last synchronization between two devices. Of course, this is an optimized process that relies on the fact that, the devices were fully synchronized at some point in the past; this way, the state at the beginning of the sync operation is well known and sound. When this condition is not met (for instance, the mobile device has been reset and lost the timestamp of the last synchronization), a full synchronization must be performed. In a full synchronization, the client sends its entire database to the server, which compares it with its local database and returns the modifications that must be applied to be up-to-date again.

Both fast and full synchronization modes can be performed in one of the following manners:

  • Client-to-server: The server updates its database with client modifications, but sends no server-side modifications
  • Server-to-client: The client updates its database with server modifications, but sends no client-side modifications
  • Two-way: The client and server exchange their modifications and both databases are updated accordingly

Extending Funambol

The Funambol platform can be extended in many areas to integrate Funambol with existing systems and environments. The most common integration use cases and the Funambol modules involved are:

  • Officer: Integrating with an external authentication and authorization service
  • SyncSource: Integrating with an external datasource to address client specific issues
  • Synclet: Adding pre or postprocessing to a SyncML message
  • Admin WS: Integrating with an external management tool

These are illustrated in the following diagram:

Funambol Mobile Open Source

Funambol extensions are distributed and deployed as Funambol modules. This section describes the structure of a Funambol module, while the following sections describe each of these listed scenarios.

A Funambol module represents the means by which developers can extend the Funambol server. A module is a packaged set of files containing Java classes, installation scripts, configuration files, initialization SQL scripts, components, and so on, used by the installation procedure to embed extensions into the server core.

For more information on how to install Funambol modules, see the Funambol Installation and Administration Guide.

LEAVE A REPLY

Please enter your comment!
Please enter your name here