Data Management

12 min read

The basics of data operations

When we migrate an application on there is a very important process of migrating the data from the existing legacy system or spread sheets and loading it into the cloud application. To improve the data quality we also need to ensure that there are no duplicate entries and the data is clean.

Some basic data management operations are:

  • Data export:
    • Data can be exported for making periodic backups or downloading the entire data
    • It can also be used to get the template to insert data
  • Inserting data:
    • We can insert data into existing standard and custom objects
    • Also helpful in migrating users
  • Updating data:
    • Updating is needed to run a de-dupe check to remove duplicates
    • It is also used for enhancing or cleaning the data
  • Deleting data:
      The deleting operation is used to free up legacy data and to remove erroneous data

We will be seeing the above operations using the data loader, using a wizard as well as using the command line interface.

The importance of record IDs

Why do we have a separate section on record IDs? Once we master them, most of the data operations are a piece of cake. The record ID is a unique identifier of the record. The record ID is similar to the primary ID and foreign ID in a database table.

The record IDs are not just a bunch of random numbers, but have valuable information that can help us.

18-character and 15-character IDs uses two types of IDs, 18 character and 15 character. When we use the ID from the URL it is 15 characters, however, when we use the data loader we get 18 character IDs.

The 15-character ID is case sensitive while the 18 character ID is not case sensitive. The 18-character ID is used to migrate data from legacy systems or spread sheets, which do not recognize case-sensitive IDs. Both the IDs work with and point to the same record. However, if we are to migrate data from the spread sheets or export data to the spread sheets we have to use 18-character IDs.

The 15-character ID is case sensitive while the 18 character ID is not case sensitive. The 18-character ID is used to migrate data from legacy systems or spread sheets, which do not recognize case-sensitive IDs. Both the IDs work with and point to the same record. However, if we are to migrate data from the spread sheets or export data to the spread sheets we have to use 18-character IDs.

  1. URL of the record detail page.
  2. By running a report on the object.
  3. Data loader (API access).

URL and report will always return the 15-characters IDs while the data loader and the API return the 18-digit IDs. API, however, accepts both 15-character and 18-character IDs.

Let us now explore what we can gain from IDs in the URL.

Exploring the URL format

The easiest way to study the record ID is by understanding the standard Salesforce objects. These IDs are fixed and are of similar format across the Salesforce organization. Every custom object will have a fixed pattern, but will be unique to individual organizations only.

For the purpose of an example we will study the record ID of opportunity object. Open an opportunity record as shown in the following screenshot:


If you do not find it on display, please select the Standard Sales Application using the drop-down in the corner.

The record ID is seen in the address bar of the browser as shown in the following screenshot:

Let’s observe the format of the URL of the address bar carefully, the URL is made up of different things as shown in the following screenshot:

Let us now look at the different parts of the URL separated in the preceding screenshot.

  • Instance name: The first part of the URL is the instance name. Salesforce. com provides multiple instances of the server and the instance name varies depending on the time and location from where you create the ID. For the full list of instance name and their status please visit:
    Packt Publishing
  • Salesforce Server URL: This is common for all the record IDs.
  • Record ID: The third part of the URL as seen in the preceding screenshot is the Record ID. The format of the URL is common for all the detail page of the records across the organizations. We will be looking at the Record ID in greater detail in the next section.

Record ID

One important thing to note is that the record ID is case sensitive. So 00690000003zRfq is not equal to 00690000003ZRFQ. There will be a unique ID per organization, no two records will have same ID. When we migrate data from one organization to another, the IDs are changed. The record ID is further split into two parts as shown in the following screenshot:

The first three characters of the record ID are the object identifiers, which help us to identify the object of the record.

The remaining characters are the unique record ID to identify the record. All the objects thus have three-character encoded ID, which is a prefix to their record ID elsewhere.

The following table shows us the three character prefixes and the objects that they are associated with:


Object Name



























These prefixes for the standard objects are the same across all the Salesforce organization. Now let’s play with the URL a bit. If we wish to land on the list view of the standard object, all we need to do is append a /o against the object prefix example for the account object we need to add 001/o.

The final Salesforce URL in this case would be:

The instance name for every organization would be different. The name depends on the geography and the time when the Salesforce account is created, please check the instance name for your organization.

This will open the list view or the tab view of the object, similarly we can change the prefix to any standard object.

When we change the /o to /e it automatically opens the edit page for a new account:

This will open the edit page for the account directly. If we add the /e in front of a record ID, we get the update detail page for the record, for example, let’s say we want to open the opportunity edit page from our original example.

Now if we add the /e in front of the URL, we get:

This will automatically open this opportunity for editing; try adding a /e against any record URL and we will get the edit page for the record with all the existing data pre-filled with information.

Only when we create the full-copy sandbox of the production organization, do we get similar IDs in productions and sandbox only once.

Record ID in the field is not editable even in code. We cannot include it in DML for insert, but will be useful for DML of update and delete.

Summarizing record IDs

Let us recap some important points we learned about record IDs:

  • A record ID is similar to the primary or foreign key in a database
  • Record IDs are case sensitive on
  • 18-character IDs are case insensitive
  • Every record ID has a prefix of the first three characters
  • Every record in multiple and similar organizations have a different ID
  • IDs are similar only for the first time when we have a full-copy sandbox of a production organization
  • 18-character IDs are used for migrating data from a legacy system that is case insensitive

Relationships — dependents first

In the general library system we have the following relationships:

The object Customer is related to Media through a junction object CustomerMedia. Media object is a master for books and video objects as shown in the following diagram:

As shown in the following diagram, Fine has a Lookup on Cards:

Finally, as shown in the following diagram Card has a Master-Detail lookup on Customer:

When we load data we have to load in the order of dependencies, for example, in the library system the CustomerMedia object is dependent on the Customer and Media object.

Hence, to load the CustomerMedia object we use the following sequence:

  1. Load Customer object data.
  2. Load Media object data.
  3. Finally load the CustomerMedia object data.

The rule of thumb when uploading the data is that every dependent object data should be loaded first. As the CustomerMedia object had two dependencies, one on Customer object and another on Media object, we loaded both the data first.

Modifying system fields

When we are loading legacy data into the Salesforce system during migration it is sometimes essential to change the Created Date, Last Modified Date, Last Modified by, and Created by entries. Normally, when we load data into the system, the dates are not modifiable and are system time stamped.

However, provides us with the facility for modifying the auditing fields only once during the first insert. This is helpful in migrating legacy data and preserving history.

To enable this feature we need to contact Salesforce customer support.

Features of modifiable system fields

Some of the important features of modifiable system fields are as follows:

  • The fields can be modified only once in the lifetime of the initial insert
  • They are accessible through API, that is, the data loader
  • All custom objects can have modifiable system fields, however, not all standard objects can have modifiable fields
  • Account, opportunity, contact, lead, case, task, and event can have modifiable fields
  • The fields are read-only for existing records

Connecting to Salesforce server through API

When loading the data from a data loader, we make an API call for the Salesforce server. The login credentials are passed through the user name + password + security token.

We can bypass the use of a security token if we white list the IP address from where the data is loaded.

The CRUD commands

The full-form of CRUD is create, read, update, and delete. provides us with all the four operations using API. Records can be inserted, updated, deleted, and extracted from the server.

The inserting process requires data to be inserted without the ID because generates the record ID when the record is created. Data is uploaded in the form of a .CSV file created using a spread sheet.

We can map individual fields to, however, to save time we can initially extract a single record with the selected fields and use it as a template to quickly map other fields.

The update process is the same as insert, but since we are updating an existing record we need an ID or an External ID during update. The delete command only needs the ID of the record to be deleted.

Apart from the general CRUD commands provides a special upsert command. Upsert is the combination of insert and update in a single call.

Upsert uses ID or external ID to match records with existing records, if no match is found or the ID is missing, it inserts the record. The upsert command is helpful in avoiding duplicates based on ID or external ID. If more than one record is matched, an error is reported.

External IDs

When we are migrating data from other systems, it is useful to have a foreign key to link data between the two systems. The external ID helps to create that link.

A custom field can be marked as an external ID to identify the primary keys of the legacy system (outside Salesforce). Text, number, e-mail, and auto-number fields can be flagged as external IDs.

Fields marked as external IDs are indexed and are available on all the objects that can have custom fields. These fields are searchable and appear in search queries.

An object can have three external ID fields. The external ID fields become the cross reference fields to the legacy systems. For example, if the following data is in the legacy spread sheets for the library system we can migrate it to Salesforce by mapping the Media Number to the Media Number External ID field.

The original data from the legacy system are given as follows:

Media Number

Media Name

Media Type


Harry Potter and the chamber of secret






Mission Impossible


We can map this legacy data to the fields using the following mapping:


(External ID)




Harry Potter and the chamber of secret






Mission Impossible


When we are migrating data from the legacy system into the system, we can upsert the data using the external ID field. This way there is no need to know the Salesforce record ID. We can prevent duplication of data using the external ID.

We can also load data into related objects without using ID. Ideally, when loading data into related objects, we need to include the Salesforce ID as the field name in the data files, but we can also include an external ID and perform an upsert. Using the external ID while loading relationship is only permitted during the upsert operation.

Exercise – migrating data from legacy system

The general library wishes to migrate the media information from the spread sheets they are using. They have a field called Media Number in the spread sheet, which is a unique identifier (primary key) in the current system. Migrate the data into the media object and avoid duplicates.

To migrate the data from the spread sheet, we first need to create an external ID field in the media object.

Create a Media_number field on the media object in Salesforce, the field can be a text field. Check the flag for external ID to flag it. While loading the data use the Media_ number fleld and use upsert to load the data.


Please enter your comment!
Please enter your name here