Transform Data Types

Data Format

During a sync, data can appear in three different formats depending on the operation:

  1. Initial Sync -- A single binary blob representing the full document.

  2. Updates -- A list representing the ID fields, plus a binary blob of the updated data.

  3. Deletes -- A list representing only the ID fields.

A correct mapping must handle all three cases. This means you may need to provide a mapping for both the data (to cover case 1) and the ID (to cover cases 2 and 3).

Data Types

Connectors may support different data types. Currently we support JSON and BSON. When transferring data between connectors that use different types, you need to provide configuration to map between them.

ID Mapping

The ID is particularly important to transform because the data type and/or field names may differ between source and destination.

Defaults:

  • JSON: single field called id

  • BSON: single field called _id

IDs can also be composed of multiple fields. Use the idkeys (source) and finalidkeys (destination) properties in the config if the ID format does not match the defaults on either side.

General approach:

  • Specify new fields under the add property.

  • Specify old fields to remove under the delete property.

  • Define a mapid expression so that update IDs can be set correctly.

  • Add a cel expression for each new ID field showing how it is populated from the data.

The id Variable in CEL

In CEL expressions, id is a built-in variable representing the document's ID.

  • If the ID has one field, id is the value of that field directly.

  • If the ID has multiple fields, id is a list of values.

You can set idlist: true at the top level of the config to force id to always be a list, even when the ID contains only one field.

JSON to BSON Examples

Rename a string id ("123") to a string _id ("123"):

Map a string id ("123") to an ObjectID _id ({"$oid": "202cb962ac59075b964b0715"}):

BSON to JSON Examples

Rename a string _id ("123") to a string id ("123"):

Convert an ObjectID _id ({"$oid": "202cb962ac59075b964b0715"}) to a string id ("202cb962ac59075b964b0715"):

Multi-Part ID Examples

When the source ID is composed of multiple fields, id becomes a list. Use idkeys to declare the source ID fields and finalidkeys for the destination. Individual parts are accessed with id[0], id[1], etc.

Map a two-part JSON ID (region and user_id) to BSON, renaming them to _region and _user_id:

Collapse a two-part JSON ID (tenant and record_id) into a single BSON _id string by concatenating them:

Expand a single BSON _id back into a two-part JSON ID by splitting on a delimiter:

Last updated