From DynamoDB to MongoDB

Near-zero downtime migration from DynamoDB to MongoDB with dsync

Prerequisites

1) DynamoDB instance with change streams enabledarrow-up-right (if CDC is needed). Make sure you have at least "New Image" enabled for the stream.

2) MongoDB cluster. Any sharded databases and collections need to be pre-created.

3) AWS credentials with proper permissions for the source

4) Installed aws-cliarrow-up-right or AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY

Additional Considerations

Indexes

For faster migration performance, we recommend to create secondary indexes on destination MongoDB after the initial sync is done. All modern MongoDB versions support building indexes in the background, and they can be created during the CDC / catch up phase.

Networking

The host running dsync should have network access to both DynamoDB and MongoDB instances.

Provisioning additional capacity on the source

The initial data copy stage of the migration is equivalent to a parallel table scan in DynamoDB and it will consume additional read units. When your DynamoDB source is already serving live production traffic and the table(s) are configured with provisioned capacityarrow-up-right, we recommended to temporarily increase read capacity to avoid impacting production traffic. The exact capacity increase varies on a case-by-case basis, but as a rule of thumb we recommend to add at least 10,000 RCU.

Step 1: Download dsync

Adiom on Azure Marketplacearrow-up-right

circle-info

Working on a large-scale production environment? Use our horizontally scalable Enterprise offering.

Use Docker (markadiom/dsync) or download the latest release from the GitHub Releasesarrow-up-right page. Note that on Mac devices you may need to configure a security exception to execute the binary by following these stepsarrow-up-right.

You can also build dsync directly from the source code using go build.

If you're using a Cloud Provider marketplace image (e.g. from Azure Marketplacearrow-up-right), then the binaries have already been preinstalled, and you just need to ssh into your provisioned instance.

If you want to access the Web UI progress feature (default port 8080) you can port forward e.g.

Step 2: Set up environment variables

Ensure you set your AWS credentials properly, such as by setting the AWS environment variables:

Alternatively, you can use the aws configure sso (if you're doing it for the first time or using a VM from the marketplace) and aws sso login shell commands to securely login into the AWS account for aws-cli. Test to see if you can see your dynamodb tables with aws dynamodb list-tables.

(Optional) Step 3: Start the transformer

When data transformations are required, you can connect a transformer to Dsync via the gRPC extension interface. You can write your own in the language of your choice, or use our YAML-based declarative transformer (available only for Enterprise customers).

When running the transformer in Docker, make sure the container is on the same network as Dsync container: the --network option to the "docker run" command.

Step 4: Start dsync

Run dsync --namespace <TABLENAME>:<DB>.<COL> dynamodb $MONGODB_URI. Replace <TABLENAME> with the dynamodb table name. Replace $MONGODB_URI with the desired MongoDB URI.

When transformer is required, use dsync --namespace <TABLENAME>:<DB>.<COL> dynamodb $MONGODB_URI grpc://localhost:8085 --insecure

We use the --insecure since we are not using TLS for our connection to the transformer service and we assume it's running on the same host on port 8085.

You can migrate multiple different tables at the same time by specifying multiple mappings in the --namespace param:

dsync --namespace "<TABLE1>:<DB>.<COL1>,<TABLE2>:<DB>.<COL2>" dynamodb $MONGODB_URI grpc://localhost:8089 --insecure

For Cloud Marketplace images and Docker:

Step 5: Post-migration configuration

Indexes

Create/validate necessary indexesarrow-up-right on MongoDB.

Last updated