Adiom | Documentation
HomeContactDownload
  • Adiom
  • Getting Started
    • Quickstart
      • From Cosmos DB to MongoDB
      • From Cosmos DB to /dev/null
      • From on-premise MongoDB to Cosmos DB
      • From DynamoDB to Cosmos DB NoSQL
    • What is supported
  • Data Migration
    • Step By Step
  • Basics
    • Features
    • How it works
      • Sync
      • Glossary
    • Limitations
    • FAQs
  • Implementation Details
    • Architecture
    • Verification
    • Resumability
Powered by GitBook
On this page
  • Namespace Filtering
  • Data Integrity Check
  • Resumability
  • Deletes Emulation
  • Progress Report
  • Load Level
  • Reversing the flow
  • Other features
  1. Basics

Features

Here are some features of dsync that make our tools unique— safer, faster, and cheaper.

PreviousStep By StepNextHow it works

Last updated 3 months ago

Dsync is currently in beta and is undergoing active development and testing.

Namespace Filtering

Allows us to selectively migrate data for specific databases or collections

  • Can be enabled with --ns "db_name_1,db_name_2.col_name"

  • You can also rename namespaces during a sync with --ns "source_db.source_col:dst_db.dst_col".

    • In some cases you can also rename without a fully qualified namespace (e.g. just the database), but it may not work properly with --reverse or --mode CDC.

Data Integrity Check

This check ensures that data remains unaltered and intact during the migration process. Dsync's data integrity check can be run separately from the data migration.

Currently, dsync supports two data integrity check mechanisms:

  • Fast namespace counts with --verify-quick-count

  • Complete hash-based verification with --verify

  • See dsync --help for usage

  • Read for internal details

The quick counts method can be used as a meaningful heuristic for most application. In most cases, it takes a trivial amount of time. The complete hash-based verification takes longer due to the need to read every single document. Dsync parallelizes the work where possible just like it does for the initial data copy. Additionally, if a mismatch is found, dsync will interrupt further checks and fail early (although that, of course, shouldn't happen during the normal course of operation).

Resumability

When restarting the flow during the initial data copy or Change Data Capture (CDC), dsync can safely resume from the last saved state.

Deletes Emulation

Cosmos DB with MongoDB API currently doesn't support delete operations when using change streams, meaning that it doesn't emit deleted changes that occur. Dsync includes a workaround that captures and simulates delete operations from the source using a periodic index scan.

  • Can be turned on using --cosmos-deletes-cdc

The destination namespace(s) must be empty or the pre-existing data might get deleted unless it exists on the source database.

Progress Report

Dsync displays a detailed progress report of the sync process. This includes the time elapsed, number of namespaces synced, number of docs synced, number of tasks completed out of the total, percentage complete, and the throughput (docs/second).

Dsync supports both CLI and Web-based progress report.

  • CLI progress report can be enabled with --progress --logfile dsync.log dsync command-line options. Note that outputting logs into a logfile is a requirement.

Load Level

Dsync allows the user to specify the load level between "Low", "Medium", "High", or "Beast". The load level controls the number of threads employed for reading and writing data, with a higher load level corresponding to more threads. Generally, a lower level results in slower migration but less system impact, while a higher level uses more threads for faster migration at the cost of higher resource consumption. When load level is not specified, it will default to connector-specific settings.

  • Can be enabled with --load-level LOAD_LEVEL

Reversing the flow

Reverse the original flow by restarting dsync with the same parameters and added --reverse.

This starts a new flow with inverted source and destination and in CDC mode, bypassing the initial data sync. When metadata store isn't specified, it will default to the original destination (the new source).

--reverse is effectively a convenient shortcut for swapping source and destination, setting "-m", and adding "--mode CDC".

Reversal will respect the provided --ns option, but otherwise will replicate changes from ALL of the namespaces on the original destination (the new source), regardless of what was originally replicated. Use the --ns option to limit which specific namespaces should be included in the reversal, if needed.

Other features

You can run just dsync to see additional options- in particular to see the available connectors and connector specific parameters they accept.

Read for internal details

Read for granular resumability and partitioning

See for internal details

Web progress report is enabled by default and can be accessed at on the host where dsync is running.

docs
docs
docs
http://localhost:8080/progress
docs