Features
Here are some features of dsync that make our tools unique— safer, faster, and cheaper.
Last updated
Here are some features of dsync that make our tools unique— safer, faster, and cheaper.
Last updated
Dsync is currently in beta and is undergoing active development and testing.
Allows us to selectively migrate data for specific databases or collections
Can be enabled with --ns "db_name_1,db_name_2.col_name"
This check ensures that data remains unaltered and intact during the migration process. Dsync's data integrity check can be run separately from the data migration.
Currently, dsync supports two data integrity check mechanisms:
Fast namespace counts with --verify-quick-count
Complete hash-based verification with --verify
See dsync --help
for usage
Read docs for internal details
The quick counts method can be used as a meaningful heuristic for most application. In most cases, it takes a trivial amount of time. The complete hash-based verification takes longer due to the need to read every single document. Dsync parallelizes the work where possible just like it does for the initial data copy. Additionally, if a mismatch is found, dsync will interrupt further checks and fail early (although that, of course, shouldn't happen during the normal course of operation).
When restarting the flow during the initial data copy or Change Data Capture (CDC), dsync can safely resume from the last saved state.
Cosmos DB with MongoDB API currently doesn't support delete operations when using change streams, meaning that it doesn't emit deleted changes that occur. Dsync includes a workaround that captures and simulates delete operations from the source using a periodic index scan.
Can be turned on using --cosmos-deletes-cdc
See docs for internal details
The destination namespace(s) must be empty or the pre-existing data might get deleted unless it exists on the source database.
Dsync displays a detailed progress report of the sync process. This includes the time elapsed, number of namespaces synced, number of docs synced, number of tasks completed out of the total, percentage complete, and the throughput (docs/second).
Dsync supports both CLI and Web-based progress report.
Web progress report is enabled by default and can be accessed at http://localhost:8080/progress on the host where dsync is running.
CLI progress report can be enabled with --progress --logfile dsync.log
dsync command-line options. Note that outputting logs into a logfile is a requirement.
Dsync allows the user to specify the load level between "Low", "Medium", "High", or "Beast". The load level controls the number of threads employed for reading and writing data, with a higher load level corresponding to more threads. Generally, a lower level results in slower migration but less system impact, while a higher level uses more threads for faster migration at the cost of higher resource consumption. When load level is not specified, it will default to connector-specific settings.
Can be enabled with --load-level LOAD_LEVEL
Reverse the original flow by restarting dsync with the same parameters and added --reverse
.
This starts a new flow with inverted source and destination and in CDC mode, bypassing the initial data sync. When metadata store isn't specified, it will default to the original destination (the new source).
--reverse
is effectively a convenient shortcut for swapping "-s" and "-d" options, setting "-m", and adding "--mode CDC".
Reversal will respect the provided --ns option, but otherwise will replicate changes from ALL of the namespaces on the original destination (the new source), regardless of what was originally replicated. Use the --ns option to limit which specific namespaces should be included in the reversal, if needed.