Verification
Dsynct provides several commands for verifying that data matches between source and destination. All verification commands run in simple mode (DSYNCT_MODE=simple). These features are currently experimental.
verify
The verify command performs a full verification by reading all data from both the source and destination connectors in parallel and comparing them. It supports both initial sync verification and ongoing change stream verification.
docker run -e 'DSYNCT_MODE=simple' \
markadiom/dsynct verify \
--namespace <NAMESPACE> \
<SOURCE> <DESTINATION>With transformations (source-side and/or destination-side):
docker run -e 'DSYNCT_MODE=simple' \
-v "./transform.yaml:/transform.yaml" \
markadiom/dsynct verify \
--namespace <NAMESPACE> \
--src-transform \
<SOURCE> <DESTINATION> dsync-transform://transform.yaml--namespace
No
Source namespace(s). Can be specified multiple times.
--dst-namespace
No
Destination namespace(s). Defaults to source namespaces (or mapped namespaces).
--namespace-mapping
No
Namespace mapping from source to destination.
--parallelism
No
Number of parallel workers. Default: 1.
--src-transform
No
Set if a source-side transformer is provided after the two connectors.
--dst-transform
No
Set if a destination-side transformer is provided (after the source transformer if present).
--src-data-type
No
Source data type. Inferred if not set.
--dst-data-type
No
Destination data type. Inferred if not set.
--transform-data-type
No
Intermediate comparison data type. Default: DATA_TYPE_MONGO_BSON.
--skip-initial-sync
No
Skip initial sync verification.
--skip-change-stream
No
Skip change stream verification.
--latency
No
Only compare documents that have not been updated for this duration during change stream mode. Default: 20s.
--report-interval
No
How often to print progress reports. Default: 1s.
--report-limit
No
Maximum number of mismatches to report per interval. Default: 5.
--report-all
No
Report all mismatches instead of limiting.
--projection
No
JSON describing which fields to include in comparisons (e.g. {"field": {"inner_field": true}}).
--id-key
No
Field name(s) that make up the document ID. Can be specified multiple times for composite keys. Default: _id.
--partition
No
Partition number (0-indexed) for distributed verification. Default: 0.
--total-partitions
No
Total number of partitions for distributed verification. Default: 1.
--mapping-delimiter
No
Delimiter for namespace mappings. Default: :.
sample-ids
The sample-ids command samples document IDs from a source namespace using reservoir sampling. The output can be written to a file for later use with verify-ids --id-file or testsync --id-file.
To sample IDs after a transformation (so the IDs reflect the transformed data):
--namespace
Yes
The source namespace to sample from.
--count
No
Number of IDs to sample. Default: 100.
--output
No
Output file path. Defaults to stdout.
--max-iter-per-partition
No
Maximum number of ListData iterations per partition. 0 for unlimited.
--transform
No
Set if a transformer is provided after the source connector.
--src-data-type
No
Source data type. Inferred if not set.
--dst-data-type
No
Data type after transform. Inferred if not set.
--id-key
No
Field name(s) that make up the document ID. Can be specified multiple times for composite keys. Default: _id for BSON, id for JSON.
The output format is one extended JSON ID per line, which can be fed directly into verify-ids --id-file or testsync --id-file.
verify-ids
The verify-ids command fetches specific documents by ID from both the source and destination, optionally transforms the source documents, and compares them. It reports whether each document matches. Both connectors must support GetByIds.
To verify with a transformation applied to the source data before comparison:
--namespace
Yes
The source namespace.
--dst-namespace
No
The destination namespace. Defaults to the source namespace or the mapped namespace.
--id
No
Document ID (string). Can be specified multiple times. For composite keys, use --id-size.
--jsonext-id
No
Document ID in extended JSON format.
--id-file
No
Path to a file containing extended JSON IDs, one per line. Compatible with sample-ids output.
--id-size
No
Number of --id entries that form a single composite key. Default: 1.
--transform
No
Set if a transformer is provided after the two connectors.
--src-data-type
No
Source data type. Inferred if not set.
--dst-data-type
No
Destination data type. Inferred if not set.
--namespace-mapping
No
Namespace mapping from source to destination.
--mapping-delimiter
No
Delimiter for namespace mappings. Default: :.
At least one of --id, --jsonext-id, or --id-file must be provided.
Typical Workflow
Use sample-ids to collect IDs, then verify-ids to spot-check them:
For a full verification of all data, use the verify command instead:
Last updated