Verification

Dsynct provides several commands for verifying that data matches between source and destination. All verification commands run in simple mode (DSYNCT_MODE=simple). These features are currently experimental.

verify

The verify command performs a full verification by reading all data from both the source and destination connectors in parallel and comparing them. It supports both initial sync verification and ongoing change stream verification.

docker run -e 'DSYNCT_MODE=simple' \
markadiom/dsynct verify \
--namespace <NAMESPACE> \
<SOURCE> <DESTINATION>

With transformations (source-side and/or destination-side):

docker run -e 'DSYNCT_MODE=simple' \
-v "./transform.yaml:/transform.yaml" \
markadiom/dsynct verify \
--namespace <NAMESPACE> \
--src-transform \
<SOURCE> <DESTINATION> dsync-transform://transform.yaml
Flag
Required
Description

--namespace

No

Source namespace(s). Can be specified multiple times.

--dst-namespace

No

Destination namespace(s). Defaults to source namespaces (or mapped namespaces).

--namespace-mapping

No

Namespace mapping from source to destination.

--parallelism

No

Number of parallel workers. Default: 1.

--src-transform

No

Set if a source-side transformer is provided after the two connectors.

--dst-transform

No

Set if a destination-side transformer is provided (after the source transformer if present).

--src-data-type

No

Source data type. Inferred if not set.

--dst-data-type

No

Destination data type. Inferred if not set.

--transform-data-type

No

Intermediate comparison data type. Default: DATA_TYPE_MONGO_BSON.

--skip-initial-sync

No

Skip initial sync verification.

--skip-change-stream

No

Skip change stream verification.

--latency

No

Only compare documents that have not been updated for this duration during change stream mode. Default: 20s.

--report-interval

No

How often to print progress reports. Default: 1s.

--report-limit

No

Maximum number of mismatches to report per interval. Default: 5.

--report-all

No

Report all mismatches instead of limiting.

--projection

No

JSON describing which fields to include in comparisons (e.g. {"field": {"inner_field": true}}).

--id-key

No

Field name(s) that make up the document ID. Can be specified multiple times for composite keys. Default: _id.

--partition

No

Partition number (0-indexed) for distributed verification. Default: 0.

--total-partitions

No

Total number of partitions for distributed verification. Default: 1.

--mapping-delimiter

No

Delimiter for namespace mappings. Default: :.

sample-ids

The sample-ids command samples document IDs from a source namespace using reservoir sampling. The output can be written to a file for later use with verify-ids --id-file or testsync --id-file.

To sample IDs after a transformation (so the IDs reflect the transformed data):

Flag
Required
Description

--namespace

Yes

The source namespace to sample from.

--count

No

Number of IDs to sample. Default: 100.

--output

No

Output file path. Defaults to stdout.

--max-iter-per-partition

No

Maximum number of ListData iterations per partition. 0 for unlimited.

--transform

No

Set if a transformer is provided after the source connector.

--src-data-type

No

Source data type. Inferred if not set.

--dst-data-type

No

Data type after transform. Inferred if not set.

--id-key

No

Field name(s) that make up the document ID. Can be specified multiple times for composite keys. Default: _id for BSON, id for JSON.

The output format is one extended JSON ID per line, which can be fed directly into verify-ids --id-file or testsync --id-file.

verify-ids

The verify-ids command fetches specific documents by ID from both the source and destination, optionally transforms the source documents, and compares them. It reports whether each document matches. Both connectors must support GetByIds.

To verify with a transformation applied to the source data before comparison:

Flag
Required
Description

--namespace

Yes

The source namespace.

--dst-namespace

No

The destination namespace. Defaults to the source namespace or the mapped namespace.

--id

No

Document ID (string). Can be specified multiple times. For composite keys, use --id-size.

--jsonext-id

No

Document ID in extended JSON format.

--id-file

No

Path to a file containing extended JSON IDs, one per line. Compatible with sample-ids output.

--id-size

No

Number of --id entries that form a single composite key. Default: 1.

--transform

No

Set if a transformer is provided after the two connectors.

--src-data-type

No

Source data type. Inferred if not set.

--dst-data-type

No

Destination data type. Inferred if not set.

--namespace-mapping

No

Namespace mapping from source to destination.

--mapping-delimiter

No

Delimiter for namespace mappings. Default: :.

At least one of --id, --jsonext-id, or --id-file must be provided.

Typical Workflow

Use sample-ids to collect IDs, then verify-ids to spot-check them:

For a full verification of all data, use the verify command instead:

Last updated