# Observability

Dsynct exports logs and metrics via OpenTelemetry (OTel). This allows you to monitor migration progress, diagnose performance bottlenecks, and track change stream lag using tools like SigNoz, Grafana, or any OTel-compatible backend.

## Configuration

Enable OpenTelemetry by passing the `--otel` flag and setting the `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable to point at your OTel gRPC collector under the `app` part of the command:

```bash
docker run \
-e 'OTEL_EXPORTER_OTLP_ENDPOINT=http://<COLLECTOR_HOSTNAME>:4317' \
markadiom/dsynct app \
--otel \
<OTHER_COMMANDS_WITH_THEIR_OPTIONS> \
```

For simple mode (no Temporal) the options go before `sync`:

```bash
docker run \
-e 'DSYNCT_MODE=simple' \
-e 'OTEL_EXPORTER_OTLP_ENDPOINT=http://<COLLECTOR_HOSTNAME>:4317' \
markadiom/dsynct \
--otel \
sync \
<OPTIONAL PARAMETERS> \
<SOURCE> <DESTINATION>
```

### OTel Flags

| Flag                     | Description                                             |
| ------------------------ | ------------------------------------------------------- |
| `--otel`                 | Enable exporting logs and metrics to an OTel collector. |
| `--otel-metric-interval` | Interval between metric pushes. Default: `10s`.         |
| `--otel-service-name`    | Service name reported to OTel. Default: `dsynct`.       |

### Logs

When `--otel` is enabled, structured logs (JSON) are emitted both to stderr and to the OTel log collector. The log level can be controlled with `--log-level` (default: `INFO`).

## Metrics

All metrics are emitted under the `dsync-flow` OTel meter. Metrics are labeled with attributes such as `namespace`, `success`, and `worker` to allow filtering and grouping.

### Common Attributes

| Attribute   | Description                                                                                                 |
| ----------- | ----------------------------------------------------------------------------------------------------------- |
| `namespace` | The namespace (collection/table) being processed.                                                           |
| `success`   | `true` if the operation succeeded, `false` if it failed.                                                    |
| `worker`    | Identifies the worker type (e.g. `initial-sync`, `stream-changes`, `writer-0`, `transform-0`, `updates-0`). |
| `index`     | Stream partition index (for change stream gauges).                                                          |

### Initial Sync Metrics

| Metric                 | Type      | Unit      | Description                                                                                            |
| ---------------------- | --------- | --------- | ------------------------------------------------------------------------------------------------------ |
| `dsynct.read`          | Counter   | documents | Total number of documents read from the source.                                                        |
| `dsynct.written`       | Counter   | documents | Total number of documents written to the destination.                                                  |
| `dsynct.list_data`     | Histogram | ms        | Latency of each `ListData` call to the source connector.                                               |
| `dsynct.write_data`    | Histogram | ms        | Latency of each `WriteData` call to the destination connector.                                         |
| `dsynct.get_transform` | Histogram | ms        | Latency of each `GetTransform` call to the transformer. Only emitted when a transformer is configured. |

### Change Stream (CDC) Metrics

| Metric                         | Type      | Unit       | Description                                                                                                  |
| ------------------------------ | --------- | ---------- | ------------------------------------------------------------------------------------------------------------ |
| `dsynct.read`                  | Counter   | events     | Total number of change events read from the source. Shares the same counter as initial sync reads.           |
| `dsynct.written`               | Counter   | events     | Total number of change events written to the destination.                                                    |
| `dsynct.write_updates`         | Histogram | ms         | Latency of each `WriteUpdates` call to the destination connector.                                            |
| `dsynct.get_transform`         | Histogram | ms         | Latency of each `GetTransform` call during change stream processing.                                         |
| `dsynct.stream_read_gauge`     | Gauge     | events     | Running total of change events read for a given stream partition.                                            |
| `dsynct.stream_written_gauge`  | Gauge     | events     | Running total of change events written for a given stream partition.                                         |
| `dsynct.read_ahead_gauge`      | Gauge     |            | The LSN (log sequence number) value reported by the source. Useful for tracking how far ahead the source is. |
| `dsynct.last_event_time`       | Gauge     | ms (epoch) | Timestamp of the last change event processed, in milliseconds since epoch.                                   |
| `dsynct.since_last_event_time` | Gauge     | ms         | Time elapsed since the last change event was processed. Useful for detecting change stream lag.              |

## Dashboards

Pre-configured SigNoz dashboards are available in the [public repository](https://github.com/adiom-data/public/tree/main/kubernetes/system/signoz_dashboards). You can import them by following the [SigNoz import instructions](https://signoz.io/docs/dashboards/import-dashboard/).

### Key Things to Monitor

* **Throughput**: Track `dsynct.read` and `dsynct.written` counters to monitor documents/events per second.
* **Latency**: Use the `dsynct.list_data`, `dsynct.write_data`, and `dsynct.write_updates` histograms to identify slow operations.
* **Change stream lag**: Monitor `dsynct.since_last_event_time` to detect if the destination is falling behind the source. A growing value indicates the CDC pipeline is not keeping up.
* **Read-ahead**: The difference between `dsynct.stream_read_gauge` and `dsynct.stream_written_gauge` shows how many events have been read but not yet written, indicating backpressure.
* **Errors**: Filter by `success=false` to isolate failed operations.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.adiom.io/enterprise/running-dsynct/observability.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
