Tags: feldera/feldera
Tags
[adapters] Delta input: revamp error handling and retry logic. The connector already had retry logic in some places, but mostly relied on delta-rs for retries. This wasn't always enough and we saw timeouts and expired token errors bubbling up. This commit adds retry loops around all object store accesses. The loops are controlled by the new `max_retries` setting, similar to the output connector. By default, it will retry forever. The retry loops set health status to UNHEALTHY while retrying. If the pipeline is stopped and restarted during a retry, the connector resumes from the last successfully ingested table version. After exhausting retry attempts the connector fails permanently with a fatal error, which eliminates the possibility of data loss. There is an important caveat: Because retries may occur after partial progress (e.g., after partially processing a Delta log entry), the same data may be ingested more than once. This is consistent with the connector’s at-least-once delivery guarantee. Signed-off-by: Leonid Ryzhyk <ryzhyk@gmail.com>
[adapters] Delta output: enable checkpoints. The Delta output connector did not create periodic checkpoints. While this is in itself problematic, it also meant that the connector became slow over time, due to this delta-rs bug, which causes the `update_incremental` function to scan the entire transaction log on every commit: delta-io/delta-kernel-rs#2103. This commit: - Introduces the `checkpoint_interval` option, which tells the connector to configure checkpoint interval when creating the table. - Creates a CommitBuilder that is actually setup to create checkpoints. Without this fix the time to create a trivial delta commit increases from 1.5s to 6s after ~1000 commits. With the fix it remains constant at ~2s. Signed-off-by: Leonid Ryzhyk <ryzhyk@gmail.com>
PreviousNext