Remove timestamp optimization for full syncs #1907
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #1372
When connectors were first being built, there was an assumption that a distinction between "full" and "incremental" syncs would not be needed. We hoped to only have one sync type, and to have it be capable of doing full table scans on its first run, but only partial scans thereafter.
In reality, this has not worked, and we've begun to introduce new sync types, specifically the Incremental sync. At the same time, we've received more and more reports that this optimization is actually a hiderance to customers working to move into production. While it was intended to save time, it in fact takes more, as every code, pipeline, or configuration change requires data to be deleted, modified, or a new index to be started from scratch. This friction was not intended or anticipated, and is considered to be a bug.
In the unlikely event that a customer is relying on this optimization, the diff is quite simple and can be added back from a fork as a stop-gap until we are able to deliver Incremental Syncs to all our connectors.
Checklists
Pre-Review Checklist
v7.13.2
,v7.14.0
,v8.0.0
)Related Pull Requests
Release Note
Addressed a bug where full syncs would not fully-resync data as expected, which caused friction when upgrading, making customized code changes, or modifying ingest pipelines. The fix may result in increased full sync times, as more documents may now be downloaded and indexed.
(Note to the docs team - please link this PR in the release note, so that users can easily revert it on a fork in the event that the performance impacts for them outweigh the value of fixing the bug).