-
Notifications
You must be signed in to change notification settings - Fork 452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
♻️ Remove usage of db.getCommittedOpVersion()
#657
Conversation
The `DB.getCommitedOpVersion()` function is only used [in one place][1]: in a corner case that differentiates between: - a create op that conflicts with another create op - a create op that has been re-submitted because the connection was closed before the ack was received The first of these cases should result in an error, and the second is non-fatal and the error can be swallowed. At the moment, this differentiation is made using the special `DB.getCommittedOpVersion()` function, which - given an op with a `src` and `seq` combination - will return `null` if the op hasn't been committed, or its version number if it has. If the op has a committed version number, we know that the submit is a duplicate, and we can safely ignore it. This approach is problematic, because: - it [needs a whole op index][2] just to handle this niche corner case - the default behaviour [fetches **all** ops][3] unless the driver overrides this behaviour This change proposes that we actually don't need this function **at all**, and implements an alternative approach to differentiate between the above cases. Instead of having to fetch all ops, we just fetch a *single* op, by version number, which [is already indexed][4] anyway. We then check to see if this retrieved op is our creation op. You can't do this in the general case, because an op's version number may be changed between submit and commit because of conflicts with remote ops, which cause it to be transformed up, and its version number incremented. **However**, the create op is a **special case**, which [**cannot** be transformed against **any** other op][5], which means that the version number it's submitted with **must** be the version number it's committed with. We can therefore use this special property of the create op to safely retrieve itself from the data store by version if it has been committed. [1]: https://github.com/share/sharedb/blob/7b20313ded4c302b9416cc6c7821694a7fa491b8/lib/submit-request.js#L112 [2]: share/sharedb-mongo#94 [3]: https://github.com/share/sharedb/blob/7b20313ded4c302b9416cc6c7821694a7fa491b8/lib/db/index.js#L69 [4]: https://github.com/share/sharedb-mongo/blob/7e88b0a23e968e3672ee151bf6abaf3bfdf62484/index.js#L401 [5]: https://github.com/share/sharedb/blob/7b20313ded4c302b9416cc6c7821694a7fa491b8/test/ot.js#L152-L162
@nateps I know you're busy, but could you please sense-check this change and the assumptions that underpin it? It would be good to know if I'm missing something here? |
NB: this actually increases coverage in |
- Fixes #94 - Follows on from share/sharedb#657 This change is to work alongside upstream work to remove ShareDB's usage of `getCommittedOpVersion()`. In `sharedb-mongo` this function requires an entire extra op index, just to handle a corner case where two `create` ops are submitted at the same time, which should happen relatively infrequently. This change allows consumers to opt out of individual indexes. For example, if we're not using `getCommittedOpVersion()`, consumers can opt out of the automatic `src`/`seq`/`v` index creation with: ```js new ShareDbMongo( mongoUrl, { disableIndexCreation: { src_seq_v: true, }, }, ); ``` Note that if this index already exists, consumers will need to manually remove it. Previous behaviour will still work, so setting: ```js new ShareDbMongo( mongoUrl, { disableIndexCreation: true, }, ); ``` disables **all** index creation.
Okay as discussed, I'm an idiot, and the Will instead try a new strategy: add |
We have [logic][1] that tries to differentiate between: - a create op that is incorrectly trying to create a `Doc` that already exists - a create op that has already been committed, and has been resubmitted (possibly because the connection closed before the ack was received) The latter case is [not currently covered by tests][2]. This change adds a test to cover this branch, and a second similar test which explains why #657 was not a viable change (and correctly fails for that case). [1]: https://github.com/share/sharedb/blob/7b20313ded4c302b9416cc6c7821694a7fa491b8/lib/submit-request.js#L105-L121 [2]: https://coveralls.io/builds/67552613/source?filename=lib%2Fsubmit-request.js#L117
Supersedes #657 The `DB.getCommitedOpVersion()` function is only used [in one place][1]: in a corner case that differentiates between two "blind", versionless (created without first fetching) create op conflict cases: - a versionless create op that conflicts with another create op - a versionless create op that has been re-submitted because the connection was closed before the ack was received The first of these cases should result in an error, and the second is non-fatal and the error can be swallowed. At the moment, this differentiation is made using the special `DB.getCommittedOpVersion()` function, which - given an op with a `src` and `seq` combination - will return `null` if the op hasn't been committed, or its version number if it has. If the op has a committed version number, we know that the submit is a duplicate, and we can safely ignore it. This approach is problematic, because: - it [needs a whole op index][2] just to handle this niche corner case - the default behaviour [fetches **all** ops][3] unless the driver overrides this behaviour This change proposes that we actually don't need this function in most cases, and implements an alternative approach to differentiate between the above cases. When creating an snapshot, we store the `src`, `seq` and `v` of the op that was used to create it. If a conflicting `create` is received, we can then compare directly with this metadata, without needing to fetch anything extra from the database. This should work in the majority of cases, but won't work if the metadata is missing, which could happen if: - the snapshot is "old": it was created before this change - the driver doesn't support metadata (eg [Postgres][4]) In the case where metadata is unavailable, we fall back to the existing method, using `getCommittedOpVersion()`. However, if drivers support metadata, this should happen sufficiently infrequently that consumers could potentially remove the `src`/`seq`/`v` index with no noticeable performance penalty. [1]: https://github.com/share/sharedb/blob/7b20313ded4c302b9416cc6c7821694a7fa491b8/lib/submit-request.js#L112 [2]: share/sharedb-mongo#94 [3]: https://github.com/share/sharedb/blob/7b20313ded4c302b9416cc6c7821694a7fa491b8/lib/db/index.js#L69 [4]: https://github.com/share/sharedb-postgres/blob/499702acd478645bcc249fa50ba6fc066d257d04/index.js#L140
Supersedes #657 The `DB.getCommitedOpVersion()` function is only used [in one place][1]: in a corner case that differentiates between two "blind", versionless (created without first fetching) create op conflict cases: - a versionless create op that conflicts with another create op - a versionless create op that has been re-submitted because the connection was closed before the ack was received The first of these cases should result in an error, and the second is non-fatal and the error can be swallowed. At the moment, this differentiation is made using the special `DB.getCommittedOpVersion()` function, which - given an op with a `src` and `seq` combination - will return `null` if the op hasn't been committed, or its version number if it has. If the op has a committed version number, we know that the submit is a duplicate, and we can safely ignore it. This approach is problematic, because: - it [needs a whole op index][2] just to handle this niche corner case - the default behaviour [fetches **all** ops][3] unless the driver overrides this behaviour This change proposes that we actually don't need this function in most cases, and implements an alternative approach to differentiate between the above cases. When creating an snapshot, we store the `src`, `seq` and `v` of the op that was used to create it. If a conflicting `create` is received, we can then compare directly with this metadata, without needing to fetch anything extra from the database. This should work in the majority of cases, but won't work if the metadata is missing, which could happen if: - the snapshot is "old": it was created before this change - the driver doesn't support metadata (eg [Postgres][4]) In the case where metadata is unavailable, we fall back to the existing method, using `getCommittedOpVersion()`. However, if drivers support metadata, this should happen sufficiently infrequently that consumers could potentially remove the `src`/`seq`/`v` index with no noticeable performance penalty. [1]: https://github.com/share/sharedb/blob/7b20313ded4c302b9416cc6c7821694a7fa491b8/lib/submit-request.js#L112 [2]: share/sharedb-mongo#94 [3]: https://github.com/share/sharedb/blob/7b20313ded4c302b9416cc6c7821694a7fa491b8/lib/db/index.js#L69 [4]: https://github.com/share/sharedb-postgres/blob/499702acd478645bcc249fa50ba6fc066d257d04/index.js#L140
Supersedes #657 The `DB.getCommitedOpVersion()` function is only used [in one place][1]: in a corner case that differentiates between two "blind", versionless (created without first fetching) create op conflict cases: - a versionless create op that conflicts with another create op - a versionless create op that has been re-submitted because the connection was closed before the ack was received The first of these cases should result in an error, and the second is non-fatal and the error can be swallowed. At the moment, this differentiation is made using the special `DB.getCommittedOpVersion()` function, which - given an op with a `src` and `seq` combination - will return `null` if the op hasn't been committed, or its version number if it has. If the op has a committed version number, we know that the submit is a duplicate, and we can safely ignore it. This approach is problematic, because: - it [needs a whole op index][2] just to handle this niche corner case - the default behaviour [fetches **all** ops][3] unless the driver overrides this behaviour This change proposes that we actually don't need this function in most cases, and implements an alternative approach to differentiate between the above cases. When creating an snapshot, we store the `src`, `seq` and `v` of the op that was used to create it. If a conflicting `create` is received, we can then compare directly with this metadata, without needing to fetch anything extra from the database. This should work in the majority of cases, but won't work if the metadata is missing, which could happen if: - the snapshot is "old": it was created before this change - the driver doesn't support metadata (eg [Postgres][4]) In the case where metadata is unavailable, we fall back to the existing method, using `getCommittedOpVersion()`. However, if drivers support metadata, this should happen sufficiently infrequently that consumers could potentially remove the `src`/`seq`/`v` index with no noticeable performance penalty. [1]: https://github.com/share/sharedb/blob/7b20313ded4c302b9416cc6c7821694a7fa491b8/lib/submit-request.js#L112 [2]: share/sharedb-mongo#94 [3]: https://github.com/share/sharedb/blob/7b20313ded4c302b9416cc6c7821694a7fa491b8/lib/db/index.js#L69 [4]: https://github.com/share/sharedb-postgres/blob/499702acd478645bcc249fa50ba6fc066d257d04/index.js#L140
Supersedes #657 The `DB.getCommitedOpVersion()` function is only used [in one place][1]: in a corner case that differentiates between two "blind", versionless (created without first fetching) create op conflict cases: - a versionless create op that conflicts with another create op - a versionless create op that has been re-submitted because the connection was closed before the ack was received The first of these cases should result in an error, and the second is non-fatal and the error can be swallowed. At the moment, this differentiation is made using the special `DB.getCommittedOpVersion()` function, which - given an op with a `src` and `seq` combination - will return `null` if the op hasn't been committed, or its version number if it has. If the op has a committed version number, we know that the submit is a duplicate, and we can safely ignore it. This approach is problematic, because: - it [needs a whole op index][2] just to handle this niche corner case - the default behaviour [fetches **all** ops][3] unless the driver overrides this behaviour This change proposes that we actually don't need this function in most cases, and implements an alternative approach to differentiate between the above cases. When creating an snapshot, we store the `src`, `seq` and `v` of the op that was used to create it. If a conflicting `create` is received, we can then compare directly with this metadata, without needing to fetch anything extra from the database. This should work in the majority of cases, but won't work if the metadata is missing, which could happen if: - the snapshot is "old": it was created before this change - the driver doesn't support metadata (eg [Postgres][4]) In the case where metadata is unavailable, we fall back to the existing method, using `getCommittedOpVersion()`. However, if drivers support metadata, this should happen sufficiently infrequently that consumers could potentially remove the `src`/`seq`/`v` index with no noticeable performance penalty. [1]: https://github.com/share/sharedb/blob/7b20313ded4c302b9416cc6c7821694a7fa491b8/lib/submit-request.js#L112 [2]: share/sharedb-mongo#94 [3]: https://github.com/share/sharedb/blob/7b20313ded4c302b9416cc6c7821694a7fa491b8/lib/db/index.js#L69 [4]: https://github.com/share/sharedb-postgres/blob/499702acd478645bcc249fa50ba6fc066d257d04/index.js#L140
The
DB.getCommitedOpVersion()
function is only used in one place: in a corner case that differentiates between:The first of these cases should result in an error, and the second is non-fatal and the error can be swallowed.
At the moment, this differentiation is made using the special
DB.getCommittedOpVersion()
function, which - given an op with asrc
andseq
combination - will returnnull
if the op hasn't been committed, or its version number if it has.If the op has a committed version number, we know that the submit is a duplicate, and we can safely ignore it.
This approach is problematic, because:
This change proposes that we actually don't need this function at all, and implements an alternative approach to differentiate between the above cases.
Instead of having to fetch all ops, we just fetch a single op, by version number, which is already indexed anyway. We then check to see if this retrieved op is our creation op.
You can't do this in the general case, because an op's version number may be changed between submit and commit because of conflicts with remote ops, which cause it to be transformed up, and its version number incremented.
However, the create op is a special case, which cannot be transformed against any other op, which means that the version number it's submitted with must be the version number it's committed with.
We can therefore use this special property of the create op to safely retrieve itself from the data store by version if it has been committed.