Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dvc][doc] Create MVP for DaVinciRecordTransformer #1087

Open
wants to merge 66 commits into
base: main
Choose a base branch
from

Conversation

kvargha
Copy link
Contributor

@kvargha kvargha commented Jul 30, 2024

Summary

  1. Added storeRecordsInDaVinci boolean to the constructor to detect if a user wants to store their transformed records in DaVinci or another storage of their choice.
  2. Created DaVinciRecordTransformerConfig to be used by DaVinciConfig, so we have access to Output Value Class and Schema without needing to create a dummy record transformer outside of StoreIngestionTask.
  3. Added a recovery method to bootstrap from VT or from local state on startup. If blob transfer is enabled, an error will be thrown since it's not currently supported with the record transformer.
  4. Created AbstractStorageIterator and a RocksDB implementation to be used by the recovery method.
  5. Added logic to detect if transformer logic has changed on startup, utilizing the hash of the class bytecode. If transformer logic has changed, that means local state is stale and we need to bootstrap directly from the VT.
  6. Added logic to create a DVRT specific Avro deserializer.
  7. Split put into transform and processPut, so that when we're trying to bootstrap from local state, we don't try to transform the data again as it already has been transformed. Also created a wrapper method transformAndProcessPut when both need to be called in StoreIngestionTask.
  8. Created DaVinciRecordTransformerResult, to handle cases where the user doesn't transform the value or wants to skip it.
  9. Added delete logic in StoreIngestionTask.
  10. Throw an exception when a user tries to use Ingestion Isolation with DVRT. This can cause issues if a user's transformer is stateful, and it will cause the storage engine to be deleted every time the client comes online.
  11. Updated record transformer doc with new logic.
  12. Made Venice repo a link in the workspace setup doc.

How was this PR tested?

Added more unit and integration tests.

Does this PR introduce any user-facing changes?

  • No. You can skip the rest of this section.
  • Yes. Make sure to explain your proposed changes and call out the behavior change.

@FelixGV
Copy link
Contributor

FelixGV commented Jul 30, 2024

The thing about doing a hash of the bytecode sounds pretty interesting, but I wonder how reliable it is... I suspect that it probably has extremely low chances of false negative (the code changed but we fail to detect it) but that it might have a somewhat large probability of false positives (the code did not change but we think it did). The latter could be due to differences in JDK versions or flags used for compilation... Maybe it's not a big deal, but IDK...

Also, coming back to false negatives, those may still exist if the scope of the checksumming is not large enough, e.g. the class we checksum did not change but it calls into another class which did change.

Overall, I am not sure how I feel about this mechanism, and I think we may want to have a config to opt out of this check. Possibly even: this should be opt in to begin with (i.e. off by default) until we can gain a better understanding of its limitations.

@kvargha
Copy link
Contributor Author

kvargha commented Jul 30, 2024

The thing about doing a hash of the bytecode sounds pretty interesting, but I wonder how reliable it is... I suspect that it probably has extremely low chances of false negative (the code changed but we fail to detect it) but that it might have a somewhat large probability of false positives (the code did not change but we think it did). The latter could be due to differences in JDK versions or flags used for compilation... Maybe it's not a big deal, but IDK...

Also, coming back to false negatives, those may still exist if the scope of the checksumming is not large enough, e.g. the class we checksum did not change but it calls into another class which did change.

Overall, I am not sure how I feel about this mechanism, and I think we may want to have a config to opt out of this check. Possibly even: this should be opt in to begin with (i.e. off by default) until we can gain a better understanding of its limitations.

From my testing, the hash would change under these circumstances:

  1. Change in JDK version.
  2. Any changes made within the user's implementation (I found this to be very reliable).

Circumstances where it didn't change:

  1. Change in version numbers of imported packages that's used by the user's implementation.
  2. Code changes in functions imported.
  3. Changes in the abstract class.

I think having it off by default and having the user enable it makes sense.

We could have a global variable that's initially set to false, and add another constructor where a user can pass in a boolean of whether or not to enable it. We could call the variable detectChangesInLogic. If it's set to false, then always bootstrap from VT. Thoughts @ZacAttack?

kvargha and others added 10 commits October 18, 2024 23:17
* Create MVP for record transformer

* Fix docs and imports

* Don't make serializers and deserializers global

* Dont check for boolean val

* Delete classHash if it exists

* Assert file deletion is true

* Improve code coverage

* Add tests for blocking record transformer

* Created AvroGenericDaVinciClient for record transformer test

* Make sure getRecordTransformer is valid

* Make sure getRecordTransformer isn't null

* Reorganize testRecordTransformerClient and fix key schema

* Fix TestStringRecordTransformer to work with Avro objects and update doc

* Refactor onRecovery and add javadoc for DaVinciRecordTransformer's constructor

* Reset offset if we need to bootstrap from VT

* Delete classHash after running tests

* Throw an error if a user tries to use blob transfer with record transformer

* Make previous public methods private, remove subscribe call inside onRecovery

* Correctly pass DVRT functional interface to initBackend, add todo to store classHash in rocksDB

* Fix spotbugs

* Init DVRT inside SIT, and move DVRT recovery to SIT

* Modify checkout action

* Undo

* Fix compilation

* Cache deserializer/serializer

* Create utility class for record transformer

* Create AbstractStorageIterator and a RocksDB implementation

* Delete classHash file. Compare value classes and don't override value class in AvroGenericDaVinciClient

* Remove compareCacheConfig mock

* Wrap access modifier with doPrivileged

* Fix spotless error

* Remove unused variables

* Added a ToDo to make chunking with record transformer lazy, and make the return type of delete void. Proceed with deletion based off of storeRecordsInDaVinci

* Created a config for record transformer that's passed into the DaVinciConfig to remove the need of instantiating a dummy record transformer, updated docs, and simplified function names

* Add message envelope for DVRT

* Fix spotless issue

* Fix test

* update docs
Copy link
Contributor Author

@kvargha kvargha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ToDo: Throw error if Ingestion Isolation is turned on with DVRT.

return new DaVinciRecordTransformerResult<>(DaVinciRecordTransformerResult.Result.TRANSFORMED, transformedValue);
}

public void processPut(Lazy<Integer> key, Lazy<String> value) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

random: Shouldn't this @OverRide?

}

// Mimic restart
clientWithRecordTransformer.close();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure I understand the implication of this test, does this run through on disk state on restart? Or.... does it actually reingest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The former. I used the debugger to verify that it's actually reading from disk and not re-ingesting.

}

@Test(timeOut = TEST_TIMEOUT, dataProvider = "dv-client-config-provider", dataProviderClass = DataProviderUtils.class)
public void testRecordsInDaVinciDisabledRecordTransformer(DaVinciConfig clientConfig) throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this test. What exactly is this case??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This tests the case where a user passes false into the constructor for storeRecordsInDaVinci.

That means nothing should be stored in DaVinci.

@@ -657,7 +667,7 @@ protected GenericRecordChunkingAdapter getGenericRecordChunkingAdapter() {
return GenericRecordChunkingAdapter.INSTANCE;
}

private D2ServiceDiscoveryResponse discoverService() {
D2ServiceDiscoveryResponse discoverService() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

protected? Or /* package private */ ?


if (clientConfig.isSpecificClient()) {
if (daVinciConfig.isRecordTransformerEnabled()) {
if (recordTransformerConfig.getOutputValueClass() != clientConfig.getSpecificValueClass()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do a test case through these, I think looking back at the integration test cases the transformed type is always the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants