Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Modular Encryption Support When Reading Parquet Files #480

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

mukunku
Copy link
Contributor

@mukunku mukunku commented Feb 25, 2024

Summary

I made significant progress on getting Footer Decryption working with parquet files (#191).

I'm opening this work-in-progress pull request with hopes that some other folks can help get this across the finish line.

AES_GCM_V1

Thanks to a test file file @pzatschl shared with me I was able to implement the Aes Gcm V1 encryption algorithm.

link to code

AES_GCM_CTR_V1

I also implemented the Aes Gcm Ctr V1 encryption algorithm, however I don't have any test files to confirm it's working 🙃

link to code

How to test

Checkout the unit test I added that tests the sample file I mentioned above:
link to code
image

However, even though I can decrypt the test file successfully, the data itself doesn't seem to be valid. So I had to add this try-catch as a temporary workaround.
link to code
We should remove this once we have a proper test file. (Unfortunately I don't have any other test files )

@mukunku mukunku changed the title Modular Encryption Support - AesGcmV1_192bit [#191] Modular Encryption Support When Reading Parquet Files Feb 25, 2024
@mukunku mukunku changed the title Modular Encryption Support When Reading Parquet Files [WIP] Modular Encryption Support When Reading Parquet Files Feb 25, 2024
@mukunku
Copy link
Contributor Author

mukunku commented Mar 3, 2024

I was able to tidy up the PR. However there is a bug that happens when running dotnet test which is breaking the PR checks. I was able to track it down to the following error although I have no clue why it's happening:

The active test run was aborted. Reason: Test host process crashed : Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at System.MemoryExtensions.AsSpan[[System.Int32, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](Int32[], Int32)
   at Parquet.File.PackedColumn.AllocateOrGetDictionaryIndexes(Int32)
   at Parquet.File.DataColumnReader.ReadColumn(System.Span`1<Byte>, Parquet.Meta.Encoding, Int64, Int32, Parquet.File.PackedColumn)
   at Parquet.File.DataColumnReader+<ReadDataPageV1Async>d__15.MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Parquet.File.DataColumnReader+<ReadDataPageV1Async>d__15, Parquet, Version=1.0.0.0, Culture=neutral, PublicKeyToken=d380b3dee6d01926]](<ReadDataPageV1Async>d__15 ByRef)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder.Start[[Parquet.File.DataColumnReader+<ReadDataPageV1Async>d__15, Parquet, Version=1.0.0.0, Culture=neutral, PublicKeyToken=d380b3dee6d01926]](<ReadDataPageV1Async>d__15 ByRef)
   at Parquet.File.DataColumnReader.ReadDataPageV1Async(Parquet.Meta.PageHeader, Parquet.File.PackedColumn)
   at Parquet.File.DataColumnReader+<ReadAsync>d__10.MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Parquet.File.DataColumnReader+<ReadAsync>d__10, Parquet, Version=1.0.0.0, Culture=neutral, PublicKeyToken=d380b3dee6d01926]](<ReadAsync>d__10 ByRef)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].Start[[Parquet.File.DataColumnReader+<ReadAsync>d__10, Parquet, Version=1.0.0.0, Culture=neutral, PublicKeyToken=d380b3dee6d01926]](<ReadAsync>d__10 ByRef)
   at Parquet.File.DataColumnReader.ReadAsync(System.Threading.CancellationToken)
   at Parquet.ParquetRowGroupReader.ReadColumnAsync(Parquet.Schema.DataField, System.Threading.CancellationToken)
   at Parquet.Test.ParquetReaderOnTestFilesTest+<DecryptFile_UTF8_AesGcmV1_192bit>d__2.MoveNext()
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[Parquet.Test.ParquetReaderOnTestFilesTest+<DecryptFile_UTF8_AesGcmV1_192bit>d__2, Parquet.Test, Version=1.0.0.0, Culture=neutral, PublicKeyToken=d380b3dee6d01926]].ExecutionContextCallback(System.Object)
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[Parquet.Test.ParquetReaderOnTestFilesTest+<DecryptFile_UTF8_AesGcmV1_192bit>d__2, Parquet.Test, Version=1.0.0.0, Culture=neutral, PublicKeyToken=d380b3dee6d01926]].MoveNext(System.Threading.Thread)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[Parquet.Test.ParquetReaderOnTestFilesTest+<DecryptFile_UTF8_AesGcmV1_192bit>d__2, Parquet.Test, Version=1.0.0.0, Culture=neutral, PublicKeyToken=d380b3dee6d01926]].MoveNext()
   at Xunit.Sdk.AsyncTestSyncContext+<>c__DisplayClass7_0.<Post>b__1(System.Object)
   at Xunit.Sdk.MaxConcurrencySyncContext.RunOnSyncContext(System.Threading.SendOrPostCallback, System.Object)
   at Xunit.Sdk.MaxConcurrencySyncContext+<>c__DisplayClass11_0.<WorkerThreadProc>b__0(System.Object)
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
   at Xunit.Sdk.ExecutionContextHelper.Run(System.Object, System.Action`1<System.Object>)
   at Xunit.Sdk.MaxConcurrencySyncContext.WorkerThreadProc()
   at Xunit.Sdk.XunitWorkerThread+<>c.<QueueUserWorkItem>b__5_0(System.Object)
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef, System.Threading.Thread)


The active Test Run was aborted because the host process exited unexpectedly. Please inspect the call stack above, if available, to get more information about where the exception originated from.
The test running when the crash occurred:
Parquet.Test.ParquetReaderOnTestFilesTest.DecryptFile_UTF8_AesGcmV1_192bit

This test may, or may not be the source of the crash.

@mukunku
Copy link
Contributor Author

mukunku commented Mar 3, 2024

Okay, some findings.

If any test runs after my new file decryption test in the same xunit collection it crashes the CLR. I moved my test to its own test collection and disabled parallelization which essentially means xunit will run my test in isolation. see: 9e0bbbe

This way my test sometimes works; It randomly fails with similar memory mismanagement issues. So it's flaky at the moment. This is just a band-aid to get the PR green. I'm sure i'm doing something stupid somewhere that's causing this issue but I haven't been able to find it so far.

@EamonHetherton
Copy link
Contributor

EamonHetherton commented Jul 5, 2024

Not a solution but some of my findings investigating this and may help debugging the issue:

I've found a reliable way to reproduce the exception that crashes the test host in a single test, the issue happens when allocating memory.

Add these two lines at the end of the "Z_DecryptFile_UTF8_AesGcmV1_192bit" test:

byte[] x1 = new byte[5_000_000];
byte[] x2 = new byte[5_000_000];

and it crashes every time on the second allocation with a System.ExecutionEngineException.

The precise minimum size of the allocation is unknown to me, but a larger single allocation doesn't trigger the issue, but more smaller allocations will. e,g.

this does not provoke the crash:
byte[] x1 = new byte[50_000_000];

but this does:

byte[] x1 = new byte[500_000];
byte[] x2 = new byte[500_000];
byte[] x3 = new byte[500_000];
byte[] x4 = new byte[500_000];
byte[] x5 = new byte[500_000];
byte[] x6 = new byte[500_000];
byte[] x7 = new byte[500_000];
byte[] x8 = new byte[500_000];

That said, it only crashes if I've read the first data column; the datetime column that has invalid data requiring the catch statement in the AsUnixMillisecondsInDateTime extension method so there's probably something in that. (not sure the int and float column data is valid either for that matter, but reading them doesn't seem to provoke this issue.)

@mukunku do you know if the file is actually valid and what the unencrypted contents should be? (I couldn't figure out how to specify an encryption key with parquet-tools). It would be useful I'd say to have both the unencrypted and encrypted files side by side for verifying the validity of the decrypted contents in the test too.

@EamonHetherton
Copy link
Contributor

Spent a little more time looking into this and I think I've found the problem. The code is decrypting the dictionary page header but not the dictionary page itself so it is actually attempting to decompress encrypted data. As for the cause of the crash, this part is mostly educated speculation but because the "CompressedPageSize" is actually the encrypted payload (with tag, length and nonce), it is too large and that may be causing problems in the decompressor with unsafe pointers causing memory corruption. In the case of the first column, the CompressedPageSize= 50 and the UncompressedPageSize=16, but after actually decrypting the data first, the CompressedPageSize=18 and is correctly decompressed into correct data.

This actually addresses the problem that required the "catch" in the "AsUnixMillisecondsInDateTime" method, a valid DateTime of "9/10/2023 2:26:07 PM" is decoded now. (well two DateTimes that are 2001 Microseconds apart actually)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants