Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audit header support for AAL #7319

Open
wants to merge 10 commits into
base: feature-HADOOP-19363-analytics-accelerator-s3
Choose a base branch
from

Conversation

rajdchak
Copy link

@rajdchak rajdchak commented Jan 23, 2025

Description of PR

Audit header support for AAL

How was this patch tested?

Ran the ITestS3AS3SeekableStream tests, and found audit headers getting created like

2025-01-22 20:27:56,194 [JUnit-testConnectorFrameWorkIntegrationWithoutCrtClient] INFO  analyticsaccelerator.S3SdkObjectClient (S3SdkObjectClient.java:getObject(177)) - auditHeaders https://audit.example.org/hadoop/1/op_open/5660c308-4ee1-41e1-aa1d-92a467592056-00000016/?op=op_open&p1=raw/2023/017/ohfh/OHFH017d.23_.gz&pr=rajdchak&ps=8557f10b-e00d-4cbc-8718-f1b29b2e24d5&rg=bytes=5-65540&id=5660c308-4ee1-41e1-aa1d-92a467592056-00000016&t0=46&fs=5660c308-4ee1-41e1-aa1d-92a467592056&t1=46&ts=1737577673749

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

steveloughran and others added 10 commits January 16, 2025 12:20
First iteration
* Factory interface with a parameter object creation method
* Base class AbstractS3AInputStream for all streams to create
* S3AInputStream subclasses that and has a factory
* Production and test code to use it

Not done
* Input stream callbacks pushed down to S3Store
* S3Store to dynamically choose factory at startup, stop in close()
* S3Store to implement the factory interface, completing final binding
  operations (callbacks, stats)

Change-Id: I8d0f86ca1f3463d4987a43924f155ce0c0644180
Revision

API: Make clear this is part of the fundamental store Model:

* abstract stream class is now ObjectInputStream
* interface is ObjectInputStreamFactory
* move to package org.apache.hadoop.fs.s3a.impl.model

Implementation: Prefetching stream is created this way too;
adds one extra parameter.

Maybe we should pass conf down too

Change-Id: I5bbb5dfe585528b047a649b6c82a9d0318c7e91e
Change-Id: If42bdd0b227c4da07c62a410a998e6d8c35581f6
Moves all prefetching stream related options into the prefetching stream
factory; the standard ReadOpContext removes them, so
a new PrefetchingOptions is passed around.

Stream factories can now declare how many extra shared threads they
want and whether or not to create a future pool around the bounded pool.
This is used in S3AFileSystem when creating its thread pools -this class
no longer reads in any of the prefetching options.

All tests which enable/disable prefetching, or probe for its state,
now use S3ATestUtils methods for this.
This avoids them having to now explicitly unset two properties,
set the new input stream type, and any more complications in test
setup in future.

Everything under S3AStore is a service, so service lifecycle matches everywhere
-and store just adds to the list of managed services for start/stop/close
integration.

+ adjust assertions in ITestS3AInputStreamLeakage for prefetching
+ update the prefetching.md doc for factory changs
+ javadocs
+ add string values of type names to Constants

Once the analytics stream is in, a full doc on "stream performance"
will be needed.

package for this stuff is now impl.streams

Change-Id: Id6356d2ded2c477ba16cbb9027ac0cfbece2a542
Push factory construction into the enum itself

Store implements stream capabilities, which are then
relayed to the active factory. This avoids the FS having
to know what capabilities are available in the stream.

Abstract base class for stream factories.

Change-Id: Ib757e6696f29cc7e0e8edd1119e738c6adc6f98f
Change-Id: Id79f8aa019095c1601bb0b2a282c51bdb0b7b817
Conflicts:
  hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java

Change-Id: I1eddd195a9a3e3332bfaac2e225acf69774c3ce8
Renamed some files

Addressed comments
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 21s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 52 new or modified test files.
_ feature-HADOOP-19363-analytics-accelerator-s3 Compile Tests _
+0 🆗 mvndep 5m 19s Maven dependency ordering for branch
+1 💚 mvninstall 17m 53s feature-HADOOP-19363-analytics-accelerator-s3 passed
+1 💚 compile 8m 30s feature-HADOOP-19363-analytics-accelerator-s3 passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04
+1 💚 compile 7m 47s feature-HADOOP-19363-analytics-accelerator-s3 passed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
+1 💚 checkstyle 1m 58s feature-HADOOP-19363-analytics-accelerator-s3 passed
+1 💚 mvnsite 2m 6s feature-HADOOP-19363-analytics-accelerator-s3 passed
+1 💚 javadoc 1m 52s feature-HADOOP-19363-analytics-accelerator-s3 passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 37s feature-HADOOP-19363-analytics-accelerator-s3 passed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
+1 💚 spotbugs 3m 7s feature-HADOOP-19363-analytics-accelerator-s3 passed
+1 💚 shadedclient 19m 37s branch has no errors when building and testing our client artifacts.
-0 ⚠️ patch 19m 51s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 21s Maven dependency ordering for patch
-1 ❌ mvninstall 0m 14s /patch-mvninstall-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
-1 ❌ compile 7m 52s /patch-compile-root-jdkUbuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04.txt root in the patch failed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04.
-1 ❌ javac 7m 52s /patch-compile-root-jdkUbuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04.txt root in the patch failed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04.
-1 ❌ compile 7m 25s /patch-compile-root-jdkPrivateBuild-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga.txt root in the patch failed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga.
-1 ❌ javac 7m 25s /patch-compile-root-jdkPrivateBuild-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga.txt root in the patch failed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga.
-1 ❌ blanks 0m 0s /blanks-eol.txt The patch has 2 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
-0 ⚠️ checkstyle 1m 56s /results-checkstyle-root.txt root: The patch generated 75 new + 34 unchanged - 7 fixed = 109 total (was 41)
-1 ❌ mvnsite 0m 28s /patch-mvnsite-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
-1 ❌ javadoc 0m 41s /results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04.txt hadoop-common-project_hadoop-common-jdkUbuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04 with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04 generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
-1 ❌ javadoc 0m 26s /patch-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04.txt hadoop-aws in the patch failed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04.
-1 ❌ javadoc 0m 32s /patch-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga.txt hadoop-aws in the patch failed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga.
-1 ❌ spotbugs 0m 25s /patch-spotbugs-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
+1 💚 shadedclient 21m 7s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 16m 18s hadoop-common in the patch passed.
+1 💚 unit 26m 40s hadoop-hdfs-rbf in the patch passed.
-1 ❌ unit 0m 30s /patch-unit-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
+1 💚 asflicense 0m 41s The patch does not generate ASF License warnings.
165m 2s
Subsystem Report/Notes
Docker ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7319/1/artifact/out/Dockerfile
GITHUB PR #7319
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint markdownlint
uname Linux 97a962472d79 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision feature-HADOOP-19363-analytics-accelerator-s3 / c4a64b4
Default Java Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7319/1/testReport/
Max. process+thread count 3659 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs-rbf hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7319/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants