Skip to content

Commit

Permalink
Move to upstream Dgraph4j shaded dependency (#270)
Browse files Browse the repository at this point in the history
## Checklist before submitting

- [X] Did you read the [contributing
guide](https://github.com/G-Research/spark-dgraph-connector/blob/contributing-guidelines/CONTRIBUTING.md)?
- [ ] Did you update the docs?
- [ ] Did you write any tests to validate this change?  

## Description

Moves to the official Dgraph4j shaded client dependency.

## Review process for approval

1. All tests and other checks must succeed.
2. At least one core contributors must review and approve.
3. If a core contributor requests changes, they must be addressed.
  • Loading branch information
EnricoMi authored Dec 15, 2024
1 parent 5a7a21b commit 9d04e9a
Show file tree
Hide file tree
Showing 7 changed files with 115 additions and 48 deletions.
10 changes: 10 additions & 0 deletions .github/actions/build/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,9 +43,13 @@ runs:

- name: Build
run: |
mvn --batch-mode -Dspotless.check.skip --update-snapshots dependency:go-offline
mvn --batch-mode -Dspotless.check.skip --update-snapshots clean compile test-compile
mvn --batch-mode -Dspotless.check.skip -DskipTests -Dmaven.test.skip=true package
mvn --batch-mode -Dspotless.check.skip -DskipTests -Dmaven.test.skip=true -Dgpg.skip install
cd examples/scala
mvn --batch-mode -Dspotless.check.skip clean compile test-compile
mvn --batch-mode -Dspotless.check.skip -DskipTests -Dmaven.test.skip=true package
shell: bash

- name: Upload Binaries
Expand All @@ -58,6 +62,12 @@ runs:
!target/*-javadoc.jar
!target/site
- name: Upload Dependencies
uses: actions/upload-artifact@v4
with:
name: Dependencies-${{ inputs.spark-compat-version }}-${{ inputs.scala-compat-version }}
path: ~/.m2/repository

branding:
icon: 'check-circle'
color: 'green'
57 changes: 30 additions & 27 deletions .github/actions/test-integrate/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,38 +40,28 @@ runs:
name: Binaries-${{ inputs.spark-compat-version }}-${{ inputs.scala-compat-version }}
path: .

- name: Cache Maven packages
uses: actions/cache@v4
- name: Fetch Dependencies Artifact
uses: actions/download-artifact@v4
with:
name: Dependencies-${{ inputs.spark-compat-version }}-${{ inputs.scala-compat-version }}
path: ~/.m2/repository
key: ${{ runner.os }}-mvn-integrate-${{ inputs.spark-version }}-${{ inputs.scala-compat-version }}-${{ hashFiles('pom.xml') }}
restore-keys: |
${{ runner.os }}-mvn-integrate-${{ inputs.spark-version }}-${{ inputs.scala-compat-version }}-
${{ runner.os }}-mvn-build-${{ inputs.spark-version }}-${{ inputs.scala-compat-version }}-

- name: Cache Spark Binaries
uses: actions/cache@v4
- name: Fetch Spark Binaries Artifact
uses: actions/download-artifact@v4
with:
name: Spark-Binaries-${{ inputs.spark-version }}-${{ inputs.hadoop-version }}
path: ~/spark
key: ${{ runner.os }}-spark-binaries-${{ inputs.spark-version }}-${{ inputs.scala-compat-version }}

- name: Change file permissions
run: chmod u+x ~/spark/bin/* ~/spark/sbin/*
shell: bash

- name: Setup JDK
uses: actions/setup-java@v4
with:
java-version: ${{ inputs.java-version }}
distribution: 'zulu'

- name: Setup Spark Binaries
env:
SPARK_PACKAGE: spark-${{ inputs.spark-version }}/spark-${{ inputs.spark-version }}-bin-hadoop${{ inputs.hadoop-version }}.tgz
run: |
if [[ ! -e ~/spark ]]
then
wget --progress=dot:giga "https://www.apache.org/dyn/closer.lua/spark/${SPARK_PACKAGE}?action=download" -O - | tar -xzC "${{ runner.temp }}"
archive=$(basename "${SPARK_PACKAGE}") bash -c "mv -v "${{ runner.temp }}/\${archive/%.tgz/}" ~/spark"
fi
shell: bash

- name: Parametrize
id: params
run: |
Expand All @@ -84,14 +74,27 @@ runs:
- name: Prepare Integration Tests
run: |
mvn --batch-mode -Dspotless.check.skip -DskipTests install
cd examples/scala
mvn --batch-mode -Dspotless.check.skip package
(cd examples/scala && mvn --batch-mode -Dspotless.check.skip package)
# spark-submit is not capable of downloading these dependencies, fetching them through mvn
mvn --batch-mode -Dspotless.check.skip dependency:get -DgroupId=com.google.errorprone -DartifactId=error_prone_annotations -Dversion=2.3.3
mvn --batch-mode -Dspotless.check.skip dependency:get -DgroupId=com.google.code.findbugs -DartifactId=jsr305 -Dversion=3.0.2
mvn --batch-mode -Dspotless.check.skip dependency:get -DgroupId=org.codehaus.mojo -DartifactId=animal-sniffer-annotations -Dversion=1.17
mvn --batch-mode -Dspotless.check.skip dependency:get -DgroupId=com.google.code.gson -DartifactId=gson -Dversion=2.8.9
mvn --batch-mode -Dspotless.check.skip dependency:get -DgroupId=org.slf4j -DartifactId=slf4j-api -Dversion=1.7.16
for dep in "org.slf4j#slf4j-api;2.0.16" \
"com.google.protobuf#protobuf-java;4.29.1" \
"io.netty#netty-all;4.1.110.Final" \
"com.google.guava#guava;33.3.1-jre" \
"com.google.guava#failureaccess;1.0.2" \
"com.google.guava#listenablefuture;9999.0-empty-to-avoid-conflict-with-guava" \
"org.checkerframework#checker-qual;3.43.0" \
"com.google.j2objc#j2objc-annotations;3.0.0"; do
IFS="#;" read group artifact version <<< "$dep"
mvn --batch-mode -Dspotless.check.skip dependency:get -DgroupId="$group" -DartifactId="$artifact" -Dversion="$version"
done
if [[ "${{ inputs.spark-compat-version }}" == "3.0" ]]
then
# spark-submit 3.0 cannot resolve the dgraph4j dependency that has classifier "shaded"
# copying it into .ivy2 cache without classifier
mkdir -p ~/.ivy2/jars/
dgraph4j_version="$(grep -A3 -B2 "<artifactId>dgraph4j</artifactId>" pom.xml | grep "<version>" | sed -e "s/[^>]*>//" -e "s/<.*//")"
cp -v ~/.m2/repository/io/dgraph/dgraph4j/${dgraph4j_version}/dgraph4j-${dgraph4j_version}-shaded.jar ~/.ivy2/jars/io.dgraph_dgraph4j-${dgraph4j_version}.jar
fi
shell: bash

- name: Start Dgraph cluster (Small)
Expand Down
10 changes: 3 additions & 7 deletions .github/actions/test-python/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,15 +37,11 @@ runs:
name: Binaries-${{ inputs.spark-compat-version }}-${{ inputs.scala-compat-version }}
path: .

- name: Cache Maven packages
uses: actions/cache@v4
- name: Fetch Dependencies Artifact
uses: actions/download-artifact@v4
with:
name: Dependencies-${{ inputs.spark-compat-version }}-${{ inputs.scala-compat-version }}
path: ~/.m2/repository
key: ${{ runner.os }}-mvn-python-test-${{ inputs.spark-version }}-${{ inputs.scala-compat-version }}-${{ hashFiles('pom.xml') }}
restore-keys: |
${{ runner.os }}-mvn-python-test-${{ inputs.spark-version }}-${{ inputs.scala-compat-version }}-
${{ runner.os }}-mvn-test-${{ inputs.spark-version }}-${{ inputs.scala-compat-version }}-
${{ runner.os }}-mvn-build-${{ inputs.spark-version }}-${{ inputs.scala-compat-version }}-

- name: Setup JDK 11
uses: actions/setup-java@v4
Expand Down
9 changes: 3 additions & 6 deletions .github/actions/test-scala/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,14 +37,11 @@ runs:
name: Binaries-${{ inputs.spark-compat-version }}-${{ inputs.scala-compat-version }}
path: .

- name: Cache Maven packages
uses: actions/cache@v4
- name: Fetch Dependencies Artifact
uses: actions/download-artifact@v4
with:
name: Dependencies-${{ inputs.spark-compat-version }}-${{ inputs.scala-compat-version }}
path: ~/.m2/repository
key: ${{ runner.os }}-mvn-test-${{ inputs.spark-version }}-${{ inputs.scala-compat-version }}-${{ hashFiles('pom.xml') }}
restore-keys: |
${{ runner.os }}-mvn-test-${{ inputs.spark-version }}-${{ inputs.scala-compat-version }}-
${{ runner.os }}-mvn-build-${{ inputs.spark-version }}-${{ inputs.scala-compat-version }}-

- name: Setup JDK
uses: actions/setup-java@v4
Expand Down
16 changes: 11 additions & 5 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,10 @@ jobs:
git diff
shell: bash

download-spark:
name: "Spark"
uses: "./.github/workflows/download-spark.yml"

build:
name: "Build"
uses: "./.github/workflows/build.yml"
Expand Down Expand Up @@ -129,15 +133,17 @@ jobs:

test-integration:
name: "Test Integration"
needs: [test-dgraph, test-spark]
needs: [download-spark, test-dgraph, test-spark]
uses: "./.github/workflows/test-integration.yml"

delete_binaries:
name: "Delete Binaries"
delete_artifacts:
name: "Delete Artifacts"
runs-on: ubuntu-latest
needs: [test-dgraph, test-spark, test-scala, test-python, test-integration]
steps:
- name: Delete Binaries Artifact
- name: Delete Artifacts
uses: geekyeggo/delete-artifact@v5
with:
name: "Binaries-*"
name: |
Binaries-*
Dependencies-*
54 changes: 54 additions & 0 deletions .github/workflows/download-spark.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
name: Build

on:
workflow_call:

jobs:
download:
name: Download (Spark ${{ matrix.spark-version }} Hadoop ${{ matrix.hadoop-version }})
runs-on: ubuntu-latest

strategy:
fail-fast: false
matrix:
# use spark versions from test-integration.yaml workflow
include:
- spark-version: '3.0.3'
hadoop-version: '2.7'
- spark-version: '3.1.3'
hadoop-version: '2.7'
- spark-version: '3.2.4'
hadoop-version: '2.7'
- spark-version: '3.3.4'
hadoop-version: '3'
- spark-version: '3.4.3'
hadoop-version: '3'
- spark-version: '3.5.3'
hadoop-version: '3'
- spark-version: '4.0.0-preview2'
hadoop-version: '3'

steps:
- name: Cache Spark Binaries
uses: actions/cache@v4
with:
path: ~/spark
key: ${{ runner.os }}-spark-binaries-${{ matrix.spark-version }}-${{ matrix.scala-compat-version }}

- name: Setup Spark Binaries
env:
SPARK_PACKAGE: spark-${{ matrix.spark-version }}/spark-${{ matrix.spark-version }}-bin-hadoop${{ matrix.hadoop-version }}.tgz
run: |
if [[ ! -e ~/spark ]]
then
wget --progress=dot:giga "https://www.apache.org/dyn/closer.lua/spark/${SPARK_PACKAGE}?action=download" -O - | tar -xzC "${{ runner.temp }}"
archive=$(basename "${SPARK_PACKAGE}") bash -c "mv -v "${{ runner.temp }}/\${archive/%.tgz/}" ~/spark"
fi
shell: bash

- name: Upload Spark Binaries
uses: actions/upload-artifact@v4
with:
name: Spark-Binaries-${{ matrix.spark-version }}-${{ matrix.hadoop-version }}
path: ~/spark

7 changes: 4 additions & 3 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -115,9 +115,10 @@
</dependency>

<dependency>
<groupId>uk.co.gresearch.dgraph</groupId>
<artifactId>dgraph4j-shaded</artifactId>
<version>21.12.0-0</version>
<groupId>io.dgraph</groupId>
<artifactId>dgraph4j</artifactId>
<version>24.1.1</version>
<classifier>shaded</classifier>
</dependency>

<dependency>
Expand Down

0 comments on commit 9d04e9a

Please sign in to comment.