Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve][build] pip with --prefer-binary to reduce image build time #22613

Conversation

nodece
Copy link
Member

@nodece nodece commented Apr 29, 2024

Motivation

When building the ARM image, pip3 will take a long time to build the grpcio package, because the PyPI doesn't provide the grpcio aarch64 wheel file, this is unfriendly.

Build script:

mvn install -Pmain,docker -pl docker/pulsar -Ddocker.platforms=linux/arm64 -Ddocker.nocache=true

Log(Improved):

....
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  04:18 min
[INFO] Finished at: 2024-04-29T18:09:50+08:00
[INFO] ------------------------------------------------------------------------
[INFO] 0 goals, 0 executed

And then we can add a job to build the ARM image.

Modifications

  • Install py3-grpcio by the pulsar-client
  • Add --prefer-binary and wheel mirror to speed up installation

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

@nodece nodece self-assigned this Apr 29, 2024
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Apr 29, 2024
pulsar-client[all]==${PULSAR_CLIENT_PYTHON_VERSION}

pulsar-client[all]==${PULSAR_CLIENT_PYTHON_VERSION} \
--prefer-binary --find-links https://wheel-index.linuxserver.io/alpine-3.19/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to trust linuxserver.io for binaries?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, it provides the grpcio package.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also download this wheel file to local.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also download this wheel file to local.

where and why? I couldn't find any existing references of linuxserver.io .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From https://wheels.linuxserver.io/alpine-3.19/grpcio-1.62.2-cp311-cp311-linux_aarch64.whl to the pulsar repo.

how does that get referenced? does the Alpine image already reference that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This solution comes from grpc/grpc#27512 (comment)

Non Alpine official solution.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This solution comes from grpc/grpc#27512 (comment)

Non Alpine official solution.

What makes this solution "official"?
I think a better solution is to lower the required grpc version in pulsar-client-python to 1.59.3 so that no external solution wouldn't be required at all.

The grpc version should match apache-bookkeeper-client's grpc version to ensure compatibility of generated stubs.

@@ -59,16 +59,16 @@ RUN apk add --no-cache \
musl-dev \
libffi-dev \
py3-pip \
py3-grpcio \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since py3-grpcio is already installed, why isn't that used? isn't there a way to prevent upgrading of the grpcio package?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not good at pip, and as far as I know there's no way. The pip will uninstall the installed, and then to install a new grpcio package by the pulsar-client.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this is causing the problem: apache/pulsar-client-python@162afd5 .

In Alpine, the grpcio version is 1.59.3 , https://pkgs.alpinelinux.org/package/v3.19/community/aarch64/py3-grpcio

@BewareMyPower Is there a way to allow using grpcio version 1.59.3 ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The operating system only requires the basic package, and we can use pip to install other dependencies, which are always the latest.

apk sometimes provides outdated packages.

The pip will uninstall the installed, and then to install a new grpcio package by the pulsar-client.

You can use the RUN pip3 install --break-system-packages pulsar-client[all]==${PULSAR_CLIENT_PYTHON_VERSION} --verbose --prefer-binary to reproduce this issue in the Dockerfile.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apk sometimes provides outdated packages.

I explained the root cause in the previous comment: #22613 (comment) .

Alpine comes with grpc 1.59.3 package. I don't see a reason why that couldn't be supported.

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have created apache/pulsar-client-python#211 as a way to mitigate the root cause.

@nodece nodece closed this May 1, 2024
@nodece nodece force-pushed the pip-with-prefer-binary-reduce-image-build-time branch from 02a351d to 93afd89 Compare May 1, 2024 17:29
@nodece
Copy link
Member Author

nodece commented May 1, 2024

Forcing push resulted in this PR being closed.

Once apache/pulsar-client-python#211 is merged, we can use the --only-binary :all: to install the pulsar-client, which can avoid the build time issue.

One issue is that PyPI doesn't provide the ratelimit wheel file, we can build this wheel, and upload it to the pulsar repo, this size is 6kb.

RUN pip3 install --break-system-packages \
    pulsar-client[all]==${PULSAR_CLIENT_PYTHON_VERSION} \
    --only-binary :all:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc-not-needed Your PR changes do not impact docs ready-to-test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants