Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibility for caching mirrored repositories #304

Open
malinink opened this issue Nov 20, 2024 · 4 comments
Open

Possibility for caching mirrored repositories #304

malinink opened this issue Nov 20, 2024 · 4 comments
Milestone

Comments

@malinink
Copy link

malinink commented Nov 20, 2024

Is there any possibility to cache mirrored repos zipballs directly in s3?

As for now, i have configured mirror:

packeton:
  mirrors:
    packagist:
      url: https://repo.packagist.org
      public_access: true
      sync_lazy: true

And s3 provider:

      STORAGE_SOURCE: s3
      STORAGE_AWS_BUCKET: test
      STORAGE_AWS_PREFIX: package
      STORAGE_AWS_ARTIFACT_PREFIX: artifact
      STORAGE_AWS_ARGS: '{"endpoint": "http://minio:9000", "accessKeyId": "minioadmin", "accessKeySecret": "minioadmin", "region": "ru-central1"}'

It work great, but i found that directory /data/composer contains package data and package folder in S3 is empty.

I need to setup packeton as stateless service, and my question is:
Is there any possibilities to store packages data that passes via packeton(mirror) into S3 to cache them?

@malinink
Copy link
Author

I have research for a while, and found out that there are some cases that are not covered on S3 providing.
i.e.
https://github.com/vtsykun/packeton/blob/master/src/Mirror/RemoteProxyRepository.php#L39
https://github.com/vtsykun/packeton/blob/master/src/Mirror/Service/ZipballDownloadManager.php#L23

@vtsykun Am I right?

@vtsykun
Copy link
Owner

vtsykun commented Nov 25, 2024

Hi @malinink

S3 for proxy is not support due to performance problem.

ZipballDownloadManager use Filesystem only for caching, not sure that we need add abstraction filesystem here, but it not problem.

Another problem with using Filesystem RemoteProxyRepository when it is lazy. I don't know how we can effective cache exists call $this->filesystem->exists() If сount of packages is large, for example 1000, it may very impact on performance. I'll want to avoide issue when proxy works slower than Packagist

@malinink
Copy link
Author

@vtsykun
As for now, there is problem in stateless deployment of service. We have 4 upstreams of service. Problem occurs when I try to install packages, and they are not found on disk(for example after restart of service(where we loses all data, except: db, redis, S3) or after install packages and requested them in another upstream). We have a lot of 404, and composer uses original url for downloading packages.

Catch idea of slowly proxy. But if we will copy cache from S3 it will be single long call after service restart per each package. After populating local cache finished, it will respond as usual. Do you think that this behavior still affects speeding of proxy?

@vtsykun
Copy link
Owner

vtsykun commented Nov 27, 2024

Yes, I understood problem. I'll investigate how we can add S3 storage support for proxies to avoid performance issue with a lot of S3 requests

@vtsykun vtsykun added this to the 2.7 milestone Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants