You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The fetch command often downloads a lot of RPMs of a collectively large size. When running CI automation on builds, most of these RPMs are unchanged between similar builds. The result is a lot of unnecessary duplication of network bandwidth and disk storage among parallel or subsequent builds that are required to operate in otherwise isolated build workspaces.
It would be very useful if a separate folder that may be on located on a separately mounted file system could optionally be used as an RPM cache shared among different build workspaces.
Desired Feature
Add (optional) support for RPM cache to be on a separate mounted file system and (re-)used among multiple builds, including ones that may be running in parallel.
This may also require an option for configuring lockfile waiting limits as well to handle parallel builds.
It may also require adding a configuration option for setting the maximum size before pruning, or an option to elimination the maximum size limit.
Example Usage
cosa fetch --rpm-cache=/mnt/nfs-mounts/fcos-rpm-cache --lock-timeout=2m --prune-limit=50GB
Other Information
While I'm having trouble tracking the specific internals of how some of the directories get created and used, it appears either cache/cache/ or cache/pkgbuild-cache/ would be good candidates for making into an RPM cache. I did some testing by running coreos-assembler containers on different workspaces with only slightly different CoreOS Configs, and bind-mounting a fixed folder from outside the current directory into both of them as either /srv/cache/cache/ or /srv/cache/pkgbuild-cache/.
There appear to be a few issues with these solutions though:
cache/cache/ checks for a hash of the entire set of RPMs, not individual RPMs
cache/cache/ doesn't always clear locks if terminated via interrupt
cache/cache/ doesn't wait or retry if locks are already held
cache/pkgbuild-cache/ is required to support hardlinking to another folder during build
cache/pkgbuild-cache/ has a hardcoded size limit before it's auto-pruned
The cache/cache/ seems like it would be the best candidate, but it creates completely independent ostree(?) commits in storage based on the hash of the RPM list. The disk contents get de-duplicated for all of the downloaded RPMs, but that doesn't happen until after the RPM is downloaded and the network bandwidth has already been wasted. Not a critical problem, but it could be solved by creating the ostree(?) commits on top of one another instead of completely isolated from one another I'm guessing.
Additionally there is a lockfile used to ensure consistency of the cache ostree(?), but there are some issues with how it's managed.
Certain console interrupts don't seem to release the lock properly. This doesn't appear to be an issue on the next run if the folder is in the same mountpoint as the rest of the working directory, but if cache/cache/ is mounted separately the second build I encounter an error that requires the lock to be manually cleared instead.
If the lock is already held, the build fails immediately rather than waiting for a parallel task to release it. For the current implementation that only supports a single process this makes sense to fail immediately, but if builds running in parallel and sharing the locked contents are supported the lock needs to be waited on for some (ideally configurable) time before failing.
The cache/pkgbuild-cache/ looks like another possible option for the shared RPM cache, but I'm less clear about its exact usage in the build. The code for the fetch command lists it as the FILE variable however and does an automatic prune of it if the size has gotten above a hardcoded limit. This hardcoded and non-configurable pruning limit would present an issue and need to be made configurable at a minimum.
Additionally, if I host-mount the cache/pkgbuild-cache/ folder separately for a coreos-assembler container, the builds fail because it unconditionally attempts to hardlink between files in it and those located in a separate folder somewhere. If this were to be located on a separate mount point, it wouldn't be able to hardlink. I'm not sure of the impact of such a loss of ability to hardlink, but it does suggest duplication would result, possibly making cache/pkgbuild-cache/ a less useful option.
The text was updated successfully, but these errors were encountered:
Feature Request
The
fetch
command often downloads a lot of RPMs of a collectively large size. When running CI automation on builds, most of these RPMs are unchanged between similar builds. The result is a lot of unnecessary duplication of network bandwidth and disk storage among parallel or subsequent builds that are required to operate in otherwise isolated build workspaces.It would be very useful if a separate folder that may be on located on a separately mounted file system could optionally be used as an RPM cache shared among different build workspaces.
Desired Feature
Add (optional) support for RPM cache to be on a separate mounted file system and (re-)used among multiple builds, including ones that may be running in parallel.
This may also require an option for configuring lockfile waiting limits as well to handle parallel builds.
It may also require adding a configuration option for setting the maximum size before pruning, or an option to elimination the maximum size limit.
Example Usage
cosa fetch --rpm-cache=/mnt/nfs-mounts/fcos-rpm-cache --lock-timeout=2m --prune-limit=50GB
Other Information
While I'm having trouble tracking the specific internals of how some of the directories get created and used, it appears either
cache/cache/
orcache/pkgbuild-cache/
would be good candidates for making into an RPM cache. I did some testing by running coreos-assembler containers on different workspaces with only slightly different CoreOS Configs, and bind-mounting a fixed folder from outside the current directory into both of them as either/srv/cache/cache/
or/srv/cache/pkgbuild-cache/
.There appear to be a few issues with these solutions though:
cache/cache/
checks for a hash of the entire set of RPMs, not individual RPMscache/cache/
doesn't always clear locks if terminated via interruptcache/cache/
doesn't wait or retry if locks are already heldcache/pkgbuild-cache/
is required to support hardlinking to another folder duringbuild
cache/pkgbuild-cache/
has a hardcoded size limit before it's auto-prunedThe
cache/cache/
seems like it would be the best candidate, but it creates completely independent ostree(?) commits in storage based on the hash of the RPM list. The disk contents get de-duplicated for all of the downloaded RPMs, but that doesn't happen until after the RPM is downloaded and the network bandwidth has already been wasted. Not a critical problem, but it could be solved by creating the ostree(?) commits on top of one another instead of completely isolated from one another I'm guessing.Additionally there is a lockfile used to ensure consistency of the cache ostree(?), but there are some issues with how it's managed.
Certain console interrupts don't seem to release the lock properly. This doesn't appear to be an issue on the next run if the folder is in the same mountpoint as the rest of the working directory, but if
cache/cache/
is mounted separately the second build I encounter an error that requires the lock to be manually cleared instead.If the lock is already held, the build fails immediately rather than waiting for a parallel task to release it. For the current implementation that only supports a single process this makes sense to fail immediately, but if builds running in parallel and sharing the locked contents are supported the lock needs to be waited on for some (ideally configurable) time before failing.
The
cache/pkgbuild-cache/
looks like another possible option for the shared RPM cache, but I'm less clear about its exact usage in the build. The code for thefetch
command lists it as theFILE
variable however and does an automatic prune of it if the size has gotten above a hardcoded limit. This hardcoded and non-configurable pruning limit would present an issue and need to be made configurable at a minimum.Additionally, if I host-mount the
cache/pkgbuild-cache/
folder separately for a coreos-assembler container, the builds fail because it unconditionally attempts to hardlink between files in it and those located in a separate folder somewhere. If this were to be located on a separate mount point, it wouldn't be able to hardlink. I'm not sure of the impact of such a loss of ability to hardlink, but it does suggest duplication would result, possibly makingcache/pkgbuild-cache/
a less useful option.The text was updated successfully, but these errors were encountered: