Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various S3 improvements #16

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

klauspost
Copy link

@klauspost klauspost commented Dec 29, 2024

This breaks the placement on older versions. It will probably also break when using versions prior to 1.24 RC.

  • Add custom endpoint via GOCACHE_S3_URL.
  • Add auth with session token via GOCACHE_S3_SESSION_TOKEN
  • Use a default us-east-1 region.
  • Fix ObjectID -> OutputID field rename
  • Fix TimeNanos -> Time field rename
  • Replace GOCACHE_CACHE_KEY with GOCACHE_S3_PREFIX.
  • Add compression to >8KB files on S3.
  • Use first 3 bytes of hash for prefix.
  • Fix NaN in timekeeping (and just ignore time not spent doing requests)
  • Update all dependencies
  • Forks bytes.Buffer to make it seekable, since non-TLS requests will include checksum.
  • Remove arch/os from path.

If you want to test on our playground you can try...

GOCACHE_S3_BUCKET=testbucket
GOCACHE_S3_URL=https://play.min.io
GOCACHE_S3_ACCESS_KEY=minioadmin
GOCACHE_S3_SECRET_KEY=minioadmin

* Add custom endpoint via `GOCACHE_AWS_URL`.
* Add auth with session token via `GOCACHE_AWS_SESSION_TOKEN`
* Use a default `us-east-1` region.
* Fix ObjectID -> OutputID field rename
* Fix TimeNanos -> Time field rename
* Replace `GOCACHE_CACHE_KEY` with `GOCACHE_S3_PREFIX`.
* Add compression to >8KB files on S3.
* Use first 3 bytes of hash for prefix.
* Fix NaN in timekeeping (and just ignore time not spent doing requests)
* Update all dependencies
@klauspost
Copy link
Author

It is completely broken on Windows: golang/go#71059

RC1 also breaks if the experiment isn't enabled due to the renamed ObjectID field.

README.md Outdated Show resolved Hide resolved
@klauspost
Copy link
Author

@or-shachar I was thinking, if there is a reason to (forcefully) separate on arch/platform.

AFAICT the files are pretty much unique per platform anyway.

We can either set it as metadata or tags (which will make it available to lifecycle filters).

@or-shachar
Copy link
Owner

Thanks @klauspost ! Any chance you can pull changes from upstream, and add 1.24rc1 to the test matrix (github action files)?

@klauspost
Copy link
Author

Main merged and workflows updated, but I expect you need to approve them to run.

@or-shachar
Copy link
Owner

mm... I was sure that I got the pull request trigger right. Can you kindly pull changes / rebase again?

@klauspost
Copy link
Author

Merged + pushed. Seems like changes to workflows aren't being picked up before they are merged.

@or-shachar
Copy link
Owner

Not sure why... I'll need to read a little more to figure it out. Will continue tomorrow. If you have an idea - let me know.

Copy link
Owner

@or-shachar or-shachar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick comment about the workflows change

.github/workflows/go.yml Outdated Show resolved Hide resolved
.github/workflows/go.yml Outdated Show resolved Hide resolved
Copy link
Owner

@or-shachar or-shachar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need to review the bytes operations but here is some questions I gathered.

envVarS3BucketName = "GOCACHE_S3_BUCKET"
envVarS3CacheKey = "GOCACHE_CACHE_KEY"
envVarS3Prefix = "GOCACHE_S3_PREFIX"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

breaking change?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. See discussion above. Honestly I thought I was talking to you until I had already sent in the change.

Comment on lines +1 to +3
// Copyright 2009 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this ? external source file? why is it checked in here? should we import it from somewhere else?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forks bytes.Buffer to make it seekable, since non-TLS requests will include checksum.

The aws SDK needs an input that is seekable to work with HTTP endpoints. So I forked bytes.Buffer to include that.

Comment on lines +72 to +74
if outputID == "" {
return "", errors.New("empty outputID")
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this backward compatible?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't test with EXPERIMENT option. For 1.24 it is.

cachers/s3.go Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants