-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU accelerated encoder #895
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yikes lost all my comments while reviewing on github.dev.... friggin github. Some graphql internal error or something. So I'll just submit this first set of reviews in case. Don't really feel like reviewing again lol... another reason why small PRs are gud, big PRs bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just spent 2 hours reviewing and I barely even got to the important stuff. Suggest no more PRs like this, for the sake of our industry, and of your reviewers' mental health. :D
docker-bake.hcl
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a lint config setup for hcl files? We should if not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what tool can lint it I just gave it to Claude.
benchmark_icicle: | ||
go run -tags=icicle main.go -cpuprofile cpu.prof -memprofile mem.prof |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious, is there any reason you're not using go benchmarks https://pkg.go.dev/testing#hdr-Benchmarks? Is there some feature that didn't work there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just am not sure how to use it. Also the benchmarking code here was used to compare the default CPU implementation vs icicle GPU implementation in terms of speed-up so needed the raw timings.
encoding/icicle/device_setup.go
Outdated
type IcicleDevice struct { | ||
Device icicle_runtime.Device | ||
NttCfg core.NTTConfig[[icicle_bn254.SCALAR_LIMBS]uint32] | ||
MsmCfg core.MSMConfig | ||
FlatFFTPointsT []icicle_bn254.Affine | ||
SRSG1Icicle []icicle_bn254.Affine | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the point that we can have a different ntt/msm config and points loaded on different devices (gpus?)? Do we actually make use of this flexibility? If so, please add comment explaining.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No there's only going to be a single configuration loaded at startup time. Right now we don't support multiple GPUs for an encoder.
type IcicleDeviceConfig struct { | ||
EnableGPU bool | ||
NTTSize uint8 | ||
// MSM setup parameters (optional) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are they optional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reed-solomon encoder only uses NTT so doesn't need the MSM configuration
encoding/icicle/device_setup.go
Outdated
setupErr = fmt.Errorf("could not setup NTT") | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just return the setupErr directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also prob want to wrap the icicleErr to help debug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cannot return because I don't think that RunOnDevice func will let me. Looks like this:
func RunOnDevice(device *Device, funcToRun func(args ...any), args ...any) {
go func(deviceToRunOn *Device) {
defer runtime.UnlockOSThread()
runtime.LockOSThread()
originalDevice, _ := GetActiveDevice()
SetDevice(deviceToRunOn)
funcToRun(args...)
SetDevice(originalDevice)
}(device)
}
encoding/icicle/msm_setup.go
Outdated
"github.com/ingonyama-zk/icicle/v3/wrappers/golang/runtime" | ||
) | ||
|
||
func SetupMsm(rowsG1 [][]bn254.G1Affine, srsG1 []bn254.G1Affine) ([]icicle_bn254.Affine, []icicle_bn254.Affine, core.MSMConfig, core.MSMConfig, runtime.EIcicleError) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comment explaining what this is doing. Why are we returning 2 MSMConfig for G1 and G2 if they are the same for eg?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
G2 MSM is not used right now but if we wanted to accelerate the length proof commitment then we would need it. Going to remove it from the configuration for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a diff-neutral PR just for refactoring? It seems there are code here which doesn't need deep review as they are just existing code moved around?
|
||
func CreateIcicleBackendEncoder(p *Encoder, params encoding.EncodingParams, fs *fft.FFTSettings) (*ParametrizedEncoder, error) { | ||
// Not supported | ||
return nil, fmt.Errorf("icicle backend called without icicle build tag") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May use errors.New("...")
since this has no formatting needs
Yes, if we're happy with this new configuration method then I'll make another PR with that and rebase this one |
RUN go build -tags=icicle -o ./bin/server ./cmd/encoder | ||
|
||
# Start a new stage for the base image | ||
FROM nvidia/cuda:12.2.2-base-ubuntu22.04 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Image size comparison:
ghcr.io/layr-labs/eigenda/encoder latest 781c3866c5dd 6 hours ago 41MB
6129846e8150 6 hours ago 41MB
ghcr.io/layr-labs/eigenda/encoder-icicle latest f530fe9c250d 21 hours ago 760MB
disperser/cmd/encoder/main.go
Outdated
prover.WithGPU(config.ServerConfig.GPUEnable), | ||
prover.WithRSEncoder(rsEncoder), | ||
} | ||
prover, err := prover.NewProver(popts...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to follow options pattern with this change https://golang.cafe/blog/golang-functional-options-pattern.html
|
||
func CreateIcicleBackendProver(p *Prover, params encoding.EncodingParams, fs *fft.FFTSettings, ks *kzg.KZGSettings) (*ParametrizedProver, error) { | ||
// Not supported | ||
return nil, fmt.Errorf("icicle backend called without icicle build tag") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point of noicicle.go file is that the icicle backend should only be functional when the icicle build tag is used. This is so we don't need to include the icicle backend files in the traditional encoder backend, otherwise it may break the existing build.
} | ||
|
||
// CommitmentDevice represents a backend capable of computing various KZG commitments. | ||
type KzgCommitmentsBackend interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved commitments to its own backend in case we would like to accelerate them in the future.
@@ -331,3 +457,99 @@ func toUint64Array(chunkIndices []encoding.ChunkNumber) []uint64 { | |||
} | |||
return res | |||
} | |||
|
|||
func (p *Prover) newProver(params encoding.EncodingParams) (*ParametrizedProver, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this file should be reviewed carefully
encoding/kzg/verifier/verifier.go
Outdated
) | ||
|
||
// VerifierOption defines a function that configures a Verifier | ||
type VerifierOption func(*Verifier) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can go in it's own PR, it's applying the same options pattern refactor to verifier so not super relevant to GPU encoder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks more verbose and is deviating from the convention, what're the main problems it solves here compared to passing in a config struct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One argument I would give for this configuration pattern is that it provides a simpler constructor that is extensible and backwards compatible.
For example, with a config struct we would have to modify every client that calls the constructor when we add a new configuration value vs. this pattern you add a WithX
function in the library that sets the field and does whatever validation it needs to. Then clients can optionally update to use that new added configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with a config struct we would have to modify every client that calls the constructor when we add a new configuration value
Not sure about this, if that new field is optional, the caller doesn't need to do anything. If it's required, callers will have to make changes in either case.
My concerns are the boilplate and the scalabilty of keeping adding WithXX, we may have a long list of fields (like in some big structs).
Also it looks it still need to operate on config struct, if the validation runs across multiple fields (eg. invariants defined over multiple fields).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concerns are the boilplate and the scalabilty of keeping adding WithXX, we may have a long list of fields (like in some big structs).
These seem like tractable problems though, here is an example in grpc library:
https://github.com/grpc/grpc-go/blob/66385b28b3fe71a4895f00d581ede0a344743a3f/dialoptions.go#L248
"github.com/consensys/gnark-crypto/ecc/bn254/fr" | ||
) | ||
|
||
type ParametrizedEncoder struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactor to make the encoder parametrized based on FFT settings (similar to Prover). Don't think it's too valuable since max time is ~600ms for loading largest fft setting. Am thinking of reverting
@@ -0,0 +1,103 @@ | |||
//go:build icicle | |||
|
|||
package icicle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This folder is icicle backend related setup
@@ -0,0 +1,46 @@ | |||
//go:build icicle | |||
|
|||
package icicle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This folder is icicle related computations for MultiProofs
"github.com/ingonyama-zk/icicle/v3/wrappers/golang/runtime" | ||
) | ||
|
||
type RsIcicleComputeDevice struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is icicle related computations for RS Encode
One thing I'm not too sure how to deal with is testing Icicle/ GPU encoder correctness. The only strategy I can think of is to compare output of default backend and icicle backend. Also if we wanted it to be part of CI, then we need to acquire GPU based github action runners. |
|
||
func CreateIcicleBackendEncoder(e *Encoder, params encoding.EncodingParams, fs *fft.FFTSettings) (*ParametrizedEncoder, error) { | ||
icicleDevice, err := icicle.NewIcicleDevice(icicle.IcicleDeviceConfig{ | ||
GPUEnable: e.Config.GPUEnable, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks Icicle will support also CPU now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but it's not as performant as the gnark backend, am trying to look into why
|
||
var evals []fr.Element | ||
g.NttCfg.BatchSize = int32(1) | ||
runtime.RunOnDevice(&g.Device, func(args ...any) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the GPU is enabled, then this func will be executed on GPU by the runtime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes because the device variable will be set to the GPU device here
eigenda/encoding/icicle/device_setup.go
Line 88 in dfe247d
Device: device, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For performance, it looks the main data movement to/from GPU will be the coefficients and the resulting evaluations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For RSEncode yes:
For multiproofs it is also the same (move coefficients, move out proofs) but in addition we need to also transfer SRS Table to compute the MSM. This can be done at SetupMSM at initialization but not sure how much speedup that is.
encoding/kzg/verifier/verifier.go
Outdated
) | ||
|
||
// VerifierOption defines a function that configures a Verifier | ||
type VerifierOption func(*Verifier) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks more verbose and is deviating from the convention, what're the main problems it solves here compared to passing in a config struct?
Fs: fs, | ||
verbose: verbose, | ||
NumRSWorker: runtime.GOMAXPROCS(0), | ||
return &ParametrizedEncoder{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So when you say "backend", it looks a software/library that computes RS encoding? I.e. it decoupled device (CPU/GPU) and backend (the software that executes compute on device)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now I consider backend as library that implements the cryptographic primitive like NTT, MSM. We have our own implementation of NTT borrowed from protolambda and MSM from gnark.
It could be plausible that there exists libraries that implement the higher level primitives like multiproofs, commitment, RS encode and we can try to make our code more of a frontend that calls those. But for now we focus on the core primitives that give the speedup factor.
Why are these changes needed?
This PR adds support for multiple backends for the encoder. Initially we only have the
default
andicicle
backends, see https://dev.ingonyama.com/icicle/overview for more information on the icicle backend.A backend may include GPU acceleration as an option, see https://dev.ingonyama.com/icicle/install_cuda_backend for icicle's GPU accelerated backend.
In order to use the icicle backend it must be compiled with the
icicle
build tag:This PR also adds a refactor for passing configuration to the encoding library, e.g.
Checks