Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bloom blocks downloading queue #11201

Merged
merged 14 commits into from
Nov 24, 2023

Conversation

vlad-diachenko
Copy link
Contributor

implemented bloom blocks downloading queue to control the concurrency of downloading the blocks from the storage

@vlad-diachenko vlad-diachenko requested a review from a team as a code owner November 10, 2023 14:17
@vlad-diachenko vlad-diachenko enabled auto-merge (squash) November 10, 2023 14:21
Copy link
Member

@owen-d owen-d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

took a brief look, letting @chaudum do the lions share of review on this one

Copy link
Contributor

github-actions bot commented Nov 21, 2023

Trivy scan found the following vulnerabilities:

Comment on lines +154 to +158
// we need to have errCh with size that can keep max count of errors to prevent the case when
// the queue worker reported the error to this channel before the current goroutine
// and this goroutine will go to the deadlock because it won't be able to report an error
// because nothing reads this channel at this moment.
errCh := make(chan error, len(references))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I understand that correctly. On line 168 you return after sending the first and only error to the errCh. Could you just make buffered channel of size 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this channel is also passed to the queue and queue workers will report errors to this channel also...

Signed-off-by: Vladyslav Diachenko <[email protected]>
return Block{}, fmt.Errorf("error while period lookup: %w", err)
}
objectClient := b.periodicObjectClients[period]
readCloser, _, err := objectClient.GetObject(ctx, createBlockObjectKey(reference.Ref))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you'll see this with the next rebase, but now result of this get call is a compressed tar archive that needs to be downloaded and decompressed, which needs to be handled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's already handled in

directory, err := d.extractBlock(&block, time.Now())

}()
return blocksChannel, errChannel
// GetBlock downloads the blocks from objectStorage and returns the downloaded block
func (b *BloomClient) GetBlock(ctx context.Context, reference BlockRef) (Block, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

asking to understand, is there no use case fetching multiple blocks at once? i can see underlying call is the same to object store, i'm asking from api design perspective.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we will download many blocks in parallel using the downloading queue. I see that in the compactor it might be necessary to download many blocks in parallel also, but I believe we can spin up the goroutines in compactor's code to download many blocks in parallel. wdyt?

@github-actions github-actions bot added the type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories label Nov 24, 2023
Copy link
Contributor

@chaudum chaudum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments, but nothing that would block merging.

Comment on lines +256 to +257
//TODO implement me
panic("implement me")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//TODO implement me
panic("implement me")
return 0

Comment on lines +261 to +262
//TODO implement me
panic("implement me")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//TODO implement me
panic("implement me")
return true

Comment on lines +266 to +267
//TODO implement me
panic("implement me")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//TODO implement me
panic("implement me")
return 1

}

func (d *blockDownloader) stop() {
_ = services.StopManagerAndAwaitStopped(d.ctx, d.manager)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should log the error?

}

func (cfg *DownloadingQueueConfig) RegisterFlagsWithPrefix(prefix string, f *flag.FlagSet) {
f.IntVar(&cfg.WorkersCount, prefix+"workers-count", 100, "The count of parallel workers that download Bloom Blocks.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be more conservative regarding the default value. 10 should be more than sufficient?


func (cfg *DownloadingQueueConfig) RegisterFlagsWithPrefix(prefix string, f *flag.FlagSet) {
f.IntVar(&cfg.WorkersCount, prefix+"workers-count", 100, "The count of parallel workers that download Bloom Blocks.")
f.IntVar(&cfg.MaxTasksEnqueuedPerTenant, prefix+"max_tasks_enqueued_per_tenant", 10_000, "Maximum number of task in queue per tenant per bloom-gateway. Enqueuing the tasks above this limit will fail an error.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

@vlad-diachenko vlad-diachenko merged commit 75cfe59 into main Nov 24, 2023
6 checks passed
@vlad-diachenko vlad-diachenko deleted the vlad.diachenko/bloom-blocks-downloading-queue branch November 24, 2023 10:27
rhnasc pushed a commit to inloco/loki that referenced this pull request Apr 12, 2024
implemented bloom blocks downloading queue to control the concurrency of
downloading the blocks from the storage

Signed-off-by: Vladyslav Diachenko <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/XL type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants