-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement Sliding Window Bloom Filter #715
Conversation
Hi @rueian, I was working on this and maybe is a good plus to the lib. Any feedback is more than welcome I had to borrow some of the internal utilities for this module, but I'm not sure if that's the best way to do it. I made some improvements for the indexes that could also be applied to existing filters, it won't improve that much the execution time because the slowest part is waiting for Redis, but it will reduce the GC pressure when the filter is on a "hot path". |
def7e26
to
04268ac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clever Idea!
0014c39
to
a976b89
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
I just wandering:
there are partitioned and discrete bloom filter idea for unbounded input. your choice is overlapping and sliding window bloom filter. do you think partitioned and discrete bloom filter is too expensive?
Hi @rueian @proost, Following the idea of moving the rotation to redis, i found that some times we get a weird behavior with the expirations. In the test case the PX is 500 but when running it on multiples test Redis says that the pttl is ~6s, see the video Cursor_tIzIte8ahI.mp4
Do you have any idea why this happens? After running it several's times I can't reproduce it now but from time to time I saw ~1s of pttl. |
@proost The first implementation that I did was a partitioned filter but partitioned filters can be slower if you need to query multiple partitions to get a result. Then i found this idea of overlapping that is simpler and you only have check the current filter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that is very weird operation. I read code again, but i don't know how can happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found that code is changed. is weird pttl phenomenon gone? i guess same situation happen.
@proost, It continue from time to time, even try with redis-benchmark to discard any possible error from our side and still, I'll prepare a issue on Redis to get more info about it |
This feature enhances the existing Bloom Filter capabilities by allowing time-based item tracking, making it suitable for use cases requiring temporary membership checks. Signed-off-by: Ernesto Alejandro Santana Hidalgo <[email protected]>
|
Hi @nesty92, I can't reproduce the issue with valkey 8. These are the commands I used: watch -n0 ./valkey-cli PTTL a
watch -n0 ./valkey-cli SET a 1 PX 500 NX Which version of Redis are you using? But, anyway, I think we can merge this already. |
Hi @rueian , I was using 7.4 that is the same in the cluster used for testing. Not sure yet why this behavior but there's nothing to do from our side. I will try with valkey later |
if acquiredLock then | ||
redis.call('RENAME', nextFilterKey, filterKey) | ||
redis.call('RENAME', nextCounterKey, counterKey) | ||
redis.call('SET', nextFilterKey, "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here the script always rotate the filters once if it acquires the lock. But shouldn’t it rotate the filters twice if we access it after idling for a while (> 2*windowHalf)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're absolutely correct. If more than half of the window has passed since the last rotation, it means that both the current and next filters are no longer valid since they no longer contain relevant data. In this case, saving the last rotation and reseting the filter if to much time as passed will solve it
Sliding Window Bloom Filter
How It Works
Core Concept
this is done in Redis to warranty an atomic operation.
Redis implementation.
Key structure:
Scripts:
Rotation Coordination: