[Question]false positive rate #36

MainHanzo · 2018-08-29T04:08:26Z

Hello everyone, i am doing some research on the performance of the bloom filter, cuckoo filter and simd-block etc. And when i tried to compare the performance i found that the false positive rate of the cuckoo filter is not mentioned. So I would like to ask how much is the false positive rate of the cuckoo filter, how is the fp rate defined?

dnbaker · 2018-08-29T11:16:00Z

I recommend looking at the manuscript.

The key difference regarding false positive rates is that bloom filters' false positive rate approaches 1 as a function of the number of elements, while a cuckoo filter's error rate caps out at a relatively small rate (often 1-5%) regardless of how full the structure becomes.
This comes at the cost of insertion increasing as the structure becomes more full.

The other disadvantage is that sketches cannot be merged or compared as concise representations of sets the way a bloom filter can, which eliminates a major use case for them.

jbapple-cloudera · 2018-08-29T12:45:22Z

@MainHanzo , when you run bulk-insert-and-query.exe, the false positive rate is the column labeled "ε". When you run conext-table3.exe, the false positive rate is in the row labeled "false positive rate".

MainHanzo · 2018-08-29T15:29:16Z

@jbapple-cloudera @dnbaker Thank you for all your help.
@dnbaker However, I didn't get the meaning of"the sketches", do they refer to the cuckoo filters? and "concise representation of sets the way a bloom filter can". May i ask you explain more of that?
@jbapple-cloudera I have another question: If the fp rate are changing when the elements inserted into the filter, what is the man-set fp rate for? Is that just the fp rate for the very beginning？
I'm sorry, this is from a very beginner. Thank you very much for your patience.

dnbaker · 2018-08-31T13:09:51Z

@MainHanzo Correct. Both cuckoo and bloom filters are sketch data structures, and a filled sketch data structure is often referred to as a sketch.

Both bloom filters and cuckoo filters provide inexact set membership queries. However, bloom filters can approximate set operations by using bitwise & and |s, while merging cuckoo filters is much more expensive.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question]false positive rate #36

[Question]false positive rate #36

MainHanzo commented Aug 29, 2018

dnbaker commented Aug 29, 2018

jbapple-cloudera commented Aug 29, 2018

MainHanzo commented Aug 29, 2018

dnbaker commented Aug 31, 2018

[Question]false positive rate #36

[Question]false positive rate #36

Comments

MainHanzo commented Aug 29, 2018

dnbaker commented Aug 29, 2018

jbapple-cloudera commented Aug 29, 2018

MainHanzo commented Aug 29, 2018

dnbaker commented Aug 31, 2018