Skip to main content

Bloom Filter

· One min read

A Bloom filter is a data structure that is used to determine whether an element is a member of a set with a much higher space and time efficiency than other general algorithms.

The results obtained using a Bloom filter may yield false positive matches, but cannot yield false negative matches. In other words, the query returns results that are "either possibly present or definitely not present." Elements can be added to the set, but cannot be removed (although this can be addressed with an additional "counting" Bloom filter); the more elements added to the set, the greater the likelihood of false positives.

Use Cases

  • Cassandra uses Bloom filters to determine if an SSTable contains data for a specific row.
  • HBase Bloom filters are an effective mechanism for testing whether a StoreFile contains a specific row or row-column cell.
  • With Bloom filters, a website's anti-cheat system can effectively deny access to banned users.
  • Google's Chrome browser once used Bloom filters to identify malicious links.
References: