Hyperloglog

Architect Within
building-systems
Published in
Feb 4, 2021

--

Facebook Algorithm to count distinct elements — Approximation Algorithm

Simplifies cardinality estimation, or count-distinct problem. Ex count number of views on a webpage. Takes input data and hashes it

A HyperLogLog is a probabilistic data structure. It counts the number of distinct elements in a list.

In order to count values with exact precision you need memory similar to size of unqiue values as the only way to detect if the value is already seen is by comparison with the previous value.

Since memory is a limited resource, doing this becomes problematic when working with large sets of values.

A HyperLogLog solves this problem by allowing to trade memory consumption for precision with a standard error of 2% using only 1.5 kilobytes of memory

--

--