HyperLogLog

Basic idea

A streaming cardinality estimator that uses O(loglogn)O(\log\log n) bits per register. Hash each element, take the position of the first 1-bit in the low bits as a proxy for log2\log_2 of the (sub-stream) cardinality, then average across mm registers with the harmonic mean.

Key formulas

Resources

Siblings