Tag Archives: hash function

Using sketch_set for reach estimation

A common problem I’ve seen in MapReduce for advertising analytics is calculating the number of unique values in a large data set. Usually the unique value represents a viewer, or cookie, or user. From a business end, it matters a … Continue reading

Posted in Uncategorized | Tagged , , , | Leave a comment