Place and Time: Monday, Nov 10, Maxwell Dworkin 319
Refreshments
at 2:30 pm, talk at 2:45pm
It is known that if a 2-universal hash function H is applied
toelements of a block source (X_1,...,X_T), where
each item X_i has enough min-entropy conditioned on
the previous items, then the output distribution (H,H(X_1),...,H(X_T)) will be
``close'' to the uniform distribution. We provide improved bounds on how much
min-entropy per item is required for this to hold, both when we ask that the
output be close to uniform in statistical distance and when we only ask that it
be statistically close to a distribution with small collision probability. In
both cases, we reduce the dependence of the min-entropy on the number T of
items from 2\log T in previous work to\log T, which we show to be optimal. This
leads to corresponding improvements to the recent results of Mitzenmacher and Vadhan (SODA`08) on the analysis of hashing-based
algorithms and data structures when the data items come from a block source.