Min-Wise Independent Permutations

We define and study the notion of min-wise independent families of permutations. We say that a family F a subset of Sn (the symmetric group) is min-wise independent if for any set X a subset of [1,n] and any x an element of X, when pi is chosen at random in F we have that that Pr(min(pi(X)) = pi(x)) = 1/|X|. In other words we require that all the elements of any fixed set X have an equal chance to become the minimum element of the image of X under pi. Our research was motivated by the fact that such a family (under some relaxations) is essential to the algorithm used in practice by the AltaVista web index software to detect and filter near-duplicate documents. However, in the course of our investigation we have discovered interesting and challenging theoretical questions related to this concept. We present the solutions to some of them and we list the rest as open problems.