Harvard Theory of Computation Seminar

Algorithms and Analysis of Depth Functions using Computational Geometry

Eynat Rafalin, Tufts University



Place and Time: Monday 4/3, Maxwell-Dworkin 319 Refreshments at 2:30, talk at 2:45.

ABSTRACT

The concept of data depth has been developed over the last decade as a method of multivariate data analysis. In today's information age, with the proliferation of large amounts of high dimensional data, it provides an analysis method that is an attractive alternative to classical statistics: while classical statistical analysis requires a preliminary assumption as to the underlying probability distribution of the data, that affects the analysis, depth based statistics requires no prior assumptions and is robust to outliers. Potential applications of this technique include bioinformatics, clinical data mining, statistical process control and financial risk analysis. Various data-depth-based analysis methods have existed for a few years, but most have not been sufficiently efficient to handle large data sets.

Computational Geometry focuses on the complexity analysis of geometric problems and the design of effective algorithmic solutions. We have used computational geometry techniques to develop more efficient tools for data-depth analysis and to suggest alternative approaches. Examples include a theoretical framework to evaluate any depth function; dynamic algorithms for computing depth contours; and the novel proximity graph depth, a new class of depth functions that, unlike most existing depth functions, can distinguish multimodal data sets.