Place
and Time: Monday, March 30th, 2009, Piece Hall 320, 29 Oxford Street
Refreshments
at 2:30pm, talk at 2:45
Link prediction is an example of
recent activity in the machine learning community on relational data sets, with
very visible applications such as collaborative recommendation systems (as used
by companies such as Netflix and Amazon). However, the problem differs from
traditional machine learning topics in some basic ways. As a result, it is
unclear how the problem should be formally posed, or if a guarantee of
performance theoretically exists akin to the standard learnability results that
are foundational to machine learning.
We show that under slightly
stronger assumptions than those usually posed, edge prediction can be
formulated in such a way that sample complexity bounds (using the concept of
VC-dimension) apply to the problem. We then examine how the theoretical result
applies to some models and algorithms currently used in the field. Interestingly,
results from extremal graph theory prove handy in analyzing the VC-dimension of
various graph families.