|
Network coordinate embeddings that use Euclidean distances make the assumption that the triangle inequality is not violated to a great extent by a large fraction of pairs of nodes. The triangle inequality states that for any triangle the length of a given side must be less than the sum of the other two sides but greater than the difference between the two sides, i.e., the sides must be able to form a triangle. When the latencies between node triples cannot form a triangle, they are said to violate the triangle inequality. Nodes with large and frequent violations tend to be the ones with the largest individual prediction error and their existence decreases overall accuracy (see [18] and Section 7.3).
We use a method from Tang and Crovella to examine the severity of
triangle inequality violations [31]. This method
normalizes the severity of each violation, permitting an all-pairs
comparison.
For each node pair, we find the shortest path between the two that
passes through a third node. Thus, for all pairs of nodes i and
j, we find the best alternative path through a node
and
normalize by the latency between i and j:
We examined the cause of the large fraction of pairs with very low
rpl (less than
) in Azureus.
We found that only a few nodes were members of many of these low
rpl pairs.
What distinguished these nodes -- and what was the cause of their
frequent participation in triangle inequality violations -- was that
their delay to non-PlanetLab nodes was atypically large, on the order
of seconds, while their delay to other PlanetLab nodes remained
typical (less than a second).
In effect, this extended one side of the triangles these nodes
participated in: d(i,j) became large while d(i,k) and
d(k,j) did not.
Because PlanetLab nodes that exhibited this behavior were co-located,
we conjecture that the Azureus traffic to non-PlanetLab sites was
being artificially limited at site gateways, while traffic to
PlanetLab nodes avoided this traffic shaping.
Rather than being a construct of the PlanetLab environment, this
effect, leading to bi- or multi-modal latency distributions,
will be the norm for at least some participants in Internet-scale
applications that use well-known ports and consume a large amount of
bandwidth, such as Azureus, because some sites will limit traffic and
some will not.
Like the round trip time spread, Azureus' violations foreshadow a
higher embedding error.
Jonathan Ledlie 2007-02-23