Network Coordinate Research at Harvard

Overview

We are researching how to develop stable and accurate large-scale network coordinate systems and what one can do with such systems once they exist. Our work onStream-Based Overlay Networks (SBON) uses a network coordinate space as the basis for its operator placement optimization process. Developing a working space in which to perform these optimizations was the initial motivation behind this work. Now that we have found that wide-area nodes do form coordinate spaces with strong predictability properties, we are examining what types of geometric algorithms can be employed on them to solve basic distributed systems problems. For example, given a low dimensional, imperfect, geometric space, what are appropriate routing algorithms to send messages to the node nearest an arbitrary point? And what kinds of guarantees can we offer about what we claim to be the nearest node? It may also be possible to apply other well-understood geometric algorithms like Furthest Neighbor and Cluster Analysis, and maybe even the Travelling Salesman Problem to sets of nodes with network coordinates. We are currently exploring the extent to which applying these existing algorithms is possible.

If you'd like to read more, we talk about accuracy and stability in our WORLDS and ICDCS papers and we outline the connection between geometric algorithms and network coordinates in our IWDDS paper. More recently we've looked at large-scale network coordinate behavior in our "Wild" paper and examined using them for routing in the "Wired Geometric Routing" paper. See Publications below.

Pyxida Project / PlanetLab Service

We have created an open source library and application that incorporates the ideas from our publications below. It is called Pyxida (pronounced "peeks-ee-da" which means "compass" in Greek). We are in the process of migrating our PlanetLab Service to using this library. The Azureus BitTorrent Client uses Pyxida.

In the short term, the Pyxida project will provide the following pieces of functionality: (a) estimate the latencies between hosts in a distributed system with stable and accurate network coordinates, (b) route to the host whose coordinate is nearest to a point in the abstract coordinate space (useful for distributed gaming applications), and (c) provide a stand-alone application that constructs a network coordinate system.

The Pyxida implementation supercedes our original PlanetLab service. We keep an instance of Pyxida running in the harvard_nc slice on PlanetLab. See the Pyxida page for more information on how to query this service.

King Blog Data

We have collected a set of latency measurements between 1895 Blogs using the King method. The King method works by taking the latency between two nodes' local DNS servers as a proxy for their true latencies. Thus, this data set is not really the latencies between the Blogs, but instead is the latencies between their local DNS servers. However, it is a reasonable data set if one just wants to see the inter-node latencies of 1895 nodes:

Please contact us if you would like the raw measurement data (not just the medians). Unfortunately, they are not timestamped. We are currently producing a timestamped dataset.

You may also be interested in the Meridian King Data Set and the MIT King Data Set.

Data Sets

We include here some of the data sets from our In the Wild paper. Please send an email if you would like additional data sets posted.

18 Day Drift Trace. This data set shows the coordinates of nodes on PlanetLab over 18 days in March 2006. The coordinates used three dimensions and latency filters. Their centroid steadily migrates away from the origin (0^3) of the coordinate system. The tarball contains:

  • directory filled with files of the form timestamp.data
  • timestamp ranges from 0 to 18 days in 10 minute intervals
  • each line in each file is: node-id coordinate for this interval
  • there is also a file containing the node-id -> IP/DNS mapping

Four Day Churn Trace. Over four days in September 2006, we collected uptime statistics from 9849 Azureus clients. If you turn this into a form more directly suitable for simulation, please let us know and we can post your version (or a link) here. 
The data is the result of approximately three million stats messages collected via getRouterUptime() returned through DHTTransportUDPContact.sendStats()messages. Each line in the data file contains (separated by spaces):

  • First two octets of the client's IP address
  • Autonomous system number of the client, obtained via perl's Net::Abuse::Utils. There is a dash ('-') if the AS is unavailable.
  • Two letter country code of the client. There is a dash ('-') if the code is unavailable.
  • Number of [timestamp,uptime] pairs to follow.
  • Pairs of [timestamp,uptimes]. The timestamp is in epoch time. The uptime is in seconds. We eliminated redundant entries from the crawl. We always include the first and last contact with the client. We also include the pairs of timestamps that show a departure and rejoin, i.e. when the current uptime is less than the previous uptime. For these, we output both the last time we heard from the client before it left and the first time since it rejoined. The median time between probes for each node was 248 seconds.

We have also collected a histogram of client port usage taken from the same trace.

PlanetLab-to-Azureus Latency Trace (582 MB). This is the raw data used to form the latency matrix in the "In the Wild" paper. Azureus clients ran on 283 PlanetLab nodes for 24 days starting on July 19, 2006. They collected a total of 9.5 x 10^7 latency measurements to 156,658 Azureus clients. 

The data is a collection of files, each named for the PlanetLab host that was running a client. Each line in the data file contains:

  • IP of remote client
  • number of latency measurements from the PlanetLab host to the client
  • Median round trip time
  • List of measurements:
    • timestamp (in epoch time, as perceived by the recording PlanetLab node)
    • raw round trip time (in milliseconds)


Caveats:

  • No attempt was made to have the PlanetLab clients contact specific remote Azureus clients. Instead, they just passively observed any latency measurements the clients made.
  • These are application-level measurements, and would have been affected by the load on the sender and receiver and timing on the sender.
  • Some of the "remote clients" are PlanetLab nodes.
  • Many, if not most, of the remote clients' IP addresses are dynamically-assigned via dialup/cable/dsl. Therefore, the actual machine on the other end may be changing over time.

Here is this same data set in matrix form. In the paper (Sec 3.4), we augmented this matrix with King measurements and network coordinate estimates: here is the augmented matrix.

Simulator

Here is the source code for our Vivaldi simulator:

And data files to go along with it:

  • 4 node box -- a simple example
  • 4 hour PlanetLab ping trace; this is a subset taken from the middle of the 72 hour trace and is recommended because using it removes most of the tail effects and it is when most of the nodes were up; 24 MB. Format:
    timestamp-in-ms from-id to-id latency
  • id2node; mapping of numeric node ids used by the sim and the traces to their IP addresses.
  • Median latencies; useful for playing with the algorithms as opposed to their behavior over time.
  • 72 hour PlanetLab ping trace; this is useful for looking at general latency behavior; 430 MB.

Note that both ping traces are UDP application-level round trip times.

A few examples:

  • Create 2-d coordinates (-x 2) for four nodes using the fixed latency file "box"; output the the change with every measurement (-g o) into a file 'foo.log'; run for 1000 rounds; each node has three neighbors:
    vivaldi -n 4 -g o -x 2 -O foo -l box -k 3 -r 1000
  • Create 3-d coordinates using the four hour ping trace, which contains 226 nodes; output the coordinates at the end of the four hours into a file 'four.syscoord':
    vivaldi -n 226 -f pings.4hr.stamp -g S -O four
Related Projects

Projects that are either interested in network coordinates, latency estimation, or both:

Contact us if you would like a link added.

Publications and Technical Reports

Jonathan Ledlie, Peter Pietzuch, and Margo Seltzer, Proxy Network Coordinates, Imperial College London Technical Report, January 2008 (pdf).

Jonathan Ledlie, Paul Gardner, and Margo Seltzer, Network Coordinates in the Wild, In Proceedings of NSDI 2007, Cambridge, MA, April 2007 (pdfTRhtmltalk slides).

  • A video showing the effect of the neighbor decay technique on Azureus' network coordinates: (avimp4).
  • A pair of videos showing how the PlanetLab coordinate system migrated with and without the gravity technique. The points shown are PlanetLab nodes. The video without gravity runs for 33 days and the one with runs for 18 days (each hour is compressed into a second's worth of video). Asian nodes are blue, nodes in the Americas are green, and European nodes are red.

Jonathan Ledlie, Peter Pietzuch, Michael Mitzenmacher, and Margo Seltzer, Wired Geometric Routing, In Proceedings of IPTPS 2007, Bellevue, WA, February 2007 (pdftalk).

Jonathan Ledlie, Peter Pietzuch, and Margo Seltzer, Stable and Accurate Network Coordinates, In Proceedings of ICDCS 2006, Lisbon, Portugal, July 2006. (abstract,TR-17-05paper). Videos showing how the filters work on a ten minute segment on PlanetLab: Raw CoordinatesLatency FiltersLatency and Update Filters.

Peter Pietzuch, Jonathan Ledlie, Michael Mitzenmacher, and Margo Seltzer, Network-Aware Overlays with Network Coordinates, In Proceedings of IWDDS 2006, Lisbon, Portugal, July 2006 (pdfhtml).

Peter Pietzuch, Jonathan Ledlie, and Margo Seltzer, Supporting Network Coordinates on PlanetLab, In Proceedings of WORLDS 2005, San Francisco, CA, December 2005 (abstracthtmlpdftalk).