Evaluating DHT-Based Service Placement for Stream-Based Overlays

Peter Pietzuch, Jeffrey Shneidman, Jonathan Ledlie, Matt Welsh, Margo Seltzer, Mema Roussopoulos


Stream-based overlay networks (SBONs) are one approach to implementing large-scale stream processing systems. A fundamental consideration in an SBON is that of service placement, which determines the physical location of in-network processing services or operators, in such a way that network resources are used efficiently. Service placement consists of two components: node discovery, which selects a candidate set of nodes on which services might be placed, and node selection, which chooses the particular node to host a service. By viewing the placement problem as the composition of these two processes we can trade-off quality and efficiency between them. A bad discovery scheme can yield a good placement, but at the cost of an expensive selection mechanism.

Recent work on operator placement proposes to leverage routing paths in a distributed hash table (DHT) to obtain a set of candidate nodes for service placement. We evaluate the appropriateness of using DHT routing paths for service placement in an SBON, when aiming to minimize network usage. For this, we consider two DHT-based algorithms for node discovery, which use either the union or intersection of DHT routing paths in the SBON, and compare their performance to other techniques. We show that current DHT-based schemes are actually rather poor node discovery algorithms, when minimizing network utilization. An efficient DHT may not traverse enough hops to obtain a sufficiently large candidate set for placement. The union of DHT routes may result in a low-quality set of discovered nodes that requires an expensive node selection algorithm. Finally, the intersection of DHT routes relies on route convergence, which prevents the placement of services with a large fan-in.