AWS rolls the dice for faster, more efficient networking

Amazon has developed a new networking topology that's up to a third faster and up to 40 percent more energy efficient than traditional hierarchical network designs. The novel architecture, called Resilient Network Graphs (RNG), is based on random graph theory. "Traditional networks have always been hierarchical," explained Matt Rehder, VP of global network engineering at AWS, in a recent interview. "They're sort of like an org chart where one network device will talk to the boss network device which will talk to the next boss network device and you gotta go up the chain of command in order to talk to someone else in another department." There are reasons for that, Rehder said. Hierarchy creates structure and makes data routing rules simpler. "You don't have to know how to talk to everyone in the organization, you just talk to the person above you," he said. But that creates inefficiencies. The tree-like structure creates points of contention where data flow bottlenecks can occur. At the same time, other parts of the network may be underutilized. Rehder said that academics in 2012 proposed a random graph topology for networks. But that design, as detailed [PDF] by Amazon researchers, had issues. The reimagined network structure, dubbed Jellyfish, relied on truly random graphs and called for removing routers from server racks and locating them centrally to simplify cabling. But that approach ended up increasing latency between servers within a rack. Rehder said no one has been able to put that design into production. "It requires much more complicated routing rules to figure out how to program every device – you can't just program every device to know who everyone is, they have limited memory space," he said. "And then the other [issue] is that the cabling actually is very complicated. Part of that hierarchy is about simplifying how you build the network in the datacenter and with a random graph it's literally random and you can't just have cable spaghetti all over a datacenter. So you could build it in a lab but you could never really do it at scale." Nonetheless, said Rehder, AWS has been solving these problems over the past few years. "The only reason we were able to even think about tackling them is that 15-year history of iteratively improving our hardware development and software ownership of our network," he said. Less random Inspired by other academic networking research, AWS managed to succeed with random network topology by making it not entirely random. RNG relies on a flat graph where routers interconnect through a mix of deterministic and randomized cabling. RNG began taking shape three years ago when Seshadhri Comandur, an Amazon Scholar and professor at the University of California, Santa Cruz, answered an internal Slack message from Ratul Mahajan, a fellow Amazon Scholar, datacenter networking expert, and professor at the University of Washington, who was looking for an expert on graph theory and routing. With help from AWS principal applied scientist Giacomo Bernardi and other colleagues, AWS has become the first company to deploy a flat datacenter network at scale. AWS expects the technology will offer better performance and reliability for Amazon customers while also saving billions of dollars in hardware and reducing CO2 emissions. The reimagined network structure was referred to as Penrose internally because the original design involved Penrose tiles. But as the project evolved, AWS settled on Resilient Network Graphs "to reflect the customer benefit and that primarily is a more resilient and performant network," as a company spokesperson put it. RNG relies on a routing algorithm called Spraypoint to identify node paths and an optical device called a Shufflebox for mixing connections between routers. Rehder said the Shufflebox is one of the pieces of magic that makes RNG work. "In a random graph network you don't have that hierarchical structure where you can have all the cables neatly aligned," he explained. "So how do you do that? How do you basically make a random network feel more structured? Well, you have the Shufflebox and the idea is that you plug fiber in here and inside of this it will randomize or basically scramble the fiber. So the ports you plug in get scrambled around and come out on some random port around the other side." RNG is AWS's new network for its core database servers. Machine learning hardware uses the company's UltraServer network, because the machine learning workloads need full bandwidth. "The core server networks can be oversubscribed more efficiently," said Rehder. "Everyone's not talking to each other at the same time." RNG has been rolled out in Ireland, Germany, and Spain, and the plan is to deploy it in the majority of company datacenters by the end of the year. ®

canonical claim-state bundle · Collapse: Technology