Ethernet Strikes Again

A comforting constant in an ever-changing world

This is a networking post. Now when I say networking, I am not referring to the social manifestation. No, no, no. Today’s post relates to the evolving, versatile, forever-young Ethernet family of network technologies.

Where does Ethernet sup from its fountain of youth? Who is Ethernet’s Oscar Wilde, penning novels on the technology’s dark bargain to remain youthful and vital even as it ages over many decades? Ethernet is networking’s — no, technology’s — Dorian Gray, but without the moldering, mortifying portrait. Ethernet’s evergreen vibrancy is all the more astounding when one considers that it inhabits tech years, which are probably at least equal to dog years.

How does Ethernet renew itself repeatedly? A large part of the answer, which I will provide shortly, is simple enough to grasp. First, though, let’s discuss how Ethernet resurfaced on my radar this week.

Under the aegis of the Open Compute Project (OCP), a work-stream plan for Ethernet for Scale-Up Networking (ESUN) was announced. (Editorial note: The acronyms abound in techdom, proliferating like tribbles in that episode of the original Star Trek.) In addition to the OCP’s own blog post, as well as various vendor takes on the initiative, Network World published a helpful article that summarizes the impetus and purpose of ESUN.

A crude summary of the rationale behind ESUN would suggest that Ethernet, which already dominates cloud giants’ scale-out datacenter networks, is reinventing itself and putting on some muscle so that it can commandeer scale-up AI deployments. After all, AI is where the growth can be found today, and likely well into the foreseeable future, until a bracing market correction (the more likely scenario) or a catastrophic market scuppering (less likely) calls time on extravagant datacenter frivolity. Follow the money, folks. That’s what the titans of the tech industry do, and so do their infrastructure suppliers.

Growing Into a New Role

There are, of course, several real and substantive technological advancements that ESUN prescribes for Ethernet that will enable the technology to accommodate scale-up AI requirements. Ethernet is tasked, by its buyers and suppliers, to emulate or surpass the performance attributes of InfiniBand, an unquestionably useful technology but one under the control and ownership of a single provider, Nvidia.

While enterprises might grudgingly accept vendor lock-ins — they have neither the means nor the internal resources to design their own networks or to command the fealty of vast supply chains — hyperscalers are disinclined to dance stiffly to the imperial tune of a sole supplier. Irrespective of its capabilities or features, InfiniBand’s sole-source status meant it was doomed to lose AI pride of place in hyperscale datacenters. Instead, Ethernet, well established and supported by a diverse ecosystem of system vendors and component suppliers, was foreordained for another promotion, even if it initially lacked all the qualifications and would have to grow into its new role.

Then again, Ethernet has always grown into new roles, fending off a succession of competing technologies along the way. Ethernet has endured and thrived for decades. Along the way, it obliterated its rivals, including Token Ring, FDDI, ARCNET, and ATM (including the short-lived desktop variety). While Fibre Channel and InfiniBand remain extant, they are relegated to niches. Nvidia’s InfiniBand group, formerly Mellanox, had delusions of AI grandeur, but Ethernet’s vast allied forces are administering a firm reality adjustment.

It’s unwise to bet against Ethernet. That’s not because Ethernet has been objectively and technically the best available networking technology — that has not always been true — but because it has a dynamic, well-motivated, and endlessly resourceful ecosystem that includes scores of competing vendors and a comprehensive supply chain. Hyperscalers, the largest buyers of network infrastructure, gravitate toward the broad selection and choice of a competitive market, though they’re less appreciative of the virtues of intense competition in their own markets.

From the Network World article:

ESUN will focus solely on open, standards-based Ethernet switching and framing for scale-up networking—excluding host-side stacks, non-Ethernet protocols, application-layer solutions, and proprietary technologies. The group will expand the development and interoperability of XPU network interfaces and Ethernet switch ASICs for scale-up networks, the OCP stated in a blog: “The Initial focus will be on L2/L3 Ethernet framing and switching, enabling robust, lossless, and error-resilient single-hop and multi-hop topologies.”

The team at Arista Networks, which recently lost some Ethernet business to Nvidia at the AI datacenters of Meta and Oracle (the former a longtime Arista customer), views ESUN as follows:

ESUN will leverage the work of IEEE and UEC for Ethernet when possible, stated Arista’s CEO Jayshree Ullal and chief development officer Hugh Holbrook in a blog post about ESUN. To that end, Ullal and Holbrook described a modular framework for Ethernet scale-up with three key building blocks:
1. Common Ethernet headers for Interoperability: ESUN will build on top of Ethernet to enable the widest range of upper-layer protocols and use cases.2. Open Ethernet data link layer: Provides the foundation for AI collectives with high-performance at XPU cluster scale. By selecting standards-based mechanisms (such as Link-Layer Retry (LLR), Priority-based Flow Control (PFC) and Credit-based Flow Control (CBFC), ESUN enables cost-efficiency and flexibility with performance for these networks. Even minor delays can stall thousands of concurrent operations.3. Ethernet PHY layer: By relying on the ubiquitous Ethernet physical layer, interoperability across multiple vendors and a wide range of optical and copper interconnect options is assured.

Scaling on Up

The scale-up focus of ESUN is designed to consolidate a collection of AI accelerators to function like a single AI supercomputer. At the same time, ESUN is designed to retain host-side flexibility while also retaining systems design choices. Its features and functional are necessary enhancements to Ethernet, but flexibility and choice are necessarily conferred by the existence of multiple suppliers. Even if InfiniBand possessed everything a network buyer could want, it still comes from a single vendor. Ergo, the answer to hyperscalers’ AI network requirements doesn’t involve the evolution of InfiniBand, but for Ethernet to draw yet again on its shape-shifting powers of reinvention.

All the features packed into ESUN are intended to eliminate the compromise that hyperscalers and others enter into when they purchase Ethernet network infrastructure rather than InfinIband. The features are also meant to counter propriety interconnect and connectivity functionality that Nvidia offers in its Ethernet portfolio. Nvidia has joined the ESUN effort, but that’s probably of necessity, more for defensive purposes than anything else.

You can scoff at Ethernet and say it’s an archaic codger among the fashionable dandies of the technology firmament, but it still looks pretty spry to me.

Subscribe to Crepuscular Circus

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe