Signed in as:
filler@godaddy.com
Signed in as:
filler@godaddy.com
A significant shift in cloud computing architecture is emerging as start-up Drut Technologies introduces its scalable computing platform. The platform is attracting attention from major banks, telecom providers, and hyperscalers.
At the heart of this innovation is a disaggregated computing system that can scale to 16,384 accelerator chips, enabled by the pioneering use of co-packaged optics (CPO) technology.
"We have all the design work done on the product, and we are taking orders," says Bill Koss, CEO of Drut
The start-up's latest building block as part of its disaggregated computing portfolio is the Photonic Resource Unit 2500 (PRU 2500) chassis that hosts up to eight double-width accelerator chips. The chassis also features Drut's interface cards that use co-package optics to link servers to the chassis, link between the chassis directly, or, for larger systems, through optical or electrical switches.
The PRU 2500 chassis supports various vendors' accelerator chips: graphics processing units (GPUs), chips that combine general processing (CPU) and machine learning engines, and field programmable gate arrays (FPGAs).
Drut has been using third-party designs for its first-generation disaggregated server products. More recently the start-up decided to develop its own PRU 2500 chassis as it wanted to have greater design flexibility and be able to support planned enhancements.
Koss says Drut designed its disaggregated computing architecture to be flexible. By adding photonic switching, the topologies linking the chassis, and the accelerator chips they hold, can be combined dynamically to accommodate changing computing workloads.
Up to 64 racks - each rack hosting eight PRU 2500 chassis or 64 accelerator chips - can be configured as a 4096-accelerator chip disaggregated compute cluster. Four such clusters can be networked together to achieve the full 16,384 chip cluster. Drut refers to its compute cluster concept as the DynamicXcelerator virtual POD architecture.
The architecture can also be interfaced with an enterprise's existing IT resources such as Infiniband or Ethernet switches. "This set-up has scaling limitations; it has certain performance characteristics that are different, but we can integrate existing networks to some degree into our infrastructure," says Koss.
A 4096-accelerator cluster showing the different interconnect strategies supported.
The PRU 2500 chassis is designed to support the PCI Express 5.0 protocol. The chassis supports up to 12 PCIe 5.0 slots, including eight double-width slots to host PCIe 5.0-based accelerators. The chassis comes with two or four tFIC 2500 interface cards, discussed in the next section.
The remaining four of the 12 PCIe slots can be used for single-width PCIe 5.0 cards or Drut's rFIC-2500 remote direct memory access (RDMA) network cards for optical-based accelerator-to-accelerator data transfers.
Also included in the PRU 2500 chassis are two large Broadcom PEX89144 PCIe 5.0 switch chips. Each PEX chip can switch 144 PCIe 5.0 lanes for a total bandwidth of 9.2 terabits-per-second (Tbps).
The internal architecture of the PRU 2500
The start-up is a trailblazer in adopting co-packaged optics. Due to the input-output requirements of its interface cards, Drut chose to use co-packaged optics since traditional pluggable modules are too bulky and cannot meet the bandwidth density requirements of the cards.
There are two types of interface cards. The iFIC 2500 is added to the host while the tFIC 2500 is part of the PRU 2500 chassis, as mentioned. Both cards are a half-length PCIe Gen 5.0 card and each has two variants: one with two 800-gigabit optical engines to support 1.6Tbps of I/O and one with four engines for 3.2Tbps I/O. It should be noted that these cards are used to carry PCIe 5.0 lanes, each lane operating at 32 gigabits-per-second (Gbps) using non-return-to-zero (NRZ) signaling.
The cards interface to the host server and connect to their counterparts in other PRU 2500 chassis. This way, the server can interface with accelerator resources across multiple PRU 2500s.
Drut uses co-packaged optics engines due to their compact size and superior bandwidth density compared to traditional pluggable optical modules. "Co-package optics give us a high amount of density endpoints in a tiny physical form factor," says Koss.
The co-packaged optics engines include integrated lasers rather than using external laser sources. Drut has already sourced the engines from one supplier and is also waiting on sources from two others.
"The engines are straight pipes - 800 gigabits to 800 gigabits," says Koss. "We can drop eight lasers anywhere, like endpoints on different resource modules."
How the co-packaged optics cards connect to hardware elements and the role of the photonic switch.
Drut also uses a third-party's single-mode-fibre photonic switch. The switch can be configured from 32x32 up to 384x384 ports. Drut will talk more about the photonic switching aspect of its design later this year.
The final component that makes the whole system work is Drut's management software, which oversees the system's traffic requirements and the photonic switching. The complete system architecture is shown below.
The complete hardware and software architecture of the disaggregated computing system.
Interest in Drut's disaggregated computing system is coming from various enterprise segments including telcos.
"The first order for the 2500 series came from a top 10 international service provider," says Koss. The telcos see Drut's disaggregated system as a way to offer customers access to GPUs in the form of a GPU-as-a-service, part of their cloud computing offerings.
Drut is also working with a big European bank. Banks see the system as a way provide their data scientist staff with internal computing resources including GPUs.
"We are also working with two top 10 cloud guys, one's in Europe, one's in the US, and you can call one of them a hyperscaler," says Koss.
One aspect of the design that has attracted interest is being able to reconfigure the system should a GPU fail. Having a dynamic disaggregated system also means that resources can be reassigned depending on changing workloads and time of day. This immediately appeals to IT departments, says Koss.
Koss says being an early adopter of co-package optics has proven to be a challenge.
The vendors are still at the stage of ramping up volume manufacturing and resolving quality and yield issues. "It's hard, right?" he says.
Koss says WDM-based co-packaged optics are 18 to 24 months away. Further out, he still foresees photonic switching of individual wavelengths: "Ultimately, we will want to turn those into WDM links with lots of wavelengths and a massive increase in bandwidth in the fibre plant."
Meanwhile, Drut is already looking at its next PRU chassis design to support the PCIe 6.0 standard, and that will also include custom features driven by customer needs.
The chassis could also feature heat extraction technologies such as water cooling or immersion cooling, says Koss. Drut could also offer a PRU filled with CPUs or a PRU stuffed with memory to offer a disaggregated memory pool.
"A huge design philosophy for us is the idea that you should be able to have pools of GPUs, pools of CPUs, and pools of other things such as memory," says Koss. "Then you compose a node, selecting from the best hardware resources for you."
This is still some way off, says Koss, but not too far out: "Give us a couple of years, and we'll be there."
We use cookies to analyze website traffic and optimize your website experience. By accepting our use of cookies, your data will be aggregated with all other user data.