Designing with VPX: Ensuring efficient speed, cooling, and interoperability

5Of the two VPX mechanical form factors, 3U is very popular in certain applications, and 6U is sometimes necessary to build highly parallel systems with optimal computing density. While air and conduction cooling are often implemented to address cooling challenges, liquid cooling solutions can be effective for designs at the chassis, board, and chip levels. Today’s designs must be standardized for VPX to maintain interoperability and availability of common building blocks, and to ensure the success of this modular architecture now and in the future.

Modular computing architectures have been used for awhile, starting with the popular VMEbus for embedded systems in the early 1980s. A modular approach in system design provides great flexibility in computer sizing and helps meet harsh environmental constraints. VMEbus is still successful and viable today due to its large installed base, and the VPX successor architecture is now the de facto standard for high-speed interconnects backplanes, especially for harsh environments.

A high-speed board interconnect standard

VPX is a high-speed board interconnect standard that defines modular computers based on interoperable building blocks. In the development of next-generation applications, the VPX architecture is particularly useful in helping achieve parallel computing with multiple processing cards and designing redundant architecture for critical applications. The standard also facilitates the implementation of a large number of I/O when necessary. It provides a unified, standards-based means to promote interoperability among multiple vendors.

VPX is mainly governed by ANSI/VITA 46.0, defining the baseline for mechanical, power, and utilities infrastructure. The substandards VITA 46.x serve to establish implementation rules for the different high-speed protocols. In addition, OpenVPX (ANSI/VITA 65) is dedicated to interconnect topologies over the backplane and pin assignment profiles for the high-speed links.

Among all the proposed high-speed protocols, some embedded computing suppliers such as Kontron have elected to use VPX as a board-to-board interconnect and concentrate on the popular Ethernet and PCI Express (PCIe) protocols. These protocols benefit from a large hardware and software ecosystem, ensuring the effectiveness of the system architecture as well as performance improvement over time. The current state-of-the-art data transfer rate of these two protocols over VPX are Ethernet 10GBASE-KR running at 10.125 Gbps per channel and PCIe Gen3 running at 8 Gbps.

Kontron has developed a white paper, “High data rates over the VPX infrastructure,” which presents how to achieve 10 Gbps rates on a VPX backplane.

Inside the OpenVPX standard, a nomenclature was put in place – the slot profile and the module profile – to easily identify each module’s high-speed link configuration/pin assignment. For example, a popular 3U slot profile is SLT3-PAY-2F2U-14.2.3 (Figure 1), meaning the module features two fat pipes (one fat pipe is four lanes = four transmit differential pairs + four receive differential pairs) and two ultra-thin pipes (one ultra-thin pipe is one lane). The module profile – for example, MOD3-PAY-2F2U-16.2.3-11 – identifies that the two fat pipes are running a PCIe Gen3 protocol, and the two ultra-thin pipes are running 10GBASE-KR Ethernet protocol.

Figure 1: The Kontron 3U VPX CPU module, based on the Intel Xeon D processor 8-core architecture, is an example of a module with the SLT3-PAY-2F2U-14.2.3 slot profile.

An optical interconnect on the rear of the backplane can be used to link one module to another at very high speed, or to link a module to an external I/O connector on the chassis. It is the objective of the recently released ANSI/VITA 66.4 (Optical Interconnect on VPX – Half Width MT Variant) to standardize this connectivity for 3U modules. In this standard, the last eight lanes, normally made of copper, are replaced by an optical connector that can host 12 or 24 fibers. This paves the way for a potential large increase in bandwidth, but the cost of such assemblies is still significant for now and restricts adoption.

3U and 6U form factor ecosystems

3U VPX modules represent the VPX form factor with the richest ecosystem of functions, backplanes, and power supplies, with the classical dimensions of 160 mm x 100 mm. This form factor is generally enough to implement most processing functions including CPUs, FPGAs, and GPUs, as well all other required I/O and subsystems such as network, switch, and storage.

However, for larger systems with a high degree of parallelism, it might be impractical to increase the number of slots in 3U to accommodate many modules, as the chassis form factor would be too elongated. Therefore, the VPX 6U form factor is a good alternative and might also be required when migrating from a legacy chassis configuration, for example, migrating a system based on 6U VMEbus to a newer bus architecture. But the ecosystem of supporting functions in 6U tends to be smaller, sometimes suggesting a mix of 6U and 3U auxiliary functions in the same backplane and chassis. An example of a 6U CPU module featuring two Xeon D subsystems on a single board is presented in Figure 2.

Figure 2: A 6U VPX module featuring two Intel Xeon D processor subsystems on a single board.

Sometimes, the choice of 6U is also dictated by the area required to implement high-end processing functions with very high power dissipation and a large amount of supporting memories and memory channels. This situation is expected to be more and more frequent in the future because of the natural trend to multiply the number of cores into the processor die, which demands more parallel memory channels to sustain the data throughput to feed all cores. The increase in CPU package size and shift to connect all memories in parallel are leading to a point wherein 3U is no longer practical.

Cooling challenges

Every couple of years, the power requirements to supply embedded processing functions increase by 10 W or so to continue offering significant gaps in computation performance. Ten years ago, for embedded processors with a dual-core capability, 15 W was all that was needed to render a fair calculation performance. Today, high-performance embedded computers have to host quad cores, octo cores, and hexa cores, where they can now typically climb up to 55 W. Strategies to cool these hot spots have made progress, but it is still a challenge to keep higher-power processing functions at a temperature compatible with their maximum operating junction temperature, and also compatible with an acceptable circuit failure rate when operating within the average mission profile temperature.

The primary classical means to evacuate the heat in modular computing architectures such as VPX are through air cooling and conduction cooling of the cards. Air cooling, or convection cooling, has the advantage of simplicity, but suffers from the air’s weak ability to transport the heat efficiently, not to mention the problem of dust or contaminants that the airstream could carry from the outside environment.

Conduction cooling, on the other hand, conducts the heat from the hot spots to the edges of the board through a metallic frame by clamping the edges of the board to the chassis card cage. In this arrangement, electronic components typically do not see any contaminant, and the board is mechanically more robust because of the heat frame.

However, there are some drawbacks to conduction cooling. For one, the weight of the module is higher, especially when using high thermal conductivity metals such as copper, and the edge surface to exchange with the chassis is moderate, generally inducing a 5 °C to 10 °C temperature rise at the interface. A last difficulty in such solutions is to get an efficient thermal contact between the hot spot heating surface and the metallic heat frame, due to the possible mechanical dimensional tolerances between the chassis and the board’s hot spot(s).

There is an ongoing trend to increase the adoption of liquid-cooled solutions to take advantage of the high capacity of most liquids to transport the heat efficiently. At least three possible options maximize the benefits of liquid cooling in modular computers. Kontron supports all of these options.

Illustrated in Figures 3 and 4, liquid circulation can be used to cool the chassis walls (followed by traditional conduction cooling), to cool the heat frame of the modules per the VITA 48.4 (Liquid Flow Thru Applied to VPX) standard, or to directly cool the hot spots.

Figure 3: The liquid cooling method shown here by the blue bands is circulated through the chassis wall using standard conduction-cooled modules.

Figure 4: A direct liquid cooling method can be used for electronic component hot spots. It can be combined with chassis wall liquid cooling for enhanced results.

These solutions offer significant cooling improvements over the traditional air or conduction cooling, but come with the additional burden of generating and controlling liquid circulation. Combining both solutions also makes sense, providing the ability to directly cool the hottest circuits with liquid circulation infrastructure of VITA 48.4 and removing the heat of other components through the liquid-cooled chassis wall.

While direct liquid cooling seems the most promising of the available solutions in the presence of few but very high dissipative hot spots, it exhibits some technical challenges. For instance, given the effective heat transportation capacity of most liquids, the difficulty is not to ensure the liquid is flowing above the hot spot and carrying the heat, but rather to ensure that the calories on the hot spot heating surface will easily “jump” and “spread” into the liquid. Partnering with research institutes and industrial companies, Kontron is actively working to solve this challenge and release the full potential of liquid cooling for embedded computers.

All major options for electronics, mechanicals, and thermal designs have been standardized for VPX to maintain interoperability and availability of common building blocks. This continues to ensure the success of this modular architecture now and in the future while keeping the game open enough to adapt to new technology evolutions.

Serge Tissot is Principal Architect within the technology platforms team at Kontron. He is in charge of digital security and HPEC platforms, and guides the technical directions for the company in this area. Serge is also involved in the innovation and patent process. At the beginning of his career, Serge developed graphics and central processing hardware at the board level. He holds an engineering degree in Electrical Engineering.