Benchmarking SBC functional density

4Many companies tout the functional density of their single board computer (SBC) products, often stating that they have the highest functional density within a class of product. But what does this mean? How do you measure functional density? While performance can be measured with industry-accepted benchmarks, similar benchmarks for functional density do not exist.

SBC functional density is a topic that has long intrigued me since my days as SBC product manager at the Motorola Computer Group. Our SBCs were recognized as achieving some of the highest functional density in the industry. It is widely accepted that increased functional density reduces system board count, shrinks system size or physical volume, and simplifies assembly and cabling, leading to reduced overall system cost and improved reliability. The reasoning is straightforward; if you can reduce the component count by putting more functionality on fewer boards, the other metrics improve.

“Increasing functional density is extremely important, and drives many of the innovations we bring to market,” said Richard Kirk, director of Core Computing at GE Intelligent Platforms. “Not only are our customers looking to do more with less – more performance and more functionality in smaller spaces – but, as a manufacturer, it enables us to develop a single design that addresses the broadest possible range of applications and markets.”

Within the scope of VPX, an SBC is commonly defined as a complete functional computer, built on a single printed circuit board. Functionality includes a single or multiple microprocessors, memory, I/O subsystems, blade interconnection, and any mezzanines within a single-slot envelope. All they lack is a power source. For most VPX suppliers, it means putting as many features and functions as possible on the board. Well-designed SBCs have the flexibility to be customized to meet customers’ specific requirements.

“Increased functional density is a key contributor to what we see as the rapid growth in the popularity of the 3U format – especially in VPX – as functionality that once required the extended board real estate of 6U can now be included on a 3U board,” Kirk said. As size, weight, and power (SWaP) considerations are foremost in the thoughts of most target customers, increased functional density is playing a key role in enabling systems to be developed with compelling SWaP characteristics.

Functional density influences

Several inflection points have led to breakthroughs in functional density. Microprocessors and memory follow a nice Moore’s Law curve. I/O and blade interconnections have benefited from great improvements in serial fabric connectivity. ASICs drove significant advances as discrete logic and key functionality could be put into a space-saving ASIC. Eventually processors with integrated serial fabrics and local buses picked up this functionality. FPGAs have changed the model, making it possible to add impressive performance and I/O capability.

Microprocessors

Distinguishing an SBC by processor has been a challenge for SBC designers over the years. Everyone uses the same processors from the same suppliers, primarily Intel and NXP (Freescale). For a time in the 1990s and 2000s, you could include secondary caches and select from a variety of I/O chipsets. But today, Intel processors have become dominant, and the Intel roadmap determines functionality and performance. Added to this are the increasing amounts of I/O, Ethernet, and serial fabrics that are being incorporated into the processor, such as PCI Express (PCIe), Ethernet, or Serial RapidIO. Kirk points out that the movement toward PCIe Gen3 and 10/40 GbE fabrics to feed multicore processors is an enabling factor.

In a world driven by industry standards – whether those are VPX or PCIe or Intel processors – it’s a common misconception that all boards that use them must be broadly similar, Kirk said. “Nothing could be further from the case,” he said. “It’s a little like giving Michelangelo and Salvador Dali the same blank canvas, and expecting both to produce the same picture.”

Inflection point: The multicore processor is the greatest innovation that has affected functional density related to processors. Previously, to increase functional density, SBC suppliers would put two or maybe four processors on the SBC, but power and space made this extremely difficult and impractical. Multicore made this easier while at the same time enabling everyone to do the same thing. The playing field remained level, but functional density took a big jump. The improvements will continue for some time as geometries shrink and the number of cores on a die increases.

Memory subsystems

Memory subsystems are one way that suppliers can still differentiate. Capacities are roughly the same, yet creative designs have different ways of making the memory modular and increasing the amount of memory on an SBC.

In years past, SBC designers used custom mezzanines that filled the rest of the slot or even went into a second or third slot. Memory technology changed so fast that designers were continually redesigning memory subsystems to use the most current and cost-effective memories. Many efforts were made to use single in-line memory module (SIMM), dual in-line memory module (DIMM), and similar packages to take advantage of consumer prices, but the durability of these modules made them an unpopular choice for many. SBC designers have always struggled with memory life cycles and capacity. Modular memory makes it easier to evolve an SBC product line and maintain the long life cycle required for embedded applications.

During a recent visit to Mercury Systems, Darryl McKenney, VP of Global Engineering Services, took me on a tour of the company’s development labs. He pointed out some interesting packaging that the company is using for memory to achieve even greater densities than is possible with DIMMs and mezzanines, yet able to address the needs of rugged operating environments. Newer Mercury Systems’ designs use a slotted circuit board with no memory connector; instead, the board is the connector. This results in a much denser and rugged design that still maintains a suitable level of replaceable modularity.

Allied to direct memory attach technology are Mercury’s BGA approaches that enable the most contemporary and powerful Intel Xeon server-class devices to be taken from their native benign LGA data-room environments and deployed in harsh military applications.

“Collectively, efficient cooling and rugged high-density packaging of Xeon devices with vast memory arrays and FPGA low-latency offload engines enables the highest functional density to be deployed right to the tactical edge,” McKenney said.

Inflection point: A major inflection point has not occurred in this sector for many years. Memory follows a classic Moore’s Law curve for performance and capacity. However, a new memory product is showing the potential to replace flash and DRAM, offering 1,000x faster speed and 1,000x greater durability, using much less power, and providing 10x the density of conventional memory. The new 3D XPoint technology developed by Intel and Micron could dramatically advance the state of memory in the embedded market, especially since it is expected to be priced at a cost between DRAM and flash.

I/O subsystems

There are various kinds of I/O, and the mix changes relatively quickly. Connectors for I/O probably have the biggest influence because many types of I/O use large connectors. I/O subsystems offer the greatest potential for establishing a differentiated product to add value and gain a competitive advantage. Specialized, customized, and secure I/O are but some of the many areas where suppliers can shine.

Functional density is important to establishing as many options as possible. Several suppliers bypass the connector issue by extending the physical connectors out through the VPX backplane or through serial or flex cables on the front, placing the burden of the physical connectors on the enclosure or other blades in the backplane.

Inflection point: The move to high-speed serial connections, primarily USB and Ethernet, has had the biggest impact. Gone are the large parallel printer port connectors and serial ports using DB-15 or DB-25 connectors. USB and Ethernet both have much smaller connectors and are continuing rapid advancements in performance, with smaller connectors making it possible to add even more functionality. They can also handle so many different I/O requirements that many specialized I/O devices now utilize Ethernet or USB. New types of I/O are continually being added, making the challenge of measuring the I/O subsystem aspect of functional density even more difficult.

Blade interconnection

The change to serial switched fabrics has made it easier to add board-to-board interconnection capability to an SBC. They require fewer pins per lane as compared to a parallel bus and are easier to expand or multiplex through switches.

In its early days VMEbus required a complete chipset and up to 35 percent of a 6U board to implement. I/O capability out of the backplane connectors was severely limited, leaving the front panel as the only option. ASIC and FPGA technology reduced the space requirements, but the performance was still confined. Today, serial switched fabrics are built into the processor or processor chip set so no additional board space is needed for a basic implementation. PCIe, Serial RapidIO, and various speeds of Ethernet are commonly available with more and faster channels being added every few generations of processors.

Inflection point: Serial switched fabrics have seen a lot of innovation through the past 15 years. Hardy Ethernet has been enhanced at the performance level to keep up with other options. PCIe is widely used and has continued to improve. Serial RapidIO fills the gap, serving target applications well.

Mezzanine modules

As board form factors continue to shrink and connectivity between modules becomes solely dependent on serial switched fabrics, the need for mezzanines to add functionality will diminish or even disappear within the next 10 years.

PCI mezzanine card (PMC) has been a workhorse for several years. Switched mezzanine card (XMC) leverages serial switched fabrics that enable XMC to keep up in bandwidth. FPGA mezzanine card (FMC) illustrates how the role of a mezzanine might be shifting. FMCs are used to provide the physical I/O interface to an FPGA controller that is on the carrier board. Mezzanines provide a way to add unique or custom I/O and allow board designers to fill the slot space in an attempt to get more board real estate to optimize functional density.

FPGAs

FPGAs provide additional processing power and custom I/O in a very small package, as well as the ability to perform runtime configurations, thus changing the whole functional density model. FPGAs will continue to play an increasingly important role, perhaps even reaching the point where most of the SBC features will be provided through FPGA IP. Boards of the future will consist of basically the microprocessor, its support chipset (if needed), the memory subsystem, and an FPGA for the remaining functionality.

Inflection point: The impact of FPGAs has been more gradual over the years. As they have become denser, faster, and lower in cost, their usage has increased. Add to that the significant improvements in development tools, and you start to see why they are so popular.

The cooling challenge

The most common challenge facing SBC designers is board thermal management. Performance leads to more power being required; functional density makes the cooling challenge even greater. Many creative cooling strategies have emerged, accelerated the past few years by the increasing use of power-hungry processors and chipsets.

“The thermal density of the CPU is a constant challenge, as we want to support a large feature set and, of course, have to target conduction-cooled applications that require up to +85 °C at the card edge,” Kirk said. “Here, we have some innovative cooling strategies and technologies that enable us to surmount these challenges.”

Mercury Systems has an agnostic approach to board layouts that makes the cooling challenge easier to address. McKenney said he was overwhelmed with all the board layouts when one day he was inspired to charge his design teams to come up with a better way. He showed me how Mercury Systems uses a common layout for all cooling methods from air convection to liquid cooling, saving a great deal of effort. Because space must be reserved for various cooling techniques, keeping a high level of functional density is even more difficult. McKenney thinks it is a wise trade-off, but it demands that designers be even more creative in other aspects of board design.

As an example, Mercury Systems’ Xeon server-class server blades are available as air-cooled modules for lab development. The same solution can be packaged as Air Flow-By (AFB) or Liquid Flow-By (LFB) modules for deployment in rugged open system architecture subsystems (see Figure 1).

Figure1
Figure 1: Mercury Systems’ cooling technologies and stand-up memory modules

“Both cooling technologies reliably remove the elevated thermal energy such powerful devices release, with the former (AFB) being an efficient low-SWaP air management approach to cooling and the latter (LFB) a redundant air/liquid approach for high-altitude applications,” McKenney said.

GE Intelligent Platforms believes the trick is to constantly evaluate available technologies and drive features in an onboard FPGA – for example, an FPGA with an ARM core that can take on I/O and board management needs. Similarly, general-purpose graphics processing unit (GPGPU) technology allows GE Intelligent Platforms to provide more floating-point performance per slot by leveraging the GPGPU’s inherent parallelism. Another way to gain “virtual” functional density is by supporting hypervisors to allow multiple functions to be performed on the same processor, with the ability to ring-fence them for performance and security reasons.

Beginnings of a benchmark model

Measuring or benchmarking SBC functional density has evaded me since those days as product manager. One day, years ago, a longtime compatriot and VITA Technologies Hall of Famer Shlomo Pri-tal came to my office and said “Gipper – we have to come up with a way to measure the functional density of our products.” Together, we worked on a method to benchmark our products but never completed the task. It has haunted me ever since, so now I’m making another attempt.

When asked what others do to measure functional density, I did not get a clear and precise answer. Everyone is driven by the same effort to deliver more capability and functionality to their customers without compromising the ability to upgrade simply and cost-effectively to newer generations of SBC. I did hear that there are some “broad brush” measurements that can be applied, usually performance related, such as GFLOPS per slot, a metric that correlates processing performance with the space/volume of the system. System products are frequently measured for their aggregate interconnect bandwidth, but that doesn’t help measuring density as I was hoping to accomplish.

Developing a good benchmark for comparing functional density has been very elusive. While several basic measurements can be calculated, there are numerous difficult-to-measure features. With the introduction of FPGA functionality, it becomes even more complicated. Nonetheless, I made an attempt to create a model that lets me compare products to each other and to the past, maybe helping us project what we can expect in the future.

Table 1 highlights the first attempt at a simple model and the metrics used in the calculations. The model is flexible enough to quickly add more measures to the scores. The model is based on a 6U form factor, but the actual form factor is irrelevant unless comparing between form factors where you would have to apply an adjustment for the volume of space.

From the calculator, it is possible to chart the data to obtain a graphical representation of functional density. I choose to use a Spider chart in lieu of a single score. A single score does not capture the impact of each of the major SBC subsystems. The Spider chart creates a dedicated axis for each subsystem. I generate a single score for each subsystem and then normalized it to 10 to fit within a single chart. Chart A (Figure 2) uses linear calculations. Chart B (Figure 3) shows the same results, but with a logarithmic Base 10 scale that better illustrates the enormous gains in functional density over the years. The biggest relative gains have been in I/O and interconnect functional density, driven by advances in serial fabrics.

Figure2
Figure 2: 6U VPX functional density – linear scale

Figure3
Figure 3: 6U VPX functional density – Log10 scale

Continued advancement

Functional density is only going to improve because that is what electronics do. What is really going to be interesting to watch is the impact of FPGAs. There will be a day in the not-too-distant future where many of today’s types of SBCs will be replaced by an FPGA SBC where everything is IP in the FPGA. The only component coming off the board will be high-density connectors to branch out to the I/O and interconnects.

I hope my attempts at developing a functional density calculator have piqued your interest. Please feel free to contact me if you wish to learn more of the methods to my madness or to brainstorm ideas of your own.