Intelligent chassis management for mission-critical VPX systems
Because lives are at stake when military electronics fail, the VITA 46.11 System Management on VPX standard facilitates intelligent chassis management and, therefore, heightened reliability.
As the military modernizes from basic control and text-based systems to nearly autonomous and vision-based platforms, the ever-increasing need for more complexity and processing power has forced military electronics to evolve as fast, or faster than, their commercial counterparts. But unlike the commercial world, where a nonoperational tablet computer is merely an inconvenience, a malfunctioning military system can cost lives. Thus, the old method of “swap it out when it fails” just will not do in the military. In the meantime, the VPX form factor is rapidly gaining traction in the military realm as a replacement for VME, and intelligent chassis management for mission-critical VPX systems as provided under VITA 46.11 is helping to ensure reliable performance when failure is not an option.
Military embedded systems are becoming more complex and expensive. The "black box" approach of replacing a system piece-by-piece until it works is no longer practical because of the costs associated with having two of everything.
Extreme reliability is therefore demanded from military systems, and while simulation and testing are extremely important, they can't perfectly predict what an actual piece of equipment will be subjected to in the field, or how that equipment will perform on a long-term basis.
To ensure that equipment is not subjected to major environmental conditions that are outside the specification requirements, an intelligent chassis management system is required. In addition, a management system Error Log is indispensable for finding the conditions that led up to any failure, and can provide valuable feedback into the maintenance and design process so that future failures can be avoided.
The VITA Standards Organization (VITA/VSO) has provided the military with this monitoring, control, and logging system in the form of VITA 46.11 – System Management on VPX. This document defines a standard that allows chassis, backplane, board, and system designers to build compatible and interactive monitoring, control, and logging systems. In creating this standard, VITA adapted much from the PICMG 3.X standards.
The widespread adoption of VITA 46.11 will result in lower maintenance costs, quicker repairs, and the greater reliability demanded by military electronics. (Note: VITA 46.11 has not been fully ratified. To date, it is still a draft specification.)
PICMG 3.X is quite complex and very comprehensive, but because it was not designed for military applications, it has several shortcomings when used in a military setting. In commercial applications, the primary focus is to protect the equipment, but in military applications the focus is on protecting the mission. These differing needs affect the decision process that the monitoring and control system makes, but it doesn't change the type and method of data acquisition.
In both PICMG 3.X and VITA 46.11, the Chassis Management Controller queries PROMs on each board for ID and other pertinent information. The consumer industrial Management Chassis Controller "asks questions and then turns on." The military Management Chassis Controller "turns on and then asks questions."
Protecting the mission chassis management
In a consumer-focused system, it might be frustrating if bootup takes several minutes, but it makes no huge difference if that is the case. However, in today's wars, a battle may be nearly over in 2 minutes. Fast startup is a must.
VITA 46.11 recognized the need for fast startup and made that process much easier by stating that a system could check only a few major items before issuing the Start command. Detailed monitoring is started only after the system becomes operational. VITA/VSO also realized there are some situations where it is imperative that the system remains operational, regardless of any other condition. Such a condition would exist if a ship were under attack. Missiles must be fired even if the equipment is overheating. To provide for such a scenario, military systems usually employ a technique known as Battleshort. This is a mode that, once turned on, amounts to a "run until you melt" command.
VITA 46.11 integrates system management from the modules to the system. An Intelligent Platform Management Controller presents that module to the Chassis Manager. An I2C Intelligent Platform Management Bus, the IPMB link, then presents the modules to the Chassis Manager, which in turn presents the entire chassis to the System Manager. The System Manager can be linked to one or more Chassis Managers via high-speed bus such as Ethernet to form an OpenVPX (VITA 65) system. This high-speed link can be encrypted for security.
The System Manager monitors temperature, voltages, currents, fan speeds, airflow, acceleration, and any other desired parameter variations for which there are sensors. When critical thresholds are reached, programmed actions are taken. It's important that parameters are logged for future reference of real-time system operability.
Because military contractors may be unfamiliar with the new system capabilities, Dawn VME Products has developed a set of diagnostic hardware tools that can run in conjunction with the System Manager in test or deployed systems. One example is the Intelligent Test Module, the ITM-6973, which is on a 3U board and performs stress tests and facilitates worst-case analyses (Figure 1). Under program control, it can load the system to any percentage of its capacity – including overload – thus proving that the system can handle normal and fault conditions properly while at the same time examining power supply response and temperatures by emulating system boards based on empirical profiles.
Such intelligent test modules are capable of providing dynamic loads to the system that can mimic virtually any board complement. The current loads are FETs, not resistors, which allow for the precise changing of currents on millisecond boundaries. It can also generate virtually any amount of heat – useful for testing cooling systems. Data log files are real-time stamped. The front panel has an I2C bus connector and a micro-USB connector that permits interfacing to a PC.
VITA 46.11: Power to the processor
Current military systems often require processor power not imagined only a few years ago. In drones, for example, several channel of video are gathered, but to send these images back at high frame rates and resolution would require enormous bandwidth. To make this task easier, the processor must compress the data before transmission. In addition, it may also have to encrypt the video stream to prevent access by enemy combatants. Also, onboard processing is required to distinguish the difference between moving and inanimate objects. These are just a couple of examples of increasing complexity and therefore challenges for mission-critical VPX systems.
Additionally, large memories, powerful processors, and other high-density semiconductors present inherent diodes with low reverse breakdown voltages and ultra-small trace insulation with low dielectric breakdown voltages. Voltage differences during power-up might exceed the reverse bias limits of these diodes or forward bias them to cause latch-up. These problems might be solved by writing code to control the order of voltage rails sequencing up or down once the System Manager has identified the particular problem board: VITA 46.11 wisely includes an error log as part of the specification.
Error logs, software upgrades, and system failure
The error log produced by the VITA 46.11 System Manager is extremely important. Design engineers and maintenance technicians must be fully aware of events leading to failure. Conditions in the field can never fully be anticipated by any test.
The VITA 46.11 System Manager is capable of recording virtually any condition. Figure 2 indicates connection paths to modules via the Chassis Manager. For example, during certain processing operations, the system might generate more heat than at other times. The System Manager can then be programmed remotely to increase fan speed during those intervals, thus ensuring that the system is always operating within specification, while at the same time power consumption is minimized by lowering fan speed when maximum cooling is not necessary.
The monitoring system can also provide clues on how the system software may be rewritten to lower temperatures and/or power consumption. Often a problem can be fixed by upgrading the System Manager's software. This is certainly an elegant solution because this can be performed without moving the units. New code can be attached to an email and sent to the units' location.
An upgrade in software can warn the System Manager that a particular board is a problem, and if certain conditions are detected, shut it down immediately. Maintenance would then replace the board with a spare. This would be a short-term fix until the particular board vendor fixes the problem.
Moreover, unusual combinations of circumstances such as vibration, temperature, humidity, power cycles, and other parameters over time might cause system failure. It is important that the cause of failure be identified as that of the customer or vendor. If the problem is that of the customer, such as exposing a unit to a harsh environment beyond specification limits, then the customer must modify their procedures or modify their requirements. If the problem is that of a particular vendor, then that vendor must fix the problem.
VITA 46.11: Standardized real-time system monitoring
With the adoption of VITA 46.11, industry has provided the military with a standardized real-time system monitoring capability that can easily be adapted to different situations by means of a large number of programmable parameters. The similarity of VITA 46.11 to PICMG3.X should reduce implementation costs and smooth the adoption process for intelligent chassis management in mission-critical VPX systems. This capability will result in lower maintenance costs, quicker repairs, and the greater reliability demanded from military electronics.
Charles Linquist is Chief Technical Officer at Dawn VME Products. He has 30 years of experience in mechanical, electrical, and software design with telecom, commercial, and military OEMs. He presently designs MIL-Spec-compliant enclosures, backplanes, and intelligent chassis management systems to enable military rugged, high-performance computing applications. He can be contacted at firstname.lastname@example.org.
Dawn VME Products