GAUNTLET.APR–Heurikon–pm
Avoiding VMEbus board incompatiblities
Despite a well established standard, VMEbus boards may fail in an
intense, multiprocessing, environment because of the VMEbus' liberal
specifications
BY JEFF DURST Heurikon Corp. Madison, WI
To address the growing complexity of today's real-time applications,
VMEbus system designers are turning to distributed architectures that pack
large numbers of VMEbus boards into a single backplane (see Fig. 1). In
theory, up to 21 VMEbus-compliant master or slave boards from any mix of
vendors should work together in the same backplane without a hitch. In
practice, however, that is often not the case–particularly in
multiprocessor control systems with multiple bus masters. In such an
environment, failures can occur because of capacitive loading, signal
coupling, and ground shifts. Part of the problem is that the VMEbus
specification leaves much room for interpretation. As a result, even
functionally correct boards often hide marginal design faults that are
exposed only when subjected to the increased rigors of a multiprocessor
environment. To resolve these issues, system designers and integrators
must ensure beforehand that boards from their various vendors can work
together. This necessitates a painstaking series of tests, performed at
maximum capacity, in a real-world multiprocessor environment.
Susceptible signals When several different VMEbus boards are in a
system, several VMEbus signals are susceptible to reliability problems. One
control line particularly susceptible to signal coupling from data lines,
for example, is bus-busy (BBSY*). Sandwiched between data lines 0 and 8,
and wire-ORed through each board to the central arbiter in slot 1, BBSY*
is driven by the current bus master to tell the arbiter that it has
control of the bus.
Signal coupling onto BBSY* can cause the arbiter to decide erroneously
that the current master has relinquished control of the bus (see Fig. 2).
If this occurs while a bus request is pending, the arbiter will grant the
bus to a second master, resulting in a collision on the bus.
Multiprocessor systems are susceptible to such errors because of increased
backplane traffic. Also, increasing the number of masters on the bus
causes a corresponding increase in the number of bus requests. This, in
turn, increases the window during which spurious BBSY* signals can impact
the arbitration cycle. Floating on-card data lines are a source of VMEbus
data line glitches. If the board is accessed as a slave for a read
operation and the VMEbus data buffers are enabled before the memory drives
data onto the local bus, the indeterminant state of floating local bus
lines, for example, will be transmitted onto the VMEbus data lines. This
can result in oscillations on the VMEbus data lines that are coupled onto
BBSY*.
Spurious activity on an unused ac-fail line (ACFAIL*), often pulled high
through backplane termination, can pose another problem. Because the
signal isn't driven, it is open signal coupling. Furthermore, ACFAIL* is
typically used to generate a nonmaskable interrupt (NMI) to a board's CPU.
Consequently, spurious signals coupled onto ACFAIL* can cause the
generation of inadvertent NMIs.
Designing reliability problems out VMEbus board designers can pursue
several simple precautions to avoid reliability problems. One of the
simplest is to avoid unnecessary transitions on data lines. Also, to guard
against high-amplitude oscillations and other spurious signals, designers
should increase their system's noise margins. One way to do this is to
speed signal rise and fall times by resending control signals before
tri-stating them. Board and backplane designers can minimize ground
shifts by using materials that reduce inductance. At the board level,
designers should use multiple ground planes with substrates constructed of
extra-heavy-duty copper (2 oz or greater). Shielded DIN connectors with
extra ground pins also help. Many of today's new DIN connectors, for
example, offer one or two extra rows of ground pins. Similar precautions
should be taken on the backplane side. Spurious activity on ACFAIL* can
be solved by cutting the standard VMEbus backplane termination and using a
stronger pull-up resistor. This procedure effectively pulls the line to a
higher positive value, thereby increasing the noise margin. Another
strategy, which eliminates the need to alter the backplane, is to
disconnect the ACFAIL* line from the CPU and terminate the line on card
through a pull-up network. A third alternative, which eliminates the need to
alter backplane terminations and add local termination, is to actively
drive the line. Pull-up resistors can also eliminate the problem of local
bus data lines left floating when the memory isn't driving data and
generating oscillations on the VMEbus data bus. Several simple
precautions can minimize board load contribution. One requires minimizing
the number of devices that can drive, or be driven by, the bus at any one
time. For example, designers shouldn't gang low-current drivers to produce
high-current drivers. While this technique works, it also increases
capacitive loading in direct proportion to the number of packages ganged.
The impact of increased bus loading can be reduced by separating turn-on
and turn-off thresholds using hysteresis. This helps prevent small
oscillations from causing erroneous state transitions in the receive
logic. Many of these reliability problems can be addressed by single-chip
VMEbus interfaces such as the VIC chip. Unfortunately, many low-cost
boards must forego these relatively expensive devices in favor of
stripped-down custom interfaces.
The Gauntlet After all these precautions, designers must still test the
board in a real-world application–with all 21 slots occupied. An example
of such a test has been devised by Heurikon, Madison, WI. Called the
Gauntlet, the test mimics that environment using a special test suite (see
Fig. 3). Before the designer begins the Gauntlet, the boards undergo a
standard series of ROM-based tests to quickly verify their functionality.
Among the resources verified in these tests are RAM, the real-time clock,
counter/timers, DMA, local interrupts, and I/O interfaces such as the
serial ports and SCSI and Ethernet controllers. The tests also verify
VMEbus arbitration, interrupts, and mailbox functions between similar
targets. The Gauntlet begins with a similar set of ROM-based functional
tests–except they're performed in a multiprocessor environment. The key,
however, is a brutal succession of arbitration, interrupt,
read-modify-write, and data-transfer tests designed to verify each board's
system-level multiprocessor functionality.
Testing VMEbus arbitration The most rigorous of the Gauntlet tests,
ARBTEST verifies the ability of multiple boards to simultaneously
arbitrate for, gain control of, and transfer data over the VMEbus. During
ARBTEST, each board randomly selects a slave target from any board in the
rack. Once it gains control of the bus, it writes data (as bytes, words,
and long words, both aligned and unaligned) to a pre-assigned region in
the target's local memory. It then reads the data back and verifies it
with the data that was written. All of the boards in the system perform
ARBTEST simultaneously in order to maximize contention for VMEbus
resources. ARBTEST provides full testing of the A24 and A32 VMEbus
address space using virtually all master/slave scenarios imaginable. To
further stress the system, each board's local bus arbitration capabilities
are also tested by having local CPU and DMA devices, as well as other VMEbus
masters, contend for local bus access. If the board under test can serve
as a slot 1 system controller, then ARBTEST exercises the full range of
arbitration modes supported in the VMEbus specification. This includes
both priority and round-robin arbitration, with all boards occupying a
single bus-request level, and with all boards distributed equally across
the VMEbus' four bus-request levels. One of the challenges in performing
ARBTEST is ensuring arbitration fairness so that all the boards can
participate equally in the test. Because the VMEbus' bus-request and
bus-grant lines are daisy-chained, boards located farther down the chain
receive the lowest priority. Thus, in a heavily loaded system, additional
measures must be taken to ensure that a small number of boards located
close to the central arbiter don't monopolize the bus. ARBTEST eliminates
this problem by having each board execute a random timing loop before it
requests the bus. This random backoff improves fairness by ensuring that
the highest-priority boards (closest to slot 1) don't continually request
the bus. ARBTEST looks for both bus errors and write-read miscompares.
Testing VMEbus interrupts Once a board has passed the Gauntlet's
arbitration test, it undergoes a test of its multiprocessor interrupt
capabilities. The test, known as VMEINTS, evaluates the ability of
multiple boards to simultaneously send and receive 16 interrupt vectors
using all seven VMEbus interrupt request lines. To keep the bus busy and
further increase stress, boards that don't participate in the test
concurrently run modified versions of other tests such as ARBTEST. At the
end of the interrupt test, each participant transmits information about
the number of interrupt vectors that it has sent and received (and on what
level) to a single accumulator board. This board counts the number of
vectors transmitted and received on each interrupt level to make sure that
the two numbers match.
Testing read-modify-write cycles One of the most important
synchronization mechanisms in a multiprocessor system is the semaphore,
which is used to prevent multiple processors from concurrently accessing
shared resources. To support semaphores, the VMEbus provides an automatic
read-modify-write (RMW) cycle that enables a master to test and set a
memory location in a single bus cycle. To verify the proper operation of
semaphores, the Gauntlet includes a special VMEbus RMW test. During RMW
testing, any combination of boards may be selected as masters or slaves.
Once a master has selected a slave, it arbitrates for, and accesses, the
bus. Next, it checks the status of a semaphore in the slave's local memory
by performing test-and-set operations. When the semaphore is available
(clear), the master accesses (sets) the semaphore, writes its board number
into a region of the slave's local memory (protected by the semaphore) and
updates a send counter. The board also interrupts the local CPU by writing
to a mailbox. The CPU, in turn, increments a receive count, corresponding
to the master's board number, and clears a flag that acknowledges to the
master that it has received the interrupt. Once the flag is cleared, the
master clears the semaphore. While the master waits for the flag to be
cleared, it polls the slave's memory to ensure that the board number that
it wrote hasn't changed. A change in this number indicates that another
master has accessed the semaphore and written its board number into memory
before the current master has released the semaphore. If this occurs,
indicating double ownership of the semaphore, the test fails. Because
polling is used to monitor the board number, it's not a fool-proof test
for catching double semaphore ownership. As a backup, the RMW test checks
the interrupt receive counter for each board that has accessed the
semaphore during the test. Double semaphore ownership will typically show
up as a higher than normal interrupt for one of the masters. Once
arbitration, interrupt, and RMW functionality have been verified, the
Gauntlet runs a series of performance tests. These tests verify not only
board-to-board transfer rates, but aggregate multiprocessor system
throughput. The tests can be run with any number of boards in any
master/slave combination. They may also be run using any combination of
CPU, DMA, or mixed CPU/DMA modes.
CAPTIONS:
Fig. 1. A network of 18, fully-populated VMEbus systems provides local
traffic-light control for New York City's 12,000 intersections. The system
was the impetus for the development of the Gauntlet multiprocessor test
suite.
Fig. 2. The digital filter in circuit A gives VMEbus boards immunity from
spurious noise on the BBSY* signal. The series of flip-flops ensures that
the active-low signal stays a logical zero during noise spikes. Circuit B,
however, transmits extraneous noise to the local bus controller, and may
falsely lead the board to detect a free bus and begin an arbitration
cycle.
Fig. 3. The Gauntlet provides a means for designers to test their
VMEbus-based boards in an actual 21 board, multiprocessing environment.
Advertisement