Advertisement

Transputer-based parallel-processing boards

TRANS.AUG–Parsytec–PM

Transputer-based parallel-processing boards

High speed and flexibility make them ideal for number crunching,
industrial control, and imaging applications

BY BRIAN UPPER Parsytec Inc. West Chicago, IL

During the mid-1980s, the computer-chip market saw several new 32-bit
microprocessors coming from leading semiconductor vendors. The Inmos
Transputer chip, however, was unique in that it was the only device that
could simultaneously compute and transfer its results or receive more
input data (see box, “The Transputer chip”). This gave Transputers
powerful number-crunching capabilities and also allowed them to compete
with bus-based systems in many applications.

An alternative to bus solutions Because of their capabilities,
Transputers have found their way into boards for numerous applications.
Less obvious than the Transputer's use in number crunching is its use in
industrial control applications, real-time turnkey systems, and military
machinery. A Transputer-based system has three advantages over a bus-based
system. The first advantage is that Transputer-based systems are
expandable. Like Transputer-based boards, bus-based systems, such as VME
or Multibus, can also simultaneously employ multiple CPUs and exchange
data among them. But any bus has bandwidth limitations. This bandwidth is
constant regardless of how many CPUs are in the system. The communication
bandwidth available to an individual CPU decreases with the number of
nodes in the system. Furthermore, the more bus masters a configuration
contains, the more the bus-arbitration logic decreases the total available
communication bandwidth. Buses that are now available or under
development show an appreciable improvement in bandwidth. However,
bandwidth improvements do not solve the fundamental problem by combining
independent bus systems, or nodes, over local area networks (LANs). Such a
combination will impede communication between those systems and provide a
heterogeneous architecture with its associated programming difficulties.
Busless systems like a Transputer-based system with information exchanged
over point-to-point communication channels eliminate these problems. Each
processor includes its own communication capabilities, which contribute to
the overall system communication performance. For example, a 10-processor
system has an overall system communication rate of 50 Mbytes/s, while a
20-processor system provides approximately 100 Mbytes/s, and so on. This
is an essential feature for applications that require allowances to be
made for future expansion of a system. The second advantage of
Transputer-based systems over bus-based systems is locality. Most control
applications are distributed to different locations in a manufacturing
plant. Each location has local tasks combined with an overall system
exchange. Typically, local bus systems are installed and networked over
some sort of real-time LAN like Arcnet. This arrangement is an overall
heterogeneous system burdened with substantial overhead. Using Transputers,
the communication channels on each processor can be used for local
(system-internal) as well as for long-distance communication. Fiber-optic
convertors for these channels are available that enable communication over
distances of up to one mile. One major difference between this approach
and a LAN is that Transputers have direct communication between
processors. Also, the local communication is identical in hardware,
software, and performance to long-distance communication. This directness
reduces overhead in industrial control applications. The third advantage
of Transputer-based systems is that the communication architecture allows
the user to build a network of local task solvers that directly correspond
to the real problem. For example, an application may have a sensor that
must be scanned in one location. Then, the resulting data must be
transferred to a second place, such as a control room. The data are read
directly into a Transputer-based A/D converter and passed through the
communication channel between the two locations. The one-to-one mapping of
real-world problems with their real-world communication paths to a
corresponding system simplifies the software development time for such
systems.

The optimum task solutions
Four kinds of Transputer-based boards are available for the tasks
involved in distributed industrial processing systems: development
platforms, I/O interfaces, bus interfaces, and raw processing modules.
Current Transputer products approach these areas in two distinct ways. Inmos
and many other manufacturers use a motherboard/daughterboard concept. This
is the Transputer module, or TRAM, approach. Its strength lies in creating
small evaluation systems that fit into host machines. The other approach
is taken by Parsytec. In this method, various modules are used for each
function. These modules, which are then housed external to any particular
host machine in a 19-in. enclosure, communicate via RS-422 buffered links.
Inmos uses a common Transputer development solution that begins with a
few processors hosted by a low-cost machine. With Transputer code, the
same code that is designed for many parallel processors can also run on a
single processor with sufficient memory. The message passing between
processes is identical whether they reside on the same processor or on
separate processors. Therefore, the application can be developed and
tested on a few processors, and later can be run on a larger system with a
corresponding increase in performance. The initial development system is
often the familiar PC supplemented by a Transputer motherboard. A board
developed for this purpose is the B008 from Inmos. The B008 is a
full-length AT-format card compatible with the PC/AT or PC/XT and
interfaced to the PC bus using an Inmos C012 link adapter. It is slotted
for 10 TRAMs in a pipeline configuration. Also from Inmos is the B411
processor TRAM, which can be plugged into a single slot on the B008 to add
a T425 or T805 processor, plus 1 Mbyte of dynamic RAM, to the PC. Up to
eight B411s can be plugged into each B008. Parsytec's modular approach
offers two PC entry-level development platforms, the TPM-PC and the
MTM-PC. The TPM-PC is similar to the B008, except that it contains one
T805 processor with up to 8 Mbytes of DRAM. It also has three slots to
accept further processors or specialized daughterboards. Parsytec
daughterboards are not pin compatible with Inmos' TRAM standard. The
MTM-PC provides four T805 Transputer sections, each with 1 or 4 Mbytes of
local memory. The Transputer links can be configured into any topology
(see Fig. 1), with the C004 link switch that is also on board. In most
industrial systems, useful work can only be done if the proper data can be
put into the system. Therefore, a second type of board-level product is
needed to supply these real-world interfaces, such as graphics I/O,
digital or analog I/O, and interfaces to mass-storage devices. TRAM-level
real-world interfaces include the four-slot B421, a general-purpose
IEEE-488 GPIB interface, and the B422, a two-slot SCSI interface that has
a sustained transfer rate of 1.5 Mbytes/s. These TRAM interfaces plug
directly into one of the motherboards, such as the B008, and reside within
the host system. An alternative to TRAMs is Parsytec's system-based
standard of rugged 3U-format boards. One critical real-world interface in
this format is Parsytec's MSC SCSI and floppy-disk interface with a choice
of 4- or 16-Mbyte DRAM to enable implementation of sophisticated buffer
algorithms. It provides a transfer rate of 3.5 Mbytes/s in asynchronous
mode. The TIP-MFG frame grabber supports charge-coupled-device cameras
conforming to the RS-170 CCIR standard, as well as video cameras, video
recorders, and line cameras with analog output. Further examples of
real-world Transputer interfaces from Parsytec are the TPM-ADC, an analog
data-acquisition board capable of 200,000 samples/s, and the TPM-DIO (see
Fig. 2), offering 16 channels of digital I/O. The TPM-EXP is among the
most flexible real-world interfaces, providing the industrial user with a
prototyping section for application-specific connections, plus system-wide
synchronization using a T225 as a controller. Still more real-world
interfaces from Inmos and Parsytec perform graphics-display,
Ethernet-interface, and signal-processing functions. Most users already
have equipment, from personal computers to workstations, that they would
like to use as the basis for the new Transputer-based production system
that they will design. To facilitate this system integration, the
Transputer network needs board-level bridges to the many popular bus
systems. These bridges can exchange data between the Transputer system and
the existing equipment. Interfaces are now available from several vendors
to support most standard buses, including PC, Micro Channel, VMEbus,
S-Bus, Q bus, NuBus, Futurebus, and Multibus I and II. Interfaces to
multitasking host machines will offer much greater functionality. Both the
Parsytec BBK-S4 and the BBK-V4 offer interfaces for up to four users on
the S-Bus and VMEbus, respectively. The Inmos B016 and the Parsytec BBK-V2
also offer full master/slave capability on the VMEbus. The last group of
products is raw processor boards–CPUs with memory–that can be integrated
in the network. Parsytec's busless TPM, MTM-2, and MTM-4 series offer one,
two, or four processors with up to 32 Mbytes of DRAM for increased
processing performance. Likewise, simple processor TRAMs can be added to
an open slot on any TRAM motherboard. In addition, for those applications
that need a vector signal processor (VSP), both Inmos and Parsytec offer
products that couple a Zoran VSP with a Transputer. The Inmos B420
requires four slots and includes 1 Mbyte of DRAM for the Transputer and
256 Kbytes for the VSP. The Parsytec TPM-SIG (see Fig. 3), provides each
processor (Transputer as well as the VSP) with 4 Mbytes of memory. If
extensive number-crunching capabilities are needed, a Transputer-based
supercomputer can be included in the system, much like another “module.”
The Parsytec GC certainly cannot fit into a 19-in. cabinet, but the
massively parallel-processing machine can easily be integrated into a
system that offers both specialized I/O as well as high performance.
Application-specific processor boards are available from several other
sources. The EktronBOSS image-processing solution was developed at Kodak's
research labs, and marketed by Ektron Applied Imaging. The EktronBOSS4 is
a system-control interface to the VMEbus. It uses four T805 processors and
provides 16 Mbytes of dual-ported RAM, which can be accessed by both VME
peripherals and the Transputer network. The EktronBOSS16 (see Fig. 4),
adds image-processing performance to the Sun or VME host, with 16
processors and 16 Mbytes of on-board DRAM.

CAPTIONS:

Fig. a. With all the functions necessary for a complete computer already
on board, all the T805 requires is a 5-MHz clock.

Fig. b. As the follow-up to the original Transputer, the T414, Inmos'
T805 offers a fast floating-point processor and a data-transfer rate of 10
Mbytes/s.

Fig. 1. The topology used to implement the Transputer-based system can be
custom-designed for each application to achieve the best balance between
speed and simplicity.

Fig. 2. As an example of a real-world Transputer-based interface, the
TPM-DIO data-acquisition board is capable of 200,000 samples/s. [wording
has changed; see final.]

Fig. 3. The TPM-SIG is a Transputer-based processor board that
incorporates a Zoran vector signal processor and up to 4 Mbytes of memory
for each processor.

Fig. 4. For imaging applications, the EktronBOSS16 interfaces to a Sun or
VME host and incorporates up to 16 processors and 16 Mbytes of DRAM.

The following companies' products are mentioned in this article. For more
information, call the contact or circle the reader service number:

Parsytec Inc. West Chicago, IL Marsha Sidmore 708-293-9500

Inmos, Div. of SGS-Thomson Microelectronics, Inc. Phoenix, AZ Graham
Trickey 602-867-6100

Ektron Applied Imaging Bedford, MA Dave Bellanger 617-275-0475 CIRCLE
xxx

Box:

The Transputer chip

Each Transputer incorporates 4 Kbytes of memory, an integer unit, an
optional floating-point unit, timers, interrupt channels, and
communication channels, as well as a memory interface with memory refresh
for external dynamic RAM (see Fig. a). The only additional resources
needed to get a Transputer running are an external 5 MHz clock and
additional external memory, if needed. Its single-chip capabilities make
the Transputer very well suited for embedded systems. Today we can find
Transputers as laser printer emulators, or as hard-disk controllers
directly attached to a disk device. The original Transputer, the T414,
was followed by an enhanced Transputer chip, the T805 (see Fig. b). This
processor's floating-point unit performs at up to 1.5 MFLOPS. Eight
on-chip DMA engines achieve a communication rate of almost 10 Mbytes/s
over the Transputer links. The functionality that a single microprocessor
provides makes it possible to design a 10-chip processor element that
includes several megabytes of external memory per node. The busless
communication made possible by the communication channels allows large and
reliable networks to be built without bus congestion problems. Such
systems are used in large number-crunching applications like Monte Carlo
simulation, fluid flow analysis, and linear analysis. Systems containing
several hundred nodes have been in use for over two years, while systems
with several thousand nodes are currently being built.

Advertisement

Leave a Reply