MS77.SEP–Advanced Micro Devices–rm
PLD architectures
Different devices suit different jobs
BY DR. OM P. AGRAWAL Advanced Micro Devices Programmable Logic Products
Div. Sunnyvale, CA
Design changes are almost inevitable during the prototype stages of
product development. Design errors, incomplete specifications, changing
marketing requirements, and new ideas all lead to changes up to the last
minute. These create the need for flexible, fast, dense, and ever-cheaper
programmable logic devices (PLDs). However, choosing the best PLD for a
given application can be tricky, because of the wide variety of types
available. Understanding the characteristics and capabilities of each type
is therefore essential.
Advantages of PLDs PLDs offer significant advantages over other logic
solutions. Table 1 compares development time, cost, and architectural
flexibility tradeoffs among the various logic alternatives–standard
components, PLDs, gate arrays, standard cells, and full-custom solutions.
Alone among the alternatives, PLDs offer faster time to market�20with
minimal design risks�20at very little cost penalty. In the past few
years, PLD density has increased from 100 to more than 5,000 usable gates.
Speed has improved from 45-50 to 5 ns (in seven generations) for bipolar
programmable array logic (PAL) devices and from 90 to 7.5 ns (in nine
generations) for CMOS PAL devices. Price per gate has declined from 25 to
30 cents a gate to less than a penny a gate. Package size for PLDs has
increased from the 20-pin DIP to pin grid arrays (PGAs) with more than 175
pins and plastic quad flatpacks (PQFPs) with over 160 pins. Technology
choices have shifted from bipolar fuse-link technology to various CMOS
technologies–EPROM, EEPROM, static RAM, and low-impedance interconnect
antifuse-based devices. Architecture choices have expanded from simple
PAL/PLA devices to denser E/EE PLDs and higher-density field-programmable
gate arrays (FPGAs). The number of device types has increased to more than
400. Design software has grown from simple assembler-based tools to
powerful integrated packages running on multiple platforms. Also,
third-party software vendors and programming support vendors have emerged.
Low-density PLDs Low-density PLDs encompass monolithic block-based
structures in 20 to 44-pin packages, with density ranging from 100 to 500
gates. The architecture of low-density PLDs (less than 100 gates) consists
of an AND-OR structure or a NAND-NAND, NOR-NOR multilevel logic structure,
with an array of logic and I/O macrocells and I/O pins. Low-end PLDs are
made in both bipolar and CMOS processes. Bipolar fuse-link has been the
technology of choice for the highest-performance low-density PAL devices.
Speeds for bipolar 16xx/20xx PAL devices have improved from 45 to 4.5 ns
over a period of 12 years. The delay of 16xx PAL devices will soon be less
than 4 ns. Speed for CMOS low-density PAL devices has improved from 90 to
7.5 ns. Although CMOS is now one generation behind, the gap is closing
quickly. While a large number of low-density PLD architectures have
appeared, simple PAL architectures in 20 to 24-pin packages (16xx/20xx,
16V8/20V8, 22V10 PAL devices) have become de facto industry standards. A
PAL device (see Fig. 1) has a programmable AND plane and a fixed OR plane
in a two-level logic array. Programmable logic array (PLA) devices have
both a programmable AND plane and a programmable OR plane. The basic
AND-OR structure makes PLDs (PALs and PLAs) natural for implementing logic
equations in Boolean sum-of-products form. Other features include
programmable I/O pins, flexible output-enable and bidirectional I/O,
programmable output polarity, flexible register configurations, flexible
clocking schemes, buried registers, and miscellaneous features like
accessibility, controllability, testability, and observability. Table 2
illustrates various basic features of a PAL device. The four most
important components of a PAL device are the AND plane, the OR plane, the
output storage elements, and the I/O pins. The AND plane in a PAL device
connects array inputs to the AND gates (product terms) to form both logic
and control functions. Logic product terms are used for logic functions,
and control product terms are used for control functions like
output-enable control, reset, preset, preload, and observability. The
size of the AND plane is determined by the number of inputs (both true and
complement) and product terms. The OR plane determines the number of OR
outputs, the number of product terms per output, the distribution of product
terms per output whether same or different, and the coupling of ORed gate
outputs with the storage elements. In a typical PAL architecture, the
outputs of the AND gates are connected to fixed OR gates. The major
limitation of a standard PAL device's AND-OR plane is the number of inputs
for the AND gates, the number of AND gates, the number of ORed outputs,
and the number of AND gates per each output. While first-generation PAL
devices had seven or eight logical product terms per output and a maximum
of eight OR outputs, some higher-density devices have much bigger AND-OR
plane and more outputs. Also significant, the product-term distribution
per output may be fixed, variable, steered, or shared. The architecture
of output storage elements includes * Nature of outputs. * Type of
flip-flops used for storage elements. * Structure/organization of
outputs. * Clocking flexibility and feedback flexibility. In a PAL
device, the output or outputs can be configured to be either sequential
(registered) or combinatorial, and to be active high or active low. The
flip-flops used for output storage elements can be edge-triggered D-type
flip-flops; J-K, S-R, or T flip-flops; or transparent latches. In general,
flip-flops are used for clocked synchronous systems, whereas latches are
good for asynchronous logic applications. D-type flip-flops are the
easiest to design with. Although J-K flip-flops are the most flexible,
both flip-flop inputs must be driven from the array inputs. As a result,
most industry-standard PAL devices use D-type flip-flops only. However, most
complex PLDs include more flexible macrocells. The structure of the
output cells determine whether the storage elements are a bank (possibly
with a common clock) or multiple banks with separate clocks. The feedback
path determines the coupling of the storage element to the I/O pin. The
feedback can be from the combinatorial output, registered output, or the
I/O pins, and it can be either a single line or multiple lines. The I/O
pins may be dedicated input pins, dedicated output pins, or dynamically
controllable bidirectional I/O pins. While the first-generation PAL
devices had a limited number of dedicated input pins, few output pins, and
few programmable I/O pins, later devices often have universal I/O
structures. A PAL device can execute any logic function as long as the
design requirements do not exceed the number of inputs, outputs, product
terms, and other logic functions (registers, clocks, polarity, etc.) that
are available in the device. Figure 2 shows the architecture of
first-generation 16L8/R8/20R8 combinatorial/registered PAL devices. Figure
3 shows the output macrocell structure of the PAL22V10 device. The
macrocell concept, introduced first in the 22V10, has become a de facto
industry standard. These output macrocells, associated with I/O pins,
increase the regularity and design flexibility of PAL devices. They allow
any or all of the I/O pins to have either registered or combinatorial
outputs of either polarity. Programmable output polarity allows the
designer to program either the true or complementary version of a logic
term, whichever makes the most efficient use of the chip's AND array. Also
associated with each macrocell is an individual output-enable control
term. Macrocells with individual I/O-enable allow designers to tailor each
chip's architecture to an application, using combinations of registered
and combinatorial inputs and outputs. Figure 4 shows the I/O macrocell
for a second-generation CMOS electrically erasable PAL device–the
PAL29MA16. This I/O macrocell offers versatile programmable input/output
structures, multiple clock choices (a pin or product-term clock with
programmable polarity), a configurable input or output register/latch
structure, flexible output-enable control, and flexible dual feedback for
a buried register. Unlike the 22V10, where only two architecture cells
configure the macrocell, the 29MA16 has up to eight or nine architecture
cells with a potential of 512 different output configurations.
High-density PLDs For high-density PLDs, two important niches have
emerged–one addressed by channeled-array-based PLDs and the other by
segmented-block-based PLDs (see Table 3). High-density PLDs are available
in CMOS only. They are also known as FPGAs–although the term sometimes
refers only to channeled devices. Conceptually, both
segmented-block-based or channeled-array-based PLDs are programmable
blocks with programmable interconnects. The fundamental differences
between the two are in the architecture of the programmable block and the
programmable interconnect structures. Basic goals are the same for both
approaches, but the means and results are different. Channeled-array PLDs
and segmented-block PLDs embody fundamentally different architectural
approaches–each with characteristic strengths and weaknesses.
Segmented-block PLDs embody a few large AND-OR blocks, with an array of
logic-plus-I/O macrocells, interconnected by some centralized,
programmable interconnect scheme. These include AMD MACH, Altera MAX, and
the Plus Logic FPGA family of devices. Currently they encompass devices
ranging from 44 to 100-pin packages, with density ranging from 500 to 5,
000 gates. These devices are implemented in CMOS EPROM or EEPROM
technologies. A channeled-array PLD comprises a relatively fine-grained
matrix of programmable logic blocks surrounded by a border of uncommitted
I/O blocks. The blocks are interconnected by distributed interconnect
structures. Reprogrammable and one-time-programmable types are available.
Reprogrammable channeled-array PLDs include the Xilinx Logic Cell Array
(LCA), Concurrent Logic CLA, and Plessey Electrically Reconfigurable Array
(ERA) devices. One-time-programmable channeled-array PLDs include the
Actel PGA, Quick Logic pASIC, and Crosspoint Solutions CPK2xxx family.
One cause of silicon inefficiency in existing low-density PLD
architectures is their fully committed structure. A fully committed
structure implies close coupling of all internal elements and fixed logic
allocation. Most high-density PLDs attempt to solve silicon inefficiency
with a more flexible block structure (better allocation and decoupling of
internal resources, more flexible macrocells) and more flexible
interconnects. Strengths of channeled-array PLDs are many registers, many
I/Os, complete decoupling of logic blocks from I/O blocks, and
programmable connectivity between logic blocks and I/O blocks. On the
other hand, segmented-block PLDs are fast and predictable. They also offer
uniform, path-independent, and deterministic delays. Segmented-block PLDs
are also better suited for wide-gating functions and complex state
machines, and are easier to design with. Table 4 shows the strengths and
weaknesses of both segmented-block and channeled-array PLDs. Typical of
the segmented-block PLDs is the Advanced Micro Devices' MACH family (see
Fig. 5), which consists of two subfamilies: MACH 1 and MACH 2, in 44-,
68-, and 84-pin PLCC packages, with device gate density ranging from 900
to 3,600 gates offering 32 to 128 macrocells. The MACH 1 family, with a
higher pin-to-logic ratio, addresses I/O-intensive applications. The MACH
2, with a higher logic-to-pin ratio, addresses logic-intensive
applications. The MACH family offers 3 to 12 times the functionality of
the 22V10. Like most other high-density, segmented-block PLDs, the MACH
family comprises arrays of programmable blocks with programmable
interconnects. The fundamental difference between the MACH family and
other PAL-like high-density PLDs lies in the internal architecture of the
blocks and the switch-matrix structure of the interconnects. The structure
provides uniform, path-independent, fixed delays for all signals. A fitter
algorithm allows automatic device fitting and logic routing, making it
unnecessary for designers to configure the switch matrix manually.
PLDs in system designs System design tasks fit three broad categories:
data path, control path, and interface (glue) logic. Data-path
applications include data manipulation (ALUs), data storage (register
file, pipeline registers) and data steering/selection
(multiplexing/demultiplexing). In these applications, speed and density
are critical. The data path is the most structured portion of a system. It
is likely to be defined early in the design cycle and is unlikely to
change during the prototype phase. Turnaround time is not critical for the
data-path portion of a typical system. Control-path applications include
the timing, control sequencing, and decision-making portions of a digital
system. These sections are normally made up of state machines. The control
section is the most complex portion of a digital design. It is likely to
contain subtle errors and to require many changes during prototyping.
Speed and turnaround may be critical for the control path, while density
is less important. Typically, interface circuitry connects
microprocessors, peripherals, and gate arrays. For interface applications,
turnaround time and design changeability are important. PLDs can take on
all these applications–data path, control, and interface. In data path,
they are used for data steering, multiplexing/demultiplexing and data
manipulation. They are used extensively for control functions like
instruction pre-decoding, pipeline control, register file control, special
instruction control, and address decoding. Other applications for PLDs
can be either I/O or logic intensive. I/O intensive applications need more
pins and more I/O capability–for example, address-decode or bus
arbitration/interface functions in systems with 32- or 64-bit
microprocessors. Most low-density PLDs are efficient for wide-gating
combinatorial functions like address decoding, multiplexing and
demultiplexing. Most high-density PLDs attempt to address both wide-gating
and register-intensive functions with a better combination of
combinatorial and sequential logic functions. Segmented-block structures
are better suited for wide-gating functions whereas channeled-array PLDs
are better suited for narrow-gating, register-intensive functions (see
Fig. 6).
Design methodology and software The PLD design process usually comprises
three phases: design, programming, and testing (see Fig. 7). The design
phase consists of five steps: defining the problem, identifying design
requirements and selecting an appropriate device, generating Boolean
equations or high-level-language solutions, optimizing logic equations,
and verification. Defining the problem comes first. Is it a combinatorial
function–simple address decoding, priority encoding, data
multiplexing/demultiplexing or control-signal generation, or a sequential
function–counters, data shifters, or state machines. Design requirements
include numbers of I/O pins, numbers of product terms, number of
combinatorial or registered outputs, polarity, power consumption and
speed. Developing a solution for the problem involves describing the
logic in Boolean equations, truth tables, state diagrams, or
high-level-language constructs. These must be fitted by software to a file
in the form required for the actual PLD selected. Logic simulation on the
resulting equations verifies correct functionality of the design before
actually programming a device. This is an inexpensive way to catch
mistakes. The programming phase consists of generating an appropriate
fuse map and programming the device. The testing phase verifies the actual
fuse/cell pattern and tests the function of the programmed device.
Software tools Many software tools are available for designing with
PLDs. These tools translate a logic specification into a format accepted
by a device programmer. These tools also aid design simulation and
documentation. Simulation tools help in debugging an initial design and
help to avoid multiple design iterations. Documentation tools help make
designs understandable and help fix errors later on. PLD design tools can
be classified into simple assembler or compiler tools. Boolean assembler
tools allow designers to use symbolic names for input and output pins.
However, equations need to be developed at the base fuse/cell level.
Often, that is quite tedious and results in designs that are more difficult
to understand and maintain. Compiler-based tools allow designers to
describe their designs at a higher level–one that most accurately
reflects the design concept. These tools result in designs that are more
thoroughly documented and easier to maintain. Advanced design-entry
methods include schematic capture, Boolean equations, state machine
languages, and HDL or VHDL language with mixed-mode design merge
capability. Advanced processing functions include both front-end tools
like syntax parsers, logic expanders, minimizers, and
synthesis/optimization tools for specific device architectures, as well as
back-end tools like place-and-route and device fitters. Simulation tools
include functional, timing, and board-level simulation.
CAPTIONS:
Fig. 1. PAL architecture is characterized by a fixed OR plane fed by a
programmable AND plane, whose inputs come from input pins and feedback
loops.
Fig 2. The input AND and OR planes are similar in the 16L8 and 16R8. The
L8 has individually programmed polarity for each output, while the R8 is
registered.
Fig 3. The output of the 22V10 is more elaborate than previous designs,
encompassing the possibilities of the R8 and L8 on each cell.
Fig 4. The 29MA16 adds a second feedback loop to the 22V10 output
macrocell.
Fig 5. Segmented-block complex PLDs join multiple simple PLDs by a
crossbar, making the delay easy to predict.
Fig 6. Segmented-block and channeled-FPGA structures each favor different
requirements, but enhancement to each type tend toward the other.
Fig. 7. The PLD design process includes the design phase, programming
phase, and testing phase.