Understanding CPLDs and FPGAs

INTEL.NOV–rm

Important differences in these programmable logic devices can spell
missed opportunities or, worse, failure for a project

BY MARYANN L. LINDQUIST Intel Corp., Folsom, CA

Deciphering the terms used to describe complex programmable logic
devices (CPLDs) and field-programmable gate arrays (FPGAs) is an important
first step for designers who must use these parts. The descriptions of
these parts point to their differences. Those unfamiliar with those
differences might see only similarities: that each is a class of
programmable logic and an alternative to gate arrays. A designer might
overlook important differences that can spell missed opportunities or,
worse, project failure. CPLDs and FPGAs both let a designer consolidate
into one device logic functions that would take many discrete components.
At the same time, they have many advantages over gate arrays, whose high
design costs and lengthy turnaround time can only be justified for very
high volumes. Individual CPLD and FPGA components cost more than gate
arrays. But taking all factors into account, gate arrays are often
costlier and riskier. Gate array designs require substantial time for
nonrecurring engineering and a costly test phase. But a close look at
CPLDs and FPGAs reveals their own set of costs and risks. Before selecting
one class of components over another, a designer needs to weigh a
project's system requirements for logic density, speed, predictability,
and cost against a particular device's architecture and the availability of
low-cost, easy-to-use design tools. Density is the first factor to
consider. CPLDs span a gate density of about 600 to 5,000 gates, where
FPGAs cover a 4,000- to 20,000-gate range (see Fig. 1). For comparison,
simple PLDs range in density from 300 to 500 gates, while gate arrays
range from 4,000 to 150,000 gates. If the required logic density is less
than 4,000 gates but more than 500, a CPLD is likely the best choice. Any
more than that and the designer must pick an FPGA or, if volumes are very
high, a gate array. Making a choice within the 4,000 to 5,000-gate range
takes a careful comparison of CPLDs and FPGAs. Performance is the other
key factor that will influence the designer's decision. Typically, CPLDs
are faster than FPGAs, offering propagation delays of 7.5 ns or better for
combinatorial logic. System clock rates run upwards of 80 MHz. In
comparison, fast FPGAs have propagation delays of about 20 ns.

Architecture at the heart Often, what makes CPLDs attractive is not so
much their performance, but the early predictability of their performance.
FPGAs are less predictable. Many FPGAs are built with what is called a
fine-grained architecture (see Fig. 2). Like a gate array, an FPGA is
characterized by logic blocks surrounded by routing resources. This
characteristic, found in FPGAs from Xilinx, Actel, and in Altera's
FLEX8000 line of devices, yields the highest density devices. However,
signal delays between device resources are not determined until the final
place-and-route is complete. Thus, the performance of FPGAs that employ a
fine-grained architecture is said to be nondeterministic. For the
designer, nondeterministic timing makes it impossible to know up front if
an FPGA can meet a system's requirements. A device's best performance is
limited by the longest delay path. And only after an analysis of all paths
can the actual performance be determined and compared to the requirements
of the design. If the resulting performance is unacceptable, manual
constraints must be imposed on the place-and-route process and another
iteration completed. For high-performance systems, nondeterministic
devices almost always require manual intervention to meet performance
goals. Even then, success is not guaranteed. To counter this drawback,
vendors of fine-grained FPGAs are developing pre-routed macros and
timing-driven design-entry techniques to make timing more predictable. In
contrast to most FPGAs, CPLDs are forged from a segmented architecture
like that found in a simple PLD (see Fig. 3). In this architecture, device
resources are grouped into PLD-like blocks. Each block contains eight or
more macrocells that can be configured into sequential or combinatorial
logic functions. Unlike a fine-grained architecture, a segmented
architecture has fixed routing delays between resources. Therefore, a
device's overall performance is deterministic. Hence the performance of
these devices can be tested, guaranteed, and specified in a data sheet.
Generally, devices having this architecture provide high performance as
well. Examples of segmented architecture are found in Advanced Micro
Devices' MACH, Altera's MAX 7000, and Intel's FLEXlogic series of
programmable logic chips.

Design differences Other important architectural characteristics that
designers need to be aware of include I/O block structures,
reconfigurability, resource allocation, and differences in a device's
internal and external performance. This performance is related to the
issue of predictability. Typically, FPGAs with fine-grained architectures
have decoupled I/O blocks. Thus, instead of having I/O resources
associated with particular blocks, I/O blocks, and macrocells are separate
from the circuit's logic blocks. As a result, the place-and-route routine
must work its way across a maze of paths–a process that takes time to
complete and imposes routing constraints. In a segmented architecture,
logic blocks have preassigned I/O blocks and macrocells. Such an
arrangement lets any output be a function of any input, but without
imposing the routing concerns that a fine-grained architecture raises.
Sometimes, a logic function requires more inputs than one logic block can
supply. When this occurs, programmable logic devices must offer some way
for a logic block to borrow resources from nearby blocks (see Fig. 4).
The main approaches for sharing resources are p-term (for product term)
sharing and p-term allocation. In devices that have p-term sharing, the
p-terms for a logic equation can be borrowed from a nearby macrocell.
However, the remaining p-terms in the donor macrocell are no longer
available for other functions. In contrast, p-term allocation lets the
designer to assign p-terms to adjacent macrocells without losing the
functionality of the donor macrocell. The result is a more optimized
fitting of the function. Differences in internal and external performance
are another concern for the designer. For some programmable logic devices,
the difference between the two can be substantial. Worse, that difference
may be unknown until the device is routed. Like the difference between
segmented and fine-grain architectures, this difference bears on whether
performance can be anticipated.

Storing configuration data Designers must consider how devices store
configuration data. In some devices, the data, once entered, is permanent.
Other devices are reconfigurable. And even differences among
reconfigurable devices can affect the requirements for using the device.
The main configuration-storage techniques are static RAM, antifuse links,
and electrically or UV erasable EPROM. Of these, antifuse and EPROM
devices are nonvolatile. In contrast, devices that use static RAM to
store data must be reloaded on power-up and therefore incur overhead in
reloading circuitry. For the same reasons, however, they are the easiest
to modify in the likely event that circuit changes are needed. In fact,
they can even be changed in circuit, giving them greater flexibility.
Unfortunately, a device that is easy to re-program is also an easy target
for competitive prying and reverse engineering. Of course, EPROM-based
devices, which are nonvolatile and therefore resist tampering and prying,
can also be changed to accommodate design revisions. But the effort needed
to modify them depends on the erase mechanism. For example, to modify a
UV-EPROM device, it must be removed from its socket, a step that grows
increasingly difficult with higher pin counts. EEPROMs can be modified
without being removed, but programming those modifications requires extra
circuitry. One device combines the advantages of EPROM and RAM storage
techniques. Using on-board EPROM, Intel's iFX780 is self-configuring,
needing no external circuits to load data. At the same time, it contains
its own SRAM that can be reloaded to reconfigure its function (see Fig.
5). The same SRAM is also available for functions needing volatile memory.

Design tools Design tools are the only link a designer has to the device
he or she is configuring. Tools are used for schematic-capture, compiling
a design, routing, simulation, and device programming (see Fig. 6). The
first choice to make is whether to buy design tools from the device vendor
or from a third party. For complex or unique architectures, the device
vendor is often best equipped to offer the most effective tools. Getting
the best fit of design into a device and weaving the most efficient
place-and-route often takes intimate knowledge that only the manufacturer
has. However, two problems plague such proprietary tools. For one, they are
often expensive. The reason for this is that some manufacturers consider
tool sales an important source of income. Besides their high cost,
proprietary tools often present a new and unfamiliar interface. A designer
who is familiar with another set of tools may be reluctant to learn a new
toolset. Indeed, because the tools needed are complex, some companies have
in-house specialists in different devices and their tools. Such devices
take a large investment in people if not in the tools themselves. The
answer to these difficulties may be to buy tools from third-party
suppliers. For one thing, third-party tools typically offer a consistent
user interface across device families. What's more, third-party tools tend
to be reasonably priced. Still, unless they are developed with the
cooperation of the device vendor, they may not produce the most efficient
design. Because design tools are critical to a successful programmable
logic design, the best toolset would be one that is low cost, available
from the device manufacturer, and can be used with third-party tools and
design interfaces. A good example of such a toolset is Intel's PLDshell
Plus, which is free, easy to use, and works with tools from such
third-party vendors as Data I/O, Logical Devices, OrCAD, and Minc.
Regardless of the tools picked, a designer should expect
segmented-architecture devices to take less time to complete
place-and-route. As a general rule, place-and-route for fine-grain devices
tends to take hours, instead of several minutes for segmented architectures.

Analyzing cost The total cost to develop a programmable logic circuit
depends on much beyond the unit cost of the device. One key factor is
time, which varies with the complexity of the particular device, its
design tools, and the engineer's familiarity with both. To compare the
full cost of the different approaches, the elements of a project should be
divided into their fixed and variable components. Of these, fixed charges
are associated with the time needed for training, design, and–if
performed–simulation. The second cost component, the variable charges,
depends on volume and includes unit and inventory costs. The learning
curve is one of the biggest tasks that designers face when using
programmable logic. It takes time to learn the details of new devices and
tools. On one hand, the cost of that training can be spread across as many
projects as it will be applied to; on the other hand, it takes several
projects before a designer becomes proficient. The training time
associated with a particular device depends on its complexity.
Implementing five designs with FPGAs having nondeterministic timing can
take most of a year. The same task might take a month using simple PLDs.
Using CPLDs, it may take somewhere in between. Once a designer is
familiar with a device and its tools, much of the time is spent on the
design phase of the project. Working with simple PLDs, a designer might
configure as many as 1,000 gates per day and, therefore, take a week to
build a 5,000-gate circuit. CPLDs and FPGAs take longer, with more complex
FPGAs slowing the design rate to as low as 50 gates per day. Simulation
also adds to the design time. Alternatively, if a prototype system is
available, the operation of CPLDs and FPGAs can be checked using built-in
circuit verification, and any configuration changes made on-line. Because
simulation runs for FPGAs can take several hours, designers frequently
leave them to run on their computers overnight. Simulating a CPLD usually
takes just a few minutes, making processing time a lesser issue. Possibly
the most overlooked cost is that of sales lost when a schedule slips.
Unfortunately, until very high volumes are reached, these lost revenues
can dominate the total project cost. Of course, a main reason for using
CPLDs and FPGAs in the first place is that they let designers bring
products to market faster than if they had used gate arrays. Thus, where
programmable devices are concerned, lost-revenue costs are only an issue
when design delays cause the project's schedule to slip. Volume is also a
factor in accounting for variable costs. Unit cost–what each device sells
for–is one element of total variable costs. With prices dropping at about
20% a year, FPGAs now range from about $0.01 to $0.02 per gate. Thus, a 5,
000-gate device sells for between $50 and $100. On the low end, 24-pin
CMOS PLDs cost about $2.50 for 25-ns versions. With about 250 gates in
each PLD, it would take 20 of them to form an equivalent 5,000-gate
circuit with a “unit price” of about $50. The second variable is inventory
cost. This applies to devices with fuse-programmable links. A last minute
or post- production change made to the design renders useless any devices
already programmed.

Fitting applications Some applications are more appropriate for a CPLD,
others need an FPGA. For example, for building test, measurement, and
analysis equipment, a designer would most likely prefer a high-density
FPGA with internal SRAM for easy reconfiguration. High-performance add-in
cards, however, would need the short propagation delays and predictable
timing of a CPLD. Typical products that would use a CPLD include LAN,
graphics, high-speed math, and telecommunications add-in cards.
Predictable timing also makes CPLDs suited for such telecommunications
applications as switches, subscriber line controls, and wireless LANs.
With uptime being a high priority for telecom equipment, the CPLD's track
record of high quality and reliability weighs heavily in its favor.
Moreover in many of these applications, the availability of a built-in
SRAM in a device helps equip a product with unique or distinctive
features. As programmable logic has evolved, SRAM is only one of the
features being added to lighten the designer's load. Others are
compatibility with the 3.3-V logic and the JTAG 1149.1 boundary-scan
standard, flexible clock modes, and dedicated comparator circuits. For
example, by offering a bridge to 3.3-V operation, Intel's IFX780 FLEXlogic
FPGA fits well into low- voltage designs for battery powered equipment like
portable and handheld computers. In addition, by complying with the JTAG
standard, the device greatly simplifies board-level and production
testing. Also, each macrocell within the iFX780 has three clock modes:
synchronous, for minimum register-clock-to-output timing; delayed, to
minimum register setup time; and asynchronous, in which the clock signal
is generated from two local product terms (see Fig. 7). Moreover, all
clock signals can activate a register on either their rising or falling
edge. Yet another distinguishing feature for the FLEXlogic family is a
built-in circuit that compares two patterns of up to 12 bits each. Each of
these features enhances the device, moving it closer toward the goals of
high density, low cost, and fast turnaround.

CAPTIONS:

Fig. 1. CPLDs span a gate density of about 600 to 5,000 gates, FPGAs
cover a 4,000- to 20,000-gate range.

Fig. 2. Many FPGAs are built with a fine-grained architecture,
characterized by logic blocks surrounded by routing resources.

Fig. 3. CPLDs are forged from a segmented architecture like that found in
a simple PLD in which device resources are grouped into PLD-like blocks.

Fig. 4. Programmable logic devices offer some way for a logic block to
borrow resources from nearby blocks.

Fig. 5. SRAM used to configure a CPLD is also available for logic
functions needing volatile memory.

Fig. 6. Tools are used for schematic-capture, compiling a design, routing
the interconnections, circuit simulation, and device programming.

Fig. 7. Macrocells within this FPGA have three clock modes synchronous, for
minimum register-clock-to-output timing; delayed, to minimum register
setup time; and asynchronous, in which the clock signal is generated from
two local product terms.

Leave a Reply Cancel reply

THE EDITOR'S PERSPECTIVE

Gina Roos

Automotive: evolving technologies and new innovations

Featured Videos

FOLLOW