|
Archive
1996
1997
1998
1999
2000
2001
Articles
Infrastructure
for Design Reuse
Solve Timing and
Signal Integrity Problems
An Effective
Way of Design Reuse
Success Stories
Alpha
CPU
goes 0.13 micron
Synopsys
chooses RUBICAD's LADEE for layout migration of 0.13-micron memory compilers
Peregrine
cuts design time and costs
IMI increases
performance
OKI
reuses
Hard IP
Philips migrates
microprocessor core
Qualcomm
optimizes Standard Cell Libraries
Xilinx
migrates
FPGA |
|
Published June1998
in Future Fab International
Migration to Deep-Submicron
Design In The Efficient Way
Reduces Your Costs and Increase Your
Efficiency By An Effective Way of Design Reuse
Michael Reinhardt, President and CEO,
Rubicad Corp, San Jose, CA
Abstract
This article describes a method of automatically
migrating a physical design to a new deep-submicron technology and how
to accelerate the physical design process by up to 90% for the next process
generation. When talking about design, here generally the physical layout
is meant. The article discusses the requirements for physical layout modifications
and a proven solution.
Technology and Market Situation
Today, it is possible for you to place
complete system functions on one chip with millions of transistors. Even
highly complex systems may be reduced to a few devices in a single package.
According to Sematech, by the year 2001, the state of the art in chip will
be 12-million gate devices operating at speeds up to 600 MHz. If an engineers
designs 100 gates per day, including verification, developing such a chip
will take 500 person years and cost approximately $75 million. Even with
productivity advances of the last decade, such as hardware description
language (HDL), logic synthesis, and HDL simulators, it's unrealistic to
expect designers to fill these huge chips.
To make these chips cost-effective forces
designers to include a number of standardized chips to accompany their
unique piece of silicon. To keep up with the growing silicon density, the
only practical method is to reuse designs and leverage previous design
work.
Cost and competition are forcing a shift
to faster and smaller submicron technologies to meet price and performance
requirements. At the same time, worldwide market pressure requires an approach
that enables tapeout of new system chips in the latest process technology
in a fraction of the time of the past decades. The number of transistor
per day per designer has to increase significantly to catch up with market
pressure.
As if market and cost pressures are not
enough, designing for DSM technologies breaks existing chip design rules.
Traditional approaches limit reuse of existing macros, cores, chips for
the design of new chips and systems. Designers face challenges they had
not faced in the pre-sub-micron era.
Challenges of Technology and Design
Traditional Shrink and it's Limitations
The idea of re-using existing cores, macros
and libraries for new chips with the latest technologies is not new. It
is done all the time by shrinking the existing layout to the latest process
design rules. It used to be an overall goal to be able to produce a design,
laid out in one technology, in the next generation process technology by
simply shrinking it linearly. If the shrink was an optical shrink it was
almost free. If it was a mask shrink, manual fixes were enough to adjust
for the design. This work could be reduced to a minimum by a hierarchical
setup of the layout. Designers only had to fix the leaf cells and the overall
design was done. With a full-custom layout the manual fix takes more effort.
With DSM technologies the design rules
don't shrink linearly any more. When going from a 0.5 micron to a 0.35
micron process, or from a 5V to a 3V application, sometimes only a few
rules become smaller, others remain the same or shrink with a bigger factor,
sometimes wires widen after the global shrink to meet timing and/or power
specifications.
To reuse approved designs in System on
Chips (SoCs), all parts of the system have to exist in the same technology
and the same grid. If the designs were originally designed for different
fabs and technologies, designers have had a hard time to bring the designs
to a common design rule set and reach the minimum area requirements.
Timing problem in design
Not only do the non- linear design rule
changes makes it difficult to reuse existing designs with the latest DSM
technologies. With technologies below 0.5-micron the previous assumption
that the interconnect delay is less than the gate delay is wrong. Now gate
delay can be less than interconnect delay. This becomes even worse with
the increasing size of DSM designs. The relative length of the interconnect
increases because the number of transistors routed together in a single
block increases.
Power consumption and metal migration
Since most new designs use higher clock
frequencies, the requirements for power lines, clock lines and databus
lines, will changes. Normally the width of these wires has to be increased
in relation to the other metallization. Often this requires the insertion
of an additional metal mask level.
Expertise from different areas of design
for System on Chip (SoC)
Designing a SoC for DSM technology requires
a wide range of experts. In the past system designers could access a wide
range of well-defined chips and integrate them on a Printed Circuit Board
(PCB). Now chip designers need to combine these chips into one SoC and
make them work together. It's seldom that one company has all the expertise
for the microprocessors, programmable logic, graphic chips, memory etc.
in this system. To achieve silicon integration the experts need to work
together to insure their individual parts work together.
Solutions and Options to solve the DSM
challenge
To meet the challenge of designing with
DSM technologies, electronic design automation (EDA) vendors provide a
wide range of tools and support several design methodologies. One challenge
is to combine existing cores, macros and blocks with unique newly designed
logic, memory or programmable logic into a single system. Designers must
acquire expertise in these methodologies. To provide expertise, IP companies
are providing predefined IP blocks.
Predefined IP is provided in different
forms:
Soft IP
Soft IP is the HDL (Verilog or VHDL) or
RTL (register transfer-level) expression of a design. It is basically software
and the designer has to implement the physical design. Soft IPs advantage
is that the designer can modify the code and customize the design according
to his requirements. For the customization and generation of the physical
layout a deep understanding of all the functions is necessary, otherwise
the testing of the design is impossible. Since each implementation of a
soft IP has not been manufactured or characterized in the context of a
system design, it is very difficult to predict area and power. Generating
the physical design can become a time consuming process. For example, stated
Timothy O'Donnell, president of North American Operation of ARM (Advanced
RISC Machine, Inc.), in a recent interview with Computer Design Magazine:
"ARM also provides a synthesizable version of the ARM core. But the ASIC
designer doesn't really want this. If the core is delivered in this HDL
format, the designer needs to convert it to a layout, and this is a lot
of work." He continued: "ASIC designers don't want to worry about the critical
paths in the core, or optimizing the die size to meet their requirements.
They want to use a working characterized cell for which that work has already
been done." To generate the layout using a place and route tool a physical
library is needed. And the library has to be available in the right technology
at the right time.
Hard IP
Hard IP is the physical layout representation
of a design. Since a design in DSM needs a physical silicon implementation
for final verification, the layout is the most secure part of a reusable
design. Using hard IP blocks save SoC designers a huge amount of time.
It offers a way to remain profitable despite the time-to-market pressures
and increasing complexity of designs. This is specially true for products,
which have short life cycles and include many functions that remain the
same from one product generation to the next. By reusing pre-designed,
pre-verified layout blocks in a succession of chips, designers insure predictability,
high performance and save precious time. Using hard IP also gives designers
time to concentrate on other high-value tasks.
In the past physical layouts were designed
for a specific fab and the exchange with other fabs and technologies was
difficult. Indeed the layout was the blueprint of a design and represented
the capital of a chip manufacturer. Using RUBICAD's Layout Conversion Environment,
LACE, hard cores and hard macros become adjustable for any fab and technology.
This makes portability an easy manageable task.
Eliminating the Down Side of Hard Macros
The ideal solution would be to re-formate
the existing macros and automatically convert their physical layouts to
the latest technology, while automatically adjusting transistor and wire
sizes. LACE compacts the existing IC mask layouts to any new or different
technology.
LACE is the only tool in the market that
combines the portability advantage of soft IP with the optimization and
security of hard blocks independent of the original design environment.
LACE uses an edge-based compactor that works directly on the physical layout.
Originally it was developed to convert full custom microprocessors from
one fab to another with smaller design rules. LACE adjusts the layout one
edge at a time, just as a human layout designer would do. All LACE needs
as input is a physical mask layout, the target technology's design rules,
and the sizing values for transistor and wire sizing and/or the simulation
result for individual transistor and wire sizing. The layout's function
and relative topology remain the same.
Hierarchical Compaction
LACE maintains the hierarchy of a design
during the migration process. That means, that after conversion, an existing
design can be brought back after the conversion into the design flow as
it is. This enables designers to hierarchically migrate datapaths, RAMs,
ROMs, and other highly structured design. A parallel synchronizing algorithm
insures that all cells of a hierarchical design are compacted to the optimum
size through parallel port adjustment. Routing or programming overlays
are considered for each instant of a cell and for each hierarchy level.
Benefits From a Layout Compaction Approach
and Usability Requirements
Using LACE's automatic compaction approach
provides a predictable reuse of hard macros, cores and blocks. The physical
design can be adjusted to the specific requirements of a technology and
a specific application. Running a test chip for each core, macro or block
allows a designer to describe the timing behavior precisely for any given
target technology. If a reusable design is well organized, silicon proven,
tested, and documented, it can be represented as a black box. The requirements
are that all levels of abstraction are provided to feed the reusable design
into different design environments. The time saved ranges from 30-95% for
building a new silicon system.
Takahiro Takechi of the Design System
Group, Logic LSI Division at Oki Electric Industry, Tokyo, Japan uses LACE
in different ways. First he and his team use LACE to import IP blocks designed
by outside companies, and migrate them into OKI's process. OKI designers
also use LACE to migrate their own IP blocks into OKI's latest process
in order to reuse proven designs. Third, the designers use LACE to migrate
OKI's cell libraries into a new customized process (such as migration from
a 5-V process to 3-V process).
"There was almost no way to migrate designs
into other processes," Takechi said, when Barbara Tuck of Computer Design
asked him what OKI designers did before they had LACE. He noted, "We spent
a lot of manpower to migrate designs by hand."
Applications for Layout Compaction
Feedback of the Simulation Result back
into Physical Design
Designers need a methodology that can
address the gap between high-level simulation and physical design. When
the design is laid out, designers extract capacitors and parasitics, and
simulate at the gate-level for timing and power. If simulation results
show that power lines, critical signal lines, and transistors need re-sizing,
the designer need to send this information to a layout modification tool,
one that automatically adjusts the layout according to the simulation results.
If a design that is implemented by using
an automatic place and route tool, turns out to have timing or power problems,
the correction of the physical design is almost as big a job as the initial
routing. Final tweaks are usually done manually by the layout designer.
And who documents these? The Place and Route (P&R) process for the
next generation takes almost the same amount of work as for the original
technology. If the engineers familiar with the initial process are no longer
available any more, it's even worse.
LACE meets the demands for automatic adjustment
of the layout according to the simulation results. It re-sizes the physical
layout database while maintaining its topology and logical functions. With
LACE, verification and simulation cycles can be drastically reduced, because
the number of iterations is minimized. LACE automatically enlarges or reduces
the width of power and signal lines to reduce resistance for higher speed
and increase capacitance for power consumption. Transistors are sized individually
according to the requirements of the design context. LACE can size transistors
and wires proportional to their input size or size transistors and wires
individually according to absolute values from the simulation. If a layout
is so dense that a layout designer has a difficult time manually widening
signal lines or increasing transistor or capacitors without causing design
rule violations, LACE automatically widens the wiring, transistors and
other devices. At the same time LACE compacts the surrounding structures
to the highest possible density. Designers are reporting about two to 15
iterations of simulation, verification and re-place and route before design
sign-off. This many iterations can cause the product to miss the market
window. Using an automatic approach to adjust the layout in a straight
way to the requirements of the simulation significantly shortens the design
cycle.
Increase the Productivity of Full Custom
Design
If the simulation results and the area
requirements of an existing full custom design require a totally new layout
design, LACE adjusts the layout automatically. A design team performed
a benchmark between LACE and manual layout. They manually laid out a design
in 0.5-micron technology. Then, because they wanted to reuse the same layout
for a 0.35-micron technology, they shrank it from 0.5 to the 0.35 technology.
The minimum design rules of DSM technologies, even for the same fab, don't
change linearly. In this case, only three design rules changed, so the
traditional approach of linear shrink did not work. They only could achieve
a gate shrink for the 0.35-micron technology. This resulted in the same
die size as the 0.5-micron version. This was not acceptable because the
design had to be implemented in a larger system and area requirements had
to be smaller.
To achieve these results, they laid out
the design manually. The 0.35-micron version had same topology as the
0.5-micron design, but now the 0.35-micron version was 20% smaller than
the 0.5 micron. The drawback to the manual approach was that they needed
24 person months for layout. Even with four layout designers, they
needed six months to produce a DRC and LVS (layout vs/ schematic) clean
layout.
The designers then used LACE to do an
automatic compaction of the 0.5-micron layout to the 0.35 technology.
During this compaction the power and signal lines of the 0.5 micron
were adjusted according to the results of the high-level simulation. In
total, 36 different signal line width adjustments were necessary to meet
the requirements of the simulation. Output drivers had to be increased.
The width of all other transistors remained the same. The design required
inserting additional transistor devices into the dense layout where there
was no space in the original design. The designer drew the devices as tiny
polygons and LACE automatically adjusted the size and design rules during
the compaction. LACE's run time was eight hours for a design that required
24 person months for manual re-layout. The area size of the compacted layout
produced by LACE was the same as the one manually drawn.
Counting all communication time and setup
time this compares to a less than one-person month vs. a 24-person month
effort for this design migration to a smaller technology. That is a time
saving of over 90%!
The benefit with this compaction approach
is that the engineer does not need to know the design in detail to run
a conversion, because LACE maintains the netlist, functionality and topology.
Accelerating New Design
SoCs are becoming larger and larger, but
because of shorter market windows and shorter product life cycles, the
design time needs to decrease. Roger Fischer of TSMC USA's office of TSMC's
president said recently in an interview with Electronic News that if foundry-ready
IP is available to customers to plug into their designs, this can save
OEMs between three and seven month in time-to-market.
Since technology development and product
development has to be done in parallel, IC and layout designers are faced
with the challenge of starting a design before all technology parameters
and design rules are stable. In the old days, producing memory for one
or two years could test the newest fabs. Then designers could use these
approved models and design rules to design chips and libraries for a specific
technology. Now designers often start with preliminary models and rules,
which may change during or right after the design process.
Design teams of library providers, fabless
design houses, or chip manufacturers are face the same problems. LACE solves
it by adjusting layout designs, in the last minute, to the foundry's latest
design rules.
Standard Cell Migration
Creating Standard Cell Libraries is time
consuming. The bottleneck in the production of a library is the physical
design. Third party library providers choose different methodologies to
create the physical design.
Some compile the library using an automatic
tool, others draw them manually. A library, targeted for different fabs,
uses a subset of the design rules for each target fab. The nature of this
approach means the design rules are not the smallest possible for each
fab. Since the competition in the IC business has grown and margins are
shrinking, area is an issue for IC designs. Designers want to have the
smallest possible library for their design. In the past design groups shrank
down libraries from one technology to an other, even if they lost 10-20%
area because the shrink factor was not optimum. They used a bigger shrink
factor than optimum, to avoid design rule violations in the new process.
Tests have shown, that a library shrunk from a 0.8 process to a 0.5 micron
process could be compacted 20% smaller by LACE and by using the real design
rules for the 0.5 micron process.
For DSM technologies one library per technology
is not sufficient. There are needs for different libraries, that are different
in device sizes and power lines depending on the design size for the library.
If the standard cells are drawn manually, designers need a fast approach
to automatically adjust the physical design to the last minute design rule
changes and to the requirements for the design. Using LACE for the modification
and automatic adjustment of a library enables designers to automatically
create the library needed for a specific technology and a specific design.
LACE does all the necessary adjustments of cell height, power bus width,
grid adjustment of vias and pickups, design rule correction and transistor
sizing.
To compact the physical layout of a standard
cell library with about 400 cells LACE needs only to run overnight on a
UltraSPARC1.
After compaction the library is design
rule correct, all ports have the right position, and the cell height is
met. The average time to draw new standard cell libraries takes about a
two hours to three days per cell, or 100 to 1200 days for a library with
about 400 cells. Compared an overnight run using LACE and five to ten days
of setup time for a specific cell architecture and technology shows a savings
about 90-99%.
For DSM technologies, an automatic layout
compaction approach for last minute design rule changes after the physical
layout is a strategic advantage.
Full Custom Design
DSM technologies below 0.35 micron have
grid values of 20nm and below. Some technologies operate with 5nm grid,
but feature sizes, (e.g. contact width) is still up to 300-400nm. That
means a relationship of feature size to grid size is 400:5. In a 0.8-micron
technology the average contact size was 1000nm, the average grid size 100nm.
At this time the relationship of feature size to grid size was 1000:100,
or 100:10. The productivity of layout designers slows down significantly
with the feature and grid sizes of DSM technologies. At the same time the
productivity has to increase. To close this gap, advanced layout tools
are needed that compensate for the disadvantage of smaller grid size, give
the designer more flexibility, and increase productivity to meet production
deadlines. Layout designers need an automatic tool that corrects the design
rules and adjusts wires and devices to the appropriate width during the
initial drawing process. LACE's polygon editor in combination with the
LACE's compactor provides this capability. The designer places devices
like in the past but without taking much care for correct design rules.
When the designer runs a compaction, LACE automatically adjusts the geometrical
structures and sizes transistors and wires according to the given width
and length. This methodology gives the designer flexibility for all upcoming
design rule changes during the design process and speeds up the drawing
process up to 10-100 times.
Conclusion
DSM design is fundamentally changing IC
design tools and methodologies. While conventional design tools and methodologies
assume the predominance of gate-related delays, the DSM design process
must account for the increased importance of the chip's interconnect. Gates,
as well as interconnects, must be modeled more accurately to reflect DSM
effects.
Transistor-level simulation results must
be able to be forwarded to an automatic layout modification tool. Designers
need to accurately document their design to make design reuse a day to
day and cost effective approach. Physical Design Reuse is extremely cost-effective,
timesaving and predictable if you choose the right migration approach to
shift to new or different technologies.
Back
to Top
|