Thursday, November 9, 2017

How primetime identifies vdd value in multivdd chip?

Primetime identifies vdd value in multi vdd chip by using lppi (link path per instance) file.

It gives information about the blocks and on which vdd that block is operating on.

what are the commands inside to define this?

Wednesday, November 8, 2017

Setup and hold times trade-off while fixing

-start_end pair option
Fix setup
Remove delay cells, Back to back buffers

In the same path, setup and hold can't fail.
Fix setup firstYou can add delay at the start point also to fix hold

Add delay at the capture flop in that hierarchy where setup is not affected.

Tuesday, November 7, 2017

Query1: Trade-off: Hold check is violating once setup fixes are done

I have setup violations from FF1 to FF2. There is little scope in data path and clock push as well as pull options are there. But hold is failing to FF2 as soon as i fix setup issues.

How do i proceed in this situation?
Use -start_end pair option 

Setup and hold can't fail in the same path-> i.e. Start point or end point are different for setup and hold violations.

Fix setup first 
Fix hold by adding delay at the start point or adding delay in the capture path where hold is failing. It must be lie in the different path where setup should not fail. 

Monday, November 6, 2017

buffer vs delay cells to fix hold violations

i) When local congestion is not there at endpoint:

generally delay cells are big compared to buf and 1 delay cell can give 80ps (for e.g.) and buf cell can give only 20ps => so instead of putting 4 buf cells (area of 4 buf cells is more than 1 delay cells for the same delay), it it better to put 1 delay cell.


ii) When local congestion is there, 80ps delay cell may not fit there and 4 buf cells can be inserted 1 top on other on different routing tracks.

So, it is difficult say which is put to fix and which is advantageous to have to fix certain ECO.

get attributes of different classes

How to which classes are there?

list_attribute  -application  -class  xyz

i) Physical class (pin, port, net, cell)
ii) Virtual class (clock)


ii)  list_attribute -app -class libcell
 reports all attributes related to libcell

get_attribute [get_cell  cell_nameattribute_name


To do: analysis oncross talk affected nets

If cross talk delta on net is more,

i) Upsize driver

STA:Do the following analysis:
report_delay_calc  -from net_source_pin    -to    net_sink_pin

Gives aggressor nets  and victim nets and related analysis
    i) Downsize aggressor
    ii) Upsize victim


To PD guys:
i) If it is in clock path, double spacing , shielding techniques
ii) If net is data path, then change to higher metal to reduce delay and viceversa


whatif analysis


Let us assume, we are pushing the clock at FF2/clk

          -  Understanding the impact on hold violation at FF2/d
          -   Understanding the impact on next stage setup violations

This analysis is mentioned as whatif analysis.

Getting scaling factor of the library from min to max corners

While working at ECO phase, it important to know what is the impact of SS corner setup fix on FF corner hold timing.

It is better to have this ratio upfront to know whether SS corner fix violates FF corner hold or not. This saves a lot of time.

How to get this:

i) Take max corner timing path
Note down buf 1x delay (assumed 1x buf is there in that path)

ii) Go to min corner and note down buf 1x delay
report_timing -through  above_buffer_instance

             Scaling factor = (max corner delay/min corner delay) 

Repeat this exercise for available drives in that library.
get_cells buf*
     - to get drives available in that library.

prepare a chart and it is constant for a given technology.


ii) Check whether you get the same info from estimate_eco ?

report_timing -path_type summary

The below command prints in the following format
[startpoint   sp_lib_cell  endpoint  ep_lib_cell  slack]

set paths [report_timing   -path_type  summary  -max_paths  10]

foreach p1  $paths {
      # get lindex 0 from the list
}

Clock push or pull techniques


Identifying common cell in the multi branch flops (next stage):

report_transitive_fanin -from FF1/clk

see from which cell it is diverging and do insert_buffer at the cell, to fix on all the cells at the same time.


Good clock tree:
Diverging of the clock path should be close to the flops otherwise

i) No of cells are more => results high dynamic power
ii) Common path should be as long as possible (CRP should be high) otherwise pvt, Rc corner variaions and skew is more => high uncertainity values

Because of bad clock tree, worst hold failures may happen at min, low voltage compared to min, high voltage corners.


cgc setup and hold checks

Why clock gaters:
    Clock gating is the well known technique to reduce dynamic power consumption.

i) If done properly: Saving the dynamic power of flops
ii) if not done properly: Glitches may lead to meta-stability issues.

Adv of using CGC:

1. Its worth mentioning that clock gating does not have much significance on individual flops. But imagine a scenario of writing a 64bit register based on an enable. The power dissipated by all 64 flops can be greatly reduced by using a single clock gate cell common to all 64 flops, which amounts to a very significant power saving.

2. On a side note, clock gating can greatly help save area in the scenario described by getting rid of the large mux'ed feedback path which would otherwise be necessary to meet logic requirements.

Drawback of Simple AND as CGC:
Think of a clock gate as "simple and" with an enable gating the clock. The reason you do this is to stop unnecessary toggles on the clock pin of flops. Even if the output doesn't toggle, the internal flop circuitry dissipates unnecessary power. Power saving can be achieved by simply gating the clock with an enable.

1. Here's the catch, if the enable is asynchronous to the clock and gates the clock during its active phase, you can end up with a clipped clock, which effects the duty cycle.

2. This scenario can lead to timing violations on the flop and downstream logic. If clock clipping happens very close to the active edge of the clock, there might even be a clock width violation.

How to mitigate this problem:
To prevent violations, its best to sync the enable signal with respect to the clock it is gating. This is achieved by using a latch which is transparent only during the inactive phase of the clock. For example, to gate the clock to its low state, use an active low latch to sync the enable and gate the clock with the sync'ed version as described by the code below:

Code:
always_latch
   if(!clk) en_syn <= en;

assign gated_clk = clk & en_syn;

always_ff @(posedge gated_clk) ....
Many ASIC vendors, supply clock-gate cells. Which is internally a combination of the latch and the gate described above. These gates can be instantiated in the design or better yet inserted during synthesis (if the RTL is coded following the tool requirements of cg insertions).

The clock_gating_setup_time is essentially the setup requirement of the latch and is available for STA delay annotation in the dbs read by PT.





How to model:

-ve latch followed by AND gate:
En is input to -ve latch, CLK is the clock of the -ve latch
Output of latch and CLK are going to AND
Output of AND is going to set of flops

Let us assume EN pin is gating clock pin (CLK)

The gating check is performed on pins (EN) that gate a clock signal.

When Clock gating Checks required:

Clock gating checks need to be done where the clock is gated with a data or enable signal. The basic idea here is to check whether the enable signal is toggling only when the clock is in its inactive phase. If enable toggles in the clock's active phase, it will result in glitch in the gate output clock.

AND/NAND gate is inactive when clock is in the LOW phase. i.e., at this time gate output will not depen on the other inputs. So AND/NAND gate is having a HIGH clock gating check.  Similarly, for OR/NOT gate is inactive when clock is in the HIGH phase, so it has LOW clock gating check.

Setup violation:
         Similar to normal flop, here, D input is EN and clock is CLK. And we are checking setup violation of Latch.

The clock gating setup check is used to   ensure   the   controlling   data signals   are stable before the clock is active. i.e., EN signal is stable enough before the -ve level (as it is -ve latch) of clock signal.

This check is performed on combinational gates through which the clock signals are  propagated.

The   arrival   time   of   the   leading   edge   of the clock pin is checked against both levels of any data signals gating the clock.

If clock gating setup failure leads to:  A clock   gating   setup failure can cause either a glitch at the leading edge of the clock pulse, or a clipped clock pulse.


How to fix setup violations in PrimeTime:
          Let us assume setup violation is occurring in the below path
 start point:  FF1 clock
 end point: EN input of CGC

Fix1:
1. Reduce data path delay:

2. Clock push:
           - adding delay in the clock path (inserting buffers at CLK input pin of CGC)

When this clock push can be done?

      As a general phenomenon, clock push can be done when there is setup margin at the next stage.
As output of CGC is the start point for setup of flops, so our clock push should not result in setup violtaion at the startpoints of those timing paths.


##Getting slack of next stage flops (from endpoint of current path under fix) as gated clock is the start point for those flops:
set f [open slc_cg.txt w ]
set fo [get_cells [all_fanout -from   CGC_cell/clk -flat -endpoints_only -only_cells]]

foreach_in_collection fo1 $fo {
set slack [get_attribute [get_timing_path -from [get_attribute $fo1 full_name ]/clk -exclude [get_ports *] ] slack]
puts $f "[get_attribute $fo1 full_name] $slack "
}
close $f

ECO:
insert_buffer  path_till_CGC/CGC_cell/clk_in          clk_buffer_1x_drive





Hold check:
The clock gating hold check is used to ensure that the controlling data signals   are   stable while the clock is active. 

The arrival time of the trailing edge of the clock pin is checked against both   levels   of   any
data signal gating the clipped clock pulse.


References:
http://tech.tdzire.com/clock-gating-checks-and-clock-gating-cell/


How to define CGC constraints:

EXAMPLES
         The   following example specifies a setup time of 0.2 and a hold time of 0.4 for all gates in the clock network of clock CK1.

                 pt_shell> set_clock_gating_check -setup 0.2 -hold 0.4 [get_clocks CK1]

         The following example specifies a setup time of 0.5 on the gate and1.

                 pt_shell> set_clock_gating_check -setup 0.5 [get_cells and1]



set_clock_gating_check

NAME
         set_clock_gating_check
                    Specifies   the   value   of   setup   and hold time for clock gating
                    checks.

SYNTAX
         string set_clock_gating_check
                    [-setup setup_value]
                    [-hold hold_value]
                    [-rise | -fall]
                    [-high | -low]
                    [object_list]

         float       setup_value
         float       hold_value
         list object_list

ARGUMENTS
         -setup setup_value
                    Specifies the clock gating setup time.   The default is 0.0.

         -hold hold_value
                    Specifies the clock gating hold time.   The default is 0.0.

         -rise   Indicates that only rising delays are   to   be   constrained.    By
                    default,   if   neither -rise nor -fall are specified, both rising
                    and falling delays are constrained.

         -fall   Indicates that only falling delays are to   be   constrained.    By
                    default,   if   neither -rise nor -fall are specified, both rising
                    and falling delays are constrained.

         -high   Indicate that the check is to be performed on the high level   of
                    the   clock.   By default, PrimeTime determines whether to use the
                    high or low level of the clock using information from the cell's
                    logic.   That   is,   for AND and NAND gates PrimeTime performs the
                    check on the high level; for OR and NOR gates, on the low level.
                    For some complex cells (for example, MUX, OR-AND) PrimeTime can-
                    not determine which to use, and does not perform   checks   unless
                    you   specify   either -high or -low.   If the user-specified value
                    differs from that derived by PrimeTime, the user-specified value
                    takes precedence, and a warning message is issued.   Unlike setup
                    or hold time, this option sets the attribute only on the   speci-
                    fied   pin   or cell and does not affect the transitive fanout pin
                    or cell.   If you specify -high or -low   you   must   also   specify
                    object_list;   in that case, object_list must not contain a clock
                    or a port.

         -low    Indicates that the check is to be performed on the low level   of
                    the   clock.   By default, PrimeTime determines whether to use the
                    high or low clock level using information from the cell's logic.
                    That   is, for AND and NAND gates PrimeTime performs the check on
                    the high level; for OR and NOR gates, on the low edge.   For some
                    complex cells (for example, MUX, OR-AND) PrimeTime cannot deter-
                    mine which to use, and does not perform checks unless you   spec-
                    ify   either   -high or -low.   If the user-specified value differs
                    from that derived by PrimeTime, the user-specified   value   takes
                    precedence,   and   a   warning message is issued.   Unlike setup or
                    hold time, this option sets the attribute only on the   specified
                    pin   or   cell   and   does not affect the transitive fanout pin or
                    cell.   If you specify   -high   or   -low   you   must   also   specify
                    object_list;   in that case, object_list must not contain a clock
                    or a port.

         object_list
                    Specifies a list of objects in the current design for which   the
                    clock   gating check is to be applied. The objects can be clocks,
                    ports, pins, or cells. If a cell is specified, all input pins of
                    that   cell   are affected.   If a pin, cell, or port is specified,
                    all gates in the transitive fanout are affected.   If a clock   is
                    specified, the clock gating check is applied to all gating gates
                    driven by that clock.    If you specify -high or   -low   you   must
                    also   specify   object_list;   in   that case, object_list must not
                    contain a clock or a port.   By default, if   object_list   is   not
                    specified,   the   clock   gating   check   is applied to the current
                    design.

DESCRIPTION
         The set_clock_gating_check command specifies a setup or hold time clock
         gating   check to be used for clocks, ports, pins, or cells.   The gating
         check is performed on pins that gate a clock signal.

Removing CGC checks:
remove_clock_gating_check
Captures clock-gating checks.

SYNTAX

string remove_clock_gating_check
[-setup]
[-hold]
   [-rise]
   [-fall]
   [-high | -low]
   [object_list]

list object_list

ARGUMENTS

-setup
Indicates the removal of the clock-gating constraint on the setup time only. If you do not specify either the -setup or -hold option, both setup and hold constraints are removed.
-hold
Indicates the removal of the clock-gating constraint on the hold time only. If you do not specify either the -setup or -hold option, both setup and hold constraints are removed.
-rise
Indicates the removal of the clock-gating constaint on the rising delays only. If you do not specify either the -rise or -fall option, constraints on both rising and falling delays are removed.
-fall
Indicates the removal of the clock-gating constaint on the falling delays only. If you do not specify either the -rise nor -fall option, constraints on both rising and falling delays are removed.
-high
Remove the high specification from the obejct list, previously set up by set_clock_gating_check command. This option has to be either high or low..
-low
Remove the low specification from the obejct list, previously set up by set_clock_gating_check command. This option has to be either high or low.
object_list
Specifies a list of objects in the current design for which to remove the clock gating check. The objects can be clocks, ports, pins, or cells. If you specify a cell, all input pins of that cell are affected. If you do not specify any objects, the clock-gating check is removed from the current design.

DESCRIPTION

This command is available only if you invoke the pt_shell with the -constraints option.
The remove_clock_gating_check command removes clock gating checks for design objects set by set_clock_gating_check.

EXAMPLES

The following example removes the setup requirement (for rising and falling delays) on all gates in the clock network involved with clock CK1 path.
ptc_shell> remove_clock_gating_check -setup [get_clocks CK1]

The following example removes the hold requirement on the rising delay of gate and1.
ptc_shell> remove_clock_gating_check -hold -rise [get_cells and1]
An alternative way to remove information set by set_clock_gating_check is to use the reset_design command.