VHDL SIMULATIONS - TIPS FOR SPEEDING
                         Vijay Madisetti
     
     
            Send additions to V. Madisetti, vkm@eedsp.gatech.edu
                   First  version  July 21, 1995
                   Second version  Aug 30,  1995
                   This version    Mar 5,   1996
     
=====================================================================
     
 Use of VHDL processes
==================================================================== 
     
     
 1.0.  Each process invocation is costly. 
     
  (Additional comments from Randy Harr (RH) - 
  The cost of the process depends on the signals to communicate information
  in and out of processes are what are really costly.  
  Each process represents a  primitive, schedulable behavior element
 (like gates in older logic 
  simulations).
     
 1.1  Each concurrent assignment is treated AS A PROCESS         <----
     
     
 1.2  Block and Generate statements have SAME overhead as a process. <---
     
     
 1.3  Sensitivity list or Wait statements ?                        <---- 
     
   You could use sensitivity lists since they are statically 
   allocated.  Wait statements are dynamically allocated and cost 
   a lot in terms of simulation. 
     
  (Additional comments - Randy Harr, ARPA)
     
  True, but remember that a sensitivity list is a 
  limited form of a wait.  Also,
  it is evaluated AT THE END OF THE PROCESS 
  (i.e. the process is executed at
  simulation time 0 and only reblocks when it 
  gets to a wait or the end of the
  process, i.e. sensitivity list).
     
     
 1.4  Functions sensitive to the same signal should be in the SAME
      process (obvious reasons!). 
     
 - Use separate processes for each clock in multiple clock systems.
   (to prevent evaluation of statements due to unrelated 
   clock signals)
     
 1.5  Putting blocks and functions in processes and "protecting" them
    from execution via a conditional statement, will NOT be of use, 
    since the process is invoked whenever the sensitivity list is 
    stimulated. 
    Generate statements can be used around processes, and will exclude 
    process from simulation when disabled. 
     
  (Additional comments - Todd Carpenter, Honeywell)
To be a wee bit more precise, it excludes them from being stimulated. 
We have seen with at least Vantage, that the memory is still consumed.
This matters for us when some of our generate-wrapped instantiations are 
several MB once elaborated.  So, we had to work some magic with
generate bounds.
     
     
     
 Use of VDHL Signals
=========================================
     
 2.0  Variables should be used instead of signals.  The overhead of
   each signal is as follows:
           >  each signal requires one or more drivers. 
           >  specific handling and event scheduling
           >  memory storage
           >  more instructions to execute. 
     
     
 (Additional comments from Randy Harr)
 Not to mention the creation of additional signals 
 ('attributes, etc.), signal
 history, signal kinds (bus vs register), 
 resolved signals evaluated on every
 assignment independent of whether there is a value change.
     
     
     
 2.1   If a signal is calculated from other signals that only change 
     a few times, and the calculation is included whenever the process 
     is invoked, then the execution time will increase. 
     
 (Additional comments - Todd Carpenter)
 New synthesis users must be careful with this.  Variables in a clock 
 triggered block infer registers.
     
     
 2.2 Reassigning signals to current value is to be avoided. 
     
(Additional comments - Todd Carpenter)
This can also be a nightmare to do if you're modeling for synthesis. 
In standard VHDL, the signal retains state naturally.  To accomplish 
this in a clock triggered block for synthesis, you'd end up with an
inferred latch, which is NOT a good thing for testability.  As a result, 
you generally make reassignments.  (Of course, if you're not designing 
with test in mind, you can eliminate some of the overhead)
     
     
 2.3  Static signals that seldom change can be left in the code. 
     
(Additional comments from Randy Harr)
Only if a few.  If you declare 64 bit busses, 
that consumes 64 entries in the
signal assignment / scheduling tables.  E 
ven if it is static, that chips away
at the resources.  The general rule should be -- 
reduce signal count.  ALSO
REMEMBER THAT COMPOSITE TYPES (ARRAY'S, RECORDS) 
GENERATE ONE SIGNAL FOR EACH
ELEMENT IN THE TYPE.
     
     
 2.4  Resolved signals should be avoided - since there is lots of
     overhead in resolving signals, use Std_ULogic instead. 
     But note that Std_Logic is accelerated on most simulators, so 
     this point is not always true. 
     
(Additional comments from Todd Carpenter)
     
Vantage allows you to specify "reflexive" for a brf so that if there is 
one driver, the brf is not invoked.  Accomplishes the same performance 
increase with significantly increased design flexibility.
     
Additional suggestion: :Avoid complex data types for signals."  At our 
performance level modeling, simple scalar signals of type bit, 
std_logic, integer, etc., are rarely used.  Rather, large vectors and or 
complex records are the common signal types.  However, VHDL (in a rather 
braindead fashion) resolves all signals down to the scalar level.  So 
that means a driver (and waveform) for each scalar element of a complex 
signal.  Consider the case where we once had a signal that was about 7kB 
wide.  Ouch.  Yes, we changed that in a hurry.
     
If the next VHDL reballot effort tried a wee bit harder to add variant 
records and atomic signals, we'd be much farther along.  This would 
*drastically* increase simulation performance for those of us working 
above the bitwise signal level.
     
     
     
 Use of Types
=======================================
     
 3.0 Numerical data types such as Integers are better than Std_Logic
   and St_Logic_Vector and Bit_Vector, and should be used for arithmetic 
   calculations.
     
   When bit-field information is used, then integers can be costly too. 
     
(Additional comments from Randy Harr)
There is no bit field manipulation of the integer type -- it has to be 
converted to an "array" (vector in lay mans terms) to be operated on.
     
(Additional comments from Todd Carpenter)
Tsk.  You're not supposed to do that anyways.  Where does it say in the 
LRM precisely *how* integers are encoded?  Tsk.  (note - you're not 
supposed to so this in Ada, either)
     
     
 3.1 Type conversions should be done only when the _value__ is necessary. 
    Do not type convert whenever there is an event.                <------ 
     
 3.2- Enumerated types have better simulation speeds than constrained
   types (such as Integer subtypes), since range checking is 
   performed statically and NOT during the simulation.           <------
     
 3.3 Perform standard code optimization - the compiler may not do it
   for you.  See Madisetti, "VLSI Signal Processors" (IEEE Press, 1995) 
   Chapter 6, Section 4 on simple tips (a shameless plug for a book). 
     
     
     
     
 Use of Conditional Statements
==========================================
     
     
     
 4.0 The outer conditional statements should reduce the necessity to
   evaluate the enclosed conditional statements..  Branches with the 
   highest probability of occurring should be executed first. 
     
     
In general, it is good to profile the VHDL code to see execution 
patterns.  Coverage from Synopsys, and Leapfrog provide these utilities. 
     
(Additional comments from Randy Harr)
Profiling only helps let you see what gets executed often.  It does not give 
you an indication of how simple or difficult the operation is to perform. 
Unfortunately in VHDL, there are a number of subtle behaviors that are very 
complex and time consuming during simulation.
     
(Additional comments from Todd Carpenter)
This is also "standard programming" which is good for speed.  Depending 
on state encodings, it might not be the easiest to read or maintain.  As 
always, this sort of thing is a tradeoff.
     
     
     
 Some More Synthesis Hints 
 ==================================
     
 5.0  Specifying ranges for signals or variables explicitly 
      reduces hardware cost and increases performance. 
     
 5.1  Use effective algorithms, e.g., a LFSR is better than
      a simple state encoder.
     
 5.2  Share complex operators using module functions (if
      available for the synthesis tool).
     
 5.3  Specify "don't care" conditions.   
          when others => OU <= "XXXX"; 
     
 5.4  Write as low a level of code as possible, it is
      usually more efficient and smaller in area and higher 
      in performance. 
     
     
===========
     

<div align="center"><br /><script type="text/javascript"><!--
google_ad_client = "pub-7293844627074885";
//468x60, Created at 07. 11. 25
google_ad_slot = "8619794253";
google_ad_width = 468;
google_ad_height = 60;
//--></script>
<script type="text/javascript" src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script><br />&nbsp;</div>