VHDL SIMULATIONS - TIPS FOR SPEEDING Vijay Madisetti Send additions to V. Madisetti, vkm@eedsp.gatech.edu First version July 21, 1995 Second version Aug 30, 1995 This version Mar 5, 1996 ===================================================================== Use of VHDL processes ==================================================================== 1.0. Each process invocation is costly. (Additional comments from Randy Harr (RH) - The cost of the process depends on the signals to communicate information in and out of processes are what are really costly. Each process represents a primitive, schedulable behavior element (like gates in older logic simulations). 1.1 Each concurrent assignment is treated AS A PROCESS <---- 1.2 Block and Generate statements have SAME overhead as a process. <--- 1.3 Sensitivity list or Wait statements ? <---- You could use sensitivity lists since they are statically allocated. Wait statements are dynamically allocated and cost a lot in terms of simulation. (Additional comments - Randy Harr, ARPA) True, but remember that a sensitivity list is a limited form of a wait. Also, it is evaluated AT THE END OF THE PROCESS (i.e. the process is executed at simulation time 0 and only reblocks when it gets to a wait or the end of the process, i.e. sensitivity list). 1.4 Functions sensitive to the same signal should be in the SAME process (obvious reasons!). - Use separate processes for each clock in multiple clock systems. (to prevent evaluation of statements due to unrelated clock signals) 1.5 Putting blocks and functions in processes and "protecting" them from execution via a conditional statement, will NOT be of use, since the process is invoked whenever the sensitivity list is stimulated. Generate statements can be used around processes, and will exclude process from simulation when disabled. (Additional comments - Todd Carpenter, Honeywell) To be a wee bit more precise, it excludes them from being stimulated. We have seen with at least Vantage, that the memory is still consumed. This matters for us when some of our generate-wrapped instantiations are several MB once elaborated. So, we had to work some magic with generate bounds. Use of VDHL Signals ========================================= 2.0 Variables should be used instead of signals. The overhead of each signal is as follows: > each signal requires one or more drivers. > specific handling and event scheduling > memory storage > more instructions to execute. (Additional comments from Randy Harr) Not to mention the creation of additional signals ('attributes, etc.), signal history, signal kinds (bus vs register), resolved signals evaluated on every assignment independent of whether there is a value change. 2.1 If a signal is calculated from other signals that only change a few times, and the calculation is included whenever the process is invoked, then the execution time will increase. (Additional comments - Todd Carpenter) New synthesis users must be careful with this. Variables in a clock triggered block infer registers. 2.2 Reassigning signals to current value is to be avoided. (Additional comments - Todd Carpenter) This can also be a nightmare to do if you're modeling for synthesis. In standard VHDL, the signal retains state naturally. To accomplish this in a clock triggered block for synthesis, you'd end up with an inferred latch, which is NOT a good thing for testability. As a result, you generally make reassignments. (Of course, if you're not designing with test in mind, you can eliminate some of the overhead) 2.3 Static signals that seldom change can be left in the code. (Additional comments from Randy Harr) Only if a few. If you declare 64 bit busses, that consumes 64 entries in the signal assignment / scheduling tables. E ven if it is static, that chips away at the resources. The general rule should be -- reduce signal count. ALSO REMEMBER THAT COMPOSITE TYPES (ARRAY'S, RECORDS) GENERATE ONE SIGNAL FOR EACH ELEMENT IN THE TYPE. 2.4 Resolved signals should be avoided - since there is lots of overhead in resolving signals, use Std_ULogic instead. But note that Std_Logic is accelerated on most simulators, so this point is not always true. (Additional comments from Todd Carpenter) Vantage allows you to specify "reflexive" for a brf so that if there is one driver, the brf is not invoked. Accomplishes the same performance increase with significantly increased design flexibility. Additional suggestion: :Avoid complex data types for signals." At our performance level modeling, simple scalar signals of type bit, std_logic, integer, etc., are rarely used. Rather, large vectors and or complex records are the common signal types. However, VHDL (in a rather braindead fashion) resolves all signals down to the scalar level. So that means a driver (and waveform) for each scalar element of a complex signal. Consider the case where we once had a signal that was about 7kB wide. Ouch. Yes, we changed that in a hurry. If the next VHDL reballot effort tried a wee bit harder to add variant records and atomic signals, we'd be much farther along. This would *drastically* increase simulation performance for those of us working above the bitwise signal level. Use of Types ======================================= 3.0 Numerical data types such as Integers are better than Std_Logic and St_Logic_Vector and Bit_Vector, and should be used for arithmetic calculations. When bit-field information is used, then integers can be costly too. (Additional comments from Randy Harr) There is no bit field manipulation of the integer type -- it has to be converted to an "array" (vector in lay mans terms) to be operated on. (Additional comments from Todd Carpenter) Tsk. You're not supposed to do that anyways. Where does it say in the LRM precisely *how* integers are encoded? Tsk. (note - you're not supposed to so this in Ada, either) 3.1 Type conversions should be done only when the _value__ is necessary. Do not type convert whenever there is an event. <------ 3.2- Enumerated types have better simulation speeds than constrained types (such as Integer subtypes), since range checking is performed statically and NOT during the simulation. <------ 3.3 Perform standard code optimization - the compiler may not do it for you. See Madisetti, "VLSI Signal Processors" (IEEE Press, 1995) Chapter 6, Section 4 on simple tips (a shameless plug for a book). Use of Conditional Statements ========================================== 4.0 The outer conditional statements should reduce the necessity to evaluate the enclosed conditional statements.. Branches with the highest probability of occurring should be executed first. In general, it is good to profile the VHDL code to see execution patterns. Coverage from Synopsys, and Leapfrog provide these utilities. (Additional comments from Randy Harr) Profiling only helps let you see what gets executed often. It does not give you an indication of how simple or difficult the operation is to perform. Unfortunately in VHDL, there are a number of subtle behaviors that are very complex and time consuming during simulation. (Additional comments from Todd Carpenter) This is also "standard programming" which is good for speed. Depending on state encodings, it might not be the easiest to read or maintain. As always, this sort of thing is a tradeoff. Some More Synthesis Hints ================================== 5.0 Specifying ranges for signals or variables explicitly reduces hardware cost and increases performance. 5.1 Use effective algorithms, e.g., a LFSR is better than a simple state encoder. 5.2 Share complex operators using module functions (if available for the synthesis tool). 5.3 Specify "don't care" conditions. when others => OU <= "XXXX"; 5.4 Write as low a level of code as possible, it is usually more efficient and smaller in area and higher in performance. ===========