Benchmark Test Report on Veritak 3.00A and MXE6.0

                                                                                      Date Jun.30.2006
                                                                                      Tak.Sugawara
1. Purpose

 To report performance comparison between  new version of Veritak and MXE6.0.

2. Test Condition

Item Description Remarks
Machine Athlon64 3000+Single /3800+Dual 2GB memory
Asus A8V
OS Windows 2000
Test Bench Mainly From Opencores/Icarus Test Suite/Others
Simulator Veritak1.71/1.82A/3.00A( Released .) MXE6.0d(Not Starter. Full Xilinx Edition of Modelsim)
Measured Time From Simulation Starts to Simulation ends.
Not include compile time.
Veritak:Optimized Debug:Normal/ Level/2/NBA/Fast Switch
All bench run W/O "waveform save" except for
item No.15.


3 Test Result

Athlon64 3000+(Single) Athlon 3800+(Dual ) 
No Name Description source #of lines Veritak1.71 Veritak 1.82A MXE6.0d Veritak
2.14B
Veritak
3.00A
3.00A vs MXE
MXE6.0d Veritak Project Files
CPU Cores  
1 WB_Z80 Z80 opencores 5K 1min.17sec 1min14sec 1min.47sec 1min8sec 39sec 2.5X 1min38sec Download
2 TV80 Z80 opencores 22K 1min.23sec 1min8sec 51sec. 1min7sec 16sec 3.0X 49sec. Download
3
**1
FZ80 Z80 PC8001 FPGA(in Japanese only) 2K 51sec 49sec 1min03 46sec 22sec 2.6X 57sec
4 YACC MIPS-I subset opencores 5.7K 7min.29sec. 8min.17sec. .
5 M68K M68K Verilator's site(opencores) 12K 11min.15sec 6min50sec 17min.18sec. 6min23sec 1min55sec 8.1X 15min34sec. Download
6 H8300H renesas subset sugawara-systems 10K 23sec. 20sec 27sec. 20sec 11sec 2.2X 25sec
7 Openrisc very large design opencores 144K 25sec. 14sec 29sec 13sec 8sec 3.0X 25sec. Download
Peripheral Cores
8 Eithernet large design opencores 45K 1h5min. 39min34sec 1h13min 35min19sec 16min28sec 4.4X 1h12min52sec. Download
9 USB11 opencores 11K 2min.41sec. 1min53sec 2min.44sec 1min41sec 43sec 3.6X 2min31sec. Download
10 PCI very large design opencores 89K 2h1min.57sec. 51min20sec 1h42min. 42min18sec 24min48sec 3.9X 1h40min10sec. Download
11 ATA opencores 4K 13min.30sec. 8min5sec 20min.37sec. 7min17sec 3min7sec 6.5X 20min24sec. Download
12 CONMUX opencores 11K 9min.47sec. 3min36 4min.59sec. 3min10sec 1min56sec 2.8X 4min36sec. Download
13 AC97 opencores 11K 47min.37sec 28min18sec 47min.58sec. 25min44sec 9min17sec 4.6X 43min15sec. Download
14 XilinxCorelib RAM 256KB R/W w/DCM sugawara-systems - 5min.56sec. 1min30sec 3min.44sec 1min19sec 14sec 14.8X. 3min28sec
Others
20 ASIC ASIC(50kgates) sugawara-systems 20K 13sec. 12sec 18sec. 10sec 5sec 3.4X 17sec.
15
*2
Simple Counter (30Millions Pattern /w saved waveform) VeritakUser's Contribution 1K 2min.57sec 2min54sec 5min.16sec
16 AES 128bit galois operation sugawara-systems(for CQ publisher's contest) 10K 1min.6sec. 32sec 50sec. 30sec 14sec
17 Many Instances1000 small module but many instances Icarus Test Suite(modified) 0.3K 6min.3sec. 13min59sec. .
18
**3
Many Instances10000 small module but many instances Icarus Test Suite(modified) 0.3K 16sec 2min.1sec(xilinx instance restriction) xilinx instance restriction
19 Large Multiplier 100bit multiplier Icarus Test Suite(modified) 0.2K 17sec. 9sec 31sec. 9sec 2sec 12X 25sec. Download
21 PCI IP Net List IP Athlon 1.2GHz 4min10sec 34sec 4min9sec 12sec 10sec 8.9X 1min29sec
 

4. Consideration
4.0 Performance difference between Single CPU and Dual CPU
Since simulator runs as single thread, no performance gain is expected even if dual CPU is used.
This is true not only Veritak but also MXE.. You will notice 10% performance gain between Single CPU and Dual CPU on the same mother board with the same config. However this is not effect of "Dual power". It is noted Dual CPU(3800+) has 2.0GHz clock, while Single CPU(3000+) has 1.8GHz clock. 

4.1 Comparison of All Save Performance
 In Item No 3 was re-tested by another machine.(Athlon 1.2GHz)
 Fig. below shows relational speed as MXE=1,w/o  w/ "all save of waveform".




 Since veritak design concept is "default save all "by run-time compression, internal compressed file is small(37MB) and overhead is low, while vcd data is over 300MB. Extraction of any signal in this project is almost instantaneous. So, such debug stage of each designer's RTL design ,( most time consuming due to many run and run), is suitable for use of Veritak..
 
4.2 Comparison of Long Vector Performance
 Veritak is faster in this test..This is reasonable since all waveform files are inside memory, not Disk. Even in 30millions patterns vector, view response is still fast. However,this is limitation in Veritak at the same time. Size of waveform view is restricted to size of PC's virtual memory. (Around 1GB seems used in this test.) ModelSim's waveforms are saved to disk,so ModelSim has advantage in long vector test. However this should be resolved in Veritak64 bit version.
 
4.3 Many Instances
 Xilinx-Edition restricts numbers of instances.That is the reason why No.18 is so slow.

5.Conclusion