Please register or login. There are 0 registered and 1355 anonymous users currently online. Current bandwidth usage: 326.30 kbit/s December 12 - 06:36pm EST 
Hardware Analysis
      
Forums Product Prices
  Contents 
 
 

  Latest Topics 
 

More >>
 

    
 
 

  Pentium 4 Scaling with DDR Memory 
  Feb 04, 2002, 08:30am EST 
 

Results - Linpack & Cachemem


By: Dan Mepham

Linpack and Cachemem are both synthetic benchmarks. As such, they often do not relate directly to practical performance, however from a theoretical standpoint, they're quite excellent. Both can be used to graphically illustrate our earlier explanation of scaling, and reliance on the memory bus.

graph1

Linpack measures simple arithmetic performance using data sets varying from 1 byte to over 512 kilobytes, in this case. As can be seen, when the data set is less than about 256KB (the size of the Pentium 4's cache), the performance of all three platforms remains equal. In other words, the data being manipulated (or collated, if you will), fits entirely within the processor cache, and thus depends only on the speed of the processor. Once the data exceeds 256KB, it can no longer be housed entirely in the cache, and must be transferred over the memory bus. At this point, our platforms, each using different speeds of memory, begin to separate out. Those with lower levels of bandwidth (845 SDR) cannot feed the processor fast enough, and their scores drop significantly, as the processor is left waiting for data. Naturally, DDR333 platform exhibits the highest performance, as it is able to feed the processor more effectively.

graph2

Cachmem's latency test measures the latency of the memory bus as seen from the CPU. The latency benchmark is easily the clearest illustration of the difficulties related to performance scaling. In the graph above, as CPU speeds are increase, the latency (or time required for the memory bus to feed the CPU), increases quite noticeably. In the case of the SDR platform, the latency is quite high, particularly for higher speed processors. We might expect that, in this case, heavily memory-dependent applications would scale very poorly with clock speed on the SDR platform.

One interesting item to note here is that the performance of the 500FSB/266MEM platform is closer to that of the 500/333 than the 400/266. This indicates that the latency seen by Cachemem is more affected by Front Side Bus bandwidth than by memory bandwidth.



1. Introduction
2. Processors 101
3. Memory Considerations
4. Test Procedure
5. Results - Linpack & Cachemem
6. Results - STREAM & 3DMark2001
7. Results - Quake 3 Arena
8. Results - MP3 Encoding
9. Results - ScienceMark
10. Results - SPECviewperf
11. Conclusion
12. Appendix A - SPECviewperf

Discuss This Article (5 Comments) - If you have any questions, comments or suggestions about the article and/or its contents please leave your comments here and we'll do our best to address any concerns.

 

    
 
 

  Related Articles 
 
 

  Newsletter 
 
A weekly newsletter featuring an editorial and a roundup of the latest articles, news and other interesting topics.

Please enter your email address below and click Subscribe.