Please register or login. There are 0 registered and 895 anonymous users currently online. Current bandwidth usage: 326.30 kbit/s December 04 - 07:19am EST 
Hardware Analysis
Forums Product Prices

  Latest Topics 

More >>


  The next Pentium 4 processor, Prescott arrives 
  Feb 02, 2004, 07:30am EST 

Branching Off

By: Dan Mepham

Intel has further made some subtle but important enhancements to the Pentium 4’s branch prediction systems. Mispredicted code branches result in pipeline stalls as the entire pipeline needs to be flushed to clear the bad branch. With the Pentium 4’s extremely deep pipeline (more on this later), stalls have a dramatic impact on performance.

Despite the exemplary accuracy of the Pentium 4’s branch predictor units, there nevertheless exist situations in which the BPU simply cannot make a prediction. In this case, the Branch Target Buffer (BTB) contains no prediction information about the current branch, and so the processor defaults to a rather simple, static prediction algorithm. Intel has enhanced this simple static algorithm to be more accurate. Without excessive description, the new prediction algorithm examines the distance and other properties of the branch to attempt to ascertain whether the branch may be a loop-ending command, and thus whether or not it should be taken. Subtle enhancements have also been made to the dynamic brand prediction algorithms as well.

Branch prediction success rate is often difficult to quantify, and changes to branch prediction schemes can show various outcomes, ranging from much better performance, to marginally better performance, or even to decreased performance in some situations. We have been given access to some in-house testing conducted by Intel, and while we cannot post actual numbers at this time, we can summarize the results as follows: Testing using the SPECint_base2000 software showed that Prescott’s mispredicted branch rate ranged from 54% lower to 10% higher than Northwood’s at the extremes, and the overall average branch misprediction rate was about 12% lower on the new Prescott core than Northwood; an impressive improvement.

Again, these results are difficult to quantify in terms of real-world performance, but the effects should not be underestimated given the degree to which mispredicted branches impact the performance of Prescott’s deep pipeline.

1. Introduction
2. Caching In
3. Branching Off
4. Round 3, SSE Gets a Refresh
5. Intel's 2004 Roadmap, Sock-et to Me!
6. Incremental Improvements
7. Something Rotten in Santa Clara
8. Performance - Cache Latency
9. Performance - Cache Bandwidth
10. Performance - Cache Throughput
11. Performance - ScienceMark 2.0
12. Performance - Sandra & PCMark
13. Performance - PCMark & AquaMark
14. Performance - SPECviewperf
15. Summary
16. Appendix A - Benchmark Configuration

Discuss This Article (16 Comments) - If you have any questions, comments or suggestions about the article and/or its contents please leave your comments here and we'll do our best to address any concerns.

Rate This Product - If you have first hand experience with this product and would like to share your experience with others please leave your comments here.



  Related Articles 

A weekly newsletter featuring an editorial and a roundup of the latest articles, news and other interesting topics.

Please enter your email address below and click Subscribe.