Meeting the future requirements of higher bandwidth while providing ever more complex functions, future network processors will require a number of methods of improving processing performance. One such method will involve deeper processor pipelines to obtain higher operating frequencies. Mitigation of the penalty costs associated with deeper pipelines have achieved by implementing prediction schemes, with previous execution history used to determine future decisions. In this paper we present an analysis of common branch prediction schemes when applied to network applications. Using widespread network applications, we find that unlike general purpose processing, hit rates in excess of 95% can be obtained in a network processor using a small 256-entry single level predictor. While our research demonstrates the low silicon cost of implementing a branch predictor, the long run times of network applications can leave the majority of the predictor logic idle, increasing static power and reducing device utilization.
The different versions of the original document can be found in: