Recency-based TLB preloading
2000, ACM SIGARCH Computer Architecture News
https://doi.org/10.1145/342001.339666Abstract
Caching and other latency tolerating techniques have been quite successful in maintaining high memory system performance for general purpose processors. However, TLB misses have become a serious bottleneck as working sets are growing beyond the capacity of TLBs. This work presents one of the first attempts to hide TLB miss latency by using preloading techniques. We present results for traditional next-page TLB miss preloading - an approach shown to cut some of the misses. However, a key contribution of this work is a novel TLB miss prediction algorithm based on the concept of “recency”, and we show that it can predict over 55% of the TLB misses for the five commercial applications considered.
References (23)
- T. Austin and G. Sohi, "High-Bandwidth Address Translation for Multiple-Issue Processors," in Proceedings of the 22nd Ann. Int. Symp. on Computer Architecture, pp. 158-167, 1995.
- M. Cekleov and M. Dubois, "Virtual-Address Caches, Part 1: Problems and Solutions in Uniprocessors" pp. 64-71, in IEEE Micro, Nov/Dec 1997.
- J. Chase, H. Levy, and M. Feeley, "Sharing and Protection in a Single-Address-Space Operating System," in ACM Trans. on Computer Systems, pp. 271-307, Nov. 1994.
- B.Chemlik, "The SHADE simulator", Sun Labs T.R. 1993.
- J. Chen and A. Borg, "A Simulation Based Study of TLB Per- formance," in Proceedings of the 19th Ann. Int. Symp. on Computer Architecture, pages 114-123
- H.K.J. Chu, "Zero-Copy TCP in Solaris", in 1996 USENIX Annual Technical Conference, January 22-26, 1996, San Diego, California
- D. W. Clark and J.S. Emer, "Performance of the VAX-11/780 Translation Buffers: Simulation and Measurement," in ACM Trans. on Computer Systems, vol. 3, no. 1, 1985.
- F. Dahlgren and P. Stenström "Evaluation of Stride and Sequential Hardware-based Prefetching in Shared-Memory Multiprocessors," in IEEE Trans. on Parallel and Distributed Systems, Vol. 7, No. 4, pp. 385-398, April 1996.
- J. Huck and J. Hays, "Architecture Support for Translation Table Management in Large Address Space Machines," in Proceedings of the 20th Ann. Int. Symp. on Computer Archi- tecture, pp. 39-50, May 1993.
- B. Jacob and T. Mudge, "Software-Managed Address Trans- lation," in Proceedings of the 3rd Int. Symp. on High-Perfor- mance Computer Architecture, pp. 156-167, Feb 1997.
- B. Jacob and T. Mudge, "A Look at Several Memory Manage- ment Units and TLB-Refill Mechanisms and Page Table Organizations," in ASPLOS-VIII, pp. 295-306. 1998.
- K. Bala, M.F. Kaashoek, W.E.Weihl, "Software Prefetching and Caching for Translation Lookaside Buffers", in Proceed- ings of the First Symposium on Operating System Design and Implementation, November 1994.
- R.L. Mattson, J. Gecsei, D. Slutz, and I.L. Traiger, "Evalua- tion Techniques for Storage Hierarchies", in IBM Systems Journal 9 (2):pp.78-117, 1970
- J. S. Park and G. S. Ahn, "A Software-controlled Prefetching Mechanism for Software-managed TLBs," in Microprocess- ing and Microprogramming, Vol .41, No 2. pp. 121-136, May, 1995.
- X. Qiu and M. Dubois, "Options for Dynamic Address Trans- lation in COMAs," in Proceedings of the 25th Ann. Int. Symp. on Computer Architecture, pp. 214-225, June 1998.
- X. Qiu and M. Dubois, "Tolerating Late Memory Traps in ILP Processors," in Proc. of 26th Ann. Int. Symp. on Com- puter Architecture, pp. 76-87, 1999.
- M. Talluri and M. Hill, "Surpassing the TLB Performance of Superpages with Less Operating System Support," in Pro- ceedings of the Sixth Int. Conf. on Architectural Support for Programming Languages and Operating Systems, Oct 1994.
- M. Talluri, S. Kong, M. Hill, and D. Patterson, "Tradeoffs in Supporting Two Page Sizes," in Proceedings of the 19th Ann. Int. Symp. on Computer Architecture, May 1992.
- B. Wheeler and B. N. Bershad, "Consistency Management for Virtually Indexed Caches," in Proceedings of the Fifth Int. Conf. on Architectural Support for Programming Languages and Operating Systems, Oct 1992.
- pnmrotate, part of Net PBM distribution, version 7: ftp:// wuarchive.wustl.edu/graphics/graphics/packages/NetPBM
- AMD K-7 Product announcement at microprocessor forum. http://www.amd.com/products/cpg/k7/micropforum.html
- HAL SPARC64-III, Microprocessor Report, Dec 8, 1997 http://www.hal.com/home/sparc64-3_mda.html
- A. Seznec, "A Case for Two-Way Skewed-Associative Caches", Proc. 20th Annual Symposium on Computer Archi- tecture, pp. 169-178, May 1993