BenchPress

Foivos Tsimpourlas; Pavlos Petoumenos; Min Xu; Chris Cummins; Kim Hazelwood; Ajitha Rajan; Hugh Leather

doi:10.1145/3559009.3569644

Outline

BenchPress

Ajitha Rajan

Proceedings of the International Conference on Parallel Architectures and Compilation Techniques

https://doi.org/10.1145/3559009.3569644

visibility

…

description

12 pages

link

1 file

Abstract

Finding the right heuristics to optimize code has always been a difficult and mostly manual task for compiler engineers. Today this task is near-impossible as hardware-software complexity has scaled up exponentially. Predictive models for compilers have recently emerged which require little human effort but are far better than humans in finding near optimal heuristics. As any machine learning technique, they are only as good as the data they are trained on but there is a severe shortage of code for training compilers. Researchers have tried to remedy this with code generation but their synthetic benchmarks, although thousands, are small, repetitive and poor in features, therefore ineffective. This indicates the shortage is of feature quality more than corpus size. It is more important than ever to develop a directed program generation approach that will produce benchmarks with valuable features for training compiler heuristics. We develop BenchPress, the first ML benchmark generator for compilers that is steerable within feature space representations of source code. BenchPress synthesizes compiling functions by adding new code in any part of an empty or existing sequence by jointly observing its left and right context, achieving excellent compilation rate. BenchPress steers benchmark generation towards desired target features that has been impossible for state of the art synthesizers (or indeed humans) to reach. It performs better in targeting the features of Rodinia benchmarks in 3 different feature spaces compared with (a) CLgen-a state of the art ML

References (38)

R. Bagrodia, R. Meyer, M. Takai, Yu-An Chen, Xiang Zeng, J. Martin, and Ha Yoon Song. 1998. Parsec: a parallel simulation environment for complex systems. Computer 31, 10 (1998), 77-85. https://doi.org/10.1109/2.722293
Matej Balog, Alexander Gaunt, Marc Brockschmidt, Sebastian Nowozin, and Daniel Tarlow. 2016. DeepCoder: Learning to Write Programs. (11 2016).
Rodinia Benchmarks. [n.d.]. http://lava.cs.virginia.edu/Rodinia/download.htm. [Online; accessed 25-Apr-2022].
Shuai Che, Michael Boyer, Jiayuan Meng, David Tarjan, Jeremy W. Sheaffer, Sang- Ha Lee, and Kevin Skadron. 2009. Rodinia: A benchmark suite for heterogeneous computing. In 2009 IEEE International Symposium on Workload Characterization (IISWC). 44-54. https://doi.org/10.1109/IISWC.2009.5306797
Bruce Collie, Philip Ginsbach, Jackson Woodruff, Ajitha Rajan, and Michael FP O'Boyle. 2020. M3: Semantic api migrations. In 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 90-102.
Chris Cummins, Pavlos Petoumenos, Zheng Wang, and Hugh Leather. 2017. Synthesizing benchmarks for predictive modeling. In 2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). 86-99. https://doi.org/ 10.1109/CGO.2017.7863731
Chris Cummins, Bram Wasti, Jiadong Guo, Brandon Cui, Jason Ansel, Sahir Gomez, Somya Jain, Jia Liu, Olivier Teytaud, Benoit Steiner, Yuandong Tian, and Hugh Leather. 2021. CompilerGym: Robust, Performant Compiler Optimization Environments for AI Research. arXiv:2109.08267 [cs.PL]
Anderson Faustino da Silva, Bruno Conde Kind, José Wesley de Souza Magalhães, Jerônimo Nunes Rocha, Breno Campos Ferreira Guimarães, and Fernando Magno Quinão Pereira. 2021. ANGHABENCH: A Suite with One Million Compilable C Benchmarks for Code-Size Reduction. In 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). 378-390. https://doi.org/10.1109/ CGO51591.2021.9370322
Sander de Bruin, Vadim Liventsev, and Milan Petković. 2021. Autoencoders as Tools for Program Synthesis. arXiv:2108.07129 [cs.AI]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Im- ageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. 248-255. https://doi.org/10.1109/CVPR. 2009.5206848
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL.
Jacob Devlin, Jonathan Uesato, Surya Bhupatiraju, Rishabh Singh, Abdel-rahman Mohamed, and Pushmeet Kohli. 2017. RobustFill: Neural Program Learning under Noisy I/O. In Proceedings of the 34th International Conference on Machine Learning -Volume 70 (Sydney, NSW, Australia) (ICML'17). JMLR.org, 990-998.
Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, and Ming Zhou. 2020. CodeBERT: A Pre-Trained Model for Programming and Natural Languages. arXiv:2002.08155 [cs.CL]
Grigori Fursin and Olivier Temam. 2011. Collective Optimization: A Practical Collaborative Approach. ACM Trans. Archit. Code Optim. 7, 4, Article 20 (dec 2011), 29 pages. https://doi.org/10.1145/1880043.1880047
GitHub. [n.d.]. https://docs.github.com/en/rest. [Online; accessed 25-Apr-2022].
Andrés Goens, Alexander Brauckmann, Sebastian Ertel, Chris Cummins, Hugh Leather, and Jeronimo Castrillon. 2019. A Case Study on Machine Learning for Synthesizing Benchmarks. In Proceedings of the 3rd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages (Phoenix, AZ, USA) (MAPL 2019). Association for Computing Machinery, New York, NY, USA, 38-46. https://doi.org/10.1145/3315508.3329976
Google. [n.d.]. https://cloud.google.com/bigquery. [Online; accessed 25-Apr- 2022].
Dominik Grewe, Zheng Wang, and Michael F. P. O'Boyle. 2013. Portable mapping of data parallel programs to OpenCL for heterogeneous systems. In Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). 1-10. https://doi.org/10.1109/CGO.2013.6494993
Kavi Gupta, Peter Christensen, Xinyun Chen, and Dawn Song. 2020. Synthesize, Execute and Debug: Learning to Repair for Neural Program Synthesis.
Ameer Haj-Ali, Qijing (Jenny) Huang, John Xiang, William Moses, Krste Asanovic, John Wawrzynek, and Ion Stoica. 2020. AutoPhase: Juggling HLS Phase Orderings in Random Forests with Deep Reinforcement Learning. In Proceedings of Machine Learning and Systems, I. Dhillon, D. Papailiopoulos, and V. Sze (Eds.), Vol. 2. 70-81. https://proceedings.mlsys.org/paper/2020/file/ 4e732ced3463d06de0ca9a15b6153677-Paper.pdf
Farah Hariri and August Shi. 2018. SRCIROR: A Toolset for Mutation Testing of C Source Code and LLVM Intermediate Representation. In 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE). 860-863. https: //doi.org/10.1145/3238147.3240482
John L. Henning. 2006. SPEC CPU2006 Benchmark Descriptions. SIGARCH Comput. Archit. News 34, 4 (sep 2006), 1-17. https://doi.org/10.1145/1186736. 1186737
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. 9, 8 (nov 1997), 1735-1780. https://doi.org/10.1162/neco.1997.9. 8.1735
Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, and Kensen Shi. 2020. Learning and Evaluating Contextual Embedding of Source Code. arXiv:2001.00059 [cs.SE]
Chris Lattner and Vikram Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis and Transformation. In CGO. San Jose, CA, USA, 75-88.
Yann LeCun, Y. Bengio, and Geoffrey Hinton. 2015. Deep Learning. Nature 521 (05 2015), 436-44. https://doi.org/10.1038/nature14539
Leandro T. C. Melo, Rodrigo G. Ribeiro, Breno C. F. Guimarães, and Fernando Magno Quintão Pereira. 2020. Type Inference for C: Applications to the Static Analysis of Incomplete Programs. ACM Trans. Program. Lang. Syst. 42, 3, Article 15 (nov 2020), 71 pages. https://doi.org/10.1145/3421472
Maxwell Nye, Luke B. Hewitt, Joshua B. Tenenbaum, and Armando Solar-Lezama. 2019. Learning to Infer Program Sketches. In ICML.
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Des- maison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.). Cur- ran Associates, Inc., 8024-8035. http://papers.neurips.cc/paper/9015-pytorch- an-imperative-style-high-performance-deep-learning-library.pdf
Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representa- tions. (02 2018).
Alec Radford and Karthik Narasimhan. 2018. Improving Language Understanding by Generative Pre-Training.
H. S. Seung, M. Opper, and H. Sompolinsky. 1992. Query by Committee. In Proceed- ings of the Fifth Annual Workshop on Computational Learning Theory (Pittsburgh, Pennsylvania, USA) (COLT '92). Association for Computing Machinery, New York, NY, USA, 287-294. https://doi.org/10.1145/130385.130417
OpenCL specification. [n.d.]. https://www.khronos.org/registry/OpenCL/specs/ 3.0-unified/html/OpenCL_C.html. [Online; accessed 25-Apr-2022].
John E. Stone, David Gohara, and Guochun Shi. 2010. OpenCL: A Parallel Pro- gramming Standard for Heterogeneous Computing Systems. Computing in Science Engineering 12, 3 (2010), 66-73. https://doi.org/10.1109/MCSE.2010.69
Kai Sheng Tai, Richard Socher, and Christopher D. Manning. 2015. Improved Semantic Representations From Tree-Structured Long Short-Term Memory Net- works. In Proceedings of the 53rd Annual Meeting of the Association for Computa- tional Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Beijing, China, 1556-1566. https://doi.org/10.3115/v1/P15-1150
Zheng Wang and Michael O'Boyle. 2018. Machine Learning in Compiler Opti- mization. Proc. IEEE 106, 11 (2018), 1879-1901. https://doi.org/10.1109/JPROC. 2018.2817118
Vanya Yaneva, Ajitha Rajan, and Christophe Dubach. 2017. Compiler-assisted test acceleration on gpus for embedded software. In Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis. 35-45.
Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and Un- derstanding Bugs in C Compilers. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (San Jose, Cali- fornia, USA) (PLDI '11). Association for Computing Machinery, New York, NY, USA, 283-294. https://doi.org/10.1145/1993498.1993532

BenchPress

Sign up for access to the world's latest research

Abstract

Related papers

References (38)

Related papers