Key research themes
1. How can program synthesis be leveraged to automate data extraction and transformation in data compilation pipelines?
This theme investigates the development of program synthesis techniques, particularly programming-by-example (PBE) and predictive synthesis, to automate data extraction and transformation tasks within data compilation workflows. It addresses the challenge of generating accurate, reusable programs from incomplete or input-only specifications, aiming to reduce manual effort in data wrangling and preprocessing, which are often time-consuming and require programming expertise.
2. What methodologies and architectures enable automatic generation of dependable and scalable programs for data acquisition and control systems in data compilation?
This theme focuses on the design and implementation of program generators and compiler-compilers that automate the generation of software artifacts, specifically for data acquisition, control systems, and general program compilation. It explores architectural frameworks, extended formal automata models, and attribute grammar-based compilers that facilitate scalable, customizable, and error-free software production essential for integrating diverse data sources and processing logic in data compilation.
3. How can comprehensive data preparation workflows and tools enhance the efficiency and quality of data compilation?
This theme explores approaches, tools, and workflows designed to support comprehensive data preparation, including data cleaning, integration, profiling, matching, and transformation to facilitate effective data compilation. It looks at workflow-based, programmatic, dataset-centric, and automation-driven tools that help minimize manual effort, accommodate heterogeneous data sources, and ensure reusable, repeatable pipelines that underpin reliable compiled datasets for subsequent analysis.