Papers by KUSHVANTH CHOWDARY NAGABHYRU

Journal of Informatics Education and Research, 2025
Generative AI for Large-scale Enterprise Data Engineering: Automating Code and Query Generation, ... more Generative AI for Large-scale Enterprise Data Engineering: Automating Code and Query Generation, Data Insights, and Analytics. In a data-driven economy, organizations across verticals show a growing demand for ingesting terabytes (or petabytes) of data frequently and performing rapid analytics on the data. However, a sizable effort is required to transform the data into a standardized, query-optimized format. Data scientists have long wished to automate code and query generation for their projects. The advent of generative AI holds the promise to enable such automation. Major technology companies have incorporated generative AI modules within their products. For instance, publicly available large language models (LLMs) can take natural language text as input and generate code in Python, Java, and SQL. OpenAI publicly hosts a jacuzzi to invoke these LLMs, and the open-source community has built similar services. These tools have gained immense popularity, with over 100 million users in just one year. This paper discusses the integration of generative AI tools into large-scale enterprise data engineering workflows for code and query generation, data insights, and analytics.

American Online Journal of Science and Engineering (AOJSE), 2025
Agentic AI concerns the use of decision-making capabilities to perform tasks in an autonomous way... more Agentic AI concerns the use of decision-making capabilities to perform tasks in an autonomous way, with a focus on AI applied to autonomous systems. This paper explores agentic AI's role beyond traditional automation, particularly in autonomous data engineering and adaptive enterprise systems. By 2025, agentic AI will be increasingly embedded in enterprise data pipelines-allowing systems to self-configure and automate business processes related to data engineering tasks. Such implementations leverage AI to provide holistic system autonomy, moving beyond what AI-centric solutions dedicated to individual subtasks can achieve. Data engineering and adaptive enterprise systems offer an instructive arena for observing the transition from automation to agentic AI: it is most apparent when intelligent agents experienced in data engineering are added to oneself-solving, self-configuring, and adaptive enterprise systems. Although these developments come with inherent ethical considerations, the transition from automation to agentic AI is best described as evolution rather than revolution.

American Online Journal of Science and Engineering (AOJSE), 2023
Data engineering and machine learning workflows suffer from this confusion and often employ separ... more Data engineering and machine learning workflows suffer from this confusion and often employ separated pipelines for each of them. Summaries on how companies build machine learning models reveal that they employ numerous components (e.g., Spark, Airflow, TensorFlow). They also use special-purpose versions of such components to address particular concerns (e.g., Airflow for Machine Learning, Google TensorFlow Data Validation). Different versions of components in separated pipelines prevent companies from achieving the automation of model development achieved by Continuous Integration/Continuous Delivery (CI/CD) of traditional software. Therefore, companies should strive to achieve a unified pipeline that supports both data engineering and machine learning to fulfill the objective of Automated Model Deployment. Data engineering prepares the data required by an organization. Machine learning extracts knowledge from data; it needs to consume the data made available by data engineering. Both data engineering and machine learning pipelines can never be separated; therefore, they should never be implemented using separated tools and schedules. A roadmap is necessary that guides enterprises toward building a unified pipeline and, in doing so, helps enterprises achieve Automated Model Deployment.

Online Journal of Engineering Sciences, 2022
Machine Learning (ML) and Artificial Intelligence (AI) are having an increasingly transformative ... more Machine Learning (ML) and Artificial Intelligence (AI) are having an increasingly transformative impact on all industries and are already used in many mission-critical use cases in production, bringing considerable value. Data engineering, which combines ETL pipelines with other workflows managing data and machine learning operations, is also significantly impacted. The Intelligent Data Engineering and Automation framework offers the groundwork for intelligent automation processes. However, ML/AI are not the only disruptive forces; new Big Data technologies inspired by Web2.0 companies are also reshaping the Internet. Companies having the largest Big Data footprints not only provide applications with a Big Data operational model but also source their competitive advantage from data in the form of AI services and, consequently, impact the cost/performance equilibrium of ETL pipelines. All these technologies and reasons help explain why the traditional ETL pipeline design should adapt to current and emerging technologies and may be enhanced through artificial intelligence.
Uploads
Papers by KUSHVANTH CHOWDARY NAGABHYRU