Transforming Data Engineering with Generative AI  - CDInsights

Transforming Data Engineering with Generative AI 

Data engineers can use generative AI in multiple ways in their jobs. Some key use cases include using the technology to prep and clean data, write code, and more.

Written By
DC
David Curry
Oct 6, 2023
3 minute read
Data engineers can use generative AI in multiple ways in their jobs. Some key use cases include using the technology to prep and clean data, write code, and more.

Generative AI is expected to make its mark on every industry in the next decade, as businesses look for ways to improve productivity and enhance customer experience. For data engineering, there are already quite a few use cases being tested by leading-edge companies, with the aim of reducing the amount of manual work engineers need to do and assisting them with code building. 

Here are a few use cases where generative AI can help data engineers.

Data cleaning and preparation

Data comes in a wide variety of formats and one of the key factors in a successful data-led project is ensuring that the data is high quality and readable by the end platform or algorithm. For data engineers, there are tools available for reformatting and cleaning data, but these can get stuck at the processing stage due to incomplete data or unsupported formats. 

With the natural language processing functionality of generative AI, data engineers will be able to ask for specific cleaning or preparation to be done on a batch of data, avoiding issues where a batch of data has to be scrapped due to it being incompatible. 

See also: What’s Changing Faster? Data Pipeline Tech or the Role of the Data Scientist?

Code conversion 

During a migration or modernization project, a shift in programming language or platform may require a full code conversion. This is a very time-consuming process, as 1-to-1 changes between coding languages are not always available and programmers need to be able to identify the correct substitute.

As generative AI tools like ChatGPT have been trained on gargantuan amounts of data, it has been considered a natural assistant for programmers, as it is capable of referring to documentation, tested code, and forums to find the optimal conversion between many programming languages. 

Generating code 

Similar to code conversion, as generative AI tools have been trained on existing code bases and best practices, data engineers can use them to generate new code that aligns with what has already been added. These tools can also analyze existing code and offer recommendations to cut down on the amount of repetitive or boilerplate code. 

A step up from this, data engineers can also use these systems to design and implement data pipelines, providing the engineers with more time to analyze data quality and application performance. 

See also: MLOps vs DataOps: Will They Eventually Merge?

Testing 

Generative AI can be deployed in various forms for testing performance and security. It can generate test cases that fit the profile of the application or service being delivered, including edge cases which may not be thought up by the data engineering team. 

Creating visualizations

There are already programs available that take data and visualize it, but with generative AI, data engineers can ask for more niche changes and test out how the data would look in a variety of scenarios. By taking hands off the wheel, data engineers can trial more types of visualizations to find ones that work. 

DC

David is a technology writer with several years experience covering all aspects of IoT, from technology to networks to security.

Recommended for you...

What It Takes to Make AI Useful in Enterprise Networking
Santosh Dornal
Apr 27, 2026
How Cloud Quantum Computing Services are Shaping the Future of HPC
Cloud Spending Trends: From Expansion to Optimization in the AI Era
A CIO’s Checklist for a Low-Risk Migration to an AI-Ready Platform

Featured Resources from RT Insights

What It Takes to Make AI Useful in Enterprise Networking
Santosh Dornal
Apr 27, 2026
Data Masking at Scale: Architecting Privacy for Real-time and AI-driven Systems
Yash Mehta
Apr 23, 2026
How Cloud Quantum Computing Services are Shaping the Future of HPC
Cloud Spending Trends: From Expansion to Optimization in the AI Era
Cloud Data Insights Logo

Cloud Data Insights is a blog that provides insights into the latest trends and developments in the cloud data space. We cover topics related to cloud data management, data analytics, data engineering, and data science.

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.