Why you need a solid data engineering capability in your organization

Nikhil Das Nomula

solid-data-engineering

Data plays a big role in AI. To give some perspective ChatGPT-3 was trained on multiple sources that include web pages, books, Wikipedia, and articles. We can see the results in front of us and how incredibly useful ChatGPT is in our lives. I will give a simple example where I wanted to create a vegetarian meal diet and the list of items that I need to get from the grocery store. It took me around 2-3 minutes to come up with that list as opposed to spending time on Google looking at various sites and making a list.

OpenAI had great data engineers who were able to scrape this data from multiple sources, munge and massage it, and then train it on their models. The analogy that we can think of in organizations is getting the data from multiple sources and either doing ELT or ETL depending on your organization’s needs. You can have the best model but if you do not have good data to train on, then your attempts to leverage AI may turn futile

We therefore recommended that organizations prioritize building data engineering capabilities which will help tremendously as they look towards to AI and ML in the future.

Logo

© 2024 Yajur LLC . All rights reserved