Nikhil Das Nomula
DBT stands for data build tool. If you have been working with data, you might be familiar with the term ETL, which means extract, transform and load. DBT provides the T(Transform functionality). DBT empowers data engineers or data analyst to perform transformations on data.
DBT comes in two offerings. One is DBT CLI and the other is DBT cloud. DBT CLI is basically a bunch of python packages, DBT Cloud has a lot of bells and whistles and it eases the data engineers/analysts developer experience as it has integrations not only with popular data warehouses like Amazon Redshift, Google BigQuery, Snowflake etc but also with github.
The integrations drastically reduces the learning curve for someone to get up and going with DBT cloud as traditionally data engineers have not been used to version controlled systems and dbt cloud basically takes care of that for the data engineers.
We will have other blogs coming up on DBT to share more info, but before we wrap up this blog - we will provide you with some info what DBT does not do. DBT does not store data. DBT is not a data ingestion, data loading or a BI tool. DBT is also not a compute processing tool unlike Apache Spark. The compute is run in the data warehouse DBT is connected to.
If you have any questions on DBT or planning to use DBT, feel free to reach out to us on nikhil.nomula@yajur.tech or book an appointment here
Dagster takes a different approach where it is focused on the whats - which dagster terms them as assets.
Read MoreOne might wonder, if relational databases have been mainstream for so long - how come document based databases i.e. NoSQL have become so popular.
Read More