Introduction to data pipelines in 2 minutes:
For analytics, AI products lots of data is needed – commonly dubbed as Big data
To get the data needed, and in a usable format – we need to pass it through a lot of different data processing stages, a collection of which can be called data pipeline.
Data pipelines have 3 main stages:
– Ingestion: Gather data from various sources in different formats
– Data hub & warehouse – Cleanse, model data at different stages
– Analytics – Run analytics or use specific data sets for machine learning algorithms
Join me at TestBash New Zealand Online 2020 where I’ll talk about a lot more around testing in big data projects!