Loading…
FLOWCON 2022 has ended
Back To Schedule
Tuesday, October 18 • 15:05 - 16:00
🇬🇧 Assuring Data Quality At Scale - (A study of Data Mesh in Practice)

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
Data is the lifeblood of any Data-Driven organisation. High-value Data products, AI/ML pipelines and business decisions are made based on data.
Therefore, it is highly imperative that this data is of the highest quality and continues to stay high quality. Consequently, there is a need to do this centrally to provide standardisation and promote transparency and trust on the data quality metrics calculated and used to measure quality.
Data Mesh is a Data Architecture pattern that has emerged recently. It advocates for centralized capabilities for that constitutes data platform and federated governance across all data products which themselves are domain specific.

This talk is about the practical application of Data Mesh principle in the Data Quality space, the challenges of implementing such capability at scale and of course the opportunities it has unlocked within the organisation in turn.

Why?
Data pretty much rules the majority of customer experience and journeys, which in turn drive the P&L for businesses. This data is captured and sourced via many data pipelines. The quality of data has a direct impact on the quality of the ML model output, accuracy and relevance. It also has a proportional impact on the cost of running data engineering pipelines be it stream or batch data processing.

What is this really about?
This talk is about the practical application of Data Mesh principle. There are not many examples of this out there but there is tremendous interest for this. This talk takes Data Quality as an example of how you can apply this in reality, the challenges of implementing such capability at scale and of course the opportunities it has unlocked within the organisation in turn

What does the talk covers?
Following the DataMesh pattern to building platform capabilities that powers decentralised data products, I want to layout an approach to implementing Data Quality at scale, the key steps in providing confidence and trust in the data being produced & consumed by the data product teams. In this talk, I will talk about
- Data Quality challenges in modern day data-driven enterprises from both Stream and Batch perspective
- Dimensions & metrics of Data Quality, subjective vs objective lens when looking at Data Quality and the resulting challenges in implementing for it
- Key parts and approach to build a Data Quality platform at scale to provide near-realtime visibility to DQ issues
- Fitting this capability around data eco-system including triggering remediation actions such as stopping a data pipeline

Speakers
avatar for Gayathri Thiyagarajan

Gayathri Thiyagarajan

Engineering Leader, Expedia Group
Who are you?I live in England near Windsor with my young family. I have over 17 years of software experience. My other passions include philosophy, history and psychologyWhat do you do for a living?I am an Engineering Lead at Expedia Group leading multiple teams building platform... Read More →


Tuesday October 18, 2022 15:05 - 16:00 CEST
Fast Flow - Auditorium
  Tech, Presentation
  • LANGUE 🇬🇧