Building a Unified Data Platform at Pattern with William Graham
The orchestration of data workflows at scale requires both flexibility and security. At Pattern, decoupling scheduling from orchestration has reshaped how data teams manage large-scale pipelines.In this episode, we are joined by William Graham, Senior Data Engineer at Pattern, who explains how his team leverages Apache Airflow alongside their open-source tool Heimdall to streamline scheduling, orchestration and access management.Key Takeaways:00:00 Introduction.02:44 Structure of Pattern’s data teams across acquisition, engineering and platform.04:27 How Airflow became the central scheduler for batch jobs.08:57 Credential management challenges that led to decoupling scheduling and orchestration.12:21 Heimdall simplifies multi-application access through a unified interface.13:15 Standardized operators in Airflow using Heimdall integration.17:13 Open-source contributions and early adoption of Heimdall within Pattern.21:01 Community support for Airflow and satisfaction with scheduling flexibility.Resources Mentioned:William Grahamhttps://www.linkedin.com/in/willgraham2/Pattern | LinkedInhttps://www.linkedin.com/company/pattern-hq/Pattern | Websitehttps://pattern.comApache Airflowhttps://airflow.apache.orgHeimdall on GitHubhttps://github.com/Rev4N1/HeimdallNetflix Geniehttps://netflix.github.io/genie/Thanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.#AI #Automation #Airflow #MachineLearning
How Astronomer Turns Proactive Monitoring Into Customer Success with Collin McNulty
The evolution of Airflow continues to shape data orchestration and monitoring strategies. Leveraging it beyond traditional ETL use cases opens powerful new possibilities for proactive support and internal operations.In this episode, we are joined by Collin McNulty, Sr. Director of Global Support at Astronomer, who shares insights from his journey into data engineering and the lessons learned from leading Astronomer’s Customer Reliability Engineering (CRE) team.Key Takeaways:00:00 Introduction.03:07 Lessons learned in adapting to major platform transitions.05:18 How proactive monitoring improves reliability and customer experience.08:10 Using automation to enhance internal support processes.12:09 Why keeping systems current helps avoid unnecessary issues.15:14 Approaches that strengthen system reliability and efficiency.18:46 Best practices for simplifying complex orchestration dependencies.23:24 Anticipated innovations that expand orchestration capabilities.Resources Mentioned:Collin McNultyhttps://www.linkedin.com/in/collin-mcnulty/Astronomer | LinkedInhttps://www.linkedin.com/company/astronomer/Astronomer | Websitehttps://www.astronomer.ioApache Airflowhttps://airflow.apache.org/Prometheushttps://prometheus.io/Splunkhttps://www.splunk.com/Thanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.#AI #Automation #Airflow #MachineLearning
Overcoming Data Engineering Challenges at Daiichi Sankyo Europe GmbH with Evgenii Prusov
The shift to a unified data platform is reshaping how pharmaceutical companies manage and orchestrate data. Establishing standards across regions and teams ensures scalability and efficiency in handling large-scale analytics.In this episode, Evgenii Prusov, Senior Data Platform Engineer of Daiichi Sankyo Europe GmbH, joins us to discuss building and scaling a centralized data platform with Airflow and Astronomer.Key Takeaways:00:00 Introduction.02:49 Building a centralized data platform for 15 European countries.05:19 Adopting SaaS to manage Airflow from day one.07:01 Leveraging Airflow for data orchestration across products.08:16 Teaching non-Python users how to work with Airflow is challenging.12:25 Creating a global data community across Europe, the US and Japan.14:04 Monthly calls help share knowledge and align regional teams.15:47 Contributing to the open-source Airflow project as a way to deepen expertise.16:32 Desire for more guidelines, debugging tutorials and testing best practices in Airflow.Resources Mentioned: Evgenii Prusovhttps://www.linkedin.com/in/prusov/Daiichi Sankyo Europe GmbH | LinkedInhttps://www.linkedin.com/company/daiichi-sankyo-europe-gmbh/Daiichi Sankyo Europe GmbH | Websitehttps://www.daiichi-sankyo.euApache Airflowhttps://airflow.apache.org/Astronomerhttps://www.astronomer.io/Snowflakehttps://www.snowflake.com/Thanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.#AI #Automation #Airflow #MachineLearning
Building a Data-Driven Beauty and Wellness Marketplace at StyleSeat with Paschal Onuorah
StyleSeat is revolutionizing how beauty and wellness professionals grow their businesses through data-driven tools. From streamlining scheduling to optimizing marketing, their platform empowers professionals to focus on their craft while expanding their client base.In this episode, Paschal Onuorah, Senior Data Engineer at StyleSeat, shares how the company leverages Airflow, dbt, and Cosmos to drive marketplace intelligence, improve client connections and deliver measurable growth for professionals.Key Takeaways:00:00 Introduction.05:44 The role of the data engineering team in driving business success.08:52 Leveraging technology for real-time business intelligence.10:52 Data-driven strategies for improving marketing outcomes.13:05 How adopting the right tools can increase revenue growth.14:25 Advantages of simplifying and integrating technical workflows.18:45 Benefits of multi-environment configurations for development and production.20:17 Foundational skills and best practices for learning Airflow effectively.22:33 Opportunities for deeper tool integration and improved data visualization.Resources Mentioned:Paschal Onuorahhttps://www.linkedin.com/in/onuorah-paschal/StyleSeat | LinkedInhttps://www.linkedin.com/company/styleseat/StyleSeat | Websitehttps://www.styleseat.comApache Airflowhttps://airflow.apache.org/dbthttps://www.getdbt.com/Astronomer Cosmoshttps://www.astronomer.io/cosmos/Thanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.#AI #Automation #Airflow #MachineLearning
Building the Future of Airflow Execution at Astronomer with Ian Buss and Piotr Chomiak
The evolution of orchestration in Airflow continues with innovations that address both scalability and security. From improving executor reliability to enabling remote execution, these advancements reshape how organizations manage data pipelines.In this episode, we’re joined by Ian Buss, Principal Software Engineer at Astronomer, and Piotr Chomiak, Principal Product Manager at Astronomer, who share insights into the Astro Executor and remote execution.Key Takeaways:00:00 Introduction.04:13 How product leadership drives scalability for enterprise needs.08:23 Architectural changes that improve reliability and remove bottlenecks.10:15 Metrics that enhance visibility into system performance.12:54 The role of remote execution in addressing security requirements.15:56 Differences between open-source solutions and managed offerings.19:04 Broad industry adoption and applicability of remote execution.20:39 Future advancements in language support and multi-tenancy.Resources Mentioned:Ian Busshttps://www.linkedin.com/in/ian-buss/Piotr Chomiakhttps://www.linkedin.com/in/piotr-chomiak-b1955624/Astronomer | Websitehttps://www.astronomer.ioApache Airflowhttps://airflow.apache.org/Airflow Slack Communityhttps://airflow.apache.org/community/Beyond Analytics conferencehttps://astronomer.io/beyond/dataflowcastThanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.#AI #Automation #Airflow #MachineLearning