How do you manage the dependencies of a large-scale data science project? How do you migrate that project from a laptop to cloud infrastructure or utilize GPUs and multiple instances in parallel? This week on the show, Savin Goyal returns to discuss the updates to the open-source framework Metaflow.
Savin briefly describes the Metaflow platform and the goal of simplifying engineering overhead for data scientists and programmers. We discuss how the platform captures snapshots of a project as you work, allowing you to go back in time or share the state of your project with another team member.
We dig into the complicated process of managing dependencies for machine learning and data science projects. Savin describes how the required external libraries can be specified within a flow with the new @pypi or @conda decorators. This allows a project to scale from a local machine to the cloud or multiple instances with all dependencies included.
He talks about starting a new company, Outerbounds, with fellow co-workers from Netflix. Their vision is to continue to build the Metaflow open-source platform and offer customers scalable enterprise-grade infrastructure.
This week’s episode is brought to you by Intel.
Course Spotlight: Everyday Project Packaging With pyproject.toml
In this Code Conversation video course, you’ll learn how to package your everyday projects with pyproject.toml. Playing on the same team as the import system means you can call your project from anywhere, ensure consistent imports, and have one file that’ll work for many build systems.
Topics:
Show Links:
Level up your Python skills with our expert-led courses:
Support the podcast & join our community of Pythonistas
Considering Accessibility & Assistive Tech as a Python Developer
Querying OpenStreetMaps via API & Lazy Evaluation in Python
Embarking on a Relaxed and Friendly Python Coding Journey
Pydantic Data Validation & Python Web Security Practices
Decoupling Systems to Get Closer to the Data
Avoiding Error Culture and Getting Help Inside Python
Leveraging Documents and Data to Create a Custom LLM Chatbot
Build a Video Game With Python Turtle & Visualize Data in Seaborn
Using Python in Bioinformatics and the Laboratory
Exploring Duck Typing in Python & Dynamics of Monkey Patching
Building a Healthy Developer Mindset While Learning Python
Automate Tasks With Python & Building a Small Search Engine
Wes McKinney on Improving the Data Stack & Composable Systems
Practical Python Decorator Uses & Avoiding datetime Pitfalls
Great Starting Points for Contributing to Open Source
Building a Python Debugger & Preparing for NumPy 2.0
Measuring Bias, Toxicity, and Truthfulness in LLMs With Python
Serializing Data With Python & Underscore Naming Conventions
Exploring Python in Excel
Create your
podcast in
minutes
It is Free
Insight Story: Tech Trends Unpacked
Zero-Shot
Fast Forward by Tomorrow Unlocked: Tech past, tech future
The Unbelivable Truth - Series 1 - 26 including specials and pilot
Lex Fridman Podcast