06x04: Keeping Your GPUs Fed with a Data Pipeline from Hammerspace with Molly Presley

2024-03-11

Download

AI training is a uniquely data-hungry application, and it requires a special data pipeline to keep expensive GPUs fed. This episode of Utilizing Tech focuses on the data platform for machine learning, featuring Molly Pressley of Hammerspace along with Frederic Van Haren and Stephen Foskett. Nothing is worse than idle hardware, especially when it comes to expensive GPUs intended for ML training. Performance is important, but parallel access and access to multiple systems is just as important. Building an AI training environment requires identifying and eliminating bottlenecks at every layer, but many systems are simply not capable of scaling to the extent required by the largest GPU clusters. But a data pipeline goes way beyond storage: Training requires checkpoints, metadata, and access to different data points. And different models have unique requirements as well. Ultimately, AI applications require a flexible data pipeline not just high-performance storage.

Hosts: Stephen Foskett, Organizer of Tech Field Day: ⁠⁠⁠https://www.linkedin.com/in/sfoskett/⁠⁠⁠ Frederic Van Haren, CTO and Founder of HighFens, Inc.: ⁠⁠⁠https://www.linkedin.com/in/fredericvharen/⁠

Guest: Molly Presley, Head of Global Marketing at Hammerspace: https://www.linkedin.com/in/mollyjpresley/

Follow Gestalt IT and Utilizing Tech

Website: ⁠⁠⁠⁠⁠https://www.GestaltIT.com/⁠⁠⁠⁠⁠

Utilizing Tech: ⁠⁠⁠⁠⁠https://www.UtilizingTech.com/⁠⁠⁠⁠⁠

X/Twitter: ⁠⁠⁠⁠⁠https://www.twitter.com/GestaltIT⁠⁠⁠⁠⁠

X/Twitter: ⁠⁠⁠⁠⁠https://www.twitter.com/UtilizingTech⁠⁠⁠⁠⁠

LinkedIn: ⁠⁠⁠⁠⁠https://www.linkedin.com/company/Gestalt-IT

Tags: #UtilizingAI #AI #AITraining @Hammerspace_Inc @UtilizingTech

Comments (3)