J.T. Wolohan is the author of "Mastering Large Datasets with Python," a book that helps Python developers adopt functional programming styles in their their project prototyping, in other to scale up towards big data projects. Greg Nokes, a Master Technical Architect with Heroku, initiates their conversation by lying out what Python is and what it's being used for. As a high-level scripting language, Python was primarily used by sysadmins as a way to quickly manipulate data. Over the years, an ecosystem of third-party packages have manifested around scientific and mathematical approaches. Similarly, its web frameworks have shifted towards asynchronous flows, allowing developers to ingest data, process them, and handle traffic in more efficient ways.
J.T.'s book is all about how to move from small datasets to larger ones. He lays out three stages which every project goes through. In the first phase, a developer can solve a problem on their individual PC. This stage typically deals with datasets that are manageable, and can be processed with the compute hardware on hand. The second phase is one in which you still have enough compute power on your laptop to process data, but the data itself is too large. It's not unreasonable for machine learning corpus to reach five terabytes, for example. The third phase proposed is one where an individual developer has neither the compute resources to process the data nor the disk space to store it. In these cases, external resources are necessary, such as cluster computing and some type of distributed data system. J.T. argues that by exercising good programming practices in the first phase, the third "real world" phasing will require little modification of your actual data processing algorithms.
Links from this episode99. The Technical Side of Deep Fakes
98. The Ethical Side of Deep Fakes
Special Episode: Health Metrics at Scale
97. The Challenges of Bespoke Solutions in a Regulated World
I Was There: Stories of Production Incidents
96. Incubating a Startup
95. Intelligence Through Logging
94. Engineering Management
93. Conferences in a Virtual World
92. Strategies for Improving Your Mental Health
91. Destigmatizing Mental Health
90. Saving Lives at Scale: Part Two
Special Episode: Scaling Businesses During a Pandemic
89. Saving Lives at Scale: Part One
88. Monitoring Productivity through IoT
87. Living with Landing
86. Innovations in Business Modeling
85. The New Definition of Frontend Development
84. Salesforce for Heroku Developers
83. SEO and Accessibility
Create your
podcast in
minutes
It is Free
Insight Story: Tech Trends Unpacked
Zero-Shot
Fast Forward by Tomorrow Unlocked: Tech past, tech future
The Unbelivable Truth - Series 1 - 26 including specials and pilot
Lex Fridman Podcast