J.T. Wolohan is the author of "Mastering Large Datasets with Python," a book that helps Python developers adopt functional programming styles in their their project prototyping, in other to scale up towards big data projects. Greg Nokes, a Master Technical Architect with Heroku, initiates their conversation by lying out what Python is and what it's being used for. As a high-level scripting language, Python was primarily used by sysadmins as a way to quickly manipulate data. Over the years, an ecosystem of third-party packages have manifested around scientific and mathematical approaches. Similarly, its web frameworks have shifted towards asynchronous flows, allowing developers to ingest data, process them, and handle traffic in more efficient ways.
J.T.'s book is all about how to move from small datasets to larger ones. He lays out three stages which every project goes through. In the first phase, a developer can solve a problem on their individual PC. This stage typically deals with datasets that are manageable, and can be processed with the compute hardware on hand. The second phase is one in which you still have enough compute power on your laptop to process data, but the data itself is too large. It's not unreasonable for machine learning corpus to reach five terabytes, for example. The third phase proposed is one where an individual developer has neither the compute resources to process the data nor the disk space to store it. In these cases, external resources are necessary, such as cluster computing and some type of distributed data system. J.T. argues that by exercising good programming practices in the first phase, the third "real world" phasing will require little modification of your actual data processing algorithms.
Links from this episodeSpecial Episode: Creativity and Connection in a Remote Workplace
81. Exploring Technical Documentation
80. Defining Operational Agility
79. A Podcast about Podcasts
78. Changing Culture Through Technology
77. Voices of Native and Indigenous People in Tech
76. The W3C and Standardizing the Web
Special Episode: Giving Back in Today's World
75. gRPC
Special Episode: Celebrating our Pride
74. How Dev.to Built a Community
Special Episode: When Giving Back Saves 1000s of Jobs
73. The Blockchain, Beyond Cryptocurrency
72. Designing with Lynn Fisher
Special Episode: Celebrating Technology, Asian Heritage, and Our Communities
71. Linking Data with Mulesoft
Special Episode: Active for Good
70. Monitoring, Privacy, and Security in Public Cloud
69. Designing a Better 2FA Mobile App
Create your
podcast in
minutes
It is Free
Insight Story: Tech Trends Unpacked
Zero-Shot
Fast Forward by Tomorrow Unlocked: Tech past, tech future
The Unbelivable Truth - Series 1 - 26 including specials and pilot
Lex Fridman Podcast