In this episode I speak about data transformation frameworks available for the data scientist who writes Python code.
The usual suspect is clearly Pandas, as the most widely used library and de-facto standard. However when data volumes increase and distributed algorithms are in place (according to a map-reduce paradigm of computation), Pandas no longer performs as expected. Other frameworks play a role in such context.
In this episode I explain the frameworks that are the best equivalent to Pandas in bigdata contexts.
Don't forget to join our Discord channel and comment previous episodes or propose new ones.
This episode is supported by Amethix Technologies
Amethix works to create and maximize the impact of the world’s leading corporations, startups, and nonprofits, so they can create a better future for everyone they serve. Amethix is a consulting firm focused on data science, machine learning, and artificial intelligence.
Pandas a fast, powerful, flexible and easy to use open source data analysis and manipulation tool - https://pandas.pydata.org/
Modin - Scale your pandas workflows by changing one line of code - https://github.com/modin-project/modin
Dask advanced parallelism for analytics https://dask.org/
Ray is a fast and simple framework for building and running distributed applications https://github.com/ray-project/ray
RAPIDS - GPU data science https://rapids.ai/
Rust in the Cosmos Part 2: testing software in space (Ep. 255)
Rust in the Cosmos: Decoding Communication Part I (Ep. 254)
AI and Video Game Development: Navigating the Future Frontier (Ep. 253)
Kaggle Kommando's Data Disco: Laughing our Way Through AI Trends (Ep. 252)
Revolutionizing Robotics: Embracing Low-Code Solutions (Ep. 251)
Is SQream the fastest big data platform? (Ep. 250)
OpenAI CEO Shake-up: Decoding December 2023 (Ep. 249)
Careers, Skills, and the Evolution of AI (Ep. 248)
Open Source Revolution: AI’s Redemption in Data Science (Ep. 247)
Money, Cryptocurrencies, and AI: Exploring the Future of Finance with Chris Skinner [RB] (Ep. 246)
Debunking AGI Hype and Embracing Reality [RB] (Ep. 245)
Destroy your toaster before it kills you. Drama at OpenAI and other stories (Ep. 244)
The AI Chip Chat 🤖💻 (Ep. 243)
Rolling the Dice: Engineering in an Uncertain World (Ep. 242)
How Language Models Are the Ultimate Database(Ep. 241)
Elon is right this time: Rust is the language of AI (Ep. 240)
Attacking LLMs for fun and profit (Ep. 239)
Unlocking Language Models: The Power of Prompt Engineering (Ep. 238)
Erosion of Software Architecture Quality in the Age of AI Code Generation (Ep. 237)
The new dimension of AI: Vector Databases (Ep. 236)
Create your
podcast in
minutes
It is Free
Insight Story: Tech Trends Unpacked
Zero-Shot
Fast Forward by Tomorrow Unlocked: Tech past, tech future
Black Wolf Feed (Chapo Premium Feed Bootleg)
Bannon`s War Room