In this episode I speak about data transformation frameworks available for the data scientist who writes Python code.
The usual suspect is clearly Pandas, as the most widely used library and de-facto standard. However when data volumes increase and distributed algorithms are in place (according to a map-reduce paradigm of computation), Pandas no longer performs as expected. Other frameworks play a role in such context.
In this episode I explain the frameworks that are the best equivalent to Pandas in bigdata contexts.
Don't forget to join our Discord channel and comment previous episodes or propose new ones.
This episode is supported by Amethix Technologies
Amethix works to create and maximize the impact of the world’s leading corporations, startups, and nonprofits, so they can create a better future for everyone they serve. Amethix is a consulting firm focused on data science, machine learning, and artificial intelligence.
Pandas a fast, powerful, flexible and easy to use open source data analysis and manipulation tool - https://pandas.pydata.org/
Modin - Scale your pandas workflows by changing one line of code - https://github.com/modin-project/modin
Dask advanced parallelism for analytics https://dask.org/
Ray is a fast and simple framework for building and running distributed applications https://github.com/ray-project/ray
RAPIDS - GPU data science https://rapids.ai/
The new dimension of AI: Vector Databases (Ep. 236)
Building Self Serve Business Intelligence With AI and LLMs at Zenlytic (Ep. 235)
Money, Cryptocurrencies, and AI: Exploring the Future of Finance with Chris Skinner (Ep. 234)
Debunking AGI Hype and Embracing Reality (Ep. 233)
Full steam ahead! Unraveling Forward-Forward Neural Networks (Ep. 232)
The LLM Battle Begins: Google Bard vs ChatGPT (Ep. 231)
Unleashing the Force: Blending Neural Networks and Physics for Epic Predictions (Ep. 230)
AI’s Impact on Software Engineering: Killing Old Principles? [RB] (Ep. 229)
Warning! Mathematical Mayhem Ahead: Demystifying Liquid Time-Constant Networks (Ep. 228)
Efficiently Retraining Language Models: How to Level Up Without Breaking the Bank (Ep. 227)
Revolutionize Your AI Game: How Running Large Language Models Locally Gives You an Unfair Advantage Over Big Tech Giants (Ep. 226)
Rust: A Journey to High-Performance and Confidence in Code at Amethix Technologies (Ep. 225)
The Power of Graph Neural Networks: Understanding the Future of AI - Part 2/2 (Ep.224)
The Power of Graph Neural Networks: Understanding the Future of AI - Part 1/2 (Ep.223)
Leveling Up AI: Reinforcement Learning with Human Feedback (Ep. 222)
The promise and pitfalls of GPT-4 (Ep. 221)
AI’s Impact on Software Engineering: Killing Old Principles? (Ep. 220)
Edge AI applications for military and space [RB] (Ep. 219)
Prove It Without Revealing It: Exploring the Power of Zero-Knowledge Proofs in Data Science (Ep. 218)
Deep learning vs tabular models (Ep. 217)
Create your
podcast in
minutes
It is Free
Insight Story: Tech Trends Unpacked
Zero-Shot
Fast Forward by Tomorrow Unlocked: Tech past, tech future
The Unbelivable Truth - Series 1 - 26 including specials and pilot
A Prairie Home Companion: News from Lake Wobegon