Download - Large models on CPUs | Podbean

Discover

Podcast Features
Your all-in-one podcasting solution.

Podcast Studio
Easy-to-use audio recorder app.
Livestream
High-performing audio live, without limits.

Podcast App
The best podcast player & podcast app.
Podbean AI
AI-Enhanced Audio Quality and Content Generation.

Ads Marketplace
Join Ads Marketplace to earn money
through sponsorship on your podcast.

PodAds
Manage your ads with dynamic ad insertion capability.
Patron & Paid Content
The seamless way for fans to support you directly
from your podcast.
Apple Podcasts Subscriptions Integration
Effortlessly publish and manage exclusive episodes for your
Apple Podcasts subscribers directly from Podbean.

All Arts Business Comedy Education
Fiction Government Health & Fitness History Kids & Family
Leisure Music News Religion & Spirituality Science
Society & Culture Sports Technology True Crime TV & Film
Live

How to Start a Podcast
How to Start a Live Podcast
How to Monetize a podcast
How to Promote Your Podcast
How to Use Group Recording

Log in
Start your podcast for free

Podcasting
Monetization
Enterprise
Pricing
Discover

Practical AI: Machine Learning, Data Science

Technology:Software How-To

Large models on CPUs

2023-05-02

Download Right click and do "save link as"

Model sizes are crazy these days with billions and billions of parameters. As Mark Kurtz explains in this episode, this makes inference slow and expensive despite the fact that up to 90%+ of the parameters don’t influence the outputs at all.

Mark helps us understand all of the practicalities and progress that is being made in model optimization and CPU inference, including the increasing opportunities to run LLMs and other Generative AI models on commodity hardware.

Leave us a comment

Changelog++ members save 1 minute on this episode because they made the ads disappear. Join today!

Sponsors:

Fastly – Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.com
Fly.io – The home of Changelog.com — Deploy your apps and databases close to your users. In minutes you can run your Ruby, Go, Node, Deno, Python, or Elixir app (and databases!) all over the world. No ops required. Learn more at fly.io/changelog and check out the speedrun in their docs.

Featuring:

Mark Kurtz – Twitter, LinkedIn
Daniel Whitenack – Twitter, GitHub, Website

Show Notes:

Neural Magic
SparseML
SparseZoo
Neural Magic Scales up MLPerf™ Inference v3.0 Performance With Demonstrated Power Efficiency; No GPUs Needed
Deploy Optimized Hugging Face Models With DeepSparse and SparseZoo
SparseGPT: Remove 100 Billion Parameters for Free

Something missing or broken? PRs welcome!

Timestamps:

(00:44) - Neural Magic Mark Kurtz
(03:24) - Why does LLM size matter?
(06:15) - GPUs vs. CPUs
(08:45) - Overcoming perception
(10:54) - Most parameters dont affect results
(16:01) - Balancing space & sparsity
(17:47) - Tackling performance hits
(20:38) - Aware optimization vs not?
(23:52) - Community tools
(26:11) - Neural Magic tools
(29:56) - Supporting new architecture
(31:40) - Exciting research trends
(34:52) - Looking forward in this space
(37:05) - Outro

More Episodes

Rise of the AI PC & local LLMs

2024-06-04

AI in the U.S. Congress

2024-05-29

First impressions of GPT-4o

2024-05-22

Full-stack approach for effective AI agents

2024-05-15

Autonomous fighter jets?!

2024-05-08

Private, open source chat UIs

2024-04-30

2024-04-24

Udio & the age of multi-modal AI

2024-04-16

RAG continues to rise

2024-04-10

Should kids still learn to code?

2024-04-02

AI vs software devs

2024-03-26

Prompting the future

2024-03-20

Generating the future of art & entertainment

2024-03-12

YOLOv9: Computer vision is alive and well

2024-03-06

Representation Engineering (Activation Hacking)

2024-02-28

Leading the charge on AI in National Security

2024-02-20

Gemini vs OpenAI

2024-02-14

Data synthesis for SOTA LLMs

2024-02-06

Large Action Models (LAMs) & Rabbits 🐇

2024-01-30

Collaboration & evaluation for LLM apps

2024-01-23

←
1
2
3
4
5
6
7
8
9
10
→

012345678910111213141516171819

Get this podcast on your
phone, FREE

Download Podbean app on App Store

Download Podbean app on Google Play

Create your
podcast in
minutes

Full-featured podcast site
Unlimited storage and bandwidth
Comprehensive podcast stats
Distribute to Apple Podcasts, Spotify, and more
Make money with your podcast

It is Free

Podcast Services
MONETIZATION & MORE
KNOWLEDGE BASE
Support
Podbean

Privacy Policy
Cookie Policy
Terms of Use
Consent Preferences
Copyright © 2015-2024 Podbean.com