Download - SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models | Podbean

Discover

Podcast Features
Monetization
Podbean App
- Podcast Studio
  Easy-to-use audio recorder app.
- Podcast App
  The best podcast player & podcast app.

Help and Support
Popular Topics

All Arts Business Comedy Education
Fiction Government Health & Fitness History Kids & Family
Leisure Music News Religion & Spirituality Science
Society & Culture Sports Technology True Crime TV & Film
Live

How to Start a Podcast
How to Start a Live Podcast
How to Monetize a podcast
How to Promote Your Podcast
How to Use Group Recording

Log in
Start your podcast for free

Podcasting
Advertisers
Enterprise
Pricing
Resources
- Help and Support
- Popular Topics
Discover

Log in

Sign up free

Papers Read on AI

News:Tech News

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

2023-03-27

Download

Large language models (LLMs) show excellent performance but are compute- and memory-intensive. Quantization can reduce memory and accelerate inference. However, for LLMs beyond 100 billion parameters, existing methods cannot maintain accuracy or do not run efficiently on hardware. We propose SmoothQuant, a training-free, accuracy-preserving, and general-purpose post-training quantization (PTQ) solution to enable 8-bit weight, 8-bit activation (W8A8) quantization for LLMs. Based on the fact that weights are easy to quantize while activations are not, SmoothQuant smooths the activation outliers by offline migrating the quantization difficulty from activations to weights with a mathematically equivalent transformation. 2022: Guangxuan Xiao, Ji Lin, Mickael Seznec, Julien Demouth, Song Han https://arxiv.org/pdf/2211.10438v4.pdf

view more

More Episodes

ToolAlpaca: Generalized Tool Learning for Language Models with 3000 Simulated Cases

2024-11-01

550

Mini-Omni2: Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities

2024-10-31

176

Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image Animation

2024-10-30

129

F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching

2024-10-18

240

LightRAG: Simple and Fast Retrieval-Augmented Generation

2024-10-17

218

Aria: An Open Multimodal Native Mixture-of-Experts Model

2024-10-16

114

AgentKit: Structured LLM Reasoning with Dynamic Graphs

2024-10-15

144

PDF-WuKong: A Large Multimodal Model for Efficient Long PDF Reading with End-to-End Sparse Sampling

2024-10-14

120

Diffusion Models are Evolutionary Algorithms

2024-10-10

174

Is Safer Better? The Impact of Guardrails on the Argumentative Strength of LLMs in Hate Speech Countering

2024-10-09

115

LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations

2024-10-08

152

Internal Consistency and Self-Feedback in Large Language Models: A Survey

2024-10-07

123

On the Diagram of Thought

2024-10-02

146

3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion

2024-10-01

111

StoryMaker: Towards Holistic Consistent Characters in Text-to-image Generation

2024-09-30

115

On the limits of agency in agent-based models

2024-09-24

177

Symbolic Prompt Program Search: A Structure-Aware Approach to Efficient Compile-Time Prompt Optimization

2024-09-23

117

PuLID: Pure and Lightning ID Customization via Contrastive Alignment

2024-09-22

105

MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery

2024-09-21

145

PuLID: Pure and Lightning ID Customization via Contrastive Alignment

2024-09-20

100

←
1
2
3
4
5
6
7
8
9
10
→

012345678910111213141516171819

Get this podcast on your
phone, FREE

Download Podbean app on App Store

Download Podbean app on Google Play

Create your
podcast in
minutes

Full-featured podcast site
Unlimited storage and bandwidth
Comprehensive podcast stats
Distribute to Apple Podcasts, Spotify, and more
Make money with your podcast

Get started

It is Free

Podcast Services
MONETIZATION & MORE
KNOWLEDGE BASE
Support
Podbean

Privacy Policy
Cookie Policy
Terms of Use
Consent Preferences
Copyright © 2015-2026 Podbean.com