Today we’re joined by Ram Sriharsha, VP of engineering at Pinecone. In our conversation, we dive into the topic of vector databases and retrieval augmented generation (RAG). We explore the trade-offs between relying solely on LLMs for retrieval tasks versus combining retrieval in vector databases and LLMs, the advantages and complexities of RAG with vector databases, the key considerations for building and deploying real-world RAG-based applications, and an in-depth look at Pinecone's new serverless offering. Currently in public preview, Pinecone Serverless is a vector database that enables on-demand data loading, flexible scaling, and cost-effective query processing. Ram discusses how the serverless paradigm impacts the vector database’s core architecture, key features, and other considerations. Lastly, Ram shares his perspective on the future of vector databases in helping enterprises deliver RAG systems.
The complete show notes for this episode can be found at twimlai.com/go/669.
Are Vector DBs the Future Data Platform for AI? with Ed Anuff - #664
Quantizing Transformers by Helping Attention Heads Do Nothing with Markus Nagel - #663
Responsible AI in the Generative Era with Michael Kearns - #662
Edutainment for AI and AWS PartyRock with Mike Miller - #661
Data, Systems and ML for Visual Understanding with Cody Coleman - #660
Patterns and Middleware for LLM Applications with Kyle Roche - #659
AI Access and Inclusivity as a Technical Challenge with Prem Natarajan - #658
Building LLM-Based Applications with Azure OpenAI with Jay Emery - #657
Visual Generative AI Ecosystem Challenges with Richard Zhang - #656
Deploying Edge and Embedded AI Systems with Heather Gorr - #655
AI Sentience, Agency and Catastrophic Risk with Yoshua Bengio - #654
Delivering AI Systems in Highly Regulated Environments with Miriam Friedel - #653
Mental Models for Advanced ChatGPT Prompting with Riley Goodside - #652
Multilingual LLMs and the Values Divide in AI with Sara Hooker - #651
Scaling Multi-Modal Generative AI with Luke Zettlemoyer - #650
Pushing Back on AI Hype with Alex Hanna - #649
Personalization for Text-to-Image Generative AI with Nataniel Ruiz - #648
Ensuring LLM Safety for Production Applications with Shreya Rajpal - #647
What’s Next in LLM Reasoning? with Roland Memisevic - #646
Is ChatGPT Getting Worse? with James Zou - #645
Create your
podcast in
minutes
It is Free
20/20
The Dropout
Ten Percent Happier with Dan Harris
World News Tonight with David Muir
NEJM This Week