Today we’re joined by Ram Sriharsha, VP of engineering at Pinecone. In our conversation, we dive into the topic of vector databases and retrieval augmented generation (RAG). We explore the trade-offs between relying solely on LLMs for retrieval tasks versus combining retrieval in vector databases and LLMs, the advantages and complexities of RAG with vector databases, the key considerations for building and deploying real-world RAG-based applications, and an in-depth look at Pinecone's new serverless offering. Currently in public preview, Pinecone Serverless is a vector database that enables on-demand data loading, flexible scaling, and cost-effective query processing. Ram discusses how the serverless paradigm impacts the vector database’s core architecture, key features, and other considerations. Lastly, Ram shares his perspective on the future of vector databases in helping enterprises deliver RAG systems.
The complete show notes for this episode can be found at twimlai.com/go/669.
Why Deep Networks and Brains Learn Similar Features with Sophia Sanborn - #644
Inverse Reinforcement Learning Without RL with Gokul Swamy - #643
Explainable AI for Biology and Medicine with Su-In Lee - #642
Transformers On Large-Scale Graphs with Bayan Bruss - #641
The Enterprise LLM Landscape with Atul Deo - #640
BloombergGPT - an LLM for Finance with David Rosenberg - #639
Are LLMs Good at Causal Reasoning? with Robert Osazuwa Ness - #638
Privacy vs Fairness in Computer Vision with Alice Xiang - #637
Unifying Vision and Language Models with Mohit Bansal - #636
Data Augmentation and Optimized Architectures for Computer Vision with Fatih Porikli - #635
Mojo: A Supercharged Python for AI with Chris Lattner - #634
Stable Diffusion and LLMs at the Edge with Jilei Hou - #633
Modeling Human Behavior with Generative Agents with Joon Sung Park - #632
Towards Improved Transfer Learning with Hugo Larochelle - #631
Language Modeling With State Space Models with Dan Fu - #630
Building Maps and Spatial Awareness in Blind AI Agents with Dhruv Batra - #629
AI Agents and Data Integration with GPT and LLaMa with Jerry Liu - #628
Hyperparameter Optimization through Neural Network Partitioning with Christos Louizos - #627
Are LLMs Overhyped or Underappreciated? with Marti Hearst - #626
Are Large Language Models a Path to AGI? with Ben Goertzel - #625
Create your
podcast in
minutes
It is Free
20/20
The Dropout
Ten Percent Happier with Dan Harris
World News Tonight with David Muir
NEJM This Week