How can you measure the quality of a large language model? What tools can measure bias, toxicity, and truthfulness levels in a model using Python? This week on the show, Jodie Burchell, developer advocate for data science at JetBrains, returns to discuss techniques and tools for evaluating LLMs With Python.
Jodie provides some background on large language models and how they can absorb vast amounts of information about the relationship between words using a type of neural network called a transformer. We discuss training datasets and the potential quality issues with crawling uncurated sources.
We dig into ways to measure levels of bias, toxicity, and hallucinations using Python. Jodie shares three benchmarking datasets and links to resources to get you started. We also discuss ways to augment models using agents or plugins, which can access search engine results or other authoritative sources.
This week’s episode is brought to you by Intel.
Course Spotlight: Learn Text Classification With Python and Keras
In this course, you’ll learn about Python text classification with Keras, working your way from a bag-of-words model with logistic regression to more advanced methods, such as convolutional neural networks. You’ll see how you can use pretrained word embeddings, and you’ll squeeze more performance out of your model through hyperparameter optimization.
Topics:
Background Links:
Dataset Links:
Tutorials and Documentation for Python Packages:
Measurement Links:
Training Data for LLMs:
Agents and Plugin Links:
Additional Links:
Level up your Python skills with our expert-led courses:
Support the podcast & join our community of Pythonistas
Considering Accessibility & Assistive Tech as a Python Developer
Querying OpenStreetMaps via API & Lazy Evaluation in Python
Embarking on a Relaxed and Friendly Python Coding Journey
Pydantic Data Validation & Python Web Security Practices
Decoupling Systems to Get Closer to the Data
Avoiding Error Culture and Getting Help Inside Python
Leveraging Documents and Data to Create a Custom LLM Chatbot
Build a Video Game With Python Turtle & Visualize Data in Seaborn
Using Python in Bioinformatics and the Laboratory
Exploring Duck Typing in Python & Dynamics of Monkey Patching
Building a Healthy Developer Mindset While Learning Python
Automate Tasks With Python & Building a Small Search Engine
Wes McKinney on Improving the Data Stack & Composable Systems
Practical Python Decorator Uses & Avoiding datetime Pitfalls
Focusing on Data Science & Less on Engineering and Dependencies
Great Starting Points for Contributing to Open Source
Building a Python Debugger & Preparing for NumPy 2.0
Serializing Data With Python & Underscore Naming Conventions
Exploring Python in Excel
Create your
podcast in
minutes
It is Free
Insight Story: Tech Trends Unpacked
Zero-Shot
Fast Forward by Tomorrow Unlocked: Tech past, tech future
The Unbelivable Truth - Series 1 - 26 including specials and pilot
Lex Fridman Podcast