Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: On OpenAI Dev Day, published by Zvi on November 9, 2023 on LessWrong.
OpenAI DevDay was this week. What delicious and/or terrifying things await?
Turbo Boost
First off, we have GPT-4-Turbo.
Today we're launching a preview of the next generation of this model, GPT-4 Turbo.
GPT-4 Turbo is more...
Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: On OpenAI Dev Day, published by Zvi on November 9, 2023 on LessWrong.
OpenAI DevDay was this week. What delicious and/or terrifying things await?
Turbo Boost
First off, we have GPT-4-Turbo.
Today we're launching a preview of the next generation of this model, GPT-4 Turbo.
GPT-4 Turbo is more capable and has knowledge of world events up to April 2023. It has a 128k context window so it can fit the equivalent of more than 300 pages of text in a single prompt. We also optimized its performance so we are able to offer GPT-4 Turbo at a 3x cheaper price for input tokens and a 2x cheaper price for output tokens compared to GPT-4.
GPT-4 Turbo is available for all paying developers to try by passing
gpt-4-1106-preview in the API and we plan to release the stable production-ready model in the coming weeks.
Knowledge up to April 2023 is a big game. Cutting the price in half is another big game. A 128k context window retakes the lead on that from Claude-2. That chart from last week of how GPT-4 was slow and expensive, opening up room for competitors? Back to work, everyone.
What else?
Function calling updates
Function calling lets you describe functions of your app or external APIs to models, and have the model intelligently choose to output a JSON object containing arguments to call those functions. We're releasing several improvements today, including the ability to call multiple functions in a single message: users can send one message requesting multiple actions, such as "open the car window and turn off the A/C", which would previously require multiple roundtrips with the model (learn more). We are also improving function calling accuracy: GPT-4 Turbo is more likely to return the right function parameters.
This kind of feature seems highly fiddly and dependent. When it starts working well enough, suddenly it is great, and I have no idea if this will count. I will watch out for reports. For now, I am not trying to interact with any APIs via GPT-4. Use caution.
Improved instruction following and JSON mode
GPT-4 Turbo performs better than our previous models on tasks that require the careful following of instructions, such as generating specific formats (e.g., "always respond in XML"). It also supports our new JSON mode, which ensures the model will respond with valid JSON. The new API parameter
response_format enables the model to constrain its output to generate a syntactically correct JSON object. JSON mode is useful for developers generating JSON in the Chat Completions API outside of function calling.
Better instruction following is incrementally great. Always frustrating when instructions can't be relied upon. Could allow some processes to be profitably automated.
Reproducible outputs and log probabilities
The new
seed parameter enables reproducible outputs by making the model return consistent completions most of the time. This beta feature is useful for use cases such as replaying requests for debugging, writing more comprehensive unit tests, and generally having a higher degree of control over the model behavior. We at OpenAI have been using this feature internally for our own unit tests and have found it invaluable. We're excited to see how developers will use it. Learn more.
We're also launching a feature to return the log probabilities for the most likely output tokens generated by GPT-4 Turbo and GPT-3.5 Turbo in the next few weeks, which will be useful for building features such as autocomplete in a search experience.
I love the idea of seeing the probabilities of different responses on the regular, especially if incorporated into ChatGPT. It provides so much context for knowing what to make of the answer. The distribution of possible answers is the true answer. Super excited in a good way.
Updated GPT-3.5 Turbo
In addition to GPT-4 Turbo, we are also releasing a...
View more