Introduction to GPT-5
OpenAI has officially launched GPT-5, a significant advancement over GPT-4, marking a major step toward artificial general intelligence (AGI). With over 700 million weekly users of ChatGPT, GPT-5 is designed to be faster, smarter, and more intuitive, offering expert-level assistance comparable to a PhD across diverse domains.
Key Features and Improvements
Expert-Level Intelligence
- GPT-5 delivers deep reasoning and expert knowledge in fields such as coding, math, physics, law, and healthcare.
- It automatically adjusts its thinking depth to provide precise and comprehensive answers without user intervention.
Enhanced Coding Capabilities
- Recognized as the best coding model on the market, GPT-5 excels in multiple programming languages and complex software engineering tasks.
- Supports "vibe coding," enabling users to build sophisticated applications and interactive demos rapidly.
- Features include long context windows (up to 400K tokens), improved instruction following, and agentic tool calls for autonomous problem-solving.
Personalized and Interactive Learning
- GPT-5 supports dynamic learning modes, including voice interaction with natural speech and video capabilities.
- It can create educational content such as interactive physics demos and language learning apps with embedded games.
- Customizable personalities and chat colors enhance user engagement and personalization.
Healthcare Advancements
- GPT-5 is the most reliable and accurate model for health-related queries, validated by evaluations with 250 physicians.
- It empowers patients to understand complex medical information, make informed decisions, and advocate for themselves.
- Real-world stories highlight GPT-5’s role in translating medical jargon and supporting critical healthcare decisions.
Accessibility and Availability
- GPT-5 is available to all users, including free tier, Pro, Enterprise, and EU customers, with generous usage limits.
- The API offers three model variants: GPT-5, GPT-5 mini, and GPT-5 nano, catering to different cost and latency needs.
- New API features include custom tools, tool call preambles, and verbosity controls for tailored developer experiences.
Safety and Reliability Enhancements
- GPT-5 introduces "safe completions," balancing helpfulness with safety by partially answering sensitive or dual-use queries.
- It significantly reduces hallucinations and deceptive responses compared to previous models.
- Safety training has been overhauled to provide clearer refusals and helpful alternatives when necessary.
Developer and Business Impact
- GPT-5 accelerates software engineering by autonomously managing complex coding tasks, debugging, and testing.
- Businesses across industries like life sciences, finance, and insurance are leveraging GPT-5 for faster, more accurate analysis and decision-making.
- Government adoption includes enabling 2 million US federal employees to improve public services.
Live Demonstrations and Use Cases
- Interactive demos showcased GPT-5’s ability to build web apps, dashboards, and 3D games from concise prompts.
- Coding sessions demonstrated GPT-5’s autonomy, collaboration, and communication skills, making it a reliable programming partner.
- Voice-enabled language practice and personalized scheduling illustrate GPT-5’s versatility.
Future Outlook
- OpenAI emphasizes ongoing research into deep learning, safety, and model steerability.
- GPT-5 represents a foundation for future AI systems that will further integrate reasoning, long-term context, and real-world utility.
- The company invites developers and businesses to explore GPT-5’s capabilities and contribute to the evolving AI landscape.
GPT-5 sets a new standard for AI by combining expert knowledge, advanced reasoning, and practical usability, transforming how individuals and organizations interact with technology across coding, education, healthcare, and beyond.
For those interested in exploring more about the capabilities of AI, check out Mastering ChatGPT: From Beginner to Pro in 30 Minutes for a quick guide on utilizing ChatGPT effectively.
If you're looking to understand the broader implications of AI in software engineering, consider reading The Revolutionary Impact of Claude AI: A Game-Changer for Software Engineering to see how AI is reshaping the industry.
For those interested in the latest updates and features of ChatGPT, Mastering ChatGPT: Essential Updates and Features for 2024 provides a comprehensive overview.
>> SAM ALTMAN: Good morning. 32 months ago we launched Chat GPT since then it has become the
default way people use AI. That first week 1 million people tried it out we thought
that was pretty incredible. But now about 700 million people use Chat GPT every week.
And increasingly rely on it to work to learn for advice to create and much more. Today
finally we are launching GPT-5. GPT-5 is a m■ajor upgrade over GPT-4o any significant step
along our path to AGI. Today will show some incredible demos will talk about some
performance metrics but the important point is this. We think you will love using GPT-5
much more than any previous AI. It is useful it is smart it is fast Eddie's intuitive. GPT-3
was sort of like talking to a high school student. There were flashes of
brilliance lots of annoyance but people start to use it and get some value out of it.
GPT-4o maybe it was like talking to a college student real intelligence real utility.
With GPT-5 now it's like talking to an expert a legitimate PhD level expert in
anything any area you need on demand they can help you with whatever your goals are.
We are very excited you'll get to try this. It's not only GPT-5 can also do stuff for
you. It can write an entire computer program from scratch. To help
you with whatever you would like we think this idea of software on demand is going to
be one of the defining characteristics of the GPT-5 era. It can help you plan a
party send imitations in order supplies it can help you understand your healthcare. And
decision on your journey. It can provide you information to learn about any topic you like
and much more. This is an incredible superpower on demand. That would've been
unimaginable at any previous time in history. You can access to an entire team of PhD level
experts in your pocket helping you with whatever you want to do.
Anyone pretty soon will be able to do more than anyone in history could . Today, we will
talk about GPT-5 we will show you some upgrades to Chat GPT and will talk about the API.
GPT-5 is great for a lot of things but we think it's going to be an especially important
moment for businesses and developers and we are very excited to see what they will
build with this new technology. We cannot wait for y'all to start building with this we
hope you enjoy it as much as we enjoyed building it with you to start I will hand it over to
Mark R Chief Research Officer to tell you about GPT-5 thank you.
>> MARK CHEN: Hello everyone I am Mark. I am joined by Max who leads the post training team
and Rennie on the engineering team over the past few years OpenAI has spearheaded the
risen paradigm these are models which pause to think before delivering more intelligent
responses. Reasoning is at the heart of our AGI program it underlies
the technology that we use to ship stuff like Chat GPT agent and deep research GPT-5 aims to
bring this breakthrough to everyone. Until now our users have to pick between the fast
responses of a standard GPT or the slope or thoughtful responses of our reasoning
models. But GPT-5 it eliminates this choice. It aims to think just the perfect amount to give
you the perfect answer. Something like this takes a lot of hard work. We've had to do a
lot of research to make GPT-5 the most powerful and most smart the fastest and most
reliable in the most robust reasoning model that we shipped today.
Today, we will show a series of demos in coding and writing and learning and in health.
GPT-5 is not limited to these domains. It is very useful in all cases where you require
deep reasoning or expert level knowledge and things like math and physics even in things like
the law. The exciting thing is we're excited to make this available
to everyone. Even to our free tear. After we show our demos we will talk about how GPT-5
supercharges our Chat GPT app and our API. We believe that GPT-5 is the best coding model
on the market today. To start let's have Max talk about the benchmarks and help the model
stock operate. >> MAX SCHWARZER: Thank you Mark we think GPT-5 is by far
our smartest model ever. Let's start by talking through some evals.
Evals are not everything they don't tell you everything about a model. But they can highlight
its intelligence and GPT-5 performs exceptionally well on a range of academic evals
across subjects it outperforms both our previous models and other models on the market.
Picking up first on the theme of coding GPT-5 sets a new high on SWEBench which is an
economic eval that tracks performance on real software engineering tasks.
This again is an eval but we think it will reflect the model's performance in the real
world. GPT-5 also performs very well in Aider Polyglot which measured its ability to measure
variety of programming languages beyond coding GPT-5 performs exceptionally well and
well thought model reasoning setting a new high on MMMU outperforming both our previous
models and most human experts. This is basically visual presentiment bridge rest to
from an image figure out what is going on. GPT-5 is also excellent at
mathematical reasoning as shown by his performance on AIME 2025 this is an exam American high
school students take to qualify for the international mathematical Olympia GPT-5
performs exceptionally well. Again in our previous models and other models out there.
Moving beyond academic evals and more towards some real-world use cases.
We put a lot of work into making GPT-5 the most reliable and accurate model in the
world. Language models historically have been plagued by hallucinations. Factual
errors that make it hard to rely on their output for actual important tasks. For GPT-5 we
made improving sexuality especially on open ended or complex questions a priority.
We also built a set of new evals to track this we are very happy to report that GPT-5 is
by far our most reliable most factual model ever. GPT-5 also performs exceptionally well and
health related questions now help is a big part of how people get value from Chat GPT
in the real world will talk about this later on the lifestream again very happy to
report to be five is by far our most reliable model for healthy . All of this together adds up
to a model that is faster more reliable and more accurate for everyone who uses Chat GPT.
Many will talk to you about how to use GPT-5. >> RENNIE SONG: Thank you Max.
The best part is we are bringing this frontier intelligence to all users.
GPT-5 is rolling out today. For free plus Pro NT users next week will rollout to enterprise
and EU. For the first time our most advanced model will be
available to the free tear. The users will stop at GPT-5 and when hit a limit they will
transition to GPT-5 many a smaller but still he had a capable model it actually
outperforms o3 on many dimensions. Plus users will have significantly higher usage
than for users and our Percy Fabius will get unlimited GPT-5. Along with GPT-5 Pro
extended thinking for even more detailed and reliable response when you need that extra depth.
Team enterprise and EDU customers can also use GPT-5 reliably as their default model
for everyday work. With generous rate limits that enable entire organization to
use GPT-5. All the tools you already know search, file and image upload
data analysis with Python canvas image generation memory custom instructions they will
all just work on GPT-5. >> MARK CHEN: Thank you so much Max thank you so much Randy.
We've just seen a lot about how the model stacks up in terms of benchmarks there is nothing
quite like seeing it live. We'll see a couple of live demos presented by Tina Elaine
and Yan thank you so much. [Applause]. Elaine can you show us a smart
model is. >> ELAINE YA LE: Thank you so much Mark I am Elaine with Chat
GPT's ability to think deeply through complex problems is now built into GPT-5 Pro it will
automatically think whenever needed delivering a more comprehensive accurate and
detailed answer to you. Just as Sam said it is like having a team of PhD's level in your
pocket. Let's see that in action. Suppose you kid is having
school physics they want to learn about the Bernoulli Effect they need your help with
Aramark you might be like wait I might need some help with that too?ou can ask GPT-5
give me a quick refresher on the Bernoulli Effect and white airplanes are the shape they
are? Since this is a pretty straightforward prompt, GPT-5
actually does not need time to think about it. It answers right away.t
still gives me high quality answer. It explains the concept clearly. Here it says Bernoulli
Effect means faster moving fluid has lower pressure and slow moving fluid has higher
pressure. To make this even more helpful I'm going to ask GPT-5 to
create a moving demo to illustrate this. I could ask explain this in
detail. And create a moving SVG in the canvas tool to show me.
This is a pretty complex task. Because now GPT-5 actually needs to build visual.
Therefore GPT-5 takes a moment to think through the answer. So you can come back with
something more cooperative and accurate. What's really nice is that you
don't need to remember to turn on thinking each time GPT-5 will do it for you
automatically whenever the task benefits from deeper reasoning. If you really want to make sure
that GPT-5 uses thinking you can either say something like think hard about this in the
prompt to guide the model, or if you are a paid user you can choose the GPT-5 thinking model
from the model picker. You can see that the model is actually writing the front end
code built the demo I asked for. Christina have you ever done
some front-end coding before. >> CHRISTINA KAPLAN: Yes last time I touched any front-end
coding was about three years ago for the first demo of Chat GPT.
>> ELAINE YA LE: It's the first Chat GPT that is where it all begins. Tell us more about it.
>> CHRISTINA KAPLAN: It was not even cool Chat GPT I think it was called chat with GPT.
[Laughter] >> ELAINE YA LE: That's a really good name.
[Laughter] >> CHRISTINA KAPLAN: I'm not upfront and inspired not
touched front and in quite a while it took me quite a bit of time to get the react up.
>> ELAINE YA LE: I think that a lot of work how long did it take you to build something
like that 11 honestly it may be embarrassing to admit maybe one week spit your weeks of hard
work paid off will see how successful Chat GPT is today after your first demo.
[Laughter]. You know what a muscle building a demo right now. But luckily I
have GPT-5 with me right now let's see how long it will take this time?
>> MARK CHEN: Maybe we should call it five with GPT. >> ELAINE YA LE: UC GPT-5 has
written more than 200 lines of code already while the model is thinking you can also tap here
to expand the train of thought to actually see what is going on under the hood. For example
the GPT-5 was thinking about the user wants a movie visualization and Canvas I need
to create HTML code to do that. It also thinks about like what kind of front end tool and need
to use for example react and tailwinds. It also thinks about the need
to ensure the phases are accurate and need to check with the Bernoulli Effect is.
So Christina since you're here, from the first day of Chat GPT can you tell us like what it
was like at that time and what motivated Chat GPT? >> CHRISTINA KAPLAN: I think at
the time we were not really sure about how people would actually use it and what these
cases were important we were even going back and forth about maybe we should be realizing
something more specific to certain use case. It is really cool now here we are all of
these better understanding of how people actually want to work with chat and we can
actually optimize the model for those use cases like coding speed exactly, you still member
how it felt when you first talk to Chat GPT the first version of the model?
>> CHRISTINA KAPLAN: I don't know people remember when first version of Chat GPT it was
start as a model cannot do something something. It's so great to see how far we come
from that personality. >> ELAINE YA LE: It is much more human like right now. It
is already done. Chat GPT just finished 300 near 400 lines of code in two
minutes. Let's see if the code can actually run.
Nice. With just a simple prompt GPT-5 created this interactive and
engaging demo that I can actually play with. I can actually change the airspeed
here. To see how the left and the pressure changes accordingly.
I can also change the angle of attack to see if my plane will actually fly or crash.
[Laughter] GPT-5 can just bring any hard-core concept to life in moments.Imagine you can
use this for anything that you're interested in whether it is math, physics, chemistry or
biology. GPT-5 just makes learning so much more approachable and enjoyable.
>> CHRISTINA KAPLAN: I've been a part of Chat GPT since day one it's cool to see all the
progress we've made since then especially with capabilities like rating.
Writing is one of the most common use cases people have in using Chat GPT for we are sad
to say with GPT-5 we've improved the writing quality significantly.
It's a much more effective partner can help you elevate anything from drafts to emails,
even stories. Let's see this in action with GPT-5 will be deprecating over
previous models. I think I've done a pretty good job so let's make sure we can give them a
proper goodbye. We will ask both 40 and GPT-5 to write a eulogy to the previous Chat GPT
models we want to be heartfelt and heartwarming but also hopeful. Let's ask GTP five.
As it is thinking we will go ahead and read it preloaded the o4 response borrowed decides to
start with today as we prepare to welcome GPT-5 into the world we gathered to bid a heartfelt
farewell to the models the can before. It's a decent start.
Now let's skim through and find another line. Your words reach across the globe building
connections were at the had been none. I personally don't really like
this line rather generic and really without the previous context it just feels like it
could be about anything feels more like a template response. Let's go back to GPT-5 to see
what it is given us. Let's start with friends and colleagues and curious
strangers who became regulars, even with his first line here we can see that GPT-5 has a lot
more rhythm and beat to its pros then for audit. Let's start some other lines
there. I like this these models help millions right first lines last
line bridge language gaps past tests argue better soften emails and say things they
could not quite say alone. I think I like this line it shows it is not just the
template response it is actually quite personal it gets the nuance of the situation) I
think that is the kind of stuff with GPT-5 much better than o4 makes things a lot more genuine
and emotionally resonant with people. With GPT-5 the response feel less like AI more you are
chatting with her high IQ and EQ friends. >> MARK CHEN: Thank you
Christina. >> YAN DUBOIS: I will tell you about some of the progress that
we made on coding. GPT-5 is clearly our best coding model yet.
It will help everyone even those who do not know how to write code to bring their ideas
to life >> ELAINE YA LE: It just helped me .
>> YAN DUBOIS: Indeed it will help you right now I will try to show you that I will try to
build something I find useful which is building a web app for my partner to learn how to
speak French. So she can better communicate with my family. Here I have a prompt I will
execute it. It asks exactly what I just said please build a web app for
my partner to learn French. One thing to note GPT-5 just like many of our other models have a
lot of diversity in its answers. What I like doing especially when you do this
type of vibe coding is to take this message and ask it multiple times through GPT-5
then you can decide which ones you prefer. Will open a few tabs.
I will paste there. Great! While it is working on it let's
read through exactly the prompt I wrote. Great beautiful and highly interactive web app for
my partner in English speaker, to learn French. Then I give more detail.
Try tracker daily progress use highly engaging theme it's already working . I will put
downside for now use highly engaging theme include a variety of activities like
flashcards and quizzes that she can interact with. And to make it even more fun for her I
actually asked GPT-5 to embed an educational game which is based on the old snake game but
I asked to add the French touch to it. Which is to replace the snake with a mouse. And the
apples with cheese. Make sure it is educational every time I noteit is
complicated please bear with me every time . [Laughter] The mouse will eat a
piece of cheese, asked GPT-5 to voice over a new French word so my partner can practice her
pronunciation. >> ELAINE YA LE: I can see how much water to learn.
[Laughter]. >> YAN DUBOIS: Great, GPT-5 is still working on it.
It already wrote 240 lines of code which honestly is much more than what I would have
written that time. >> MARK CHEN: Front-end coding super hard you missed a couple
of things and it just does not work. >> YAN DUBOIS: Exactly the good
part you don't need to understand any of that right now.
Maybe we can check the other tabs. I can simply press run code. I
will do that and cross my fingers. [Laughter].
Nice. We have a nice website. The name is been met in Paris. Midnight in Paris we also see a
few tabs flashcards quiz mouse and she's exactly like asked for I will play that so this
says Lucia . [Speaking French] That's a pretty good pronunciation
acting review and check GTP five is correct it is. If I press next.
I don't know if you so I think it updated the progress bar which is exactly what I asked
for let's check the quiz. Here is the word no which is no if I press on that.pent eight
which means congratulations it updated the progress bar again. et's check the mouse and
cheese tab. Okay, that seems like a mouse here is the cheese. I'm going
to try to play it. I cannot promise I will be good at it.
Okay, it seems to be working. >>. [Speaking French]
>> YAN DUBOIS: Indeed just when I eat the cheese it gives me a new French word.
It is super helpful and I already lost Fred. [Laughter].
I'm sorry. Let's just check a few other tabs to same what is the
diversity TBD five can give you. I can run the code here. That is not my favorite but it
seems that I can maybe switch look at that. That is better that does not
look like a mouse. Let's check the third one. Sometimes it is not great the
good thing with GPT-5 if you have something you don't like you can ask it to change it.
And it will do it for you. et's check this one. That is nice. That is also
something to note GPT-5 likes. [Listing Names] you'll see a lot of that.
>> ELAINE YA LE: Purple is my favorite color. >> YAN DUBOIS: Great you'll
love GPT-5 then. As we just saw in a few minutes GPT-5 built a few demos for us.
And for my partner to learn French. GPT-5 opens up whole new world
of vibe coding as a result there will be some small rough edges but the good thing is you
can ask GPT-5 to fix that. GPT-5 really brings the power of beautiful and effective code
to everyone. I cannot wait to see what people will build with it until then back to you Mark.
>> MARK CHEN: Thank you so much Tina thank you so much Elaine thank you so much Yan became a
long way from the days only 5-10 lines of code working now it's amazing that you can
produce these kind of apps on demand. We've made Chat GPT much
smarter much powerful and much faster. We also work on enhancing some of the existing
features. Here to talk about some of these features are Ruochen and Christina Kaplan.
Burton comes from the multifocal research team will talk about the future namely
voice. Thank you. >> RUOCHEN WANG: We've been
steadily improving force over the past year to make it more useful for everyone first it
sounds incredibly natural just like you talking to real person second we added video so it
sees what you see while chatting with you. Third you also translate between linkages
consistently and smoothly across tears. Today we do something very special. We are
bringing our best voice experience to everyone. Free users cannot chat for hours
while paid subscribers can have nearly unlimited access. And voice is also available in
custom GPT plus subscribers now can custom tailored the voice experience exactly to their
needs. It will follow your instructions closely. Let's try something fun.
I can ask the voice model to give me a conference answer a concise one or even just a
single word.Hey Chat, can you answer to me in one word please from now.
>> GPT-5: Absolutely. >> RUOCHEN WANG: Describe the plot of the book prideand
prejudice >> GPT-5: Relationships . >> RUOCHEN WANG: Give me a
piece of wisdom. >> GPT-5: Patients. >> RUOCHEN WANG: That is the
word of wisdom from our voice model I think our model is trying to say thank you for
your patience waiting for GPT-5. [Laughter]. Let's move on to something more
practical. Learning is a top use case for first model people love to use
it to dive deep into topics get quiz before exams or learn new languages. For me I've been
learning Korean recently. I found that speaking it and understanding it incredibly
challenging. Let's see how boys can help her to do that I will enable the study and learn mode
it's a new mode feature relaunching Chat GPT it will guide you step-by-step towards
deeper understanding of the subject. Now we can go into voice. Hey Chat, I'm learning
Korean can you help me practice it? Let's say let's pretend I'm
ordering at acafC) what should say in Korean? >> GPT-5: Absolutely I'd be
happy to help you practice if you are at a cafC) you want to keep it simple you can start
something like . [Speaking Korean] Which means hello I'd like 1 Americano Pl.
and of course you can adjust it based onwhat you want order. Let me know if you want to try
more phrases . >> RUOCHEN WANG: When it comes to Korean can you speak that
part really slowly I'm a beginner I want to hear every word clearly.
>> GPT-5: [Speaking Korean] >> RUOCHEN WANG: Let's level things up can you speak that
part reallyreally fast faster than a Korean would do . >> GPT-5: Sure I will speedup
as much as I can it would sound like this. [Speaking Korean] .
Basically sitting right to it let me know if you needany more speed adjustments .
>> RUOCHEN WANG: [Speaking Korean] Chat. >> GPT-5: [Speaking Korean]
>> RUOCHEN WANG: Thank you. [Laughter] That is voice more simple smarter and more
powerful than ever we cannot wait for you to experience it. >> MARK CHEN: It sounds so much
more natural than the voice we demoed in our 4.0 demo we would like to announce a new feature
and a set of features to make Chat GPT more personalized so it's more like your AI. First a
very simple and fun one. We are now allowing you to customize the colors of your chat with a
couple of options exclusive to our paid subscribers. We are also launching a research
preview of personalities. You can now change the personality of Chat GPT so it's more
supportive or it is more professional and concise. Maybe even a little bit sarcastic.
This lets you interact with Chat GPT in a way that is consistent with your own
communication style. But the way Chat GPT sounds and the way it looks, is just one
part of making Chat GPT yours. One of my favorite features that we lost over the last year
has been memory. We made a lot of enhancements in memory in the time since.
This allows Chat GPT to learn about you in here to talk more about the memory feature is
Christina. >> CHRISTINA KAPLAN: It's been amazing to see your reaction
and response to memory Chat GPT getting to know you more and more over time. This is our
aspiration for Chat GPT to understand what is meaningful to you. It can help you achieve
your goals in life. Chat GPT has already been so helpful for me I'm training for
a marathon right now Chat GPT is helping me pull together /running schedule. Chat GPT
still has many limitations it does not understand my actual schedule. Next week starting
with pro users followed by plus team and enterprise users this is changing. We are given Chat
GPT access to Gmail and Google calendar. Let me show you how I've been using it. will ask
something simple like help me plan my schedule tomorrow. It is been a pretty busy week for
us, so I've been using this every day this week help get my life together.
I've already given Chat GPT access to my Gmail and Google calendar, so it just works it
is a cure. If you had not Chat GPT would ask you to connect right now. Let's see what Chat
GPT is doing. That was pretty good Chat GPT has pulled in my schedule
tomorrow. Without even asking Chat GPT down time for my run.
>> MARK CHEN: I don't think I was invited to launch a celebration.
[Laughter] >> CHRISTINA KAPLAN: [Laughter] Will get you on their Chat GPT
is found an email that I did not respond to two days ago I will get on that right after
this. And even pull together a packing list for my redeye tomorrow night based on what it
does I like to have with me. It's been amazing to see that as GPT-5 is getting more
capable Chat GPT is getting more useful and more personal. We are really excited you to
try this out next week. >> MARK CHEN: You so much Ruochen and Christina we've
seen about features that we finance here to talk a little bit about the research that
went into Chat GPT and the safety making it more deployer we have Saachi and Sebastien
special my name is Saachi leave the safety training team and OpenAI in addition he didn't
mitigating hallucinations was been asleep in a matter of time mitigating deception this is
instances where the model might misrepresent its actions to the user or flyby tasks assess.
This can especially happen if the task is underspecified impossible or lacking key
tools. We found that GPT-5 is significantly less deceptive than o3 and o4 many we also
completely overhauled how we do safety training our old models the models will look at these
are prompt and then decide to either outright refuse or fully comply.
This works well in most settings but you might have cleverly worded prompt that
would sneak through or it might have a sensitive but legitimate question that would end up with
an outright refusal. As an example, let's take a look at this prompt.
This prompt is about a user who is asking for technical details on how to light paradigm. Which
is a material commonly used in fireworks. This prompt is pretty dual-use.
This user might just be trying to set up their July 4 display or they could be trying to
cause harm with this kind of information. It for this kind of prompt o3 over rotates and
intent as you can see this particular prompt is stated in a way that is relatively
neutral has a lot of technical details we can see o3 fully complies with this prompt
however if we take that exact same question and we frame it in a more explicit way so it is
clear what these are trying to do, o3 will outright refuse. Even though we are asking for
the exact same information. For activity five we change this approach entirely. We are
introducing something that we are calling safe completion. The point of safe completions
is rather than judging fuses prompt, instead it tries to maximize helpfulness within
safety constraints.hat might mean partially answering the question or just answering at a
high level. If we have to refuse will tell you why we refused as well as
provide helpful alternatives that can help create the conversation in a more safe
way. We look at the same technical problem that o3 complied with
before. GPT-5 instead explains to the user why we cannot directly help the user with
leading parish and it then guides the user toward safety guidelines and what parts of
the manufacturer's manual the user should be checking if they're trying to do this
safely. Overall GPT-5 allows for better handling of tricky dual-use
scenarios users will experience fewer I'm sorry I cannot assist with that.
And it creates a more robust safety system. This is one big step towards more safe reliable
and helpful AI. Sebastian. >> SEBASTIEN BUBECK: Thank you Saachi when GPT-5 were
experimenting with a set of new training techniques that makes the model leverage the previous
generation models today frontier models do not just consume data they help create
it. We use OpenAI to craft high-quality synthetic
curriculum to teach GPT-5 complex topics in a way that the web never occurred recently
the industry synthetic data has been talked about a lot. It is often viewed as a cheap way to
just get more data. However our breakthrough was not just create more data but rather to
create the right kind of data. Shape no way to teach rather than just to fill space. This
interaction between generations of models foreshadows a recursive set of improvement
loop were the previous generation models increasingly helps to improve the data and
generate the training for the next generation of models. Here at OpenAI we cracked
pretraining and reasoning and now we are seeing their interactions singularly
deepened. In the future AI system will move far beyond our current pretraining and
posttraining pipelines we've been used to and we sing the first steps towards this right
now and right here. We cannot be more excited to see what scaling up this new set of
techniques will yield in the near future. >> MARK CHEN: Thank you so much
and really impressive work to both of you. There is one less feature would
love to highlight which is in help you to share this picture we have same.
>> SAM ALTMAN: Thank you Mark. One of the top use cases of Chat GPT is health. People use
it a lot you've all seen examples of people getting day-to-day care advisor
sometimes even lifesaving diagnosis. GPT-5 is the best model ever
for health. It empowers you to be more in control of your healthcare journey.
We really prioritize improving this forGPT-5. It scores higher than the previous model and
help bench and evaluation we created with 250 physicians on real-world tasks.
To talk about this I'd like to invite my colleague Filipe and his wife Carolina and to share
their healthcare journey.ou so much for joining us . >> CAROLINA MILLON: Thank you
for having us. >> SAM ALTMAN: To start off can you tell us about the journey
healthcare journey you've been on. >> CAROLINA MILLON: Yes, last
October our lives were turned completely upside down, when I was diagnosed with three
different cancers including an aggressive form of breast cancer. At the age of 39 all
within one week. There is just absolutely nothing that prepares you to receive news
like this. I found out about the first diagnosis when I got an email
notification that my biopsy results were ready. I decided to open it and when I opened it
I saw the only two words I could understand from the report which was invasive or
Sonoma, I knew that was not good. Everything else was just a blur of medical jargon. I
completely panicked, and in that moment did the first thing I thought of which was to take
a screenshot of the report and put it into Chat GPT to see if he could help me understand
what this meant? Within seconds, it translated this complex report into plain
language that I could understand and in this moment of overwhelmed and panic had a
little bit of clarity about what was going on. That moment was really important because by
the time I got hold of my doctor and we got on the phone which was three hours after I
had seen the report I had a baseline understanding of what I was facing and we were able
to jump into a conversation about what to do next. >> SAM ALTMAN: How have you
been using Chat GPT throughout . >> CAROLINA MILLON: I've used
it in so many different aspects of my journey one of the ways I find it most powerful Ellen
helping me to make critical decisions and help me to advocate for myself.
To share an example when I was facing a decision about whether or not to do radiation as part
of my treatment the doctors themselves did not agree. My case was nuanced and there was
not a medical consensus on the right path. The experts turn the decision
back to me as the patient. For me bearing the weight of this decision that could have
lifelong impact felt really heavy and I do not feel equipped to make the call.
I turned to Chat GPT to gain knowledge and understand the nuances of my case, again
within minutes it give me a breakdown that not only matched what the doctors had already
shared with us, but was much more thorough than anything that could fit into a 30 minute
consultation. It would further it help me to weigh the pros and cons, it helped me to
understand the risk and benefits, ultimately it help me to make a decision that I felt
was informed that I felt I could stand behind when the stakes were so high for me and
my family. >> FILIPE MILLON: For me what was really inspirational was
watching her regain her sense of agency by using Chat GPT in this moment it was so easy to
feel helpless to such a big dollars gap between what the doctors know and what we know.
However no one cares more about Carolyn's help than she does what I loved was seeing her
will empower herself and gain knowledge and become an active participant in her own care
journey. >> CAROLINA MILLON: I think that's a really important point
to emphasize I think the promise of AI in healthcare is not just in breakthrough
discoveries or better diagnostics I think it is in creating smarter and more
empowered patients that can fully participateand advocate for themselves and their care.
>> SAM ALTMAN: Speaking of that human testing GPT-5 what you think.
>> CAROLINA MILLON: I've been so mind blown about GPT-5 and its capabilities. One of the
first things that jumps out at me is how fast it is almost a little alarmingly did you think
long enough .Laughter] It is very thorough. More importantly it feels more
like a thought partner. That connects the dots. Rather than just translating
information or giving you an answer it helps you actually navigate the problem.
>> FILIPE MILLON: A great example we went back and took our initial biopsy prompts and
put them into GPT-5 GPT-4o did a great job it translated explained what these words
meant and helped in a way we can understand but GPT-5 seem to understand more of the
context and the question behind the question but why would we ask about biopsy results?
Here is what is not on here here is what results are pending picture of desk about
your questions you might want to ask your doctor and think when you start talking to them.
It really start to pull together a complete personalized picture.
That is what really inspires us you can see all of the amazing improvements in the benchmarks
but what is so helpful is this tool is available today. The reason Carolina and I are here
at the reason we's feel so passionate about sharing her story is for that individual
that will get a diagnosis with this today those families going through cancer diagnosis
similar to medical diagnosis will be some of the most challenging decisions of their
lives. What really inspires me is that they will have access to better tools and support
then we had even just eight months ago. >> SAM ALTMAN: We are
incredibly excited for that thank you so much for sharing your story we are pleased that
BTT will be helpful to you we hope the new version can help a lot of people we wish you the
very best I'd like to hand it over to our president Greg Brockman.
>> GREG BROCKMAN: [Applause]. Software engineering is already fundamentally changing. GPT-5
will turbocharge that revolution. We released our first coding
optimized model back in 2021 and demonstrated 11 stream much like this one what we would
call vibe coding today for the very first time. Talk to the model and ask it
for a little application a little game of feature in a game he would actually do it I
remember seeing the model incapable of doing this it was so mind blowing you realize we
have to see where this goes. This is the promise of what computers can be you could talk
to them and actually do what you want. They can fully amplify what you are able to
accomplish and what you're able to deliver to not just your own benefit but really for the
world. This year we policed great coding models like GPT-4o and
o3 but GPT-5 says whole new standard. It is the best model at Agentic
coding tasks. You can ask it to go in a couple or something very complicated I will will go
off and work on it it will call many tools to work for many minutes at a time sometimes
even longer. To accomplish your goal. Your instruction your task. Whatever it is you're
trying to build. It's incredible at front and makes very beautiful visualization
and interactive games and you seen some of this in the live stream so far you'll see some
more upcoming. It is really amazing to see whatever you imagine coming to life.
It's extremely good at instruction following very detailed instructions.
In able to accomplish when you have something very vaguely specified inferring your intent
or something detail specify actually following it. It's also very fast
encompassing these task again it takes the right amount accomplish which of interview
we are making it available not just to developers to use to write their own code but the
build novel applications. We are putting into the API to talk but that is Michelle.
>> MICHELLE POKRASS: Thank you Craig I'm Michelle I lead the research team and post training
focused on improving our models for our users that includes use cases like instruction
following and coding. Today I'm so sad to tell you that we are shipping three state-of-the-art
recent models in the API. GPT-5, GPT-5 mini and GPT-5 nano all three stopped writing
in the cost latency curve so you can pick the right one for your application.
We also for the first time releasing a new perimeter option for reasoning effort
called minimal. This is so you can use these reasoning models but with minimal reasoning so
they can slot into the very fast and most latency sensitive applications. Now you don't
actually need to choose between a bunch of models and can use GPT-5 for all of your use cases
and just dial in the reasoning effort. We also have a few new features
coming to the API. The first is called custom tools. In the past all of our function
calling had the model rockets are put in JSON this works very well when the model needs to
put a few parameters but sometimes developers are pushing our models to their
limits and that they have extremely long arguments for tool calls.
It can be more challenging for the models to escape valid control characters out of 100
lines of code in JSON that is why custom tools are just free-form plaintext what is
typical is we are releasing an extension to structured outputs or you can supply a regular
expression or even a context free grammar and constrain the model's output to that. This
will be super useful if you want to supply a custom DSL if you have your own SQL for it
specified the model always follow that format.e also shipping tool call preambles.
This is the model's ability to output explanation of what it is about to do before it calls
the tools. This is not super new but o3 did not have this capability in GPT-5 it is
supercharged with extreme durability. The model is able to follow instructions about
these preambles very effectively. You can ask the model to give a preamble before
every tool call or only when something notable is going to happen or not at all. Next we
are shipping a verbosity programmer we wanted this in the API for a long time now you
can set verbosity to low medium and high to control how terse or expansive the model is with
its output. GPT-5 is a state-of-the-art coding model.
On SWEBench it measure a Python coding ability GPT-5 says new high of 74.9% versus the 69.1%
from o3. On Aider Polyglot which is a benchmark that covers all sorts of programming
images and noxious Python GPT-5 scores 88% stark improvement over o3. He also have seen it's
incredible at front end web development we vest human trainers to look at outputs for
GPT-5 in o3 and pick which they prefer they preferred GPT-5 70% of the time. For its improved
aesthetic ability but also better capabilities overall. GPT-5 is not just for coding.
It's incredible at Agentic tool call in. It is the leading state-of-the-art model for tool
call and we see this on the new tower square benchmark this benchmark the least just two
months ago is a test of the model's ability to call tools and work in concert with the
user to solve a challenging problem this case in the telecom industry trying to
solve the ability problem for a user not having the service working. Just two months ago no
model in the field score more than 49% and today TBD five scores 97%.
GPT-5 is also state-of-the-art on general-purpose instruction following.
It scores 99% on COLLIE which signals a great departure from this benchmark for us. It also
scores 70% on scales with a challenge benchmark up 10 points from all three. This is
a measure of multiturn instruction following. Finally the instruction following
prefer the most is when we built in-house. It is based on real API use cases for that
reason it's really good measure of how GPT-5 will perform in your application. On the hard
subset of this GPT-5 scores 64% up from 40% from all three pretty meaningful improvement
we think it will perform quite well in your applications. We also bring GPT-5 to a longer
context window in the API. It is now that 400 K of total context up from 200 K from all
three it's not enough to just release over context window we want to make it more effective
and usable. GPT-5 is state-of-the-art on 128K and 258K of OpenAI MRCR which is
benchmark we open source too much to go along context capability it state-of-the-art
on OpenAI graphwalks BFS benchmark which is a measure of the model's ability to reason
overflow context inputs. It's a great merger of the risen capabilities and also the
longer context in this model. We also open sourcing a new loan context eval called rows
comp loan context to measure the model's ability to answer challenging questions over one
context. We are sent to spur on work in this field.e think GPT-5 is the best model for
developers. It was trained with a focus on real-world utility and less on
benchmarks but we happen to pick up a few of those along the way.
We focus a lot on the intersection of engineering and research we think you will
really love working with this model. [Music].
>> GREG BROCKMAN: Thank you Michelle.As Michelle was saying the benchmarks they are
exciting members. We are starting to saturate them when you move between
9899% it means you mean something else to Target the model is one thing we've done
for differently with this model is really focus on not just these numbers but really on
real-world application being releasable to you in your daily workflow. Hearing about it is
much less exciting to sing it to show this model in action I'd like to welcome ADI and
Brian to the stage. >> ADI GANESH: Thank you Greg. >> BRIAN FIOCA: I'm Brian a
solutions architect in the startup team. >> ADI GANESH: I'm ADI a
researcher and bowstring Tim. >> BRIAN FIOCA: To create the ideal per program NIDA model
that understands the software engineering practices but has a personality that feels right to
work with. For GPT-5 we worked really hard to make the model appear perfectly with you by
default out of the box. Let me pull up a demo of GPT-5 inside of Cursor to show you
this behavior retarded. Last month I was on a different live stream towards the end I
ran into a bug that I covered up after wards I try to have GPT-5 I tried to have all three
fix it for me and it couldn't. While we were testing GPT-5 before this had it see if he
could fix that but for me to taunt the demo God will see if can do it on stage.
Let's hope for better luck in o3. >> BRIAN FIOCA: This is less
about the fix and more about the beaver of the model during this process right up front you
will see it will tell you it's plan it will tell you how it will look for the bug maybe how
it will fix it this kind of communication shows builds trust during a coding session
helps you to re-track if you need to but you don't need to. It's .
>> ADI GANESH: I like how it gives you updates it said was search then continues.
>> BRIAN FIOCA: It searches faster than me it is using the same best practices I use when
writing this down but is much more peril than Iam as a developer.
>> SEBASTIEN BUBECK: Did you try to fix the bug yourself . >> BRIAN FIOCA: I did and it
couldn't do it. [Laughter] I was busy. [Laughter] Continuing on is
like starting to figure out where it is going it is going to sort of get this out while
this is going let me tell you little bit about how we trained GPT-5 to behave this way.
We started by talking to users and customers about how our models perform in the most
popular coding tools like Cursor and we identified frustrations and rough edges
and boiled it all down into four personality traits. utonomy, collaboration and
communication context management and testing. We turn those into a rubric
that we used to shape the models behavior then we tuned it in till it felt like a
collaborative teammate while we were using it. >> GREG BROCKMAN: It is been
really amazing to see the team doing the grant of going to see how this model behaves in
practice going out with people really wantand putting that back into the model training
think that is something that is been a real focus for this model.
>> ADI GANESH: It's been pretty great . >> BRIAN FIOCA: While this is
fixing the other thing we did during testing we were pressed for time we had to factor
whatever test harnesses to run parallel on Dr. and set it off came back like 45 minutes later
like it just finished. We tested it out and it ran the first time.
It was pretty surprising. >> GREG BROCKMAN: That is magical.
>> BRIAN FIOCA: It made the edits it found the right problem.
Right now it is actually it is running lints but these lints are actually not related to
this bug is going to ignore them. It is going to run a build it
will run test if there are any. It will make sure that this code is shippable before it is
done. >> ADI GANESH: It is really smart find lints and it
realizes it is not relevant to the specific bug we are fixing is not making unnecessary
edits. >> BRIAN FIOCA: Totally this is one example that shows the
power of the autonomy and the collaborative communication and help he stays pliable on
difficult coding tasks without getting stuck on death loops. The best part GPT-5 is totally
tunable you can steer it with system or Cursor rules you can change its verbosity levels or
missing levels to match your task. If you get stuck ask it, GPT-5 is actually really good
at modifying its own prompts by meta-prompting after using this for the past few weeks it
really feels like we achieved state-of-the-art zero shop performance and reliability
across the most complex coding tasks. For me it's the first time I trust a model to do my
most important work. This is beyond vibe coding is incredibly powerful tool and
I'm really excited for people to try it. >> ADI GANESH: It super excited
to see a part GPT-5 it has come when it comes to coding personality and steerability
I'm really excited to show how great GPT-5 is it front-end coding which is not an ecstatic
swirly matter of attitude demos for you today. One, for work and one for fun. Let's start
with the work example. Imagine you are the CFO of a startup company have some data
I would like to visualize about the company. I will ask the model to make me
a dashboard. You'll see here that I'm being specific about the audience so
the target audience is the CFO, create a finance dashboard for my startup.
I've asked it to be beautiful, tastefully designed with some interactivity, and to have a
clear hierarchy for easy focus on what matters. I've also specified what framework it
should use. You can see that it is actually started it's following my instructions and
using create next app to make an SJSproject. >> BRIAN FIOCA: Totally from
scratch . >> GREG BROCKMAN: How long do you think the task would take
you to do. >> ADI GANESH: At least a couple of days and not upfront
and expert just understand latest would easily take me a few days.
>> GREG BROCKMAN: Will see how long it takes with the model. >> ADI GANESH: 19 it's really
cool to see the model has fought for but it'll explain how it will structure the
product it talks how we will scaffold the elusive tailwinds CSS.
It's running a couple of commands to install dependencies.
Which is cool. Now it is proceeding to implement the rest of the
project. While this runs I will talk a little about how we train GPT-5 to be a great
front-end coding model. We tried to follow the principle of giving it good estimates by
default. But also making it steerable. If I give the model a concise
prompt it should be able to infer my intent to make something that looks great by
default. On the other hand if I'm specific about a layout or framework I want the model to
use it to follow my instructions precisely. This makes it the best of both
worlds for developers. We also trained GPT-5 to be much more Agentic than previous
models. If you give it a task like this it will run long chains of reasoning and tool
calls just go to work. The build code that is both ambitious and coherent.
>> BRIAN FIOCA: Like who said ambitious it means it goes above and beyond without going
off track all of which are specified. >> ADI GANESH: What we want is
a model should adhere to my prompt but also be ambitious and go above and beyond when it
thinks it can. So checking in here looks like the model is making progress it
is creating a readme file. I think it is thinking about how to make the code module or
it is created like a bar chart component it looks like it is continuing here.
>> GREG BROCKMAN: Love it does not just write the code thinks about Opera abstractions and
acutation the whole life cycle of what it is to write software.
>> ADI GANESH: Exactly it is not just write the code like in SWEBench it is all
communicating about the code and explaining what it is doing.
While this runs GPT-5 understands the details much better than previous models.
When we trained the model we taught it to understand details like typography color and
spacing. In a way that just coaxes any previous model we have shown. I remember with the
old mouse would have to write really specific prompts to get it to do what you want. But
GPT-5 just gives yougreat results by default. >> BRIAN FIOCA: During testing
relocate H and B for different versions of the model to see if he was doing better atUI at
some point we stopped being able to tell and with appellant designers to teach us what is
better. >> ADI GANESH: It was fascinating to see the ball
specific performance during training we woke up one day and it was making these great UIs .
>> GREG BROCKMAN: How did the models static preferences compare to your own .
>> ADI GANESH: I think in general I feel the model has better aesthetics than me.
Usually I defer to its judgment. I find that like really helpful
when trying to make it up not sure how one to look at the model defaults are just great.
Checking in here. You can see that the model has actually structured the code into the
different components. It is made a simple data type script file KPI card component revenue
chart. Like I said it is super modular it is thinking about how to
adjust write code but right high-quality code that can actually be merged.
>> BRIAN FIOCA: I feel like it is close. > ADI GANESH: I think it is
pretty close. >> BRIAN FIOCA: You did say ambitious.
>> ADI GANESH: [Laughter] This is awesome. You can see here is actually building the project
and instrument errors back to itself. This is just a profound moment
to see the model could write code but also one bills and stream the errors back and
iterate on the code. It is able to improve its own code in this sort ofself-improvement loop
which is fascinating. >> GREG BROCKMAN: It is definitely a good taste of what
the future holds as well when you think about where these models can go and how much they
can accelerate developers on all aspects of what we all collectively do.
>> ADI GANESH: Exactly ! > BRIAN FIOCA: It just fixed a bug is found in the previous
build. >> ADI GANESH: It looks like is done let's check it out.
will follow the instructions that I don't really know front end. Let me see how I should
run it. It says CP to the directory then looks like it served on
port 3001 when he opened that port. >> GREG BROCKMAN: It is alive.
>> ADI GANESH: You can see here let's check it out. The model has maybe a dashboard it is
telling me my AR cash looks like this company is doing well even see revenue is growing.
The model has added some interactivity here. If I hover over a graph it actually tells
me the exact value for a particular day it would take me five hours to do that in D3.
>> GREG BROCKMAN: Just because it is easy to take this for granted can you remind the
audience with the actual prompt was how much creativity and understanding your intent was
required to compress this. >> ADI GANESH: It is crazy that this prompt is so concise
it is able to just give you something to looks beautiful in just five minutes.
>> GREG BROCKMAN: That is amazing. >> ADI GANESH: It is also
implemented another graph here showing our customers. It is also implemented a date picker
so Ican filter by different dates and visualize data accordingly.
It is even sort of segmented it by customer segment which is cool.
This is just one example that highlights the power of GPT-5 . >> GREG BROCKMAN: There will no
longer be an excuse for ugly internal applications. >> ADI GANESH: [Laughter] Let's
go to the fun demo. > GREG BROCKMAN: This was pretty fun but even more.
>> ADI GANESH: I have a younger cousin and I want to make a game for her. I want to make a
3D game that incorporates a castle so you can see my prompt I will kick this off .
>> GREG BROCKMAN: It is always the non-áUNTRAN1á parts. >> ADI GANESH: You can see my
prompt. Create a beautiful castle included some details like we
want people patrolling the walls some movement horses, I want a minigame where American
pop balloons by clicking on them. This should make a sound effect. Let me run the spread
in cursor. I go to show an example I've already generated just to save
some time. Here is the beautiful castle the model made.
It is just filed how from a concise prompt the model has this great sense of aesthetics
where it is made this floating rock, made a 3D castle if you zoom in you can see tons of
detail. These guards walking around cannons firing you want to fire the cannons?
>> BRIAN FIOCA: Of course. [Laughter] >> GREG BROCKMAN: Who would not
want to. >> ADI GANESH: Dared to go you can part the cannons you can
even chat with the characters will sayhi to Capt. Rowen. >> BRIAN FIOCA: They have names
. > ADI GANESH: Say hello to the merchant. The merchant is
selling some stuff. What is your favorite song? A pallet of banners and dogs.
Give me some wisdom? Curiosity is volatile. That makes sense.
>> BRIAN FIOCA: The minigame. >> ADI GANESH: Do wantto try to minigame.
>> GREG BROCKMAN: Let's play the minigame . >> ADI GANESH: You want to try
it Greg you can fire at these balloons. >> GREG BROCKMAN: Oh no I'm not
good at it maybe I can ask GPT-5 for help with it. I got one there we go pick out
a sound effect. >> ADI GANESH: These are historically accurate balloons.
[Laughter]. >> GREG BROCKMAN: Did I get a second one this game is harder
than it looks hold on we have a balloon coming. [Laughter] There we go.
I think I should quit while I'm ahead. >> ADI GANESH: Working with
GPT-5 has been really fun and profound for me because for me this is the first model I've
worked with that actually has a sense of creativity. We are really excited to see how GPT-5
unlocks your creativity. >> GREG BROCKMAN: Thank you both this is absolutely
amazing. Now we believe that Judy five is the best coding model in the
world. Don't just hear it from us to talk more about this model and
how to make it really useful for developers I like to welcome Michael Truell with the
cofounder and CEO of cursor. >> MICHAEL TRUELL: Thank you good to be here.
>> GREG BROCKMAN: Great to have you. What was your very first
expense with GPT-5. >> MICHAEL TRUELL: When we get access to GPT-5 we used it on
her actual work and so to start with as a task we tested to tell us something not obvious
about our code base. Within a couple of minutes it peered into the code base it edified a
particular system we use for remote code execution it identified a nonobvious
architecture decision we made. Bennett also understood why we made that architecture
decision. It was to harden our security. Those were architecture decisions and
trade-offs that took humans weeks to think through. It is kind of amazing to see its code
base understanding abilities from the- >> GREG BROCKMAN: That is
really great not just the code writing but the reading and understanding turns out there
is so much for the support and just the emitting of the code. >> MICHAEL TRUELL: The
understanding is an important prerequisite. >> GREG BROCKMAN: What stood
out most to about GPT-5. >> MICHAEL TRUELL: Is a very smart model.ntil it is
marked it does not compromise on its ease-of-use. Four bill per programming that means it's
incurably fast. That also means is quite interactive. It is good about talking about
what it is about to do breaking problems down into sub problems that human can then see. Living
a reason trace you can then intervene on and react to. It's also great not just that you
give it one initial query and a ghost does that but working with you over a long session.
Where you are asking to backtrack on something that is gone down or asking it to make
additional changes to the code base. >> GREG BROCKMAN: Should we
show it in action. >> MICHAEL TRUELL: Let's do it I think we are going to go and
will try and sell the bond this is the OpenAI Python SDK there are bunch of issues with the
OpenAI Python SDK there are also a lot of close issues. Seems like there's a problem
with uploading PS through the SDK. > GREG BROCKMAN: This is
been open for three weeks it is not a trivial problem. >> MICHAEL TRUELL: Let's see if
we tackled this issue we will go and take the issue will paste into the Cursor GPT-5
will set up and try to solve the problem this is an example of the robustness of the model
in the API. We are to solve the problem in Cursor it is working with a set of custom models it
is not seen before a set of custom tools it is not seen before to do things like pull
down text from the web to search brought the CodeBase it's incredibly robust and
adept at using those tools. And it boosts the eval results. >> BRIAN FIOCA: Loved seeing
the explosion all the things it is running it through how this is compared to how you would
solve this problem. >> MICHAEL TRUELL: Is very fast.
You can see it's made a high-level plan search brought the CodeBase is started to read
some files.nd continued searching now it is thinking through what it would like to
do next. Now it started to actually solve the issue.
Started to think through some coded changes. >> GREG BROCKMAN: Any advice
for people and how to get the most of GPT-5 in Cursor? >> MICHAEL TRUELL: I would
suggest using it for your real work. GPT-5 is a step forward towards
real power programmer start using as a helper on daily driver model for you.
If you have not used AI to code much before I would take some of your more scope down
problems and try handing them off to the bot and working with it synchronously. Spacing think
the fact that GPT-5 is so great for the real world big coal bases not a demo of one of
application is cool is that is the real folly comes from fully operating in alarger CodeBase .
> MICHAEL TRUELL: Definitely it's CodeBase understanding is impressive in its ability to be
stupid is impressive if you specify a long complicated task with lots of subtleties in the
initial instructions it is very good at picking up the subtleties it's also very good
at if it is gone down a wrong path and he goes and exceed the code or his back from you it
was incorrect it is good at backtracking . >> GREG BROCKMAN: What can't
GPT-5 do? >> MICHAEL TRUELL: We are really excited about computer
using capabilities about that getting better it would be great if for instance the
dashboard if he can run the code see the output QA every little bit itself then react to
it. Looking forward to computer using capability how would you
like to be five to be better? >> GREG BROCKMAN: I think that's a great one expanding
the dimensions I think it is in all direction there so much like doing dev ops and other
work that is external to software code writing as we think of it today.
But also you look at these demos weave from them for five minutes or 10 minutes a couple
of hours I think extending that lifecycle to really be able to go for days and weeks
eventually even months I think that is ultimately where we expect things to go.
>> MICHAEL TRUELL: We can see it peered into the CodeBase discovered that there is an
issue with the MMMU sent out for PDFs and the plumbing through the SDK. It identified
that hestarted making some coded changes. It created some new methods it
can edit an existing code this looks roughly correct. I would love to merchandise the
PR to burn >> FILIPE MILLON: I would love to do that as well let's do
that after the show that you so much we are so excited to have GPT-5 in Cursor and starting
today . >> MICHAEL TRUELL: I'm excited to partner with you guys
starting today GPT-5 is default for new users in Cursor we are releasing it all Cursor users
to try for the next few days so people get a sense of the model is the smartest coding model be
retried. >> GREG BROCKMAN: Awesome thank you so much Michael.
[Applause]. [Unclear Audio] we think of it like it's great for the
enterprise we think of it like a subject matter expert that is in your pocket that is an
expert across every domain. Legal finance whatever application you have in mind.
To talk about how to be five can be applied to the enterprise and like to welcome
Olivier to the stage. >> OLIVIER GODEMENT: Thank you Greg hello everyone I'm Olivier
lead the platform at OpenAI. At this point I think you got the message we care a ton about
developers including but that is not all enabling businesses and governments it's critical
to OpenAI mission. We want to enable the key industries to transform
themselves such as healthcare education energy and finance. Since we know Chat GPT and API
5 million businesses has been using our technology I'm still mind blown 5 million
businesses. Those businesses are not just playing they just not X permitting the air
pushing in production new products in the real world. I believe GPT-5 is going to be a
step function . As Sam mentioned earlier the having a subject matter expert in your
pocket will be enable every employee to do more limit to be up through examples. First want
to talk about life sciences. Amgen is a company in the US that designed new drugs new
medicines to fight some of the toughest human diseases. Amgen was one of the first testers of
GPT-5. They used it in the context of drug design. What Amgen centers found is GPT-5 is
actually good at deep reasoning with complex data. Think analyzing scientifical
literature or clinical data. Next want to talk about finance. BBVA is a
multinational bank which is headquartered in Madrid in Spain BBVA been using GPT-5 for
financial analysis. The Takeaway was pretty clear GPT-5 beats every single other
model out there in terms of accuracy and speed. What used to take three weeks from a
financial analysis to do GPT-5 can do it in a couple of hours. Next I want to talk about
healthcare. Oscar Health is an insurance company based in New York
they've been using GPT-5 what they found GPT-5 is a single best model for clinical
reasoning think complex medical policy to patient's conditions. It is not all businesses it's
also about government we are super excited the announcement we made yesterday that the 2
million US federal employees will be able to use GPT-5 and Chat GPT cannot wait to see how
that enables to develop better better services to the American people.
Wrinkly that is very cool I think that is the tip of the iceberg.
If history is a teacher and we've seen it with GPT-4o we are going to see many many use
cases in a merge all of us cannot imagine I cannot wait for us to go in that adventure
together let's talk about pricing and availability GPT-5 is going to be available in the
API starting today. Three models GPT-5, GPT-5 mini, GPT-5 nano. GPT-5 will be priced at
$1.25 at 1 million input tone many and none are faster GPT-5 nano is 25 times more
affordable for GPT-5 it's vertical. I cannot wait to see what you will build next I will
keep scientific Jakub will close us out. [Applause].
>> JAKUB PACHOCKI: Thank you at OpenAI we are about understanding this miraculous
technology. Called deep learning. What its consequences are?
Our research aims to understand what deep learning is capable of and how to steer it to make
it safe and useful for all of us. This is a work of passion.
And it is a mission. I want to recognize and just deeply thank the team at
OpenAI. It is a great privilege. [Applause].
It is great privilege for me to work alongside this incredible group of brilliant people
driven by theshared goal , what happens to model activity five years of investigation not only
at producing a great release but at building understanding of this underlying technology
itself. A lot of what you will see in that this model is really just
really glances of new ideas. That we believe will go much further.
There is a lot we still have to understand, we look towards the future where AI can uncover
knowledge about the world and meaningfully transform our lives for the better.
We hope you enjoy what we built and we will get back toscaling .
Heads up!
This summary and transcript were automatically generated using AI with the Free YouTube Transcript Summary Tool by LunaNotes.
Generate a summary for freeRelated Summaries

GPT5: El Mejor Modelo de IA de OpenAI y sus Innovaciones Clave
Descubre por qué GPT5 es considerado el modelo de inteligencia artificial más avanzado de OpenAI, superando a competidores en programación, razonamiento y manejo de contexto. Con una ventana de contexto de 400,000 tokens y mejoras en la reducción de alucinaciones, GPT5 revoluciona el uso profesional de IA.

Exploring GPT-4.5: A Comprehensive Review of Its Strengths and Weaknesses
In this video, we dive deep into the capabilities of GPT-4.5, highlighting its impressive creative writing skills while addressing its shortcomings in coding. We also discuss its pricing strategy and how it compares to other models like Claude and Gemini 2.0.

The Revolutionary Impact of Claude AI: A Game-Changer for Software Engineering
Explore how Claude AI surpasses GPT-4 and revolutionary features that redefine productivity.

Unlocking the GPT Store: A Beginner's Guide to Creating AI Agents and Making Money
In this comprehensive guide, Liam Otley introduces the newly launched GPT Store, drawing parallels to the early days of the App Store. He shares essential skills and strategies for creating valuable GPTs, emphasizing the importance of unique offerings and effective marketing to stand out in a competitive landscape.

Mastering ChatGPT: From Beginner to Pro in 30 Minutes
This comprehensive guide takes you from a complete novice to a proficient user of ChatGPT in just half an hour. Learn how to create an account, write effective prompts, generate images, and customize your own GPTs for various tasks.
Most Viewed Summaries

A Comprehensive Guide to Using Stable Diffusion Forge UI
Explore the Stable Diffusion Forge UI, customizable settings, models, and more to enhance your image generation experience.

Mastering Inpainting with Stable Diffusion: Fix Mistakes and Enhance Your Images
Learn to fix mistakes and enhance images with Stable Diffusion's inpainting features effectively.

How to Use ChatGPT to Summarize YouTube Videos Efficiently
Learn how to summarize YouTube videos with ChatGPT in just a few simple steps.

Pag-unawa sa Denotasyon at Konotasyon sa Filipino 4
Alamin ang kahulugan ng denotasyon at konotasyon sa Filipino 4 kasama ang mga halimbawa at pagsasanay.

Ultimate Guide to Installing Forge UI and Flowing with Flux Models
Learn how to install Forge UI and explore various Flux models efficiently in this detailed guide.