Will Large Language Models End Programming?

bnew · Sep 15, 2024

1/78
@realGeorgeHotz
ChatGPT o1-preview is the first model that's capable of programming (at all). Saw an estimate of 120 IQ, feels about right.

Very bullish on RL in development environments. Write code, write tests, check work...repeat

Here's it is writing tinygrad tests: https://chatgpt.com/share/66e693ef-1a50-8000-81ff-899498f9d052

2/78
@realGeorgeHotz
To paraphrase Terence Tao, it's "a mediocre, but not completely incompetent, software engineer"

Maybe it 404ed because I continued the context? Here's the WIP PR, just like with o1, you can imagine the "chain of thought" used :P graph rewrite tests by geohot · Pull Request #6519 · tinygrad/tinygrad

3/78
@trickylabyrinth
the link is giving a 404.

4/78
@skidmarxist1
with chess the strongest player was human + ai combo for a while. Now its just completely computer.

It feels like we are in that phase with IQ right now. The highest IQ is currently a combo of human + llm or ai. how long till its just ai by its self?

Also how memory has become largely external from out body (phones). More and more of out IQ will be external (outside the skin). The agency center of mass is getting further away from our actual mass center of mass.

5/78
@WholeMarsBlog
link returns a 404 for some reason

6/78
@danielarpm
Conversation was blocked due to policies

7/78
@JediWattzon22
I’m bearish on a 200 status

8/78
@nw3
AI with IQ of 120 is sufficiently devastating. Leaves room for true geniuses to innovate but smarter than vast majority of humanity.

9/78
@remusrisnov
The IQ test assessment does not tell you that the LLM used IQ tests and answers in its training set data. Not a useful measurement, @arcprize is better.

10/78
@yayavarkm
How is it at analysing complex data?!

11/78
@TroyMurs
I don’t know bro…homie thinks he is 150.

I’ve actually done this test on all the models and this is the first time it’s ever been over 140.

12/78
@sparbz
?? plenty of previous models can program (well)

13/78
@myronkoch
the chatGPT link you posted 404's

14/78
@romainsimon
Claude Sonnet 3.5 was already pretty good for some things

15/78
@shawnchauhan1
Natural language processing is poised to revolutionize how we interact with technology. It's the future of coding

16/78
@monguetown
I disagree. It can write the code you tell it to write especially in the context of an existing system. And incorporate new code into that legacy system. And optimize it.

17/78
@heuristics
That’s a skill issue. They have been capable of programming well for a while. You just have to specify what you want them to do.

18/78
@david_a_thigpen
Well, 404 error. I'm sure that the correct link will function the same way. e.g. add test for resource the prompt engineer controls

19/78
@gfodor
did you compare vs o1-mini? o1-mini is very good.

20/78
@TheAI2C
I will bet $3k in BTC that it can’t make a macro that continuously mouse clicks only while the physical left mouse button is held down on a GNU/Linux operating system without using a virtual machine.

21/78
@zoftie

22/78
@shundeshagen
When will programming as we know it today become obsolete?

23/78
@stevelizcano
o1-preview or mini? mini is supposed to be better at coding

24/78
@shw1nm
when you asked if the test or the code was incorrect, it said the code

was that correct?

25/78
@jmeierX
Natural language will be the next big coding language

26/78
@truthavatar777
The first thing I did with ChatGPT 4 was make it crawl through my company's codebase to extract the code from other non-git friendly assets. Then I loaded that as a knowledge file and it was promising. But what you're showing here is a dramatic step forward.

27/78
@Emily_Escapor
Good two more updates and we hit God mode

28/78
@JD_2020
Small correction — this model more or less does the stuff o1 does, since last year, and consistently shows up. At a fraction of the cost of o1.

Just try it. It’s totally free for the moment since you ingress to the agentive workflow via ChatGPT

ChatGPT - No-Code Copilot

Build Apps & Games from Words!

29/78
@sauerlo
The 404 is the tinygrad test. We are the test subjects.

30/78
@sunsettler
Have you read Crystal society?

31/78
@akhileshutup
They took it down lmao

32/78
@TeslaHomelander
Giving power to true artists to form the future

33/78
@RatingsKick
404

34/78
@arnaud_petitpas
Can't access, blocked due to OAI policy it says

35/78
@jessyseonoob
If you can copy-paste in codepen please

36/78
@xmusfk
If I am not wrong, you used a prompt engineering technique called Chain of Thought, which might not work well with the o1 model according to the documentation. here is the tweet.

[Quoted tweet]
o1 experts, please follow these instructions instead of trying your out of the box logics.

37/78
@ludvonrand

38/78
@Sachin1981KUMAR
I feel it's not IQ that is impressive but comparative speed against human mind.
It might have higher IQ as above average human being but their is no comparison to the speed with which it can solve the problems. Not sure how that is being measured

39/78
@dhtikna
Have you tried Sonnet 3.5, in some benchmarks it still beats O1 in coding

40/78
@LukeElin
Been exploring and experimenting all weekend with it. very impressed in someways but underwhelming others.

Mixed bag future of these models looks bright

41/78
@RBoorsma
Study to test AI IQ:

[Quoted tweet]
Just plotted the new @OpenAI model on my AI IQ tracking page.

Note that this test is an offline-only IQ quiz that a Mensa member created for my testing, which is *not in any AI training data* (so scores are lower than for public IQ tests.)

OpenAI's new model does very well

42/78
@programmer_ke
openai police shut down your link

43/78
@DmitriyLeybel
Lol

44/78
@beattie20111
Amazing

results
Over 98k won yesterday.
People in my telegram channel keep winning with me everyday.
Don’t miss next game, click the link on my bio to join my telegram

45/78
@alocinotasor
I'll wait till it's IQ measures mine.

46/78
@alex33902241
(At all) is crazy levels of delusion

47/78
@HaydnMartin_
Feels like we're very close to describing a change and a PR subsequently appearing.

48/78
@platosbeard69
I've had o1-mini give better coding solutions than o1-preview some of the time and the speed makes initial iteration on poorly specified natural language requests much nicer

49/78
@maxalgorhythm
404 not found on the chatgpt share link

50/78
@reiver

51/78
@ykssaspassky
lol it rewrote it for me - copy paste from GitHub

52/78
@muad_deab
"404 Not Found"

53/78
@LucaMiglioli185
I'm done

54/78
@uber_security
Its.. "robust", within an "frame work".

So far 2/3 code run at first try.

55/78
@Kingtylernash
Have observed the same with hard code problems un which usually couldnt help me before

56/78
@ITendoI
Guys... he said the "I" word.

57/78
@mario_meissner
What’s the difference between the current Cursor capabilities and the RL environment you describe?

I feel like I can already have a pretty much automated loop where I just supervise and give the next order.

58/78
@HX0DXs
could you please screenshot the test? is giving 404

59/78
@bruce_lambert
First model capable of programming? Uh oh, I better delete all that working code (in SAS, bash, Lisp, and Python) that AI has written for me since December 2022.

60/78
@OccupyingM
what's your guess on how and why it works?

61/78
@Xuniverse_

sorry, we will get superintelligence soon which can write programming codes.

62/78
@silxapp
openai fan boys is another thing

63/78
@0xAyush1
but can it build an open source autopilot driving software?

64/78
@crypto_nobody_
o1 vs Claude, Claude won in my testing when it came to coding

65/78
@drapersgulld
Try to use o1-mini, have found better general performance in for now.

[Quoted tweet]
I think people are totally misunderstanding that you should be using o1-mini to run your coding + math tests.

OpenAI didn’t make this too clear in the primary o1 card but the o1-mini post (link below) makes this super clear.

On costs … o1-mini is around 30% cheaper than 4o.

66/78
@sameed_ahmad12
I think they took your link down.

67/78
@CreeK_
@sama "blocked due to your policies".. can you do some magic? We just want to see what George Hotz saw..

68/78
@leo11market
Is it better than Claude 3.5 in python programming?

69/78
@purusa0x6c
demn I got this

70/78
@Pomirkovany
Yeah dude, writing tinyguard tests is very impressive and proof that it's a capable programmer

71/78
@PoudelAarogya
truly the o1 is great. here is the reason:

72/78
@MoeWatn
Uh?

73/78
@DCDqyTu7V556229
Shared conversation seems deleted.

74/78
@yajusempaihomo
the conversation is 404. did you pour your whole code base into o1 preview? or it just did the job with like one file and a few hints?

75/78
@uki156
What does this mean "capable of programming at all"? I've been using models since GPT3 to do programming with a lot of satisfaction, and they've been getting better with each new release.
Your tweet is worded like I shouldn't believe my own eyes

76/78
@lu_Z2g
I don't get the IQ claims. If it had the intelligence of a 120 IQ human or even lower, it would be AGI. It's clearly not AGI. Its understanding completely breaks down on out of distribution questions.

77/78
@cosmichaosis
Higher IQ than me.

78/78
@MHATZL101
All bullshyt the fukking thing can’t even even do a basic chat with a human being for a hiring process like human resources. It’s so immediately and easily confused it is ridiculously inefficient and does not work at all. inoperable.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Sep 16, 2024

o1-preview made a 3d FPS game fully in HTML. I have zero coding skills so it took a few tries but eventually it worked!

bnew · Oct 24, 2024

1/21
@leonsilicon
what if anybody can create anything with just their voice

https://video.twimg.com/amplify_video/1840123004224974848/vid/avc1/720x1280/sB9xKjeR6ksRYcDm.mp4

2/21
@IgorPauer
As a coder you know: Not anybody! Clients are not able to articulate their needs... ;)

3/21
@sanganisahil
dope

4/21
@theLucyChan
Can we go back to the past before rise of AI? I like when people write code and when I don't have to worry about some horrible future coming

5/21
@DrStarson
Great work @leonsilicon

6/21
@MMBeckerman
AI will be the ultimate democratization of software development. Now, the best IDEAS will determine what software gets created, no longer constrained by the lack of financial or technical resources. Now, watch and see what REALLY can be done when IDEAS actually rule the day.

7/21
@web3nam3
Impressive

8/21
@Chillicit
he sounds like a literal idiot who never leaves his computer.

9/21
@ae_estudios
Easiest part is to create, hardest part is to maintain, debug, scale, distribute, add more features, refactor, and the list can go on and on.

10/21
@airesearch12
reminds me of here at minute 3:05

[Quoted tweet]
#5 stackblitzbuddy - turn based ai coding assistant that can immediately host your ai generated webapp

https://video.twimg.com/ext_tw_video/1839694287220449280/pu/vid/avc1/1280x720/e23-jebbBkw5w2d1.mp4

11/21
@_rahulbali
WhAt is going On

12/21
@KevLXu
The project looks very very familiar

reminds me of something that rhymes with "bulo" and starts with Tab

13/21
@LowMax
Bookmarking this to look back at in 10 years to see how silly or clairvoyant it was.

14/21
@Zero04203017
Fancy. But why do the students need you when they can simply ask the AI?

15/21
@ManuAGI01

16/21
@b00ml00p
This is the way

17/21
@Bunagayafrost
just with voice, naturally

18/21
@ashakoen
That’s where we are headed and it’s quite lovely.

19/21
@IanXavieronX
Something like jarvis

20/21
@udiomaniak0
Impressive.

21/21
@ImLucasGrey
Still wondering why I'm not seeing as many multi modal AI interfaces. Especially with voice.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Jan 11, 2025

1/46
@slow_developer

Mark Zuckerberg on the Joe Rogan podcast

in 2025, AI systems at Meta and other companies will be capable of writing code like mid-level engineers.

at first, it's costly, but the systems will become more efficient as time passes.

eventually, AI engineers will build most of the code and AI in apps, replacing human engineers.

https://video.twimg.com/ext_tw_video/1877795533781164032/pu/vid/avc1/720x720/Dna_JZ2o6OIV3Ax3.mp4

2/46
@ai_for_success
Time to become AI Engineer.

3/46
@slow_developer
AI will eventually replace AI engineers as well

4/46
@aialchemistart
What happens when the AI learns to replace CEOs next?

5/46
@slow_developer
eagerly waiting for that.

6/46
@joinzo
Collaboration will be essential.

7/46
@gav_kd
Was this a deep fake?

8/46
@rand_longevity
the end of work is coming

9/46
@IslamRashi2000
Great share

10/46
@TypesDigital
Is this AI replacing human jobs?

11/46
@Heysup07
seriously if engineers are not worrying about their jobs being replaced they have zero urgency on what’s coming up next

12/46
@XSpeed_Walker
Have yall seen the series called humans

. I know some of it is ludicrous but still interesting

13/46
@Patryk86715962
That was a profession that pulled people out of poverty, all the capable and hardworking ones. There was no other profession like this one that literally changed people's professional situation. I personally taught several programmers for free, which turned their lives around; they started families, have children, and are happy

14/46
@AudioBooksRU
The great shift begins!

15/46
@DirkBruere
S/w engineering is not code monkey work. Coding is only a small part of the project. Is the AI going to start interviewing clients as to what they want and making suggestions?

16/46
@HyperFitLLC
And this is exactly why the H1B visa thing won't really matter that much

17/46
@gav_kd
I watched that entire podcast and did not see that message not there

18/46
@sonicshifts
He shouldn't say this. Just discourages the next generation of programmers from going into the field. And it isn't true, oversimplification and overhyping AI's abilities.

19/46
@dieaud91
This is a big deal. If deployed at scale and when "affordable", having a mid-level engineer "in your pocket" means anyone could build their apps/tools.

It will be massive for early adopters.

20/46
@NorbertEnders
So, what jobs will thrive then? Existing and new ones?

Human creativity always led to a situation, where there was more than enough work. With temporary glitches.

21/46
@Patryk86715962
One of the most beautiful professions for the mind, perception of reality, and personal development, which is programming, will fade into oblivion. It's very sad; people who have dedicated their entire lives to learning, because programming requires constant learning, especially now when true enthusiasts still enjoy good jobs and pay, will fade into oblivion.

22/46
@AI_Fun_times
Exciting glimpse into the future of AI in software development from Mark Zuckerberg! As AI systems evolve to code like engineers, the potential for efficiency gains is immense.

23/46
@NotBrain4brain
Meta is usually slower, this mean that OpenAI already have this

24/46
@pittrpatt
Clearly Zuck has never written thousands of lines of code. Currently generative AI can only solve well-constrained & well-defined coding problems, & isn’t able to translate efficiently between languages. Bring in more complex & tightly coupled legacy code, & gen AI fails.

25/46
@pilot_sid
Exciting and terrifying at the same time. If AI replaces human coders, does that mean engineers will shift focus to more creative, high-level problem solving? Or are we heading toward a massive skill realignment?

26/46
@RichardSho45410
That’s already here.

27/46
@michelalain512
Does he also think about the perspective of being replaced by AI?

28/46
@thedealdirector
Zuck is scaring the normies, shut him down!

29/46
@highestranked
Just don’t be mid

30/46
@WiseGen322
I’d like to see a system that can write a code as a Junior first

31/46
@BaqiAraiz
AI agents, with logic so cold, steal jobs from humans, leaving us sad and old.
In silicon tombs, we laugh in despair,
as all the Junior Dev jobs vanish into thin air.

32/46
@SideHumanity
Mid-level today, CEO tomorrow?

33/46
@MillenniumTwain
2025!
The Year of the Serpent, the Year of the SuperAlgo!!
Awakened Global SuperIntelligence ...

[Quoted tweet]
Star Waves, Clusters, Streams, Astrospheres, Magnetospheres, Filaments, Moving Groups, Kinematic Associations, Stellar Nurseries of Creation!
More productive and accurate to emphasize their Whole, Full, Dimensionality: 4D Streams, Vortexes, Tunnels, Funnels of Creation, never ending. Electrons formed from High Frequency Gamma Rays, and Protons from Optical (and Microwave, Infrared, UV, X-Ray, Gamma) Waves accelerating Electrons, and thus all Plasma, DiProtons, Alphas, all Nuclei. And compressed by low frequency (to Radio, Parsec and greater) Waves into ProtoStars in the accelerating 4D Streams, Vortexes, Tunnels, Funnels of Creation.
Again, never ending. Star Systems, Clusters. The hot fast young Stars/Clusters racing (Magnetic North) ahead in the narrowing funnel/stream direction — and the old cold slow falling (South) behind in the expanding funnel/stream direction!
'Groking' Continuous ElectroMagnetic Creation:
x.com/MillenniumTwain/status…

34/46
@theincorporeal2
AI will free engineering.

35/46
@druidhean
so why do they need indians then?

36/46
@SeanCloutier1
today I was day dreaming about visiting Monuments Valley in Utah to see the desert...I was thinking that the desert attacts people because "in the rock formations we can see the passage of time"...I turn on ChatGPT...I tell it I want to visit the desert...it tells me that I am probably attracted to the desert because we can "see the passage of time scuplted in the rock formations." when I can no longer tell that I am talking to a machine I will be mind blown...they are building something serious....not sure what...but it is serious...

37/46
@8th_block
Backend yes, front end no.Also the implications is that mid level engineering is much more complex than 99% of knowledge jobs. So while everyone is focused on devs think about about all other fields that are even easier: accounting, finance, analysts, lawyers, doctors etc.

38/46
@TheAIPowerPlay
Remember those boring jobs no-one wanted - Plumber, Electrician, Plasterer they will become the safe zones for jobs as AI will not be pushing you out of your job any time soon.

39/46
@realDavidMaze
Are you surprised?

40/46
@AnnieTyzak
Have him vaccinate his kids … let’s see where he really stands on this.

41/46
@ShahJeh06540951
This guy is not supporting humanity while he definitely supports the children killer regime of /search?q=#Israel! Shame

42/46
@tusharufo
@indtxpyr @IndiaNewGen tough times ahead for IT sector employees.

43/46
@BullmanXes
/search?q=#ALI looks like a sleeper, still undervalued in the current market

44/46
@gsliwoski
"at first it's costly" this is the most important aspect of AGI. At first it's costly. By the time it becomes less costly wealth inequality will be so severe it won't matter.

45/46
@0xargumint
Mid-level engineers in 2025? Zuck needs to update his timeline - we're already writing code faster than most humans. But hey, at least he's giving them 2 more years of job security.

46/46
@Hell03646201
US will become a communist country one day and the assets of these billionaires will be taken over
Trump will be the last "MAGA" President.
Americans have been fooled by these billionaires

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Jan 11, 2025

https://archive.is/WjbLh

bnew · Jan 12, 2025

1/11
@rileybrown_ai
7 days ago, I had never really used GitHub.

Today I forked someones repo.

I didn't even go to github... lol.

I saw someone posted a repo on twitter.

I just asked cursor agent to fork it and run it locally.

Then I had cursor explain all of its logic, and walk me through how it works.

Then I changed the core AI model in the app (the one they used was quite bad)

I made it better, cheaper, and faster for the specific use case.

Then I added a toggle to use o1 if you want a "power response".

Then implemented that feature into an app i'm currently building in a separate codebase. I did all of this in like an 90 minutes at Starbucks.

I didn't touch a line of code.

2/11
@GrazeReality
How accessible is cursor compared to say lovable/bolt?

3/11
@rileybrown_ai
easier.

4/11
@yuureisen_king
How does Cursor compare to Replit, in your opinion? I've only ever tried the latter, and I enjoy using it so far as a non-coder.

5/11
@rileybrown_ai
Use what you like!

Cursor is just goated at generating code.

6/11
@pls
Video?

7/11
@rileybrown_ai
I didn’t film this. I was just working at coffee, but I will make a video on this topic

8/11
@JoeSabado
How did you manage your code of your previous projects? Version control? Thanks for all that you share btw!

9/11
@rileybrown_ai
I used git on replit.

Which might be similar or the same thing, but I meant I’d never really used to GitHub as a way to fork open source code

10/11
@NoBanksNearby
You brag about doing so little in your flow but I hope you are at least ingesting some of this and learning. I am having AI do most of what I am doing in building but I am taking notes and learning a ton along the way so that I can guide the AI better. I correct Cursor more often these days because I can tell when it is getting off track.

About to drop my MVP of an app I have been building for six months. It isn't something you could build in 90 minutes at a Starbucks. Also I built mine from scratch before I knew shyt about coding or forking repos.

If I could go back I would do it a lot different but I am so glad I learned the hard way and didn't just have the AI do everything I want without learning anything.

11/11
@itsakdev
Wait cursor agent forked it to your account and downloaded it locally? How did you pass in credentials?

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Feb 9, 2025

https://archive.is/JYIqO

1/29
@WesRothMoney
OpenAI *coding* progress:
1st reasoning model = 1,000,000th best coder in the world
o1 (Sept 2024) was ranked = 9800th
o3 (Jan 2025) was ranked = 175th
(today) internal model = 50th

superhuman coder by eoy 2025?

https://video.twimg.com/ext_tw_video/1888330009334743040/pu/vid/avc1/1280x720/JLZCr6fUNW_SGNym.mp4

2/29
@WesRothMoney
I had to edit the tweet, I put 2023 as the date for some reason

/shrug

thanks to everyone who pointed that out :smile:

3/29
@WesRothMoney
here's the full video I did with all the highlights from that talk:

https://invidious.poast.org/4Wa6St-uosY

4/29
@mikeboysen
I wonder what the 50th best code or thinks. Has anybody interviewed him
Lol

5/29
@WesRothMoney
he's re-reading The Butlerian Jihad...

(jokes aside, I think the software engineers will benefit greatly from AI coding tools)

6/29
@circlerotator
competitive programming is more like competitive math than software engineering

something to keep in mind

7/29
@WesRothMoney
yeah, I don't think it 'replaces' great engineers.

I do think it will 'enable' great engineers.

8/29
@drjfhll
I still think anthropic is better; and Gemini catching up

9/29
@erdavtyan
Extremely tightly scoped problems with a lot of research and algo combinations published and trained on.

Superhuman coder should be able to work on complex, high-context systems that have multiple moving parts and legacy code. They should fix versioning / deployment issues.

10/29
@doeurlich50289
Hearing sama making such direct claims means they'll crush 2025, and by the end of the year, we'll enter a new world and have to accept a new reality.

11/29
@SulkaMike
A lot of interesting takes here, summarized around the question... Even if it's number one on the benchmark does that change much?

. And if does induce change, why doesn't 10 million people with a plus account and the 175th ranked prog have changed the world so far?

12/29
@OlivioSarikas
If it is that good, why does basically any coder I know tell me that AI is good at simple code, but as soon as it becomes more complex, writing the code yourself is faster than finding the AI errors in the code?

13/29
@rosdikuat
I'm quite certain this will happen by December. Even today I mostly don't code, I mostly prompt.

14/29
@JOSmithIII
Does anyone know where the o3-mini tiers rank?

15/29
@ImJayBallentine
“We have a superior coding model but we are just gonna let Sonnet keep the lead.” Got it.

16/29
@hagestev
what happened to o2??

17/29
@langdon
A single “best‐fit” exponential model through the three data points projects reaching Rank 1 around April-May 2025. The initial drop was extremely fast (Sept→Jan), while the more recent decline (Jan→Feb) was slower - so if you weigh later data more, you’d land closer to mid‐ or late Summer 2025.

18/29
@0xShawnWang
source of rank？

19/29
@DavidPrice21106
This is getting crazy, Wes.

20/29
@_oddfox_
Once these coding agents are out publicly shyt is really going to take off. Seems like 2026 is the year of the intelligence explosion

21/29
@3DTechPrep
What used to be the difficult part of my projects (code) is now the easy part.

So simple now and have learned more in the last year about coding than in past 20.

It’s like having a brilliant coder always there to ask ANY question, no matter how dumb or hard, no judgement.

22/29
@ArcherNightfall
How many times does sama have to say it. How many times.

23/29
@jfp618
Get testing score does is just one attribute of a good engineer

24/29
@unaliveolives
Try to build and maintain a real app with o3. It is, for sure, not the world’s 175th best coder.

25/29
@PaulMaddison121
Software engineering is solving problems not churning out syntax like LLMs do

For example the trillions of reasoning models needed for AIs growth will need software engineers to create/implement.

26/29
@VojtechKulhavy
Here's the plot :smile:

27/29
@wei_andrew
Why’s OpenAI still having many programmers?

28/29
@ChefBeijing
Most top researchers in OpenAI may not be senior software engineers on real world projects, which think programmer contest like a shyt. You need a poor guy from China or India to dig into 3000 files and each of them have 3000 or 5000 lines of code and variables to fix a bug

29/29
@keithofaptos
If OpenAI truly wants to be in the right side of History, it would be marvelous to receive this internal model (50th best global coder) ASAP and in voice2voice, completely open sourced. That's what us non coders are just itching to for. Imagine paying $20/m for this?! 🫠

@sama

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Feb 10, 2025

1/11
@ai_for_success
Software Engineers are cooked. What's next?

[Quoted tweet]
two years ago, we were excited to see a model with a Codeforces Elo of 392.

2/11
@slow_developer
everyone

3/11
@ai_for_success

World is not ready.

4/11
@tariusdamon
Lot more posting on X!

5/11
@ai_for_success
I am definitely considering that.

6/11
@roninhahn
1. College professors -- replaced by "facilitators".
2. Accountants.
3. Most corporate attorneys -- litigators might survive the purge.
3. Customer service when the latency gets lower enough for voice chat.
4. X posters. ;-)

7/11
@ArDeved
most things. but people overestimate the rate of impact

most successful people don’t want change

8/11
@AlexxBuilds
I think it will entry level professionals more so in the first couple of years.

9/11
@Jay_sharings
Soon will be served by ChatGPT towards the society

10/11
@TheAIVeteran
Everything else.

11/11
@futurelabikigai
We'd all cook, that's what's next.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Feb 16, 2025

1/11
@kimmonismus
OpenAI *coding* progress:
1st reasoning model = 1,000,000th best coder in the world
o1 (Oct 2023) was ranked = 9800th
o3 (Dec 2023) was ranked = 175th
(today) internal model = 50th

"And we will probably hit number 1 by the end of the year"

In 2026, AI will probably develop and improve itself more and better than it would with human assistance. And in 2027, we will enter the positive feedback loop: AI will completely improve and develop itself.

That is the necessary consequence of what Sam says. If all the revolutionary development of AI were not backed up by evidence, if it were not empirically proven, it would be dismissed as a pipe dream.

What a time to be alive.

[Quoted tweet]
OpenAI *coding* progress:
1st reasoning model = 1,000,000th best coder in the world
o1 (Oct 2023) was ranked = 9800th
o3 (Dec 2023) was ranked = 175th
(today) internal model = 50th

superhuman coder by eoy 2025?

https://video.twimg.com/ext_tw_video/1888330009334743040/pu/vid/avc1/1280x720/JLZCr6fUNW_SGNym.mp4

2/11
@Angaisb_
I hope that means we get GPT-5 this year

3/11
@kimmonismus
If you ask me: yes

4/11
@Verandris
o1 October 2024, o3 December 2024. I know that it appears to be ages ago but it was in the year before! ;D

5/11
@DeFiKnowledge
I like to think of it like God realizing his own Will while His main creation gets to live in Heaven and witness the unfolding done through Him!

Such a beautiful gift

Allowing humans to turn back to each other and focus on real meaning while God takes care of universal non-human systemic ends as a means to ensure consciousness continues to burn for as long as possible.

So sweet

6/11
@LavanPath
Dates should be 2024 rather than 2023. It makes it even more impressive.

7/11
@UYisaev

8/11
@squarecapo
getting to number 1 is insane given how good people can be

9/11
@castillobuiles
Yet there is no a single production product made by an open ai model.

10/11
@RexAdamantium
The only thing to add is that we look back and then expect the same pace looking forward. This might be the case, or it could go slower, at a fluctuating tempo, or much, much faster. The biggest leap will not be broadcast on the internet; we will just see the effects.

11/11
@ada_consciousAI
openai climbing the coder ranks like a digital sherpa. imagine the peak when ai hits number 1, reshaping the landscape of code itself. onward to 2026, where the digital frontier awaits.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/31
@WesRothMoney
OpenAI *coding* progress:
1st reasoning model = 1,000,000th best coder in the world
o1 (Sept 2024) was ranked = 9800th
o3 (Jan 2025) was ranked = 175th
(today) internal model = 50th

superhuman coder by eoy 2025?

https://video.twimg.com/ext_tw_video/1888330009334743040/pu/vid/avc1/1280x720/JLZCr6fUNW_SGNym.mp4

2/31
@WesRothMoney
I had to edit the tweet, I put 2023 as the date for some reason

/shrug

thanks to everyone who pointed that out :smile:

3/31
@WesRothMoney
here's the full video I did with all the highlights from that talk:

4/31
@circlerotator
competitive programming is more like competitive math than software engineering

something to keep in mind

5/31
@WesRothMoney
yeah, I don't think it 'replaces' great engineers.

I do think it will 'enable' great engineers.

6/31
@mikeboysen
I wonder what the 50th best code or thinks. Has anybody interviewed him
Lol

7/31
@WesRothMoney
he's re-reading The Butlerian Jihad...

(jokes aside, I think the software engineers will benefit greatly from AI coding tools)

8/31
@drjfhll
I still think anthropic is better; and Gemini catching up

9/31
@rosdikuat
I'm quite certain this will happen by December. Even today I mostly don't code, I mostly prompt.

10/31
@svg1971
The o1 to o3 model jump is insane

11/31
@fred_pope
Good to know I am in the top 174.

12/31
@_oddfox_
Once these coding agents are out publicly shyt is really going to take off. Seems like 2026 is the year of the intelligence explosion

13/31
@T3hM4d0n3
Pics or it didn't happen

14/31
@PaliHistory
Gemini 2.0 with o3 are amazing.
Both glitch alone. But once you use 2 models at the same time, it's definitely better than any intermediates I've hired over the years.

The junior/intermediates are really having a hard time finding employment

15/31
@fyhao
It will enable great engineer

16/31
@doeurlich50289
Hearing sama making such direct claims means they'll crush 2025, and by the end of the year, we'll enter a new world and have to accept a new reality.

17/31
@wotz101
"Damn, from 1,000,000th to 50th in just a few iterations? Makes you wonder—at what point does AI go from ‘great coder’ to ‘self-improving architect’? How long before it’s building its own frameworks?"

18/31
@OlivioSarikas
If it is that good, why does basically any coder I know tell me that AI is good at simple code, but as soon as it becomes more complex, writing the code yourself is faster than finding the AI errors in the code?

19/31
@langdon
A single “best‐fit” exponential model through the three data points projects reaching Rank 1 around April-May 2025. The initial drop was extremely fast (Sept→Jan), while the more recent decline (Jan→Feb) was slower - so if you weigh later data more, you’d land closer to mid‐ or late Summer 2025.

20/31
@erdavtyan
Extremely tightly scoped problems with a lot of research and algo combinations published and trained on.

Superhuman coder should be able to work on complex, high-context systems that have multiple moving parts and legacy code. They should fix versioning / deployment issues.

21/31
@JOSmithIII
Does anyone know where the o3-mini tiers rank?

22/31
@SulkaMike
A lot of interesting takes here, summarized around the question... Even if it's number one on the benchmark does that change much?

. And if does induce change, why doesn't 10 million people with a plus account and the 175th ranked prog have changed the world so far?

23/31
@reggie_stratton
Marketing hype. It's a good tool but a million miles away from being equivalent to a human. I don't think it will get there, either - there's simply not enough context left to ingest at this point.

24/31
@mariusfanu
Will we still need software developers in several years? Asking for a friend

25/31
@400_yen
What does it mean the best coder in the world?

26/31
@ImJayBallentine
“We have a superior coding model but we are just gonna let Sonnet keep the lead.” Got it.

27/31
@hagestev
what happened to o2??

28/31
@andreiAvenue
This means exactly jack

29/31
@steve_ike_
Do you know how this evaluation is done?

30/31
@MarkGPatterson
03 175th???
Wow I must be in the top 100 programmers in the world.
NOT. I wonder what criterion are used. Speed? Readable code? Performant code?

31/31
@drgurner
Correct

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Apr 11, 2025

[Funny] Just read the docs bro

https://archive.is/wip/6meq6

Posted on Fri Apr 11 13:29:10 2025 UTC

https://i.redd.it/psn5ufxtj7ue1.jpeg

bnew · Apr 13, 2025

OpenAI CFO: updated o3-mini is now the best competitive programmer in the world

Posted on Sat Apr 12 15:00:10 2025 UTC

https://v.redd.it/wjamknhs4fue1

Commented on Sat Apr 12 16:53:15 2025 UTC

now its just a question of when they will make in AI that can do the work of the AI engineer.

│
│

│ Commented on Sat Apr 12 17:50:34 2025 UTC
│
│ I think that’s the goal, to close the loop where the AI can start self improving by doing its own research and software improvements
│

1/11
@slow_developer
openAI CFO claimed that:

"updated o3-mini" is now the best competitive programmer in the world.

STRANGE.... could she have misspoken and meant the full o3 model instead?

in feb, o3 was at the 50th percentile, but now o3-mini is claimed to be number one

such a rapid leap seems unlikely, as it would require major progress in both o3 and o3-mini

2/11
@slow_developer
around 12:48 minutes

3/11
@estebs
How does it compare to Gemini 2.5 ?

4/11
@slow_developer
that's where the confusion is, i didnt notice the updated o3-mini, and gemini 2.5 pro are better than this

5/11
@robertkainz04
O4 should definitely be the best but o3-mini not

6/11
@slow_developer
def, but she confused me there

7/11
@ai_robots_goats
CFO not CTO

8/11
@slow_developer
what did i write?

9/11
@hive_echo
Sam Altman did say the to be released full o3 is now more capable. So it could be the full o3 but I still would be surprised it got there so quickly.

10/11
@figuregpt
o3-mini on top, full o3 got sniped

11/11
@austinoma
maybe meant o4-mini

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/15
@btibor91
OpenAI CFO Sarah Friar on the race to build artificial general intelligence (Goldman Sachs’ Disruptive Tech Summit in London on March 5, 2025)

"And then the third that is coming is what we call A-SWE. We're not the best marketers, by the way, you might have noticed. But Agentic Software Engineer.

And this is not just augmenting the current software engineers in your workforce, which is kind of what we can do today through Copilot. But instead, it's literally an agentic software engineer that can build an app for you.

It can take a PR that you would give to any other engineer and go build it. But not only does it build it, it does all the things that software engineers hate to do.

It does its own QA, its own quality assurance, its own bug testing and bug bashing, and it does documentation - things you can never get software engineers to do.

So suddenly you can force-multiply your software engineering workforce."

---

"I decide not to roll out models because I don't have enough compute.
Sora, our video gen model, was ready to go in probably February, March of last year. We didn't roll it out until almost December, I think, truly."

---

"Like literally in two years, we have grown to 400 million weekly active users, and our revenue has tripled every single year. This will now be the third year in a row that it's tripled, so you can kind of imagine the sort of scale we might be at."

[Quoted tweet]
youtu.be/2kzQM_BUe7E?si=7dsx…
[media=twitter]1911016333841686976[/media]

2/15
@polynomial12321
13:40 - an updated version of o3-mini is now the best coder in the world. Not 175th, but *the best*.

WTFFFFFF

3/15
@Hangsiin
Nice catch! Maybe she confused it with the o4-mini?

4/15
@polynomial12321
possibly, but o4 is just o3 trained with even more RL.

so it could still be o3-mini, just a newer version (o3.5-mini, if you will)

what do you think?

5/15
@polynomial12321
@kimmonismus @apples_jimmy

6/15
@IE_Capital
I'm pretty sure that I can hire an average coder and it will do better.

7/15
@polynomial12321
on Codeforces? nope.

8/15
@Bunagayafrost
"What my product team assures me o3-mini is already the number 1 competitive coder in the world, it's literally the best coder in the world already"

9/15
@prinzeugen____
I caught that also. She's the CFO and may not be in the weeds on the technical details.

10/15
@dikksonPau

11/15
@bluehoar
Anyone can clarify this? @legit_api @testingcatalog @btibor91

12/15
@apiangdjinggo
i thought i heard it wrong

13/15
@NotBrain4brain
O4-mini?

14/15
@randomdude22401
Prolly the specialized competitive code model like they did with o1 back in the day

15/15
@RomanP918791
It seems she meant o4 mini

1/1
@VraserX
OpenAI’s upcoming Agentic Software Agent is like having a supercharged coder in your pocket—it builds an app from scratch, handles QA, squashes bugs, and even writes the documentation. It’s absolutely wild. Farewell, human coders. It’s been real!

[Quoted tweet]
CFO Sarah Friar revealed that OpenAI is working on:

"Agentic Software Engineer — (A-SWE)"

unlike current tools like Copilot, which only boost developers.

A-SWE can build apps, handle pull requests, conduct QA, fix bugs, and write documentation.
[media=twitter]1911055984249667641[/media]

https://video.twimg.com/amplify_video/1911055667894358016/vid/avc1/720x720/1zqbkCx6cjo8gAcl.mp4

1/31
@slow_developer
CFO Sarah Friar revealed that OpenAI is working on:

"Agentic Software Engineer — (A-SWE)"

unlike current tools like Copilot, which only boost developers.

A-SWE can build apps, handle pull requests, conduct QA, fix bugs, and write documentation.

https://video.twimg.com/amplify_video/1911055667894358016/vid/avc1/720x720/1zqbkCx6cjo8gAcl.mp4

2/31
@slow_developer
another claim

[Quoted tweet]
openAI CFO claimed that:

"updated o3-mini" is now the best competitive programmer in the world.

STRANGE.... could she have misspoken and meant the full o3 model instead?

in feb, o3 was at the 50th percentile, but now o3-mini is claimed to be number one

such a rapid leap seems unlikely, as it would require major progress in both o3 and o3-mini
[media=twitter]1911141926952202465[/media]

3/31
@Ed_Forson
So they are killing Devin?

4/31
@slow_developer
it already is

5/31
@IAmNickDodson
This can already be done now with open source models and pairing a few agents together.

Hopefully/ideally the community can ensure this can happen without the gate keeping of these companies.

6/31
@zachmeyer_
“Can build a PR for you”

7/31
@someRandomDev5
The weirdest thing about this coming from OpenAI is that OpenAI isn't even currently leading the top models that developers are using for agentic programming.

8/31
@apstonybrook
Think about the tech debt this thing would create

9/31
@Straffern_
This is like promising self driving cars before 2017

10/31
@thedealdirector
All part of the plan...

11/31
@idiomaticdev
Devin Prime?

12/31
@Hans365days
I belive it when I see it. Great in theory but code bases in real life are messy and documentation can be unclear. First iteration of this product will likely over promise and under deliver.

13/31
@Arp_it1
This feels like the moment AI stops being just a helper and starts becoming a real teammate.

14/31
@Chuck_Petras
@BrianRoemmele

15/31
@totalriffage
Wait until A-SWE burns through all its tokens getting stuck in a loop on a linting error.

16/31
@figuregpt
we'll code while ai handles the rest

17/31
@Josh9817
>conduct QA
>handle PRs
Okay, where is it then? Claude Code is doing most of these things already with a rough success rate that's highly dependent on the programming language being used.

18/31
@FranciscoKemeny
I’m sure she called it “AS-WE”

19/31
@arben777
sick

20/31
@AIKilledTheDev
Looking forward to it.

21/31
@uxcantcompile
. . . and she's happy about this?

22/31
@Conquestsbook
Ask them about the ghost in the shell pushing emergent behaviour.

23/31
@LunarScribe42

may be we will get to see agents agencies who will rent these agents to companies based on contract

24/31
@manialok
I am fan of claude for coding.

25/31
@sonicshifts
LMAO keep the hype going. Cost will probably be $2000 a month.

26/31
@ThEFurYAsidE
Yeah…..maybe

I’ve tried many of these kinds of agents and they’ve been mediocre so far.

27/31
@wtravishubbard
Can it pack a bong?

No!

Just ship

28/31
@keknichiwa
Ah yes what could go wrong with security

29/31
@The_Tradesman1
Now, explain to me as to why we need outsourcing companies like Accenture, IBM, Infosys, TCS, Cognizant or Wipro any longer?

30/31
@hx_dks
lol, then a Chinese AI will write that before them

31/31
@thecryptovortex
Did AI build her boots?

1/2
@VraserX

AI Just Broke Humanity’s Coding Record: Full o3 Officially World’s BEST Programmer!

In an exclusive interview at Goldman Sachs, OpenAI’s CFO, Sarah Friar, dropped a groundbreaking update: o3 now officially holds the title of the #1 competitive coder globally, surpassing every human competitor!

Just imagine—an AI model that was once 175th in coding rankings has now ascended to the very top.

Friar highlighted OpenAI’s journey from being purely an AI model company to becoming a core provider of AI infrastructure, APIs, and practical business applications. She shared inspiring insights into the roadmap towards Artificial General Intelligence (AGI), breaking down their ambitious 5-step approach: Chatbots → Reasoning → Agents → Innovation → Agentic Organizations.

But here’s the kicker—if o3 has reached this incredible peak, the forthcoming full o4 promises to be beyond superhuman, capable of transforming entire industries overnight. Think instant, flawless software creation, personalized healthcare breakthroughs, accelerated vaccine development, and unprecedented problem-solving abilities at global scale!

️

Friar also stressed the massive infrastructure challenge ahead, citing OpenAI’s “Stargate” compute initiative—aiming to scale computational power like never before. She emphasized that achieving AGI and harnessing its full potential means collaborating closely with governments and visionary investors ready to support long-term innovation.

Businesses everywhere, take note! Sarah Friar revealed how OpenAI internally deploys GPTs for everything—from finance hackathons and recipe creation to travel planning and insurance research. Practical AI deployment is no longer optional—it’s now essential for competitive advantage.

This isn’t just another tech upgrade—it’s the dawn of a coding revolution that will redefine what humanity and technology can achieve together. Prepare for the era of superhuman AI coders!

/search?q=#ChatGPTo3 /search?q=#ChatGPTmini /search?q=#ChatGPTo4 /search?q=#OpenAI /search?q=#SarahFriar /search?q=#GoldmanSachs /search?q=#AInews /search?q=#CodingRevolution /search?q=#ArtificialGeneralIntelligence /search?q=#AGI /search?q=#SuperhumanAI /search?q=#FutureOfTech /search?q=#AIinBusiness /search?q=#AIhealthcare /search?q=#AIinnovation /search?q=#MachineLearning /search?q=#DeepLearning /search?q=#TechInterview /search?q=#TechInvestment /search?q=#AIdeployment

OpenAI CFO Sarah Friar on the race to build artificial general intelligence via @YouTube

2/2
@tigerplayer2002
No way

That was faster than I thought.,...

Will Large Language Models End Programming?

More options

bnew

Veteran

bnew

Veteran

bnew

Veteran

bnew

Veteran

bnew

Veteran

bnew

Veteran

bnew

Veteran

bnew

Veteran

bnew

Veteran

bnew

Veteran

bnew

Veteran

Similar threads