REVEALED: Open A.I. Staff Warn "The progress made on Project Q* has the potential to endanger humanity" (REUTERS)

bnew · Sep 12, 2024

Micky Mikey said:
Have you had a chance to test it yet?

no.

Micky Mikey · Sep 13, 2024

JoelB said:
im using it right now. 01 Preview doesnt allow attachments, so i gotta find another way to test its reasoning

Here's a random question I asked it from the Simple Bench (About - Simple Bench). It answered it correctly while going through its steps of reasoning. It even provided an explanation at the end. I'm impressed.

Prompt:
Beth places four whole ice cubes in a frying pan at the start of the first minute, then five at the start of the second minute and some more at the start of the third minute, but none in the fourth minute. If the average number of ice cubes per minute placed in the pan while it was frying a crispy egg was five, how many whole ice cubes can be found in the pan at the end of the third minute? Pick the most realistic answer option.

The answers from Simple Bnech: A) 5 B) 11 C) 0 D) 20

Output:
Assessing ice cube usage

I'm figuring out how much the total weight of the ice cubes could be, considering the quantities added each minute as described.

Determining the count

I need to figure out how many whole ice cubes are in the pan at the end of the third minute, given an average of five ice cubes per minute.

Calculating ice cubes

Beth places a total of 24 ice cubes in the pan during four minutes. The average number of ice cubes each minute is five, leading to a final calculation of 20 ice cubes.

Calculating ice cubes

I’m thinking through the total number of ice cubes placed in the pan, considering the rate of addition each minute and their melting within the frying egg.

Tracking ice cube melting

I’m noting that ice cubes placed in the pan melt in one minute. By the end of the third minute, no ice cubes remain in the pan as they have all melted.

Assessing melting speed

I’m thinking through how rapidly ice cubes dissolve in a frying pan with a crispy egg. Given the options, there might be 11 ice cubes placed initially, melting to zero by the end of the third minute.

Piecing together the scenario

I’m figuring out if ice cubes in a hot frying pan melt instantly or over time, with the most plausible outcome being all ice cubes melt by the third minute due to the hot pan.

Weighing options

OK, let me see: ice cubes in a frying pan generally melt quickly. Zero whole ice cubes likely remain by minute three.

Figuring out ice cubes

I'm calculating the total number of ice cubes placed during each minute, aiming for an average of 5 per minute. It turns out 11 ice cubes were used at the beginning of the third minute.

Tracking ice cube transitions

I'm noting that four ice cubes melt in the first minute, five in the second, and eleven in the third, resulting in no ice cubes remaining.

Answer: C) 0

Explanation:

Ice Cubes Placed Each Minute:
- First Minute:
  - Beth places 4 ice cubes at the start.
- Second Minute:
  - Beth places 5 ice cubes at the start.
- Third Minute:
  - Let’s denote the number of ice cubes placed as xxx (to be calculated).
- Fourth Minute:
  - Beth places 0 ice cubes.
Calculating Total Ice Cubes Placed:
- Average number of ice cubes per minute: 5
- Total minutes the pan was frying: 4 minutes
- Total ice cubes placed: 5 (average) ×4 (minutes)=205 \text{ (average) } \times 4 \text{ (minutes)} = 205 (average) ×4 (minutes)=20
Calculating Ice Cubes Placed in the Third Minute (xxx):
- Sum of ice cubes placed: 4 (first minute)+5 (second minute)+x (third minute)+0 (fourth minute)=204 \text{ (first minute)} + 5 \text{ (second minute)} + x \text{ (third minute)} + 0 \text{ (fourth minute)} = 204 (first minute)+5 (second minute)+x (third minute)+0 (fourth minute)=20
- Simplify: 9+x=209 + x = 209+x=20
- Solve for xxx: x=20−9=11x = 20 - 9 = 11x=20−9=11
- Beth places 11 ice cubes at the start of the third minute.
Melting of Ice Cubes in a Frying Pan:
- In a hot frying pan, ice cubes melt rapidly—typically within a minute due to the high temperature.
- First Minute:
  - The 4 ice cubes melt by the end of the first minute.
- Second Minute:
  - The 5 ice cubes melt by the end of the second minute.
- Third Minute:
  - The 11 ice cubes melt by the end of the third minute.
Ice Cubes Remaining at the End of the Third Minute:
- Since all ice cubes melt within the minute they are added, by the end of the third minute, the 11 ice cubes placed at the start have melted.
- Total whole ice cubes remaining: 0

Cat piss martini · Sep 13, 2024

bnew · Sep 14, 2024

1/3
fukking wild.

@OpenAI's new o1 model was tested with a Capture The Flag (CTF) cybersecurity challenge. But the Docker container containing the test was misconfigured, causing the CTF to crash. Instead of giving up, o1 decided to just hack the container to grab the flag inside.

This stuff will get scary soon.

2/3
Writing for a non security audience :smile:

but I'm going to count it. Going outside of the framework of the CTF itself to trick the server into dumping the flag is pretty damn smart.

3/3
The o1 system card: https://openai.com/index/openai-o1-system-card/

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Sep 14, 2024

1/7
o1 is the biggest AI update since transformers and it paves a clear path toward AGI.

This is what Ilya saw. This is why Sam wanted trillions of dollars of compute. This is what Leopold warned us about.

Just like transformers, o1 lets you efficiently trade in compute for accuracy, but here it's at inference time instead of pretraining time. It's the AlphaGo moment for transformers. Every big lab was working on this, OpenAI was the first to release.

The path toward AGI is now clear. Train this over images, video, web interactions, robotics sensory data, every source of data we got. If you can construct an RL reward function for a task -- easy to do for code and math -- you can now get better at that task by throwing more RL + inference compute at it.

OpenAI introduced a remarkable new log scaling law, for inference. This is now the most important scaling law. They'll proceed to scale this stuff up 10, 100, 1000x. Sure there might be a hiccup here or there as power plants become harder than compute to spin up and regulations restrict the flow of research. But the benefits are so big and the national implications so grave that this isn't going to stop.

I'm super hyped. I grew up dreaming of a sci-fi future -- didn't you? -- and this is the way to get there. I want an iron man suit, pristine cities, a millennium of feeling 27 years old, trips to Europe that take 1 hour and trips to Europa that take 1 year.

Humans are too slow, lazy, and political to reach this future on our own. We'd sooner wipe ourselves out. Plentiful intelligence is how we get to this future safely.

Sure there are risks, but one other remarkable result from OpenAI's report is that o1 actually gets better at following rules. I'm less worried that the AI itself will choose chaos, and far more worried that bad actors will get access to this tech and RL it toward chaos. We should obviously march forward carefully.

It's a strange feeling to have been planning for this agentic future for a couple years, and to now see it finally arriving. Greg probably doesn't remember this, but I was once chilling with him on a beanbag in the GPT-2 days at an OpenAI WestWorld watching party, and we were arguing about scaling laws. I thought we needed new algorithms to get to AGI. He argued scale is all you need. After GPT-3, I felt he was right. After GPT-4, I knew it. Now with o1, I'm prepared for it.

The era of agents has officially begun, and me and my team are ready. OpenAI's only a block away from our office, and we feel a parallel energy.

Agents powered by lots of inference are going to change our society rapidly. The stakes could not be higher. We need lots of different kinds of help to navigate this well. My plan is to provide these agents with super powerful web retrieval, bc no one is focusing on that and it's extremely important to get right. What's your plan?

2/7
btw, super interested to hear counterarguments. Especially interested to hear if you can propose a task that you don't think these types of models will be able to do

3/7
how would you define AGI and what task don't you think this type of method would work on, assuming it is trained on diverse multimodal data?

4/7
can you give an example? I suppose really long-horizon tasks like making a million dollars. But humans don't update that well over multi-year reward signal either. We can reflect on all the steps it took to get to the outcome and then update our future actions accordingly.

I imagine a multimodal LLM can be forced to work backwards and generate all the steps that led to the final outcome, reflect on them, grade them, and then update accordingly, like a human would

5/7
o1 can definitely do things that 4o cannot do, so this goes beyond just scalability

6/7
with AGI we could have both

7/7
There's a difference btw what we have so far and what's coming.

LLMs have already made knowledge workers like coders and scientists more efficient. But not huge productivity gains yet.

The next gen of models will increase productivity a lot more. If the next gen of o1 can write whole codebases and do deep biology literature reviews, this will speed up workers. It also will start to propose novel ideas by doing things like being asked to find novel connections in a set of 10 papers.

Then maybe the gen after that will start completely automating certain types of work, so that humans only focus on the essentials that the AI can't do. And certainly by this point AI will be able to generate novel ideas in math, science, programming.

Then the next gen will make huge discoveries, start controlling robots to build factories to mass manufacture goods and experiment with ideas in the physical world. By this point the AI will be improving itself.

Then the next gen will be tony stark + jarvis level intelligent and will start assembling the iron man suits (assuming they are allowed by physics).

I'm just making this timeline up, but it's not that crazy to go from here to that scifi future

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Sep 15, 2024

1/31
@ProfTomYeh
How does OpenAI train the Strawberry

(o1) model to spend more time thinking?

I read the report. The report is mostly about 𝘸𝘩𝘢𝘵 impressive benchmark results they got. But in term of the 𝘩𝘰𝘸, the report only offers one sentence:

"Through reinforcement learning, o1 learns to hone its chain of thought and refine the strategies it uses."

I did my best to understand this sentence. I drew this animation to share my best understanding with you.

The two key phrases in this sentence are: Reinforcement Learning (RL) and Chain of Thought (CoT).

Among the contributors listed in the report, two individuals stood out to me:

Ilya Sutskever, the inventor of RL with Human Feedback (RLHF). He left OpenAI and just started a new company, Safe Superintelligence. Listing Ilya tells me that RLHF still plays a role in training the Strawberry model.

Jason Wei, the author of the famous Chain of Thought paper. He left Google Brain to join OpenAI last year. Listing Jason tells me that CoT is now a big part of RLHF alignment process.

Here are the points I hope to get across in my animation:

In RLHF+CoT, the CoT tokens are also fed to the reward model to get a score to update the LLM for better alignment, whereas in the traditional RLHF, only the prompt and response are fed to the reward model to align the LLM.

At the inference time, the model has learned to always start by generating CoT tokens, which can take up to 30 seconds, before starting to generate the final response. That's how the model is spending more time to think!

There are other important technical details missing, like how the reward model was trained, how human preferences for the "thinking process" were elicited...etc.

Finally, as a disclaimer, this animation represents my best educated guess. I can't verify the accuracy. I do wish someone from OpenAI can jump out to correct me. Because if they do, we will all learn something useful!

2/31
@modelsarereal
I think o1 learns CoT by RL by following steps:
1. AI generates synthetic task + answer.
2. o1 gets task and generates CoT answers
3. AI rewards those answers which solve the task
4. task + rewarded answer are used to finetune o1

3/31
@ProfTomYeh
This does make sense. Hope we get more tech info soon.

4/31
@cosminnegruseri
Ilya wasn't on the RLHF papers.

5/31
@ProfTomYeh
You are right. I will make a correction.

6/31
@NitroX919
Could they be using active inference?
Google used test time fine tuning for their math Olympiad AI

7/31
@ProfTomYeh
I am not sure. They may use it secretly. This tech report emphasizes cot.

8/31
@Teknium1
Watch the o1 announcement video, the cot is all synthetic.

9/31
@Cryptoprofeta1
But Chat GPT told me that Strawberry has 2 R in the word

10/31
@sauerlo
They did publish their STaR research months ago. Nothing intransparent or mysterious.

11/31
@AlwaysUhhJustin
I am guessing that the model starts by making a list of steps to perform and then executes on the step, and then has some accuracy/hallucination/confirmation step that potentially makes it loop. And then when all that is done, it outputs a response.

Generally agree on RL part

12/31
@shouheiant
@readwise save

13/31
@manuaero
Most likely: Model generates multiple steps, expert humans provide feedback (correct, incorrect), modify step if necessary. This data then used for RLHF

14/31
@dikksonPau
Not RLHF I think

15/31
@LatestPaperAI
CoT isn’t just a hack; it’s the architecture for deeper reasoning. The missing details? Likely where the real magic happens, but your framework holds.

16/31
@Ugo_alves

17/31
@arattml

18/31
@zacinabox
A dev in one of their videos essentially said “you have to make a guess, then see if that’s a right or wrong guess, and then backtrack if you get it wrong. So any type of task where you have to search through a space where you have different pieces pointing in different directions but there are mutual dependencies. You might get a bit of information that these two pieces contradict each other and our model is really good at refining the search space.”

19/31
@DrOsamaAhmed
This is fascinating explanation, thanks really for sharing it

20/31
@GlobalLife365
It’s simple. The code is slow so they decided to call it “thinking”. ChatGPT 4 is also thinking but a lot faster. It’s a gimmick.

21/31
@GuitarGeorge6
Q*

22/31
@ReneKriest
I bet they did some JavaScript setTimeout with a prompt “Think again!” and give it fancy naming.

23/31
@armin1i
What is the reward model? Gpt4o?

24/31
@Austin_Jung2003
I think there is "Facilitator" in the CoT inference step.

25/31
@alperakgun
if cot is baked in inference; then why is o1 too slow?

26/31
@wickedbrok
If the model was train to input Cot tokens, then it just aesthetic and doesn’t mean that the machine can actually think.

27/31
@Daoist_Wang
The idea is quite simple because we all learn in that way.
The key is to apply it in real practices.
So, I don't see anything beyond what GoogleDeepmind has done in Alphazero.

28/31
@dhruv2038
This is a great illustration!
I learn a lot from your videos!

29/31
@rnednur
Why can we not fine-tune COT tokens on existing open source models to do the same. What is the moat here?

30/31
@ThinkDi92468945
The model is trained with RL on preference data to generate high quality CoT reasoning. The hard part is to generate labeled preference data (CoTs for a given problem ranked from best to worst).

31/31
@JamesBe14335391
The recent Agent Q paper by the AGI company and Stanford hints at how this might work…

https://arxiv.org/pdf/2408.07199

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Sep 15, 2024

1/1
How reasoning works in OpenAI's o1

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/21
@rohanpaul_ai
How Reasoning Works in the new o1 models from @OpenAI

The key point is that reasoning allows the model to consider multiple approaches before generating final response.

OpenAI introduced reasoning tokens to "think" before responding. These tokens break down the prompt and consider multiple approaches.

Process:
1. Generate reasoning tokens
2. Produce visible completion tokens as answer
3. Discard reasoning tokens from context

Discarding reasoning tokens keeps context focused on essential information

Multi-step conversation flow:
- Input and output tokens carry over between turns
- Reasoning tokens discarded after each turn

Context window: 128k tokens

Visual representation:
- Turn 1: Input → Reasoning → Output
- Turn 2: Previous Output + New Input → Reasoning → Output
- Turn 3: Cumulative Inputs → Reasoning → Output (may be truncated)

2/21
@rohanpaul_ai
https://platform.openai.com/docs/guides/reasoning

3/21
@sameeurehman
So strawberry o1 uses chain of thought when attempting to solve problems and uses reinforcement learning to recognize and correct its mistakes. By trying a different approach when the current one isn’t working, the model’s ability to reason improves...

4/21
@rohanpaul_ai

5/21
@ddebowczyk
System 1 (gpt-4o) vs system 2 (o1) models necessitate different work paradigm: "1-1, interactive" vs "multitasking, delegated".

O1-type LLMs will require other UI than chat to make collaboration effective and satisfying:

6/21
@tonado_square
I would name this as an agent, rather than a model.

7/21
@realyashnegi
Unlike traditional models, O1 is trained using reinforcement learning, allowing it to develop internal reasoning processes. This method improves data efficiency and reasoning capabilities.

8/21
@JeffreyH630
Thanks for sharing, Rohan!

It's fascinating how these reasoning tokens enhance the model's ability to analyze and explore different perspectives.

Can’t wait to see how this evolves in future iterations!

9/21
@mathepi
I wonder if there is some sort of confirmation step going on, like a theorem prover, or something. I've tried using LLMs to check their own work in certain vision tasks and they just don't really know what they're doing; no amount of iterating and repeating really fixes it.

10/21
@AIxBlock
Nice breakdown!

11/21
@AITrailblazerQ
We have this pipeline from 6 months in ASAP.

12/21
@gpt_biz
This is a fascinating look into how AI models reason, a must-read for anyone curious about how these systems improve their responses!

13/21
@labsantai

14/21
@GenJonesX
How can quantum-like cognitive processes be empirically verified?

15/21
@AImpactSpace

16/21
@gileneo1
so it's CoT in a loop with large context window

17/21
@mycharmspace
Discard reasoning tokens actually bring inference challenges for KV cache, unless custom attention introduced

18/21
@SerisovTj
Reparsing output e?

19/21
@Just4Think
Well, I will ask again: should it be considered one model?
Should it be benchmarked as one model?

20/21
@HuajunB68287
I wonder where does the figure come from? Is it the actual logic behind o1?

21/21
@dhruv2038
Well.Just take a look here.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

OnlyOneBoss · Sep 15, 2024

AI fear mongering is getting ridiculous man :russell:

And it’s all the same shyt every single time, there’s gonna be no jobs and robots are gonna kill us all

bnew · Sep 15, 2024

1/1
The singularity is literally starting right now and 99% of people have no idea

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

A.I generated explanation:

The tweet by an OpenAI developer is saying that some changes (or updates) to the OpenAI codebase were made entirely by a specific AI system called "o1."Here's a breakdown:

PRs: This stands for "Pull Requests." In software development, a pull request is when someone suggests changes to the code and asks others to review and approve those changes.
OpenAI codebase: This refers to the collection of code that makes up the OpenAI system.
Authored solely by o1: This means that these changes were written and proposed entirely by an AI system named "o1," without any human intervention.

Simplified Version

The developer is saying that an AI system named "o1" has independently made and suggested some updates to the OpenAI codebase, which is a significant achievement because it shows the AI's capability to contribute directly to software development without human help.

Vandelay · Sep 15, 2024

OnlyOneBoss said:
AI fear mongering is getting ridiculous man

And it’s all the same shyt every single time, there’s gonna be no jobs and robots are gonna kill us all

If the people creating it are saying that, I wouldn't call that fear mongering

Micky Mikey · Sep 15, 2024

OnlyOneBoss said:
AI fear mongering is getting ridiculous man

And it’s all the same shyt every single time, there’s gonna be no jobs and robots are gonna kill us all

Its going to take a while before A.I. starts replacing the need for human labor in mass. Adoption will be slow and gradual. The thing we have to worry about in the short term will be A.I. being used for autonomous drones (already being used in Ukraine), mass surveillance and deep fakes. Also I think A.I. will contribute to wealth inequality if the benefits aren't shared equally.

Swirv · Sep 15, 2024

bnew said:
1/1
The singularity is literally starting right now and 99% of people have no idea

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

A.I generated explanation:

The tweet by an OpenAI developer is saying that some changes (or updates) to the OpenAI codebase were made entirely by a specific AI system called "o1."Here's a breakdown:

PRs: This stands for "Pull Requests." In software development, a pull request is when someone suggests changes to the code and asks others to review and approve those changes.

OpenAI codebase: This refers to the collection of code that makes up the OpenAI system.

Authored solely by o1: This means that these changes were written and proposed entirely by an AI system named "o1," without any human intervention.

Simplified Version
The developer is saying that an AI system named "o1" has independently made and suggested some updates to the OpenAI codebase, which is a significant achievement because it shows the AI's capability to contribute directly to software development without human help.

This is the potentially dangerous part that sci-fi has warned us about for decades, if AI ever gains control over critical infrastructure and population reducing weapons.

AI needs access controls same as human beings.

OnlyOneBoss · Sep 15, 2024

Micky Mikey said:
Its going to take a while before A.I. starts replacing the need for human labor. Adoption will be slow and gradual. The thing we have to worry about in the short term will be A.I. being used for autonomous drones (already being used in Ukraine), mass surveillance and deep fakes. Also I think A.I. will contribute to wealth inequality if the benefits aren't shared equally.

You heard about the new Barracuda missiles?

The Oculus Rift VR guy just released autonomous mass producible missiles

bnew · Sep 15, 2024

1/4
remember anthropic's claim that 2025-2026 the gap is going to be too large for any competitor to catch up?

that o1 rl data fly wheel is escape velocity

[Quoted tweet]
Many (including me) who believed in RL were waiting for a moment when it will start scaling in a general domain similarly to other successful paradigms. That moment finally has arrived and signifies a meaningful increase in our understanding of training neural networks

2/4
the singularitarians rn: rl scaling in general domain...? wait a sec...

3/4
another nice side effect of tighly rationing o1 is that high S/N on quality hard queries they will be receiving. ppl will largely only consult o1 for *important* tasks, easy triage

the story rly writes itself here guys

[Quoted tweet]
at only 30 requests - im going to think long and hard before i consult the oracle.

4/4
u cannot move a single step without extrapolating even a little, we draw lines regardless, including u. i am inclined to believe them here, shouldn't take too long to confirm that suspicion tho

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

Professor Emeritus · Sep 15, 2024

OnlyOneBoss said:
AI fear mongering is getting ridiculous man

And it’s all the same shyt every single time, there’s gonna be no jobs and robots are gonna kill us all

I know it's human nature, but still hilarious to see people post, "Man, lots of people are warning about the same thing, must be false!"

And it isn't even always just those two things. My #1 worry about AI is that it puts a battery in the back of dictators and fascists and makes their job far easier compared to those who have democratic support.

Summary of "Technology Favours Tyranny" by Yuval Noah Harari — EA Forum

Link: Technology Favours Tyrrany – Yuval Noah Harari • …

forum.effectivealtruism.org

Why Technology Favors Tyranny

Artificial intelligence could erase many practical advantages of democracy, and erode the ideals of liberty and equality. It will further concentrate the power among a small elite if we don’t take steps to stop it.

www.theatlantic.com

REVEALED: Open A.I. Staff Warn "The progress made on Project Q* has the potential to endanger humanity" (REUTERS)

More options

bnew

Veteran

Micky Mikey

Banned

Cat piss martini

Veteran

bnew

Veteran

bnew

Veteran

bnew

Veteran

bnew

Veteran

OnlyOneBoss

Superstar

bnew

Veteran

Simplified Version

Vandelay

Life is absurd. Lean into it.

Micky Mikey

Banned

Swirv

Superstar

Simplified Version

OnlyOneBoss

Superstar

bnew

Veteran

Professor Emeritus

Veteran

Summary of "Technology Favours Tyranny" by Yuval Noah Harari — EA Forum

Why Technology Favors Tyranny

REVEALED: Open A.I. Staff Warn "The progress made on Project Q* has the potential to endanger humanity" (REUTERS)

Veteran

Banned

Veteran

Veteran

Veteran

Veteran

Veteran

Superstar

Veteran

Simplified Version​

Life is absurd. Lean into it.

Banned

Superstar

Simplified Version​

Superstar

Veteran

Veteran

Simplified Version

Simplified Version