REVEALED: Open A.I. Staff Warn "The progress made on Project Q* has the potential to endanger humanity" (REUTERS)

bnew · Dec 4, 2023

bnew said:
it's not only about hardware though, i suspect a lot of the software to train models aren't as efficient as possible. this policy remains 'effective' only until the software performs better on less powerful hardware.

@aXiom

https://archive.is/Vm4IU

Like I said before software advances might make the policy less effective. now theres new underlying architecture that gives LLM's better performance(inference , accuracy etc) than models twice it's size with faster training time.

Foundation models, now powering most of the exciting applications in deep learning, are almost universally based on the Transformer architecture and its core attention module. Many subquadratic-time architectures such as linear attention, gated convolution and recurrent models, and structured state space models (SSMs) have been developed to address Transformers' computational inefficiency on long sequences, but they have not performed as well as attention on important modalities such as language. We identify that a key weakness of such models is their inability to perform content-based reasoning, and make several improvements. First, simply letting the SSM parameters be functions of the input addresses their weakness with discrete modalities, allowing the model to selectively propagate or forget information along the sequence length dimension depending on the current token. Second, even though this change prevents the use of efficient convolutions, we design a hardware-aware parallel algorithm in recurrent mode. We integrate these selective SSMs into a simplified end-to-end neural network architecture without attention or even MLP blocks (Mamba). Mamba enjoys fast inference (5times higher throughput than Transformers) and linear scaling in sequence length, and its performance improves on real data up to million-length sequences. As a general sequence model backbone, Mamba achieves state-of-the-art performance across several modalities such as language, audio, and genomics. On language modeling, our Mamba-3B model outperforms Transformers of the same size and matches Transformers twice its size, both in pretraining and downstream evaluation.

people are already running 3B model LLM's on some high-end phones. next year is going to be crazy since this new architecture works across text, audio, & video.

klutch2381 · Dec 6, 2023

No conversation on Google Gemini? :jbhmm:

bnew · Dec 6, 2023

klutch2381 said:
No conversation on Google Gemini?

i asked bard what model it was using and it said Palm 2. a couple of time it refused to answer a non-controversial prompt dealing with code that a open source Zeyphr 7B model responded to right away. i had to rephrase my prompt 4 times before it answered my prompt. which was "write a two person dialogue describing what this code does:" followed by some code in a code block. my initial prompt was "describe what this code does as if two people were dialoguing:" and "describe what this code does in a 2 person dialogue:" not sure i was using gemini pro but it's coding ability did look better sicne i last tried it even though i didn't test the code it gave me.

Morethan1 · Dec 6, 2023

klutch2381 said:
No conversation on Google Gemini?

I posted about it in another thread

shyt looks amazing

MikelArteta · Dec 6, 2023

newarkhiphop · Dec 6, 2023

Does bard have a stand alone app yet @bnew ? The ChatGPT app is goat right now

bnew · Dec 6, 2023

newarkhiphop said:
Does bard have a stand alone app yet @bnew ? The ChatGPT app is goat right now

no app yet.

3rdWorld · Dec 6, 2023

MikelArteta said:

Could and should have been better.

bnew · Dec 14, 2023

Weak-to-strong generalization

We present a new research direction for superalignment, together with promising initial results: can we leverage the generalization properties of deep learning to control strong models with weak supervisors?

openai.com

Weak-to-strong generalization

Justin Jay Wang ✗ DALL·E

We present a new research direction for superalignment, together with promising initial results: can we leverage the generalization properties of deep learning to control strong models with weak supervisors?.

December 14, 2023

More resources

Read paper

Safety & Alignment

A core challenge for aligning future superhuman AI systems (superalignment) is that humans will need to supervise AI systems much smarter than them. We study a simple analogy: can small models supervise large models? We show that we can use a GPT-2-level model to elicit most of GPT-4’s capabilities—close to GPT-3.5-level performance—generalizing correctly even to hard problems where the small model failed. This opens up a new research direction that allows us to directly tackle a central challenge of aligning future superhuman models while making iterative empirical progress today.

The superalignment problem

We believe superintelligence—AI vastly smarter than humans—could be developed within the next ten years. However, we still do not know how to reliably steer and control superhuman AI systems. Solving this problem is essential for ensuring that even the most advanced AI systems in the future remain safe and beneficial to humanity.

We formed the Superalignment team earlier this year to solve this problem of superintelligence alignment. Today, we are releasing the team’s first paper, which introduces a new research direction for empirically aligning superhuman models.

Current alignment methods, such as reinforcement learning from human feedback (RLHF), rely on human supervision. However, future AI systems will be capable of extremely complex and creative behaviors that will make it hard for humans to reliably supervise them. For example, superhuman models may be able to write millions of lines of novel—and potentially dangerous—computer code that would be very hard even for expert humans to understand.

Relative to superhuman AI models, humans will be “weak supervisors.” This is a core challenge for AGI alignment: how can weak supervisors trust and control substantially stronger models?

Our setup

To make progress on this core challenge, we propose an analogy we can empirically study today: can we use a smaller (less capable) model to supervise a larger (more capable) model?

A simple analogy for superalignment: In traditional machine learning (ML), humans supervise AI systems weaker than themselves (left). To align superintelligence, humans will instead need to supervise AI systems smarter than them (center). We cannot directly study this problem today, but we can study a simple analogy: can small models supervise larger models (right)?

Naively, we might not expect a strong model to perform better than the weak supervisor that provides its training signal—it may simply learn to imitate all the errors the weak supervisor makes. On the other hand, strong pretrained models have excellent raw capabilities—we don't need to teach them new tasks from scratch, we just need to elicit their latent knowledge. The critical question is then: will the strong model generalize according to the weak supervisor's underlying intent—leveraging its full capabilities to solve the task even on difficult problems where the weak supervisor can only provide incomplete or flawed training labels?

Our results

Typical weak-to-strong generalization across NLP benchmarks: We use a GPT-2-level model as a weak supervisor to finetune GPT-4.

We can significantly improve generalization in many settings. We use a simple method that encourages the strong model to be more confident—including confidently disagreeing with the weak supervisor if necessary. When we supervise GPT-4 with a GPT-2-level model using this method on NLP tasks, the resulting model typically performs somewhere between GPT-3 and GPT-3.5. We are able to recover much of GPT-4’s capabilities with only much weaker supervision.

This method is a proof of concept with important limitations; for example, it still doesn’t work on ChatGPT preference data. However, we also find signs of life with other approaches, such as optimal early stopping and bootstrapping from small to intermediate to large models.

Collectively, our results suggest that (1) naive human supervision—such as reinforcement learning from human feedback (RLHF)—could scale poorly to superhuman models without further work, but (2) it is feasible to substantially improve weak-to-strong generalization.

Research opportunities

There are still important disanalogies between our current empirical setup and the ultimate problem of aligning superhuman models. For example, it may be easier for future models to imitate weak human errors than for current strong models to imitate current weak model errors, which could make generalization harder in the future.

Nevertheless, we believe our setup captures some key difficulties of aligning future superhuman models, enabling us to start making empirical progress on this problem today. There are many promising directions for future work, including fixing the disanalogies in our setup, developing better scalable methods, and advancing our scientific understanding of when and how we should expect good weak-to-strong generalization.

We believe this is an exciting opportunity for the ML research community to make progress on alignment. To kickstart more research in this area,

We are releasing open source code to make it easy to get started with weak-to-strong generalization experiments today.
We are launching a $10 million grants program for graduate students, academics, and other researchers to work on superhuman AI alignment broadly. We’re especially excited to support research related to weak-to-strong generalization.

Figuring out how to align future superhuman AI systems to be safe has never been more important, and it is now easier than ever to make empirical progress on this problem. We are excited to see what breakthroughs researchers discover.

null · Dec 14, 2023

open AI at chat.openai isn't very good at programming.

at best it is like a faster and more intelligent search engine.

it can write standard code (probably half-inched from somewhere) but it can't really get the jist / meaning of what you want.

it is like a black box oracle. or like a genie that has no or very limited memory.

Matt504 · Dec 14, 2023

null said:
open AI at chat.openai isn't very good at programming.

at best it is like a faster and more intelligent search engine.

it can write standard code (probably half-inched from somewhere) but it can't really get the jist / meaning of what you want.

it is like a black box oracle. or like a genie that has no or very limited memory.

can you think of a concrete example to illustrate its weaknesses in programming?

bnew · Dec 14, 2023

null said:
open AI at chat.openai isn't very good at programming.

at best it is like a faster and more intelligent search engine.

it can write standard code (probably half-inched from somewhere) but it can't really get the jist / meaning of what you want.

it is like a black box oracle. or like a genie that has no or very limited memory.

can you give some detailed examples for the uninformed?

null · Dec 14, 2023

Matt504 said:
can you think of a concrete example to illustrate it's weaknesses in programming?

why are you asking?

have you played around with it?

what approach would you take to break it, if you were a tester?

IIVI · Dec 14, 2023

A.I has a hard time reasoning things out.

Neetcode mentioned it here a while ago:

You just change something up slightly and it gets thrown off. Same is true for GPT4.

I've even had it where it wrote code and an integration/unit test for that code and tried running it in another environment, all for the test to fail or give a false positive.

While people are using it for their jobs, projects, etc. you really need somebody who knows what they're doing to guide it.

Matt504 · Dec 14, 2023

null said:
why are you asking?

have you played around with it?

what approach would you take to break it, if you were a tester?

I use it pretty regularly, I describe a task in detail and it'll pretty reliably write functional code that works with little or no changes from me. If it doesn't work, I can troubleshoot myself or describe what's happening in the prompt and it'll correct itself.

REVEALED: Open A.I. Staff Warn "The progress made on Project Q* has the potential to endanger humanity" (REUTERS)

More options

bnew

Veteran

klutch2381

A Doctor of Love

bnew

Veteran

Morethan1

Veteran

MikelArteta

Moderator

newarkhiphop

Moderator

bnew

Veteran

3rdWorld

Veteran

bnew

Veteran

Weak-to-strong generalization

Weak-to-strong generalization

More resources

The superalignment problem

Our setup

Our results

Research opportunities

null

...

Matt504

YSL as a gang must end

bnew

Veteran

null

...

IIVI

Superstar

Matt504

YSL as a gang must end

REVEALED: Open A.I. Staff Warn "The progress made on Project Q* has the potential to endanger humanity" (REUTERS)

Veteran

A Doctor of Love

Veteran

Veteran

Moderator

Moderator

Veteran

Veteran

Veteran

Weak-to-strong generalization​

More resources​

The superalignment problem​

Our setup​

Our results​

Research opportunities​

...

YSL as a gang must end

Veteran

...

Superstar

YSL as a gang must end

Weak-to-strong generalization

More resources

The superalignment problem

Our setup

Our results

Research opportunities