Large Language Models News & Discussions

bnew · Aug 15, 2024

1/1
Now xAI is at the frontier

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/1
BREAKING: Here's an early look at Grok 2.0 features and abilities!

It's better at coding, writing, and generating news! It'll also generate images using the FLUX.1 model!

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Aug 19, 2024

Google quietly opens Imagen 3 access to all U.S. users

Google quietly releases Imagen 3, its advanced AI image generator, to all U.S. users, sparking debates on AI ethics and creativity as it competes with xAI's unrestricted Grok-2.

venturebeat.com

Google quietly opens Imagen 3 access to all U.S. users

Michael Nuñez@MichaelFNunez

August 15, 2024 10:42 AM

Credit: Google Imagen

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

Google has quietly made its latest text-to-image AI model, Imagen 3, available to all U.S. users through its ImageFX platform and published a research paper detailing the technology.

This dual release marks a significant expansion of access to the AI tool, which was initially announced in May at Google I/O and limited to select Vertex AI users in June.

1/1
Google announces Imagen 3

discuss: Paper page - Imagen 3

We introduce Imagen 3, a latent diffusion model that generates high quality images from text prompts. We describe our quality and responsibility evaluations. Imagen 3 is preferred over other state-of-the-art (SOTA) models at the time of evaluation. In addition, we discuss issues around safety and representation, as well as methods we used to minimize the potential harm of our models.

The company’s research team stated in their paper, published on arxiv.org, “We introduce Imagen 3, a latent diffusion model that generates high-quality images from text prompts. Imagen 3 is preferred over other state-of-the-art models at the time of evaluation.”

This development comes in the same week as xAI’s launch of Grok-2, a rival AI system with notably fewer restrictions on image generation, highlighting the divergent approaches to AI ethics and content moderation within the tech industry.

Imagen 3: Google’s latest salvo in the AI arms race

Google’s release of Imagen 3 to the broader U.S. public represents a strategic move in the intensifying AI arms race. However, the reception has been mixed. While some users praise its improved texture and word recognition capabilities, others express frustration with its strict content filters.

One user on Reddit noted, “Quality is much higher with amazing texture and word recognition, but I think it’s currently worse than Imagen 2 for me.” They added, “It’s pretty good, but I’m working harder with higher error results.”

The censorship implemented in Imagen 3 has become a focal point of criticism. Many users report that seemingly innocuous prompts are being blocked. “Way too censored I can’t even make a cyborg for crying out loud,” another Reddit user commented. Another said, “[It] denied half my inputs, and I’m not even trying to do anything crazy.”

These comments highlight the tension between Google’s efforts to ensure responsible AI use and users’ desires for creative freedom. Google has emphasized its focus on responsible AI development, stating, “We used extensive filtering and data labeling to minimize harmful content in datasets and reduced the likelihood of harmful outputs.”

Grok-2: xAI’s controversial unrestricted approach

In stark contrast, xAI’s Grok-2, integrated within Elon Musk’s social network X and available through premium subscription tiers, offers image generation capabilities with virtually no restrictions. This has led to a flood of controversial content on the platform, including manipulated images of public figures and graphic depictions that other AI companies typically prohibit.

The divergent approaches of Google and xAI underscore the ongoing debate in the tech industry about the balance between innovation and responsibility in AI development. While Google’s cautious approach aims to prevent misuse, it has led to frustration among some users who feel creatively constrained. Conversely, xAI’s unrestricted model has reignited concerns about the potential for AI to spread misinformation and offensive content.

Industry experts are closely watching how these contrasting strategies will play out, particularly as the U.S. presidential election approaches. The lack of guardrails in Grok-2’s image generation capabilities has already raised eyebrows, with many speculating that xAI will face increasing pressure to implement restrictions.

The future of AI image generation: Balancing creativity and responsibility

Despite the controversies, some users have found value in Google’s more restricted tool. A marketing professional on Reddit shared, “It’s so much easier to generate images via something like Adobe Firefly than digging through hundreds of pages of stock sites.”

As AI image generation technology becomes more accessible to the public, the industry faces critical questions about the role of content moderation, the balance between creativity and responsibility, and the potential impact of these tools on public discourse and information integrity.

The coming months will be crucial for both Google and xAI as they navigate user feedback, potential regulatory scrutiny, and the broader implications of their technological choices. The success or failure of their respective approaches could have far-reaching consequences for the future development and deployment of AI tools across the tech industry.

bnew · Aug 19, 2024

Runway’s Gen-3 Alpha Turbo is here and can make AI videos faster than you can type

Gen-3 Alpha Turbo should be priced at 5 credits per 1 second of video per Runway's statement that it is 50% less.

venturebeat.com

Runway’s Gen-3 Alpha Turbo is here and can make AI videos faster than you can type

Carl Franzen@carlfranzen

August 15, 2024 9:04 AM

Robot director in red beret looks through camera monitor setup video village

Credit: VentureBeat made with Midjourney

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

After showing it off in a preview late last month, Runway ML has officially released Gen-3 Alpha Turbo, the latest version of the AI video generation model that it claims is seven times faster and half the cost of its predecessor, Gen-3 Alpha.

The goal? Make AI video production more accessible to a wider audience across all subscription plans, including free trials.

The New York City-based company announced the news on its X account, writing: “Gen-3 Alpha Turbo Image to Video is now available and can generate 7x faster for half the price of the original Gen-3 Alpha. All while still matching performance across many use cases. Turbo is available for all plans, including trial for free users. More improvements to the model, control mechanisms and possibilities for real-time interactivity to come.”

1/1
Gen-3 Alpha Turbo Image to Video is now available and can generate 7x faster for half the price of the original Gen-3 Alpha. All while still matching performance across many use cases. Turbo is available for all plans, including trial for free users.

More improvements to the model, control mechanisms and possibilities for real-time interactivity to come.

Gen-3 Alpha Turbo builds on the already impressive capabilities of Runway’s Gen-3 Alpha, which gained attention for its realistic video generation.

However, Runway has pushed the boundaries even further with this latest release, prioritizing speed without compromising on performance. According to Runway co-founder and CEO Cristóbal Valenzuela, the new Turbo model means “it now takes me longer to type a sentence than to generate a video.”

1/1
it now takes me longer to type a sentence than to generate a video.

This leap in speed addresses a critical issue with AI video generation models—time lag—allowing for near real-time video production.

As a result, users can expect a more seamless and efficient workflow, particularly in industries where quick turnaround times are essential.

Broad accessibility and aggressively low pricing

Runway’s decision to lower the cost of using Gen-3 Alpha Turbo aligns with its strategy to encourage more widespread adoption of its technology.

While Gen-3 Alpha regular is priced at 10 credits per second of video generated by the model, Gen-3 Alpha Turbo should be priced at 5 credits per 1 second of video per Runway’s statement that it is 50% less.

Credits can be purchased in bundles starting at 1,000 credits on the Runway website or as part of monthly or annual subscription tiers. It costs $10 for 1,000 credits, or $0.01 per credit.

Screenshot-2024-08-15-at-11.44.45%E2%80%AFAM-1.png

The model’s availability across all subscription plans, including free trials, ensures that a broad spectrum of users—from hobbyists to professional creators—can benefit from these enhancements.

By offering a faster and cheaper alternative, Runway is positioning itself to maintain a competitive edge in the rapidly evolving AI video generation market, where rivals including Pika Labs, Luma AI’s Dream Machine, Kuaishou’s Kling, and OpenAI’s Sora are also vying for dominance.

Yet despite showing off Sora in January of this year and releasing it to a select group of creators, OpenAI’s video model remains out of reach to the public, and other video generation models tend to take much longer to generate from text prompts and images — more than several minutes in my tests.

Promising initial results

Already, users of Runway Gen-3 Alpha Turbo and subscribers are sharing videos made with the new model and are finding themselves impressed with its combination of speed and quality.

While not always 1×1 in terms of seconds spent generating to seconds of video, the users are nonetheless delighted with the overall experience of using the new model and showcasing a wide range of styles, from realistic to animation and anime.

Some users, such as @LouiErik8Irl on X, prefer the regular Gen-3 Alpha model for its higher quality, in their eyes. Yet they see value in being able to generate simple motion quickly through Gen-3 Alpha Turbo.

1/10
@runwayml Gen-3 Alpha Turbo model is out! It is insanely fast (7x) and very high quality too! Tho the base Alpha model still wins when you want more dynamic motions.

Here are 6

examples to test and compare the two models.

(1/6)
The left is the normal model, and the right is Turbo.

I think I will use Turbo for shots that just need some simple motion from now on. However, the Turbo model doesn't have the Last frame gen, so it's a trade-off.

2/10
It's pretty clear that the base model is far more dynamic. But getting 7X speed with Turbo is also a great trade-off.

Used the same prompt for both to test:
The camera flies inside the tornado

3/10
(2/6)
The base model is better at dynamic motion, but that also leads to more morphing. So if you want more stable and simple motion, Turbo is the way to go!

No prompt for this one to test the models raw.

The left is the normal model, and the right is Turbo.

4/10
(3/6)
But if you want more complex motions and changes, the base model is far better.

Same prompt for both:
The dragon breathes fire out of its mouth.

The left is the normal model, and the right is Turbo.

5/10
(4/6)
The turbo model also seems to stick to the original image more closely, while the base model is more creative.

No prompt for both to test raw motion.

The left is the normal model, and the right is Turbo.

6/10
(5/6)
Some shot types might also work better with Turbo due to the fact that it is more stable. You can see the fire is definitely better for the base model here, but the overall motion of the Turbo model is not bad either.

No prompt for both to test raw motion.

The left is the normal model, and the right is Turbo.

7/10
(6/6)
Again, the base model wins in terms of dynamics. But Turbo model is more consistent and stable. It also doesn't change the character's faces when moving, which was a big problem with the base model. Turbo sticks to the original image really well, tho it is not immune from morphing either.

No prompt for both to test raw motion.

The left is the normal model, and the right is Turbo.

8/10
Overall, the new Turbo model is a fantastic addition to Gen-3. I would use Turbo for shots that need simple motion, more stability, sticking closer to the original image, or faster iteration. And use the base model for more complex motion, more creative outputs, and the First and Last frame feature.

9/10
Btw this set of images was for the Discord daily challenge. Which is themed Fire.

10/10
At the model selection drop-down button on the top left.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

Future improvements and unresolved legal/ethical issues

Runway is not resting on its laurels with the release of Gen-3 Alpha Turbo. The company has indicated that more improvements are on the horizon, including enhancements to the model’s control mechanisms and possibilities for real-time interactivity.

Previously, on its older Gen-2 model, Runway enabled the capability to edit selective objects and portions of a video with its Multi Motion Brush, enabling a more granular direction of the AI algorithms and resulting clips.

However, the company continues to navigate the ethical complexities of AI model training. Runway has faced scrutiny over the sources of its training data, particularly following a report from 404 Media that the company may have used copyrighted content from YouTube for training purposes without authorization.

Although Runway has not commented on these allegations, the broader industry is grappling with similar challenges, as legal battles over the use of copyrighted materials in AI training intensify.

As the debate over ethical AI practices unfolds, Runway and other generative AI companies may find themselves compelled to disclose more information about their training data and methods. The outcome of these discussions could have significant implications for the future of AI model development and deployment.

bnew · Aug 21, 2024

1/1
LongWriter unlocks text generation up to 10k words!

can't wait to try it

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/2
LongWriter-glm4-9b from @thukeg is capable of generating 10,000+ words at once!

Paper identifies a problem with current long context LLMs -- they can process inputs up to 100,000 tokens, yet struggle to generate outputs exceeding lengths of 2,000 words.

Paper proposes that an LLM's effective generation length is inherently bounded by the sample it has seen during supervised fine-tuning

Demonstrates that existing long context LLMs already possess the potential for a larger output window--all you need is data with extended output during model alignment to unlock this capability.

Code & models are released under Apache License 2.0

2/2
Model on

Hub: THUDM/LongWriter-glm4-9b · Hugging Face

Gradio demo available on the repo locally and linked on the project Readme: GitHub - THUDM/LongWriter: LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

Clone the repo and launch the gradio demo: python trans_web_demo.py

Demo releasing soon on

Spaces, stay tuned!

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Aug 21, 2024

1/4
1/ AI news this week that we're paying close attention to:

• Hermes 3 - The uncensored AI model by @NousResearch • Listening-while-Speaking Language Model (LSLM) by ByteDance devs

Why? Read more below!

2/4
Hermes 3 - The uncensored AI model

Powered by @lambdaapi & @NousResearch, built on @Meta's Llama 3.1 405B. The open-source uncensored LLM offers powerful agentic capabilities & user-tailored responses.

It represents a new approach to unrestricted, personalized AI interaction.

3/4
Listening-while-Speaking Language Model (LSLM) - The AI that converses in real-time

Developed by researchers from ByteDance & Shanghai Jiao Tong University, built on a decoder-only Transformer. The model can listen & speak simultaneously, enabling seamless natural conversations.

4/4
Follow @Mira_Network for more weekly updates in AI!

Join our discord: Discord - Group Chat That’s All Fun & Games

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/8
Introducing 𝐇𝐞𝐫𝐦𝐞𝐬 𝟑: The latest version in our Hermes series, a generalist language model 𝐚𝐥𝐢𝐠𝐧𝐞𝐝 𝐭𝐨 𝐲𝐨𝐮.

Hermes 3 - NOUS RESEARCH

Hermes 3 is available in 3 sizes, 8, 70, and 405B parameters. Hermes has improvements across the board, but with particular capability improvements in roleplaying, agentic tasks, more reliable function calling, multi-turn chats, long context coherence and more.

We published a technical report detailing new capabilities, training run information and more:

Paper: https://nousresearch.com/wp-content/uploads/2024/08/Hermes-3-Technical-Report.pdf

This model was trained in collaboration with our great partners @LambdaAPI, and they are now offering it for free in a chat interface here: https://lambda.chat/chatui/

You can also chat with Hermes 405B on our discord, join here: Join the Nous Research Discord Server!

Hermes 3 was a project built with the help of @Teknium1, @TheEmozilla, @nullvaluetensor, @karan4d, @huemin_art, and an uncountable number of people and work in the Open Source community.

2/8
Hermes 3 performs strongly against Llama-3.1 Instruct Models, but with a focus on aligning the model to you, instead of a company or external policy - meaning less censorship and more steerability - with additional capabilities like agentic XML, scratchpads, roleplaying prowess, and more. Step level reasoning and planning, internal monologues, improved RAG, and even LLM as a judge capabilities were also targeted.

Below are benchmark comparisons between Hermes 3 and Llama-3.1 Instruct and a sample of utilizing the agentic XML tags:

3/8
Lambda's Hermes 3 Announcement Post: Unveiling Hermes 3: The First Full-Parameter Fine-Tuned Llama 3.1 405B Model is on Lambda’s Cloud

Nous' blog post on our experience discovering emergent behavior with 405B:
Freedom at the Frontier: Hermes 3 - NOUS RESEARCH

Hermes 3 405B was trained with @LambdaAPI's new 1-Click Cluster offering, check it out here: Lambda GPU Cloud | 1-Click Clusters

Check out our reference inference code for Hermes Function Calling here: GitHub - NousResearch/Hermes-Function-Calling

Thanks to all the other organizations who helped bring this together, including @weights_biases, @neuralmagic, @vllm_project, @huggingface, @WeAreFireworks, @AiEleuther, @togethercompute, @AIatMeta, and many more

4/8
Special shoutouts to @intrstllrninja for all the work on making function calling real, robust, and useful

and a special thanks to our designer @StudioMilitary for the cover art and all the other designs that Nous uses!

5/8
Believe Lambda is also hosting an api version, will update when clear

6/8
You can try it out in our discord right now if you want! Join the Nous Research Discord Server!

7/8
He's the god of language

8/8
Certainly (not sure if 405b can be done but the rest yes)

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Aug 21, 2024

1/2
1/n How Mutual Consistent Reasoning Unlocks Agentic AI for Small Language Models

Large Language Models (LLMs) have demonstrated remarkable abilities in various tasks, yet their capacity for complex reasoning remains a significant challenge, especially for their smaller, more accessible counterparts – Small Language Models (SLMs). While fine-tuning on specific reasoning datasets can improve performance, this approach often relies on data generated by superior models, creating a dependence that hinders the development of truly self-sufficient SLMs. The paper "Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers" tackles this challenge head-on, introducing rStar, a novel approach that significantly enhances the reasoning capabilities of SLMs without relying on fine-tuning or data from superior models.

The core of rStar lies in addressing the two major pain points that plague SLMs when it comes to complex reasoning: ineffective exploration of potential solutions and unreliable self-assessment. Traditional methods often confine SLMs to a limited set of reasoning actions, hindering their ability to explore diverse paths towards a solution. Furthermore, relying on these models to evaluate their own reasoning proves unreliable, as their self-assessment capabilities are often inaccurate.

rStar tackles these limitations through a clever two-pronged approach: a richer, human-inspired set of reasoning actions and a collaborative evaluation mechanism called mutual consistency. Unlike previous methods that rely on a single action type, rStar empowers SLMs with a diverse set of actions, mimicking human problem-solving strategies. These actions include proposing thoughts, formulating sub-questions, re-answering, and even rephrasing questions for clarity. This expanded repertoire allows SLMs to navigate the solution space more effectively, exploring a wider range of possibilities.

To address the issue of unreliable self-evaluation, rStar introduces a second SLM as a partner in a collaborative verification process. The first SLM, acting as a generator, leverages the diverse action set and the Monte Carlo Tree Search (MCTS) algorithm to generate multiple candidate reasoning trajectories. The second SLM, acting as a discriminator, then evaluates these trajectories by attempting to complete them with partial information. This collaborative approach, termed "mutual consistency," ensures that only those reasoning paths agreed upon by both SLMs are considered valid, leading to a more robust and reliable evaluation process.

The effectiveness of rStar is evident in its impressive performance on a variety of reasoning tasks. Tested on five different SLMs and five diverse reasoning benchmarks, including mathematical problem-solving and multi-hop reasoning over text, rStar consistently outperforms existing state-of-the-art methods. Remarkably, it achieves accuracy comparable to or even exceeding models fine-tuned on these specific datasets, highlighting its ability to learn and improve without task-specific training data.

The success of rStar signifies a significant leap forward in the field of LLM reasoning. By combining the power of diverse reasoning actions with a collaborative evaluation mechanism, rStar unlocks the potential of SLMs, enabling them to tackle complex reasoning tasks with remarkable accuracy. This approach not only paves the way for more accessible and efficient AI systems but also sheds light on the power of collaborative learning and self-improvement in pushing the boundaries of artificial intelligence.

2/2
2/n Comparision with other methods

1. Prompting LLMs to Reason:

Chain-of-Thought (CoT) (Wei et al., 2022): Prompts LLMs with a few-shot demonstration of reasoning steps.Contrast with rStar: CoT relies on a single, greedy decoding path, while rStar explores multiple reasoning trajectories using MCTS and a richer action space.

Planning, Decomposition, Abstraction, Programming Prompts: Various works explore specific prompting strategies to guide reasoning.

Contrast with rStar: These methods focus on single-round inference, while rStar uses an iterative, self-improving approach.

2. LLM Self-improvement:
Fine-tuning based methods (Chen et al., 2024b;a): Use a well-pretrained LLM to generate data for further fine-tuning.Contrast with rStar: rStar improves reasoning at inference time without requiring additional training data or a superior teacher model.

Self-verification (Gero et al., 2023; Zhou et al., 2023): LLMs verify their own answers, often by generating explanations or checking for consistency.Contrast with rStar: rStar uses a separate discriminator SLM for more reliable evaluation, overcoming the limitations of self-assessment in SLMs.

RAP (Hao et al., 2023): Uses self-exploration and self-rewarding to iteratively improve reasoning.

Contrast with rStar: rStar addresses the limitations of RAP's single action type and unreliable self-rewarding with its diverse action space and mutual consistency mechanism.

3. Sampling Reasoning Paths:
Self-Consistency (Wang et al., 2023): Samples multiple CoT paths and selects the most consistent answer.Contrast with rStar: Self-consistency relies on random sampling of complete CoT paths, while rStar uses MCTS with a richer action space for more guided exploration.

Tree-search approaches (Yao et al., 2024; Hao et al., 2023; Zhang et al., 2024): Use tree search algorithms like MCTS to explore reasoning paths.

Contrast with rStar: Most existing tree-search methods use limited action spaces, while rStar's diverse actions provide more flexibility and effectiveness.

4. Answer Verification:
Majority voting (Wang et al., 2023): Selects the answer that appears most frequently across multiple generated solutions.Contrast with rStar: rStar's mutual consistency mechanism provides a more robust evaluation than simple majority voting, especially for SLMs.

Trained reward models (Wang et al., 2024b; Chen et al., 2024a): Train separate models to evaluate the quality of reasoning paths.

Contrast with rStar: rStar avoids the need for additional training data and potential overfitting issues associated with training separate reward models.

In essence, rStar distinguishes itself from prior work by combining the strengths of several approaches:
It leverages the power of tree search for exploring solution spaces.
It introduces a richer, human-inspired action space for more effective exploration.
It employs a novel mutual consistency mechanism for reliable evaluation without relying on self-assessment or external training data.
This unique combination allows rStar to significantly improve SLM reasoning, achieving performance comparable to or even surpassing fine-tuned models.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Aug 21, 2024

1/6
Super excited to announce our cool project, Trace, on optimizing general AI systems, using LLMs.

Trace is a new AutoDiff-like tool for training AI systems end-to-end with general feedback (like numerical rewards, natural language text, compiler errors). Trace

2/6
Training AI systems & agents with Trace can never be simpler. It is just like training neural networks!! With Trace, you can use a single optimizer to learn HPs, prompts, orchestration code, robot policy, etc, with just a few iterations of training.

3/6
Trace generalizes the back-propagation algorithm by capturing and propagating an AI system's < execution trace >. Trace is implemented as a PyTorch-like Python library. Users of Trace can optimize heterogenous parameters jointly in a non-differentiable workflow with feedback.

4/6
This feat is made possible by a new math formulation of iterative optimization, we call Optimization with Trace Oracle (OPTO). In the paper, we design an LLM-based OPTO optimizer, OptoPrime, that can solve problems originating from disparate domains.

5/6
This is work done by a wonderful collaboration with @Allen_A_N and @adith387

. Stay tuned. We will release the code soon!
,

6/6
The source code is out now :smile:

Please see also our new blogpost to learn more about it. This is a preview of the library. Let me know if you have any feedback.

Discover Trace, a new framework for AI optimization from language models to robot control

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Aug 21, 2024

bnew · Aug 21, 2024

1/11
Introducing Ideogram 2.0 — our most advanced text-to-image model, now available to all users for free.

Today’s milestone launch also includes the release of the Ideogram iOS app, the beta version of the Ideogram API, and Ideogram Search.

Here’s what’s new…

2/11
Choose from 5 distinct styles: General, Realistic, Design, 3D, and Anime.

The Realistic style in Ideogram 2.0 lets you create images that convincingly resemble photographs, with dramatically improved textures. Features like human hands, eyes, skin, and hair appear strikingly lifelike.

3/11
The Design style significantly improves text rendering in Ideogram 2.0, and it enables you to create premium graphic designs including greeting cards, t-shirt designs, posters, illustrations with longer and more accurate text.

4/11
Choose from multiple color palettes for images, giving you precise control over the color scheme. This is useful for brand consistency or for capturing a specific vibe.

You can specify custom palettes too.

5/11
Use Ideogram Search to explore over 1 billion public Ideogram images for inspiration.

6/11
Ideogram 2.0 offers industry-leading image generation capabilities.

Human evaluations consistently rate Ideogram 2.0 as significantly better than Flux Pro and DALL·E 3.

7/11
Our API (beta) is now available to developers and businesses.

We offer superior image quality at a lower cost compared to other models.

Join our public beta here: https://ideogram.ai/manage-api
Learn about pricing here: Ideogram API Pricing

We can’t wait to see what you build.

8/11
Our brand new iOS app is now available on the App Store.

You can now access Ideogram's powerful image generation capabilities on the go.

9/11
Ideogram 2.0 gives you the tools — an upgraded model, new styles, color palette control, an iOS app, and an API. Now it’s your turn to make something amazing.

What will you create?

Share your Ideogram 2.0 masterpieces below.

10/11
Learn more about Ideogram 2.0 and the new features released today here: Ideogram 2.0

11/11
Enjoy!

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Aug 28, 2024

1/1
Today, we are rolling out three experimental models:

- A new smaller variant, Gemini 1.5 Flash-8B
- A stronger Gemini 1.5 Pro model (better on coding & complex prompts)
- A significantly improved Gemini 1.5 Flash model

Try them on Google AI Studio | Gemini API | Google for Developers | Google AI for Developers, details in

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

Google drops ‘stronger’ and ‘significantly improved’ experimental Gemini models

The Gemini models show "huge gains," with 1.5 Flash across the board and a 1.5 Pro that is much better at math, coding and complex prompts.

venturebeat.com

Google drops ‘stronger’ and ‘significantly improved’ experimental Gemini models

Taryn Plumb@taryn_plumb

August 27, 2024 6:25 PM

VentureBeat/Ideogram

Google is continuing its aggressive Gemini updates as it races towards its 2.0 model.

The company today announced a smaller variant of Gemini 1.5, Gemini 1.5 Flash-8B, alongside a “significantly improved” Gemini 1.5 Flash and a “stronger” Gemini 1.5 Pro. These show increased performance against many internal benchmarks, the company says, with “huge gains” with 1.5 Flash across the board and a 1.5 Pro that is much better at math, coding and complex prompts.

Today, we are rolling out three experimental models:

– A new smaller variant, Gemini 1.5 Flash-8B
– A stronger Gemini 1.5 Pro model (better on coding & complex prompts)
– A significantly improved Gemini 1.5 Flash model

Try them on Google AI Studio | Gemini API | Google for Developers | Google AI for Developers, details in ?

— Logan Kilpatrick (@OfficialLoganK) August 27, 2024

“Gemini 1.5 Flash is the best… in the world for developers right now,” Logan Kilpatrick, product lead for Google AI Studio, boasted in a post on X.

‘Newest experimental iteration’ of ‘unprecedented’ Gemini models

Google introduced Gemini 1.5 Flash — the lightweight version of Gemini 1.5 — in May. The Gemini 1.5 family of models was built to handle long contexts and can reason over fine-grained information from 10M and more tokens. This allows the models to process high-volume multimodal inputs including documents, video and audio.

Today, Google is making available an “improved version” of a smaller 8 billion parameter variant of Gemini 1.5 Flash. Meanwhile, the new Gemini 1.5 Pro shows performance gains on coding and complex prompts and serves as a “drop-in replacement” to its previous model released in early August.

Kilpatrick was light on additional details, saying that Google will make a future version available for production use in the coming weeks that “hopefully will come with evals!”

He explained in an X thread that the experimental models are a means to gather feedback and get the latest, ongoing updates into the hands of developers as quickly as possible. “What we learn from experimental launches informs how we release models more widely,” he posted.

The “newest experimental iteration” of both Gemini 1.5 Flash and Pro feature 1 million token limits and are available to test for free via Google AI Studio and Gemini API, and also soon through the Vertex AI experimental endpoint. There is a free tier for both and the company will make available a future version for production use in coming weeks, according to Kilpatrick.

Beginning Sept. 3, Google will automatically reroute requests to the new model and will remove the older model from Google AI Studio and the API to “avoid confusion with keeping too many versions live at the same time,” said Kilpatrick.

“We are excited to see what you think and to hear how this model might unlock even more new multimodal use cases,” he posted on X.

Google DeepMind researchers call Gemini 1.5’s scale “unprecedented” among contemporary LLMs.

“We have been blown away by the excitement for our initial experimental model we released earlier this month,” Kilpatrick posted on X. “There has been lots of hard work behind the scenes at Google to bring these models to the world, we can’t wait to see what you build!”

‘Solid improvements,’ still suffers from ‘lazy coding disease’

Just a few hours after the release today, the Large Model Systems Organization (LMSO) posted a leaderboard update to its chatbot arena based on 20,000 community votes. Gemini 1.5-Flash made a “huge leap,” climbing from 23rd to sixth place, matching Llama levels and outperforming Google’s Gemma open models.

Gemini 1.5-Pro also showed “strong gains” in coding and math and “improve[d] significantly.”

The LMSO lauded the models, posting: “Big congrats to Google DeepMind Gemini team on the incredible launch!”

Chatbot Arena update!

The latest Gemini (Pro/Flash/Flash-9b) results are now live, with over 20K community votes!

Highlights:
– New Gemini-1.5-Flash (0827) makes a huge leap, climbing from #23 to #6 overall!
– New Gemini-1.5-Pro (0827) shows strong gains in coding, math over… x.com pic.twitter.com/D3XpU0Xiw2

— lmsys.org (@lmsysorg) August 27, 2024

As per usual with iterative model releases, early feedback has been all over the place — from sycophantic praise to mockery and confusion.

Some X users questioned why so many back-to-back updates versus a 2.0 version. One posted: “Dude this isn’t going to cut it anymore :| we need Gemini 2.0, a real upgrade.”

On the other hand, many self-described fanboys lauded the fast upgrades and quick shipping, reporting “solid improvements” in image analysis. “The speed is fire,” one posted, and another pointed out that Google continues to ship while OpenAI has effectively been quiet. One went so far as to say that “the Google team is silently, diligently and constantly delivering.”

Some critics, though, call it “terrible,” and “lazy” with tasks requiring longer outputs, saying Google is “far behind” Claude, OpenAI and Anthropic.

The update “sadly suffers from the lazy coding disease” similar to GPT-4 Turbo, one X user lamented.

Another called the updated version “definitely not that good” and said it “often goes crazy and starts repeating stuff non-stop like small models tend to do.” Another agreed that they were excited to try it but that Gemini has “been by far the worst at coding.”

Some also poked fun at Google’s uninspired naming capabilities and called back to its huge woke blunder earlier this year.

“You guys have completely lost the ability to name things,” one user joked, and another agreed, “You guys seriously need someone to help you with nomenclature.”

And, one dryly asked: “Does Gemini 1.5 still hate white people?”

bnew · Aug 28, 2024

Anthropic releases AI model system prompts, winning praise for transparency

Anthropic's recent release of system prompts for its Claude family of AI models might be a path for other AI companies to follow.

venturebeat.com

Anthropic releases AI model system prompts, winning praise for transparency

Emilia David@miyadavid

August 27, 2024 11:01 AM

A scientist stands on stage holding a red cloth above four robotic head busts

Credit: VentureBeat made with ChatGPT

The OpenAI rival startup Anthropic yesterday released system prompts for its Claude family of AI models and committed to doing so going forward, setting what appears to be a new standard of transparency for the fast-moving gen AI industry, according to observers.

System prompts act much like the operating instructions of large language models (LLMs), telling models the general rules they should follow when interacting with users and the behaviors or personalities they should exhibit They also tend to show the cut-off date for the information learned by the LLM during training.

Most LLMs have system prompts, but not every AI company publicly releases them. Uncovering the system prompts for models has even become a hobby of sorts for AI jailbreakers.

But now, Anthropic has beat the jailbreakers at their own game, going ahead and revealing the operating instructions for its models Claude 3.5 Sonnet, Claude 3 Haiku and Claude 3 Opus on its website under the release notes section.

In addition, Anthropic’s Head of Developer Relations Alex Albert posted on X (formerly Twitter) a commitment to keeping the public updated on its system prompts, writing: “We’re going to log changes we make to the default system prompts on Claude dot ai and our mobile apps.”

We've added a new system prompts release notes section to our docs. We're going to log changes we make to the default system prompts on Claude dot ai and our mobile apps. (The system prompt does not affect the API.) pic.twitter.com/9mBwv2SgB1

— Alex Albert (@alexalbert__) August 26, 2024

What Anthropic’s system prompts reveal

The system prompts for the three models — Claude 3.5 Sonnet, Claude 3 Haiku and Claude 3 Opus — reveal some interesting details about each of them, their capabilities and knowledge date cut-offs, and various personality quirks.

Claude 3.5 Sonnet is the most advanced version, with a knowledge base updated as of April 2024. It provides detailed responses to complex questions and concise answers to simpler tasks, emphasizing both accuracy and brevity. This model handles controversial topics with care, presenting information without explicitly labeling it as sensitive or claiming objectivity. Additionally, Claude 3.5 Sonnet avoids unnecessary filler phrases or apologies and is particularly mindful of how it handles image recognition, ensuring it never acknowledges recognizing any faces.

Claude 3 Opus operates with a knowledge base updated as of August 2023 and excels at handling complex tasks and writing. It is designed to give concise responses to simple queries and thorough answers to more complex questions. Claude 3 Opus addresses controversial topics by offering a broad range of perspectives, avoiding stereotyping, and providing balanced views. While it shares some similarities with the Sonnet model, it does not incorporate the same detailed behavioral guidelines, such as avoiding apologies or unnecessary affirmations.

Claude 3 Haiku is the fastest model in the Claude family, also updated as of August 2023. It is optimized for delivering quick, concise responses to simple questions while still providing thorough answers when needed for more complex issues. The prompt structure for Haiku is more straightforward compared to Sonnet, focusing primarily on speed and efficiency, without the more advanced behavioral nuances found in the Sonnet model.

Why Anthropic’s release of its system prompts is important

A common complaint about generative AI systems revolves around the concept of a “black box,” where it’s difficult to find out why and how a model came to a decision. The black box problem has led to research around AI explainability, a way to shed some light on the predictive decision-making process of models. Public access to system prompts is a step towards opening up that black box a bit, but only to the extent that people understand the rules set by AI companies for models they’ve created.

AI developers celebrated Anthropic’s decision, noting that releasing documents on Claude’s system prompts and updates to it stands out among other AI companies.

Anthropic Claude now tracks system prompt changes in their docs!

This is SO nice and much more transparent than ChatGPT!! https://t.co/m25OPZvNJF

— Nick Dobos (@NickADobos) August 26, 2024

We can now see the system prompts for all three versions of Claude – and when they were last updated – in their entirety. This is a great change, and I hope this is eventually adopted industry wide. Good stuff from Anthropic. Transparency! x.com pic.twitter.com/2E4zP4LsVz

— Andrew Curran (@AndrewCurran_) August 26, 2024

Great move by Anthropic to share their system prompt releases with users!System Prompts - Anthropic pic.twitter.com/thGbWTIwRZ

— Victor M (@victormustar) August 26, 2024

Not fully open source, though

Releasing system prompts for the Claude models does not mean Anthropic opened up the model family. The actual source code for running the models, as well as the training data set and underlying “weights” (or model settings), remain in Anthropic’s hands alone.

Still, Anthropic’s release of the Claude system prompts shows other AI companies a path to greater transparency in AI model development. And it benefits users by showing them just how their AI chatbot is designed to act.

1/1
We've added a new system prompts release notes section to our docs. We're going to log changes we make to the default system prompts on Claude dot ai and our mobile apps. (The system prompt does not affect the API.)

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/1
Great move by Anthropic to share their system prompt releases with users!
System Prompts - Anthropic

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Aug 28, 2024

‘This could change everything!’ Nous Research unveils new tool to train powerful AI models with 10,000x efficiency

Ultimately, the DisTrO method could open the door to many more people being able to train massively powerful AI models.

venturebeat.com

‘This could change everything!’ Nous Research unveils new tool to train powerful AI models with 10,000x efficiency

Carl Franzen@carlfranzen

August 27, 2024 10:22 AM

Frizzy dark haired three eyed glowing eyes cyborg with giant hands overlooks office workers at desks with PCs and globe

Credit: VentureBeat made with ChatGPT

Nous Research turned heads earlier this month with the release of its permissive, open-source Llama 3.1 variant Hermes 3.

Now, the small research team dedicated to making “personalized, unrestricted AI” models has announced another seemingly massive breakthrough: DisTrO (Distributed Training Over-the-Internet), a new optimizer that reduces the amount of information that must be sent between various GPUs (graphics processing units) during each step of training an AI model.

Nous’s DisTrO optimizer means powerful AI models can now be trained outside of big companies, across the open web on consumer-grade connections, potentially by individuals or institutions working together from around the world.

DisTrO has already been tested and shown in a Nous Research technical paper to yield an 857 times efficiency increase compared to one popular existing training algorithm, All-Reduce, as well as a massive reduction in the amount of information transmitted during each step of the training process (86.8 megabytes compared to 74.4 gigabytes) while only suffering a slight loss in overall performance. See the results in the table below from the Nous Research technical paper:

Screenshot-2024-08-27-at-12.41.29%E2%80%AFPM.png

Ultimately, the DisTrO method could open the door to many more people being able to train massively powerful AI models as they see fit.

As the firm wrote in a post on X yesterday: “Without relying on a single company to manage and control the training process, researchers and institutions can have more freedom to collaborate and experiment with new techniques, algorithms, and models. This increased competition fosters innovation, drives progress, and ultimately benefits society as a whole.”

What if you could use all the computing power in the world to train a shared, open source AI model?

Preliminary report: DisTrO/A_Preliminary_Report_on_DisTrO.pdf at main · NousResearch/DisTrO

Nous Research is proud to release a preliminary report on DisTrO (Distributed Training Over-the-Internet) a family of… pic.twitter.com/h2gQJ4m7lB

— Nous Research (@NousResearch) August 26, 2024

'

The problem with AI training: steep hardware requirements

As covered on VentureBeat previously, Nvidia’s GPUs in particular are in high demand in the generative AI era, as the expensive graphics cards’ powerful parallel processing capabilities are needed to train AI models efficiently and (relatively) quickly. This blog post at APNic describes the process well.

A big part of the AI training process relies on GPU clusters — multiple GPUs — exchanging information with one another about the model and the information “learned” within training data sets.

However, this “inter-GPU communication” requires that GPU clusters be architected, or set up, in a precise way in controlled conditions, minimizing latency and maximizing throughput. Hence why companies such as Elon Musk’s Tesla are investing heavily in setting up physical “superclusters” with many thousands (or hundreds of thousands) of GPUs sitting physically side-by-side in the same location — typically a massive airplane hangar-sized warehouse or facility.

Because of these requirements, training generative AI — especially the largest and most powerful models — is typically an extremely capital-heavy endeavor, one that only some of the most well-funded companies can engage in, such as Tesla, Meta, OpenAI, Microsoft, Google, and Anthropic.

The training process for each of these companies looks a little different, of course. But they all follow the same basic steps and use the same basic hardware components. Each of these companies tightly controls its own AI model training processes, and it can be difficult for incumbents, much less laypeople outside of them, to even think of competing by training their own similarly-sized (in terms of parameters, or the settings under the hood) models.

But Nous Research, whose whole approach is essentially the opposite — making the most powerful and capable AI it can on the cheap, openly, freely, for anyone to use and customize as they see fit without many guardrails — has found an alternative.

What DisTrO does differently

While traditional methods of AI training require synchronizing full gradients across all GPUs and rely on extremely high bandwidth connections, DisTrO reduces this communication overhead by four to five orders of magnitude.

The paper authors haven’t fully revealed how their algorithms reduce the amount of information at each step of training while retaining overall model performance, but plan to release more on this soon.

The reduction was achieved without relying on amortized analysis or compromising the convergence rate of the training, allowing large-scale models to be trained over much slower internet connections — 100Mbps download and 10Mbps upload, speeds available to many consumers around the world.

The authors tested DisTrO using the Meta Llama 2, 1.2 billion large language model (LLM) architecture and achieved comparable training performance to conventional methods with significantly less communication overhead.

They note that this is the smallest-size model that worked well with the DisTrO method, and they “do not yet know whether the ratio of bandwidth reduction scales up, down, or stays constant as model size increases.”

Yet, the authors also say that “our preliminary tests indicate that it is possible to get a bandwidth requirements reduction of up to 1000x to 3000x during the pre-training,” phase of LLMs, and “for post-training and fine-tuning, we can achieve up to 10000x without any noticeable degradation in loss.”

They further hypothesize that the research, while initially conducted on LLMs, could be used to train large diffusion models (LDMs) as well: think the Stable Diffusion open source image generation model and popular image generation services derived from it such as Midjourney.

Still need good GPUs

To be clear: DisTrO still relies on GPUs — only instead of clustering them all together in the same location, now they can be spread out across the world and communicate over the consumer internet.

Specifically, DisTrO was evaluated using 32x H100 GPUs, operating under the Distributed Data Parallelism (DDP) strategy, where each GPU had the entire model loaded in VRAM.

This setup allowed the team to rigorously test DisTrO’s capabilities and demonstrate that it can match the convergence rates of AdamW+All-Reduce despite drastically reduced communication requirements.

This result suggests that DisTrO can potentially replace existing training methods without sacrificing model quality, offering a scalable and efficient solution for large-scale distributed training.

By reducing the need for high-speed interconnects DisTrO could enable collaborative model training across decentralized networks, even with participants using consumer-grade internet connections.

The report also explores the implications of DisTrO for various applications, including federated learning and decentralized training.

Additionally, DisTrO’s efficiency could help mitigate the environmental impact of AI training by optimizing the use of existing infrastructure and reducing the need for massive data centers.

Moreover, the breakthroughs could lead to a shift in how large-scale models are trained, moving away from centralized, resource-intensive data centers towards more distributed, collaborative approaches that leverage diverse and geographically dispersed computing resources.

What’s next for the Nous Research team and DisTrO?

The research team invites others to join them in exploring the potential of DisTrO. The preliminary report and supporting materials are available on GitHub, and the team is actively seeking collaborators to help refine and expand this groundbreaking technology.

Already, some AI influencers such as @kimmonismus on X (aka chubby) have praised the research as a huge breakthrough in the field, writing, “This could change everything!”

Wow, amazing! This could change everything! x.com

— Chubby (@kimmonismus) August 27, 2024

With DisTrO, Nous Research is not only advancing the technical capabilities of AI training but also promoting a more inclusive and resilient research ecosystem that has the potential to unlock unprecedented advancements in AI.

A_Preliminary_Report_on_DisTrODownload

bnew · Aug 28, 2024

Amazon to launch AI-enhanced Alexa subscription in October

Amazon plans to release a paid version of its Alexa voice assistant with advanced AI features this October. The upgraded Alexa aims to compete with newer AI assistants from companies like OpenAI and Google.

the-decoder.com

AI in practice

Aug 27, 2024

Amazon to launch AI-enhanced Alexa subscription in October

Midjourney prompted by THE DECODER

Kim M. Scheurenbrand

Kim is a regular contributor to THE DECODER. He focuses on the ethical, economic, and political implications of AI.
Profile

Amazon plans to release a paid version of its Alexa voice assistant with advanced AI features this October. The upgraded Alexa aims to compete with newer AI assistants from companies like OpenAI and Google.

Internal documents obtained by The Washington Post reveal that the new Alexa, known internally as "Remarkable Alexa" or "Project Banyan," will offer several AI-powered capabilities.

A key feature is "Smart Briefing," which will provide personalized daily news summaries generated by AI. This feature is being developed despite concerns about AI's accuracy in handling political news, especially with the upcoming U.S. presidential election.

The subscription could cost up to $10 per month, though the current "classic Alexa" will remain free. Amazon executives are expected to finalize pricing, subscription structure, and product name this month.

We know your family and they should eat more vegetables

The improved Alexa is reportedly designed to be more conversational and engaging. It will learn to recognize individual voices and ask users about their preferences to provide more tailored assistance. Other new features include improved recipe recommendations and AI-powered shopping tools.

Amazon is also developing a web-based product called Project Metis, intended to compete directly with ChatGPT-style LLM-tools. This move comes as Amazon faces pressure to keep pace with AI advancements from competitors.

The company has invested $4 billion in AI startup Anthropic but is also developing its own large language model, Olympus. Amazon aims for Olympus to surpass Anthropic's Claude model, with early reports suggesting it has "hundreds of billions of parameters." But we haven't heard from Olympus lately.

The launch of the new Alexa has been delayed, with internal documents initially targeting a September 2024 release. The current mid-October timeline indicates it has taken over a year to bring the project to market since its announcement in September 2023.

While Amazon hasn't publicly disclosed Alexa's financial performance, reports suggest the company's devices business, which includes Alexa, has been losing money. The subscription model and enhanced e-commerce features of the new Alexa could help Amazon recoup some of its investment.

bnew · Aug 28, 2024

OpenAI has a "highly accurate" ChatGPT text detector, but won't release it for now

OpenAI has developed technology to reliably detect AI-generated text, according to inside sources and documents reported by the Wall Street Journal. However, the company is reluctant to release it, likely due to concerns about its own business model.

the-decoder.com

AI in practice

Aug 5, 2024

Update

OpenAI has a "highly accurate" ChatGPT text detector, but won't release it for now

Midjourney prompted by THE DECODER

Matthias Bastian https://twitter.com/maba_xr

Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Profile
E-Mail

Update

Added OpenAI's statement.

Update from August 5, 2024:

Following the Wall Street Journal's coverage, OpenAI revised an earlier blog post on AI content detection, confirming the existence of their watermarking detector.

The detector excels at detecting minor text changes such as paraphrasing, but struggles with major changes such as translations, rewrites using different AI models, or the insertion and removal of special characters between words.

Ultimately, this makes bypassing the detector "trivial," according to OpenAI. The company also mentions concerns that it could unfairly target certain groups, such as non-native English speakers who use ChatGPT to improve their writing.

While the watermarking method has a low false positive rate for individual texts, applying it to large volumes of content would still lead to a significant number of misidentifications overall.

OpenAI is researching metadata as an alternative method of verifying the provenance of text. This research is in the "early stages of exploration," and its effectiveness remains to be seen. Metadata is promising because, unlike watermarks, it can be cryptographically signed, eliminating false positives, according to OpenAI.

OpenAI says it is focusing on audiovisual content, which it considers higher risk. It's updated DALL-E 3 image provenance C2PA-based system now tracks if and how AI-generated images are edited after generation.

Bild: OpenAI

Share

Recommend our article

Original article from August 4, 2024:

OpenAI has developed technology to reliably detect AI-generated text, according to inside sources and documents reported by the Wall Street Journal. However, the company is reluctant to release it, likely due to concerns about its own business model.

bnew · Aug 28, 2024

Former OpenAI researcher believes company is "fairly close" to AGI and not prepared for it

About half of OpenAI's AGI/ASI safety researchers have left the company recently, according to a former employee. The departures likely stem from disagreements over managing the risks of potential superintelligent AI.

the-decoder.com

AI in practice

Aug 27, 2024

Former OpenAI researcher believes company is "fairly close" to AGI and not prepared for it

Midjourney prompted by THE DECODER

Matthias Bastian

Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.

Profile
E-Mail

About half of OpenAI's AGI/ASI safety researchers have left the company recently, according to a former employee. The departures likely stem from disagreements over managing the risks of potential superintelligent AI.

Daniel Kokotajlo, a former OpenAI safety researcher, told Fortune magazine that around half of the company's safety researchers have departed, including prominent leaders.

While Kokotajlo didn't comment on specific reasons for all the resignations, he believes they align with his own views: OpenAI is "fairly close" to developing artificial general intelligence (AGI) but isn't prepared to "handle all that entails."

This has led to a "chilling effect" on those trying to publish research on AGI risks within the company, Kokotajlo said. He also noted an "increasing amount of influence by the communications and lobbying wings of OpenAI" on what's deemed appropriate to publish.

The temporary firing of OpenAI CEO Sam Altman was also linked to safety concerns. A law firm cleared Altman after his reinstatement.

Of about 30 employees working on AGI safety issues, around 16 remain. Kokotajlo said these departures weren't a "coordinated thing" but rather people "individually giving up."

Notable departures include Jan Hendrik Kirchner, Collin Burns, Jeffrey Wu, Jonathan Uesato, Steven Bills, Yuri Burda, Todor Markov, and OpenAI co-founder John Schulman.

The resignations of chief scientist Ilya Sutskever and Jan Leike, who jointly led the company's "superalignment" team focused on future AI system safety, were particularly significant. OpenAI subsequently disbanded this team.

Experts leave OpenAI, but not AGI

Kokotajlo expressed disappointment, but not surprise, that OpenAI opposed California's SB 1047 bill, which aims to regulate advanced AI system risks. He co-signed a letter to Governor Newsom criticizing OpenAI's stance, calling it a betrayal of the company's original plans to thoroughly assess AGI's long-term risks for developing regulations and laws.

Large Language Models News & Discussions

Veteran

Veteran

Google quietly opens Imagen 3 access to all U.S. users​

Imagen 3: Google’s latest salvo in the AI arms race​

Grok-2: xAI’s controversial unrestricted approach​

The future of AI image generation: Balancing creativity and responsibility​

Veteran

Runway’s Gen-3 Alpha Turbo is here and can make AI videos faster than you can type​

Broad accessibility and aggressively low pricing​

Promising initial results​

Future improvements and unresolved legal/ethical issues​

Veteran

Veteran

Veteran

Veteran

Veteran

Veteran

Veteran

Google drops ‘stronger’ and ‘significantly improved’ experimental Gemini models​

‘Newest experimental iteration’ of ‘unprecedented’ Gemini models​

‘Solid improvements,’ still suffers from ‘lazy coding disease’​

Veteran

Anthropic releases AI model system prompts, winning praise for transparency​

What Anthropic’s system prompts reveal​

Why Anthropic’s release of its system prompts is important​

Not fully open source, though​

Veteran

‘This could change everything!’ Nous Research unveils new tool to train powerful AI models with 10,000x efficiency​

The problem with AI training: steep hardware requirements​

What DisTrO does differently​

Still need good GPUs​

What’s next for the Nous Research team and DisTrO?​

Veteran

​

Amazon to launch AI-enhanced Alexa subscription in October​

​

We know your family and they should eat more vegetables​

Veteran

OpenAI has a "highly accurate" ChatGPT text detector, but won't release it for now​

Veteran

Former OpenAI researcher believes company is "fairly close" to AGI and not prepared for it​

​

Experts leave OpenAI, but not AGI​

Google quietly opens Imagen 3 access to all U.S. users

Imagen 3: Google’s latest salvo in the AI arms race

Grok-2: xAI’s controversial unrestricted approach

The future of AI image generation: Balancing creativity and responsibility

Runway’s Gen-3 Alpha Turbo is here and can make AI videos faster than you can type

Broad accessibility and aggressively low pricing

Promising initial results

Future improvements and unresolved legal/ethical issues

Google drops ‘stronger’ and ‘significantly improved’ experimental Gemini models

‘Newest experimental iteration’ of ‘unprecedented’ Gemini models

‘Solid improvements,’ still suffers from ‘lazy coding disease’

Anthropic releases AI model system prompts, winning praise for transparency

What Anthropic’s system prompts reveal

Why Anthropic’s release of its system prompts is important

Not fully open source, though

‘This could change everything!’ Nous Research unveils new tool to train powerful AI models with 10,000x efficiency

The problem with AI training: steep hardware requirements

What DisTrO does differently

Still need good GPUs

What’s next for the Nous Research team and DisTrO?

Amazon to launch AI-enhanced Alexa subscription in October

We know your family and they should eat more vegetables

OpenAI has a "highly accurate" ChatGPT text detector, but won't release it for now

Former OpenAI researcher believes company is "fairly close" to AGI and not prepared for it

Experts leave OpenAI, but not AGI