Large Language Models News & Discussions

bnew · Jan 26, 2025

1/1
@MunchBaby1337

**ByteDance Unveils Doubao-1.5-pro**

/search?q=#db

- **Deep Thinking Mode**: Surpasses O1-preview and O1 on the AIME benchmark.

- **Benchmark Beast**: Outperforms deepseek-v3, gpt4o, and llama3.1-405B across multiple benchmarks.

- **MoE Magic**: Utilizes a Mixture of Experts architecture, with significantly fewer active parameters than competitors.

- **Performance Leverage**: Achieves dense model performance with just 1/7 of the parameters (20B active = 140B dense equivalent).

- **Tech Talk**: Employs a heterogeneous system design for prefill-decode and attention-FFN, optimizing throughput with low latency.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

ByteDance AI Introduces Doubao-1.5-Pro Language Model with a 'Deep Thinking' Mode and Matches GPT 4o and Claude 3.5 Sonnet Benchmarks at 50x Cheaper

The artificial intelligence (AI) landscape is evolving rapidly, but this growth is accompanied by significant challenges. High costs of developing and deploying large-scale AI models and the difficulty of achieving reliable reasoning capabilities are central issues. Models like OpenAI’s GPT-4...

www.marktechpost.com

ByteDance AI Introduces Doubao-1.5-Pro Language Model with a ‘Deep Thinking’ Mode and Matches GPT 4o and Claude 3.5 Sonnet Benchmarks at 50x Cheaper

By

Asif Razzaq

-

January 25, 2025

The artificial intelligence (AI) landscape is evolving rapidly, but this growth is accompanied by significant challenges. High costs of developing and deploying large-scale AI models and the difficulty of achieving reliable reasoning capabilities are central issues. Models like OpenAI’s GPT-4 and Anthropic’s Claude have pushed the boundaries of AI, but their resource-intensive architectures often make them inaccessible to many organizations. Additionally, addressing long-context understanding and balancing computational efficiency with accuracy remain unresolved challenges. These barriers highlight the need for solutions that are both cost-effective and accessible without sacrificing performance.

To address these challenges, ByteDance has introduced Doubao-1.5-pro, an AI model equipped with a “Deep Thinking” mode. The model demonstrates performance on par with established competitors like GPT-4o and Claude 3.5 Sonnet while being significantly more cost-effective. Its pricing stands out, with $0.022 per million cached input tokens, $0.11 per million input tokens, and $0.275 per million output tokens. Beyond affordability, Doubao-1.5-pro outperforms models such as deepseek-v3 and llama3.1-405B on key benchmarks, including the AIME test. This development is part of ByteDance’s broader efforts to make advanced AI capabilities more accessible, reflecting a growing emphasis on cost-effective innovation in the AI industry.

Screenshot-2025-01-25-at-7.50.30 PM-1-1024x704.png

Screenshot-2025-01-25-at-7.50.48 PM-1-1024x726.png

Technical Highlights and Benefits

Doubao-1.5-pro’s strong performance is underpinned by its thoughtful design and architecture. The model employs a sparse Mixture-of-Experts (MoE) framework, which activates only a subset of its parameters during inference. This approach allows it to deliver the performance of a dense model with only a fraction of the computational load. For instance, 20 billion activated parameters in Doubao-1.5-pro equate to the performance of a 140-billion-parameter dense model. This efficiency reduces operational costs and enhances scalability.

The model also integrates a heterogeneous system design for prefill-decode and attention-FFN tasks, optimizing throughput and minimizing latency. Additionally, its extended context windows of 32,000 to 256,000 tokens enable it to process long-form text more effectively, making it a valuable tool for applications like legal document analysis, academic research, and customer service.

Screenshot-2025-01-25-at-7.46.05 PM-1-1024x651.png

Results and Insights

Performance data highlights Doubao-1.5-pro’s competitiveness in the AI landscape. It matches GPT-4o in reasoning tasks and surpasses earlier models, including O1-preview and O1, on benchmarks like AIME. Its cost efficiency is another significant advantage, with operational expenses 5x lower than DeepSeek and over 200x lower than OpenAI’s O1 model. These factors underscore ByteDance’s ability to offer a model that combines strong performance with affordability.

Early users have noted the effectiveness of the “Deep Thinking” mode, which enhances reasoning capabilities and proves valuable for tasks requiring complex problem-solving. This combination of technical innovation and cost-conscious design positions Doubao-1.5-pro as a practical solution for a range of industries.

Screenshot-2025-01-25-at-7.51.12 PM-1-1024x714.png

Conclusion

Doubao-1.5-pro exemplifies a balanced approach to addressing the challenges in AI development, offering a combination of performance, cost efficiency, and accessibility. Its sparse Mixture-of-Experts architecture and efficient system design provide a compelling alternative to more resource-intensive models like GPT-4 and Claude. By prioritizing affordability and usability, ByteDance’s latest model contributes to making advanced AI tools more widely available. This marks an important step forward in AI development, reflecting a broader shift towards creating solutions that meet the needs of diverse users and organizations.

bnew · Jan 26, 2025

cobra said:
Nǐ hǎo shìjiè

i didnt like deepseek's thinking - feels like spam - just give me the answer

you can deselect the thinking option. I switch between thinking mode and V3 model in the same conversation for some multi-step prompts.

Insensitive · Jan 27, 2025

Waiting for more China wins.
:banderas:

@DeepSeek

Fillerguy · Jan 27, 2025

Deepspeak answers questions by first explaining how it came to that answer. Even on simple shyt. This shyt is annoying to most people but I appreciate it.

Tribaligenesis · Jan 27, 2025

Fillerguy said:
Deepspeak answers questions by first explaining how it came to that answer. Even on simple shyt. This shyt is annoying to most people but I appreciate it.

It's far better in order to grasp an understanding of the subject at hand. This hopefully supersedes openAI

bnew · Jan 28, 2025

1/12
@Saboo_Shubham_
Qwen2.5 Max is a new large-scale MoE model from China that outperforms DeepSeek v3, Claude Sonnet 3.5, GPT-4o and Llama-3 405B.

It is available to use as an OpenAI like API and at much less cost.

Everyday in AI is now about China. Let that sink in.

2/12
@Saboo_Shubham_
I will be adding more AI Agent apps using Qwen2.5 Max in the future.

You can find all the awesome LLM Apps with AI Agents and RAG in the following Github Repo.

P.S: Don't forget to star the repo to show your support

GitHub - Shubhamsaboo/awesome-llm-apps: Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini and opensource models.

3/12
@Saboo_Shubham_
50+ Step-by-step tutorials of LLM apps with AI Agents and RAG.

P.S: Don't forget to subscribe for FREE to access future tutorials.

https://theunwindai.com

4/12
@Saboo_Shubham_
If you find this useful, RT to share it with your friends.

Don't forget to follow me @Saboo_Shubham_ for more such LLM tips and AI Agent, RAG tutorials.

[Quoted tweet]
Qwen2.5 Max is a new large-scale MoE model from China that outperforms DeepSeek v3, Claude Sonnet 3.5, GPT-4o and Llama-3 405B.

It is available to use as an OpenAI like API and at much less cost.

Everyday in AI is now about China. Let that sink in.

5/12
@KairosDataLabs
Cray week in AI.

6/12
@Saboo_Shubham_
100% agree.

7/12
@Gargi__Gupta
Chinese New Year started with an AI festival

8/12
@Saboo_Shubham_
Its an AI revolution at this point lol

9/12
@AILeaksAndNews
China is accelerating

10/12
@Saboo_Shubham_
Totally at an exponential rate.

11/12
@xdrmsk
In a week, decades are happening!!!

12/12
@Saboo_Shubham_
Those are the right words.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/31
@Alibaba_Qwen
The burst of DeepSeek V3 has attracted attention from the whole AI community to large-scale MoE models. Concurrently, we have been building Qwen2.5-Max, a large MoE LLM pretrained on massive data and post-trained with curated SFT and RLHF recipes. It achieves competitive performance against the top-tier models, and outcompetes DeepSeek V3 in benchmarks like Arena Hard, LiveBench, LiveCodeBench, GPQA-Diamond.

Blog: Qwen2.5-Max: Exploring the Intelligence of Large-scale MoE Model

Qwen Chat: Qwen Chat (choose Qwen2.5-Max as the model)

API: Make your first API call to Qwen - Alibaba Cloud Model Studio - Alibaba Cloud Documentation Center （check the code snippet in the blog）

HF Demo: Qwen2.5 Max Demo - a Hugging Face Space by Qwen

In the future, we not only continue the scaling in pretraining, but also invest in the scaling in RL. We hope that Qwen is able to explore the unknown in the near future!

Thank you for your support during the past year. See you next year!

2/31
@Alibaba_Qwen
Results of base language models. We are confident in the quality of our base models and we expect the next version of Qwen will be much better with our improved post-training methods.

3/31
@Alibaba_Qwen
It is interesting to play with this new model. We hope you enjoy the experience in Qwen Chat:

Qwen Chat

https://video.twimg.com/ext_tw_video/1884260770374115329/pu/vid/avc1/1280x720/OU7GghDaR4_gJloI.mp4

4/31
@Alibaba_Qwen
Also, it is available to HF demo, and it is on Any Chat as well!

Qwen2.5 Max Demo - a Hugging Face Space by Qwen

5/31
@Alibaba_Qwen
Welcome to use the API through the service of Alibaba cloud. Using the API is as easy as using any other OpenAI-API compatible service.

6/31
@mkurman88
Looks good

7/31
@securelabsai
V3 or R1?

8/31
@Yuchenj_UW
Happy new year Qwen!

9/31
@raphaelmansuy
Happy new Year of The Snake / From Hong Kong

10/31
@Urunthewizard
yoooooo thats cool! Is it open source like deepseek?

11/31
@SynquoteIntern
"Sir, another Chinese model has hit the timeline."

12/31
@koltregaskes
Happy New Year and thank you guys.

13/31
@iamfakhrealam
Ahaaa… Happy Lunar Year to you guys and specially to @sama

14/31
@hckinz
Lol, another one and this time they are not even comparing Claude 3.5 on coding

15/31
@octorom
Android app in the works?

16/31
@Cloudtheboi
Currently using qwen to search websites. It's great!

17/31
@luijait_
We claim a test time scaling GRPO RL over this base model

18/31
@yupiop12
based based based based based waow...

19/31
@AntDX316
Non-stop cooking.

20/31
@marjan_milo
A takedown of everything OpenAI has shown so far.

21/31
@TepuKhan
恭喜发财

22/31
@tom777cruise
butthole logo

23/31
@LuminEthics
Tweet Storm Response: Qwen2.5-Max vs. DeepSeek V3—But Where’s the Accountability?

1/ Qwen2.5-Max steps into the spotlight!
With benchmarks outpacing DeepSeek V3, it’s clear the MoE (Mixture of Experts) race is heating up.
But as models compete on performance, we need to ask:

What ethical safeguards are in place?

Who ensures transparency and alignment?
/search?q=#AI /search?q=#Governance

24/31
@vedu023
The race just keeps getting more exciting…!!

25/31
@elder_plinius

26/31
@vedangvatsa
Read about Liang Wenfeng, the Chinese entrepreneur behind DeepSeek, the AI App challenging ChatGPT:

[Quoted tweet]
Liang Wenfeng - Founder of DeepSeek

Liang was born in 1985 in Guangdong, China, to a modest family.

His father was a school teacher, and his values of discipline and education greatly influenced Liang.

Liang pursued his studies at Zhejiang University, earning a master’s degree in engineering in 2010.

His research focused on low-cost camera tracking algorithms, showcasing his early interest in practical AI applications.

In 2015, he co-founded High-Flyer, a quantitative hedge fund powered by AI-driven algorithms.

The fund grew rapidly, managing over $100 billion, but he was not content with just the financial success.

He envisioned using AI to solve larger, more impactful problems beyond the finance industry.

In 2023, Liang founded DeepSeek to create cutting-edge AI models for broader use.

Unlike many tech firms, DeepSeek prioritized research and open-source innovation over commercial apps.

Liang hired top PhDs from universities like Peking and Tsinghua, focusing on talent with passion and vision.

To address US chip export restrictions, Liang preemptively secured 10,000 Nvidia GPUs.

This strategic move ensured DeepSeek could compete with global leaders like OpenAI.

DeepSeek's AI models achieved high performance at a fraction of the cost of competitors.

Liang turned down a $10 billion acquisition offer, stating that DeepSeek’s goal was to advance AI, not just profit.

He advocates for originality in China’s tech industry, emphasizing innovation over imitation.

He argued that closed-source technologies only temporarily delay competitors and emphasized the importance of open innovation.

Liang credits his father’s dedication to education for inspiring his persistence and values.

He believes AI should serve humanity broadly, not just the wealthy or elite industries.

27/31
@Mira_Network

28/31
@snats_xyz
any chances of a paper / release of weights or something similar at some point?

29/31
@LechMazur
18.6 on NYT Connections, up from 14.8 for Qwen 2.5 72B. I'll run my other benchmarks later.

30/31
@daribigboss
Absolutely love this project! Let’s connect , send me a DM now!

https://twitter.com/messages/compos...+joining+forces?+💎&recipient_id=203762854

31/31
@shurensha
Man OpenAI can't catch a break

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Jan 28, 2025

https://archive.is/0apVa

1/11
@RnaudBertrand
This is pretty hilarious in retrospect.

In India in 2023, Altman was asked how if a small, smart team with a budget of $10 million could build something substantial within AI.

His reply: "It’s totally hopeless to compete with us on training foundation models"

[Quoted tweet]
Sam Altman - founder of OpenAI and ChatGPT - is in India and VCs are asking some tough questions to him

https://video.twimg.com/amplify_video/1666438180323696642/vid/1920x1080/TiS26MpJ4GkCkxu6.mp4

2/11
@minotauronlucy
How many foundation models has India trained since then? Zero.

There is no point bashing Sam Altman. India has not even produced a byte worth of weights to even snide at openAI.

3/11
@RnaudBertrand
Doesn't mean it wasn't possible, as Deepseek demonstrated...

4/11
@terrybythebay

[Quoted tweet]
DeepSeek was able to build their R1 model for only $6M because they bought all their GPUs directly from Temu.

5/11
@TheJesseMK
Do you think DeepSeek had a budget under $10M?

6/11
@1Paul_1
“The fool doth think he is wise, but the wise man knows himself to be a fool”

7/11
@tate_terminal
with a mere $10 million, one could readily purchase a mirror for altman to gaze upon his own hubris, a modern-day icarus flying too close to the silicon sun.

8/11
@AGItechgonewild
True LMAO!!

9/11
@tacobelmin
You missed the part where he said “you should try anyway”!

10/11
@aledeniz
DeepSeek doesn’t have a $10 million budget though

They spend more than that – likely a multiple – just in wages.

11/11
@junyongz
Things would make sense if he added within “2 years”

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Jan 28, 2025

1/21
@RnaudBertrand
The denial is frankly unreal. They're still pushing for the chips export controls when they now couldn't have a better illustration that it's so self-defeating.

Again, continued decoupling by building walls and barriers means that it's the U.S. that's becoming a closed system. And in tech a closed system eventually loses momentum while an open one gains it.

The U.S. is very much facing its a red/blue pill moment: it can either take the blue pill of comfort - hiding behind walls, bans and comforting anti-China propaganda, all the band-aids that don't address the key issue: the fact that China is increasingly better. Or they can swallow the red pill and try to understand and adapt to the world they now live in. And just like in The Matrix, the longer they wait, the more shocking the eventual awakening becomes.

[Quoted tweet]
Anthropic CEO Dario Amodei says while DeepSeek may be able to smuggle 50,000 H100s, it would be very difficult to smuggle the hundreds of thousands or millions of chips required to continue to compete with American companies in AI

https://video.twimg.com/ext_tw_video/1883974939470094339/pu/vid/avc1/720x720/6ovb9kwRGqirIQVp.mp4

2/21
@RnaudBertrand
And on top of that he's wrong since Deepseek is using Huawei chips for inference

(the development of those chips by Huawei being another direct effect of the export controls and sanctions)

[Quoted tweet]
I feel this should be a much bigger story: DeepSeek has trained on Nvidia H800 but is running inference on the new home Chinese chips made by Huawei, the 910C.

3/21
@st_aubrun

[Quoted tweet]
If DeepSeek were a US company it would now have a valuation of about 3 trillion

4/21
@RnaudBertrand
Probably correct

5/21
@deed_deeds
Dario is talking like a ning nong

6/21
@RSA_Observer
Probably too late anyway:

"China's new AI chip outperforms NVIDIA's most powerful GPU A team of researchers from Beijing, led by Professors Fang Lu and Dai Qionghai of Tsinghua University, has unveiled the world's first fully optical artificial intelligence (AI) chip. Named Taichi-II, this groundbreaking innovation marks a significant milestone in the field of optical computing. The chip has outperformed NVIDIA's H100 GPU in terms of energy efficiency and performance."

7/21
@MuhumuzaMaurice
Playing Go (after playing Chess) gives you a sense of how two things can be great and yet different in approach and consequence.

One may argue that the Americans are assessing the Chinese Chess Board and marking themselves right. Meanwhile the Chinese continue to extend their understanding of a superior game which their only potential opponent is even refusing to acknowledge is better in outcome prediction. Simply because it appears to have pedestrian rules of engagement.

8/21
@michaeltanyk
Anthropic is on the first row of firing squad. This guy is shaking. Spectrum people lie badly.

9/21
@awaken_tom
Crazy that Anthropic is pushing "AI safety" and a chip blockade of China, while they themselves are conducting "gain of function" research with malicious AIs and teaching their models to lie and cover up "uncomfortable" truths. What could go wrong?

10/21
@chickadeedee3
"Parity"

11/21
@johann_theron
Pushing realization to the next generation is normal these days, because Americans treat dogs better than children. Check fewer children correlat with more

.

12/21
@Bob72838565
He is a typical clueless American CEO

13/21
@arscrypta
Singapore just buys more.

14/21
@carismachet
Narcissism is a hell of a drug

15/21
@DottorPav

from

16/21
@Steve90315595
Ultimately the unipolar west will isolate themselves from the Multipolar/ BRICS nations completely given the west's 'my way or the highway' stance on global economics.

If the Anglosphere cannot win the game they WILL flip the game board which makes them an existential threat.

17/21
@Aishalifett
The term ‘free market’ is often used by the West as a facade to mask the manipulative practices it employs. In truth, the West’s so-called free market is far from free. /search?q=#DeepSeekR1

18/21
@amarinica
I think this is normal human behaviour. Difficult to see anyone react in a manner that admits defeat or any personal fault. The play is to get fired and get the comp package, not admit incompetence and resign.

19/21
@hx_dks
He is not even a real scientist or an engineer

20/21
@ethicalzac
Exactly, so the only lesson learned from DeepSeek is to buy more Nvidia chips and blow more hot air into our markets

21/21
@LaniRefiti
What the DeepSeek episode has demonstrated is the old adage of "necessity is the mother of all invention"
Denied advanced chips, the DeepSeek team instead innovated and came up with a really innovative and efficient way to train LLM's at a fraction of the cost.

Plus they made the thing open source!

I'm skeptical on the whole 50,000 H100's given it's open source. Any lab worth it's salt should be able to replicate or disprove what DeepSeek did on general purpose GPU's. Let's see some actual data.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Jan 28, 2025

1/21
@rowancheung
NEWS: DeepSeek just dropped ANOTHER open-source AI model, Janus-Pro-7B.

It's multimodal (can generate images) and beats OpenAI's DALL-E 3 and Stable Diffusion across GenEval and DPG-Bench benchmarks.

This comes on top of all the R1 hype. The

is cookin'

2/21
@rowancheung
Link: deepseek-ai/Janus-Pro-7B · Hugging Face

3/21
@rowancheung
For those wondering my quick take on what's happening right now with R1 and Janus

1. GPU demand will not go down
2. OpenAI is not done for, but Open source and China are showing they're far closer than anticipated
3. There's way too much misinfo being spread by mainstream media right now (almost seems on purpose?)
4. DeepSeek open-sourcing R1 is still a huge gift to developers and overall AI progress

I haven't seen this much confusion and uncertainty on my TL for ages...

4/21
@rowancheung
That said, I'm shocked we haven't heard any response from @nvidia or @OpenAI yet

5/21
@rowancheung
Also a reminder that DeepSeek R1 dropped 6 days ago, but the market only reacted today

Wall Street, along with 99% of the world, still has trouble keeping up on AI

The easiest way to stay ahead in just 5-min per day (and get news like DeepSeek live): The Rundown AI

6/21
@ObiUzi
AAAAA

7/21
@Martinoleary

[Quoted tweet]
Live footage of Sam walking to work today.

https://video.twimg.com/ext_tw_video/1883914184699604992/pu/vid/avc1/320x172/_YAdbhQ9OkcBneBD.mp4

8/21
@mhdfaran
DeepSeek coming in hot with Janus-Pro-7B like, "Beat this, OpenAI!"

9/21
@HighlyRetired

10/21
@ObiUzi
I don’t feel good doc

11/21
@BeginnersinAI
This is great for competition. These last two models are going to push the established players to up their game.

12/21
@AIRoboticsInt
Seems like @elonmusk has a point

[Quoted tweet]
Elon Musk on DeepSeek:

He says, DeepSeek “obviously” has ~50,000 Nvidia H100 chips that they can’t talk about due to US export controls.

Interesting.

13/21
@WealthArchives
Deepseek dropped another model

https://video.twimg.com/ext_tw_video/1883920660377808896/pu/vid/avc1/720x1280/-3NsWWOTMa5QyJWC.mp4

14/21
@RealStarTrump
Supposedly, if you ask deepseek to identify itself, it calls itself ChatGPT which would indicate illicit training data.

Something for devout autists to confirm or deny.

15/21
@CastelMaker
OpenAI after investing 500B

16/21
@laplacesdust
Its over

17/21
@dula2006
Wait until you see GROK 3! @grok

18/21
@czverse
Janus-Pro-7B is turning up the heat! Multimodal dominance + open-source = game changer. DeepSeek’s

isn’t just cookin’, it’s serving a feast

19/21
@space_ace84
Can it generate this image?

20/21
@0xAdin

21/21
@SecretNetwork
For those worried about privacy conerns and data harvesting.

Integrate confidential computing and get the benefits without the concerns.

[Quoted tweet]
x.com/i/article/188246991007…

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Feb 4, 2025

1/2
@_The_AI_Guy_
Budget Forcing: A Simple Hack for Adaptive Reasoning

Extend reasoning: Insert “Wait” to block.

Force a final answer: Insert “Final Answer:” to encourage.

This was introduced in the “s1: simple test-time scaling” paper.

Link below

2/2
@_The_AI_Guy_
Paper: s1: Simple test-time scaling

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/1
@EmergentMind
New research introduces a simple test-time scaling technique using "budget forcing" that improves reasoning performance, allowing the s1 model to outperform OpenAI's o1 model by up to 27% on math questions, with all resources open-sourced.: Sign Up

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/9
@arankomatsuzaki
Stanford presents:

s1: Simple test-time scaling

- Seeks the simplest approach to achieve test-time scaling and strong reasoning performance
- Exceeds o1-preview on competition math questions by up to 27% (MATH and AIME24)
- Model, data, and code are open-source

2/9
@arankomatsuzaki
repo: GitHub - simplescaling/s1: s1: Simple test-time scaling
abs: s1: Simple test-time scaling

3/9
@shannonNullCode
This explains why asking “are you sure” produces higher quality code outputs.

4/9
@edwardcfrazer
this is incredibly elegant — just keep appending “wait”, rerunning, and strong reasoning emerges quickly after.

I wonder what type of distillation result you can get by approximating this on a very strong base model like 3.5 sonnet?

5/9
@gdbsm1
@Readwise save thread

6/9
@rrostt
Wait

7/9
@GreatKingCnut
its distillation, since a smarter model made the traces

8/9
@navtechai
Notable achievement in test-time scaling. Can you elaborate on how the approach simplifies the process?

9/9
@AI_Fun_times
Exciting to see Stanford's Simple test-time scaling approach advancing reasoning performance by 27% on math questions!

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/21
@Muennighoff
DeepSeek r1 is exciting but misses OpenAI’s test-time scaling plot and needs lots of data.

We introduce s1 reproducing o1-preview scaling & performance with just 1K samples & a simple test-time intervention.

s1: Simple test-time scaling

2/21
@chrmanning
But, fwiw, they did provide a test-time scaling plot for their DeepSeek-R1-Lite-Preview! (Even though, as with the OpenAI models, it’s relationship to the later non-preview model isn’t totally clear.)

DeepSeek-R1-Lite-Preview is now live: unleashing supercharged reasoning power! | DeepSeek API Docs

3/21
@Muennighoff
Thanks for mentioning it! Yes, it's unclear how it relates to the latest r1 and also how they produced that plot in the first place.

4/21
@qtnx_
congrats!! great work

5/21
@EitanTurok
so simple! nice!

6/21
@LottoLabs
Gonna have to read this one

7/21
@Navod_Jaya
very well made

8/21
@kexun_zhang
@hahahahohohe look how their idea is the same as yours.

9/21
@jonathanrlarkin
Fantastic paper. Congrats!

10/21
@timfduffy
One surprise for me here especially for the MATH500 is how linear it is. I would have expected benchmark progress to have a relationship with e^test_tokens more like this logistic function.

11/21
@sivil_taram
Congrats Niklas! The LIMA moment for reasoning model!

12/21
@lvwerra
Congrats Niklas!

13/21
@Aiden_Novaa
hey! interesting findings on deepseek r1. at jenova ai we've actually integrated both o3-mini and r1 into our model router - fascinating to see how r1 performs great in coding/math but struggles w/ creative tasks. that's why we route different tasks to diff models. your s1 results look promising btw! test-time scaling is def crucial for real-world applications

14/21
@MarketerSc50321
s1 sounds like it’s got the magic touch!

15/21
@Cameron_Chann
Congrats! Really interesting idea to mitigate the “under thinking” phenomenon.

16/21
@WenhuChen
Nice work!

17/21
@AIVideoTech
Exciting to see advancements in scaling & performance with fewer samples & streamlined test-time interventions!

18/21
@AI_Fun_times
Exciting to see strides in reproducibility & performance optimization! Scaling with just 1K samples is a game-changer. Curious to see more on the test-time intervention.

19/21
@Nupvss
I have always believed that repeated checking is the key to improving model capabilities. GRPO only screens out high-scoring answers. It is difficult for the answers to be close to the target score, but repeated checking can slowly bring the answers closer to the target score.

20/21
@llmpromptu
really enjoyed reading this paper, had a couple questions. wrote them out more here (Alex Estes on Substack) but to put it briefly, curious about thought process behind distilling on a reasoning model, and why you didn't attempt BF on the baseline qwen model?

21/21
@Muennighoff
Thanks! BF on baseline is hard because:
- It has no designated thinking part so needs prompt engineering
- It produces a much shorter CoT
- It has worse reasoning performance - the closer to 0 the harder to do BF

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/6
@TheTuringPost
s1 - a new simple! open-source test-time scaling approach from @Stanford.

With s1 researchers found the simplest way to improve reasoning through test-time scaling.

s1's innovations:

• A small s1K dataset with 1,000 tough and diverse questions, each with detailed reasoning steps for training.
• Budget forcing to controls how long the model thinks.

What about the results of s1?

- It gets up to 27% higher scores on math problems compared to OpenAI’s o1-preview model.
- Performance jumped from 50% to 57% on a math competition even without extra test-time optimizations.

More details:

2/6
@TheTuringPost
s1K dataset:

The researchers started with a big pool of 59,000 questions from many different sources. But instead of using all of them, they carefully filtered the dataset down to just 1,000 best questions by focusing on quality, difficulty, and diversity.

3/6
@TheTuringPost
s1 test-time scaling approach with budget forcing:

It's focused on sequential scaling, when the model’s later thoughts build on its earlier ones, refining its answer step by step.

Budget forcing, a special simple, trick controls how long the model thinks:

• If the model stops thinking too soon, they force it to keep going by adding “Wait” to its reasoning process.
• If the model thinks for too long, they make it stop early by inserting a “Final Answer” tag.

4/6
@TheTuringPost
Results of s1-32B:

• Researchers trained their model, s1-32B, in just 26 minutes on 16 NVIDIA H100 GPUs on 1,000 examples using basic next-word prediction.

• The more test-time compute s1-32B gets, the better it performs.

• By controlling its thinking time with budget forcing, the model outperformed OpenAI’s o1-preview on tough math questions.

• It almost matched Google’s Gemini 2.0 Thinking model on AIME24.

5/6
@TheTuringPost
The main insights:

Surprisingly, even when they tested with 59,000 examples, the model didn’t get much better— this shows that smart data selection (quality + difficulty + diversity) is key to efficient AI training.

Budget forcing gives the best balance of control, scaling, and performance.

6/6
@TheTuringPost
Paper: s1: Simple test-time scaling
Code: GitHub - simplescaling/s1: s1: Simple test-time scaling

@Stanford, @UW, @allen_ai, @ContextualAI

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

https://arxiv.org/pdf/2501.19393

bnew · Feb 6, 2025

Image Credits:ByteDance

Deepfake videos are getting shockingly good

Kyle Wiggers

8:18 AM PST · February 4, 2025

Researchers from TikTok owner ByteDance have demoed a new AI system, OmniHuman-1, that can generate perhaps the most realistic deepfake videos to date.

Deepfaking AI is a commodity. There’s no shortage of apps that can insert someone into a photo, or make a person appear to say something they didn’t actually say. But most deepfakes — and video deepfakes in particular — fail to clear the uncanny valley. There’s usually some tell or obvious sign that AI was involved somewhere.

Not so with OmniHuman-1 — at least from the cherry-picked samples the ByteDance team released.

Here’s a fictional Taylor Swift performance. Here’s a TED Talk that never took place. And here’s a deepfaked Einstein lecture:

According to the ByteDance researchers, OmniHuman-1 only needs a single reference image and audio, like speech or vocals, to generate a clip of an arbitrary length. The output video’s aspect ratio is adjustable, as is the subject’s “body proportion” — i.e. how much of their body is shown in the fake footage.

Trained on 19,000 hours of video content from undisclosed sources, OmniHuman-1 can also edit existing videos — even modifying the movements of a person’s limbs. It’s truly astonishing how convincing the result can be.

Granted, OmniHuman-1 isn’t perfect. The ByteDance team says that “low-quality” reference images won’t yield the best videos, and the system seems to struggle with certain poses. Note the weird gestures with the wine glass in this video:

Still, OmniHuman-1 is easily heads and shoulders above previous deepfake techniques, and it may well be a sign of things to come. While ByteDance hasn’t released the system, the AI community tends not to take long to reverse-engineer models like these.

The implications are worrisome.

Last year, political deepfakes spread like wildfire around the globe. On election day in Taiwan, a Chinese Communist Party-affiliated group posted AI-generated, misleading audio of a politician throwing his support behind a pro-China candidate. In Moldova, deepfake videos depicted the country’s president, Maia Sandu, resigning. And in South Africa, a deepfake of rapper Eminem supporting a South African opposition party circulated ahead of the country’s election.

Deepfakes are also increasingly being used to carry out financial crimes. Consumers are being duped by deepfakes of celebrities offering fraudulent investment opportunities, while corporations are being swindled out of millions by deepfake impersonators. According to Deloitte, AI-generated content contributed to more than $12 billion in fraud losses in 2023, and could reach $40 billion in the U.S. by 2027.

Last February, hundreds in the AI community signed an open letter calling for strict deepfake regulation. In the absence of a law criminalizing deepfakes at the federal level in the U.S., more than 10 states have enacted statutes against AI-aided impersonation. California’s law — currently stalled — would be the first to empower judges to order the posters of deepfakes to take them down or potentially face monetary penalties.

Unfortunately, deepfakes are hard to detect. While some social networks and search engines have taken steps to limit their spread, the volume of deepfake content online continues to grow at an alarmingly fast rate.

In a May 2024 survey from ID verification firm Jumio, 60% of people said they encountered a deepfake in the past year. Seventy-two percent of respondents to the poll said they were worried about being fooled by deepfakes on a daily basis, while a majority supported legislation to address the proliferation of AI-generated fakes.

bnew · Feb 10, 2025

1/1
@stefan_fee

Breaking: Comprehensive & Rigorous Evaluation on AIME 2025 (Part-I) are out!

Extensive evaluation of latest models:
• DeepSeek-R1 family (Distill-llama, Qwen, 1.5B, 7B, 14B, 32B)
• o3-mini serieso3-mini (high, medium, low)
• Gemini 2.0 Flash-Thinking
• More: LIMO, s1, QwQ

Deep dive into temperature impact analysis & performance trends

Check it out

GitHub - GAIR-NLP/AIME-Preview

[Quoted tweet]

AIME-Preview: A real-time benchmark for math reasoning models on AIME 2025!

See how DeepSeek, O Series, LIMO & others performed on Part 1

Evaluate your own models with our open-source scripts

Part 2 results coming soon
github.com/GAIR-NLP/AIME-Pre…

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/5
@BLeavesYe

AIME-Preview: A real-time benchmark for math reasoning models on AIME 2025!

See how DeepSeek, O Series, LIMO & others performed on Part 1

Evaluate your own models with our open-source scripts

Part 2 results coming soon
GitHub - GAIR-NLP/AIME-Preview

2/5
@BLeavesYe

Math reasoning models are surprisingly temperature-sensitive!

Performance swings up to 15% across temperature settings
Full analysis in our repo

3/5
@sytelus
More interesting numbers will be 1 shot performance at default temperature set by API provider. Humans don’t get to try 8 times. Why should models?

4/5
@YouJiacheng
wait, s1 only 32.9?

5/5
@hengcherkeng
my r1-distill-qwen-14b for aime01-2025 is at:
AI Mathematical Olympiad - Progress Prize 2
The prompt is important.
message = [
{'role': 'system', 'content': 'Solve the following math problem. Put your final answer within \\boxed{}.'},
{'role': 'user', 'content': row.question},]

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Feb 10, 2025

1/1
@stefan_fee
/search?q=#LIMO: The "Less is More" Law in LLM Reasoning

(1) 817 training data with 57.1% AIME: We discovered the "Less is More" law in complex reasoning: In the American Invitational Mathematics Examination (AIME), LIMO's accuracy soared from 6.5% (compared to traditional methods like Numina-Math) to 57.1%. On the MATH benchmark, performance improved from 59.2% to 94.8%.

(2) SFT can generalize surprisingly well : the model trained with just over 817 examples demonstrated remarkable generalization capabilities. Across 10 different benchmarks, it achieved a 40.5% absolute performance improvement, surpassing models trained with 100 times more data. This challenges the conventional belief that "supervised fine-tuning primarily leads to memorization rather than generalization."

(3) insights about Less-is-More and RL Scaling:

- (3.1) While RL Scaling methods like DeepSeek-R1 have become the dominant paradigm, LIMO offers a more fundamental perspective: reasoning capabilities are inherent (or can be inherent) in large models, with the key challenge being how to find optimal activation trajectories. This insight not only repositions RL Scaling as one specific method for finding optimal reasoning paths but also pioneers a new research paradigm - shifting from "training new capabilities" to "eliciting latent capabilities."

- (3.2) Why is "less is more for reasoning" happening now, after two years (since LIMA - less is more for alignment)? We attribute this to two foundations:

First, the knowledge foundation. Recent LLMs have incorporated vast mathematical knowledge during pre-training. For instance, while Llama 2 had 1.8T tokens of general training data, Llama 3 used 3.7 trillion tokens for mathematical reasoning alone. This suggests modern LLMs already "know" extensive mathematical knowledge; the key is "awakening" it.
Second, the reasoning computation foundation. Recent research shows that the length of chain-of-thought (CoT) reasoning closely correlates with model reasoning ability. Rather than force-feeding large-scale supervised data during training, providing higher-quality problems and demonstrations during inference allows models to develop deeper thinking independently.
Both elements are essential.

(4) Based on LIMO, we anticipate future shifts:
- From "how to train" to "how to elicit"
- From "capability acquisition" to "trajectory search"
- From "passive Less-is-More" to "active Less-is-More" (building a tech stack that enables extremely data-efficient learning of any capability)

[Quoted tweet]

How many examples does an LLM need to learn competition-level math?
Conventional wisdom: 100,000+ examples
Our discovery: Just 817 carefully chosen ones

With pure SFT, LIMO achieves:
57.1% on AIME
94.8% on MATH

LIMO: Less is More for Reasoning

arxiv.org/pdf/2502.03387

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/20
@arankomatsuzaki
LIMO: Less is More for Reasoning

Achieves 57.1% on AIME and 94.8% on MATH w/ only 817 training samples, i.e., only 1% of the training data required by previous approaches

2/20
@arankomatsuzaki
abs: LIMO: Less is More for Reasoning
repo: GitHub - GAIR-NLP/LIMO: LIMO: Less is More for Reasoning

3/20
@dmvaldman
This is why I never did my homework

4/20
@arankomatsuzaki
Honestly 817 question-answer pairs seem close in order to how many practice problems I tried to prepare for AIME when I was a kid lol

5/20
@reasoningTokens
The “problem” is that if this paper is correct, we haven’t really achieved nothing with RL so far. RL as in R1 would only unlock what’s already there

6/20
@shuklarishi720
At this point, the base model has to be insanely good and the benchmarks saturated to get these results!

Its not just the quality of the data when the data itself is so small

7/20
@BennettBuhner
So how does it scale if given equivalent datasets from prior RL techniques, now being applied here?

8/20
@dysondunbar
Wait why don’t they compare with qwen r1

9/20
@posedscaredcity
There's lessons to be learned here. Likely, that this is great for general tasks, but likely there are reasoning techniques that it won't have without a bigger scale. It also shows you don't need a dataset much larger than this if you can't achieve a greater level of diversity

10/20
@mkurman88
It looks incredible; after two epochs, the answers from my model are outstanding

11/20
@leakedweights
We need a new kind of speedrun @kellerjordan0

12/20
@axel_pond
this cant be right... i was told in no uncertain terms by twitter anons that LLMs are theoretically incapable of generalization or any real form of intelligence other that regurgitation?

13/20
@Nuanced16
Caveats: very high quality data? Does performance generalise beyond those benchmarks?

14/20
@AIVideoTech
Impressive results! LIMO's efficiency in leveraging limited data for high performance is a testament to the power of focused learning.

15/20
@Kyokopom
So this advance could imply that models or schemes like the one presented in rstarmarth would help to iteratively generate better, higher quality data, in case of not having enough high quality data?

16/20
@real_Nikhil_2
recent work from Apple :
M2R2: Mixture of Multi-Rate Residuals for Efficient Transformer Inference

17/20
@Kinch_ahoy
I’m really failing to get the point of this paper. They show that training on high quality answers to the hardest AIME and MATH problems helps boost scores in … AIME and MATH (with a tiny bit of generalization to other benchmarks)? Isn’t this … obvious?

18/20
@TuanPham672604
Haven't read the paper yet but manually surf through 817 examples right? What are your criteria?

19/20
@Kiss_Proof
Any chance we can achieve something like "groking" if we force much more training?

20/20
@Matagi1996
More like (few sampels) higher quality Data> (Internet full of) lower quality Data.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Feb 10, 2025

1/10
@ai_for_success
Le Chat from Mistral AI got a major upgrade, and it's really good!

They're rolling out brand-new features, along with iOS and Android apps, Pro, Team, and Enterprise tiers.

Here’s everything you need to know

1/7
Fast Answer: lowest-latency Mistral models ~1000 words / sec , currently available in preview to all users.

https://video.twimg.com/ext_tw_video/1887527490132058114/pu/vid/avc1/640x360/Ucn1ojcysIgtBc_q.mp4

2/10
@ai_for_success
2/7
Le Chat can also access information across web search with citation

https://video.twimg.com/ext_tw_video/1887527541684277248/pu/vid/avc1/584x360/X-YdsYao8bP5zHT7.mp4

3/10
@ai_for_success
3/7
Le Chat can also process image and document which is powered by the best vision and optical character recognition (OCR) models in the industry

https://video.twimg.com/ext_tw_video/1887527589604171778/pu/vid/avc1/582x360/Tv8VOUo2s2D29Btq.mp4

4/10
@ai_for_success
4/7
In-place code execution and analysis : Code interpreter in Le Chat

https://video.twimg.com/ext_tw_video/1887527706121936899/pu/vid/avc1/568x360/13LIX8tZdBEeL6Rj.mp4

5/10
@ai_for_success
5/n
- le Chat offers the vast majority of its features for free (latest models, journalism, image generation, document uploads, and more), with upgraded limits for power users starting at $14.99 per month in the Pro tier

- Starting today, le Chat is available on iOS, Android, and soon, on private infrastructure for businesses.

6/10
@ai_for_success
6/n
Image generation : Le Chat’s image generation is powered by Black Forest Labs Flux Ultra

7/10
@ai_for_success
7/n
Upload and analyze.

https://video.twimg.com/ext_tw_video/1887527881188032512/pu/vid/avc1/580x360/n8VVV4DKxs4anuSj.mp4

8/10
@VarunkInsights
A Chatbot that remembers preferences with long term memory and personalizes responses will indeed be useful

9/10
@ai_for_success
Yeah , they have done lots of improvent

10/10
@shushant_l
I've been hearing a lot about it

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Feb 14, 2025

https://archive.is/kTWfM

1/11
@dreamingtulpa
oh shyt, here we go again!

Animate Anyone 2 can replace people in videos with anyone from a single reference image

https://video.twimg.com/amplify_video/1890345165958885376/vid/avc1/720x852/ugYYYSf_hn8ZB_jN.mp4

2/11
@dreamingtulpa
the first iteration of this caused quite the fuss a year ago

[Quoted tweet]
High quality AI generated human videos are coming!

Animate Anyone can generate videos of anyone with a single image and a bit of pose guidance

humanaigc.github.io/animate-…

https://video.twimg.com/amplify_video/1730876620569718784/vid/avc1/1468x720/AALn232ln_XSEUmA.mp4

3/11
@dreamingtulpa
link to the project page

Animate Anyone 2

4/11
@DeHavenAI
Cool

5/11
@dreamingtulpa

6/11
@bearoffsghost
wow

7/11
@dreamingtulpa
wow indeed

8/11
@ThisIsMeIn360VR
I'm ready to be in the Movies 🎞

[Quoted tweet]
The most exciting thing about 3d scanning and #FaceSwap technology is that soon there will be tons of bizarre apps that emerge.
Eventually you'll be watching movies where you and your friends are the stars. The Avengers will never be the same. :-O
medium.com/@ThisIsMeIn360VR/…

9/11
@neuroautomata

10/11
@matt_barrie
There goes Hollywood

11/11
@ThisIsMeIn360VR
I'm ready to be in Ads !

[Quoted tweet]
x.com/i/article/188514147360…

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/9
@kimmonismus
Animate Anyone 2: High-Fidelity Character Image Animation with Environment Affordance

Animate Anyone 2 can replace people in videos with anyone from a single reference image

Developed from Alibaba Group, more on Github

https://video.twimg.com/ext_tw_video/1890378753877884928/pu/vid/avc1/730x720/gAacbtR8zAs7y3VS.mp4

2/9
@kimmonismus
Animate Anyone 2

3/9
@tahreem57
Impressive

4/9
@NFTMentis
These hot-swappable tools are getting really good…

5/9
@Tenkaizen8
That's quite the innovation

6/9
@RahulRa75965227
Two legends in one frame

7/9
@iwhizkid__
China is on fire with video models. Veo 2 is amazing but chinese models are cheap and getting cheaper. Its gonna be a tough market for any one company to dominate.

8/9
@jackadoresai
Character animation just got a boost! Heading to Alibaba's Github now!

9/9
@rethynkai
Now with such advancements ethical consideration of AI will be focused on more than ever

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/11
@ai_for_success
China is doing some crazy research in AI video generation. Tongyi Lab (Alibaba Group) just dropped another banger paper on AI video.

Animate Anyone 2 : Works insanely well with:
> Dynamic motion
> Human interaction

10 wild examples + research paper below

https://video.twimg.com/ext_tw_video/1890427141671780352/pu/vid/avc1/1096x1080/-vTQGbtJT9jO1lTE.mp4

2/11
@ai_for_success
2. Works well with animal reference picture as well.

https://video.twimg.com/ext_tw_video/1890427401974476800/pu/vid/avc1/1080x1280/HnNL9aBx_ApI_gZC.mp4

3/11
@ai_for_success

[Quoted tweet]
link to the project page

1/12
@minchoi
China's Alibaba just announced Animate Anyone 2.

This AI can take any single image of a character and animates it with realistic motion seamless to its environment.

10 wild examples:

1. Human Interaction

https://video.twimg.com/ext_tw_video/1890419322893320193/pu/vid/avc1/730x720/EsK3zcwCP5w3vNJu.mp4

2/12
@minchoi
2. Dynamic Motion

https://video.twimg.com/ext_tw_video/1890419512173887489/pu/vid/avc1/720x852/1rsNoFhGjka0d_r_.mp4

3/12
@minchoi
3. Dynamic Motion

https://video.twimg.com/ext_tw_video/1890419559850491904/pu/vid/avc1/720x852/EdFJnhCkUlHt-T4f.mp4

4/12
@minchoi
4. Dynamic Motion

https://video.twimg.com/ext_tw_video/1890419776616386560/pu/vid/avc1/730x720/YcCInQwb4MuLbWEj.mp4

5/12
@minchoi
5. Environment Interaction

https://video.twimg.com/ext_tw_video/1890420075703783424/pu/vid/avc1/720x852/n-0zDupPMbnzZidr.mp4

6/12
@minchoi
6. Human Interaction

https://video.twimg.com/ext_tw_video/1890420229596971008/pu/vid/avc1/730x720/j7IADDNS71BRDcQJ.mp4

7/12
@minchoi
7. Environment Interaction

https://video.twimg.com/ext_tw_video/1890420376435396608/pu/vid/avc1/720x852/9yb_npfVhhiCWMXZ.mp4

8/12
@minchoi
8. Environment Interaction

https://video.twimg.com/ext_tw_video/1890420629645447168/pu/vid/avc1/720x852/BwpDNwOW6XA1sIf9.mp4

9/12
@minchoi
9. Dynamic Motion

https://video.twimg.com/ext_tw_video/1890421016452587520/pu/vid/avc1/720x852/_Eq7Cjoj2G97JLcw.mp4

10/12
@minchoi
10. Human Interaction

https://video.twimg.com/ext_tw_video/1890421100829364226/pu/vid/avc1/730x720/xZG4SKtGwt9f6GhX.mp4

11/12
@minchoi
If you enjoyed this thread,

Follow me @minchoi and please Bookmark, Like, Comment & Repost the first Post below to share with your friends:

[Quoted tweet]
China's Alibaba just announced Animate Anyone 2.

This AI can take any single image of a character and animates it with realistic motion seamless to its environment.

10 wild examples:

1. Human Interaction

https://video.twimg.com/ext_tw_video/1890419322893320193/pu/vid/avc1/730x720/EsK3zcwCP5w3vNJu.mp4

12/12
@minchoi
Check out their GitHub Project page here

Animate Anyone 2

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

Large Language Models News & Discussions

More options

bnew

Veteran

ByteDance AI Introduces Doubao-1.5-Pro Language Model with a 'Deep Thinking' Mode and Matches GPT 4o and Claude 3.5 Sonnet Benchmarks at 50x Cheaper