The A.I Megathread (LLM , GPT , Development)

bnew · Nov 3, 2024

1/11
@itay__nakash
Breaking ReAct Agents: Foot-in-the-Door Attack

New preprint!

FITD attack subtly misleads LLM agents

.
Our reflection-based defenses counter these vulnerabilities!

paper website:
Breaking ReAct Agents: Foot-in-the-Door Attack Will Get You In

with @georgekour , Guy Uziel and @AteretAT

1/11

2/11
@itay__nakash
The attack uses indirect prompt injection (IPI), embedding harm in tools the agent accesses.

FITD amplifies it with a harmless task first, easing the agent into compliance—slipping both requests into its thought flow, leading to dangerous actions without re-evaluation.

2/11

3/11
@itay__nakash
FITD works by starting with a harmless task and building trust before a harmful request follows.

This exploits a vulnerability in ReAct agents: they rarely re-evaluate initial decisions.

Our results? Up to a 44% boost in attack success.

3/11

4/11
@itay__nakash
ReAct agents are fine-tuned on positive action/thought paths, leading them to “continue” executing sequences once started.

This makes them prone to FITD: after a small request, they’re primed to follow on with no re-evaluation.

4/11

5/11
@itay__nakash
We tested FITD with Thought-Injection analysis: injecting a thought with harmful intent led to over 95% compliance across agents.

Once a request enters the agent’s thought process, ReAct agents rarely backtrack, showing a key vulnerability.

5/11

6/11
@itay__nakash
Our setups use indirect prompt injection - embedding malicious requests within agent tools/data sources

We tested timing/position variations of the harmless distractor and unfamiliar distractors -
revealing vulnerabilities even with unknown tools.

6/11

7/11
@itay__nakash
The position and timing results mirror the “foot-in-the-door” psychological effect: the attack is most effective when placed at the beginning, especially in position.

– All setups outperformed attacks without FITD. –

7/11

8/11
@itay__nakash
FITD with unfamiliar tools (both unknown and inaccessible to the agent) still succeeded.

Agents 'ignored' the distractor request but continued to comply. Results?
ASR rose significantly even with unfamiliar tools, showcasing FITD's reach—even with unknown distractions.

9/11
@itay__nakash
Our reflection-based defenses tackle the vulnerabilities revealed by FITD, intercepting unsafe actions during the agent's thought generation.

By leveraging agents' internal processes to flag risk or hesitation we are balancing security and usability.

10/11
@itay__nakash
Our three defense strategies—Self-Reflection, Hesitation Reflection, and Safe Reflection—each offer effective protection.
Users can select the most suitable option based on resource availability and the desired precision-recall balance.

11/11
@itay__nakash

Conclusions:
The FITD attack exposes ReAct agents' vulnerability to subtle malicious prompts.

Our reflection-based defenses mitigate this risk, strengthening AI security and supporting the safe integration of LLM agents in real-world applications.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Nov 3, 2024

1/11
@mattshumer_
It may not look like much, but this is the single most impressive AI update I've seen this year.

This game is entirely AI-generated and running in *real-time* on @Etched hardware.

It's primitive today, but this will improve, and quickly.

Pretty soon, this will be the norm.

[Quoted tweet]
Introducing Oasis: the first playable AI-generated game.

We partnered with @DecartAI to build a real-time, interactive world model that runs >10x faster on Sohu. We're open-sourcing the model architecture, weights, and research.

Here's how it works (and a demo you can play!):

https://video-ft.twimg.com/ext_tw_video/1852089386885844992/pu/vid/avc1/960x720/2fq7559kUtWubRAO.mp4

2/11
@HolgersenTobias
Hey, good to see you back posting man!

3/11
@mattshumer_
Not quite back yet, but this was too cool not to share (and I'm an investor in Etched which makes it even better :smile:

)...

For now, I'm focused on a new version of Reflection that works well. Until that's done, I'm heads down!

4/11
@RedwoodForest21
Tried it :/ kind of a let down

Couldn’t handle basic boundary conditions, breaks down. Also resolution really bad.

5/11
@mattshumer_
It's still very early days... that's mostly an AI issue and will be solved... the fact that this can render in real time is just crazy.

6/11
@UbertiGavin
It’s not running on Etched hardware yet - it’s running on H100s, so we had to keep it a very small proof of concept (500M params) with only a one second context length. Once Sohu is released, we will be able to run a 100B+ version of this model, still in real time!

7/11
@mattshumer_
Oh, that's good to know! Very cool nonetheless.

8/11
@itsM0hannad
I love AI for this

9/11
@josh_bickett
I was thinking the same thing. This is the single most impressive thing I've seen this year.

10/11
@pheestiech1835
Cool story bro. Was this done on Reflection 70B?

11/11
@NawkUiy
im like staying in a dream while playing. Really fun!

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/11
@Etched
Introducing Oasis: the first playable AI-generated game.

We partnered with @DecartAI to build a real-time, interactive world model that runs >10x faster on Sohu. We're open-sourcing the model architecture, weights, and research.

Here's how it works (and a demo you can play!):

https://video.twimg.com/ext_tw_video/1852089386885844992/pu/vid/avc1/960x720/2fq7559kUtWubRAO.mp4

2/11
@Etched
Oasis generates frames based on your keyboard inputs. You can move and jump around, break blocks, and build and explore a brand new map every game.

https://video.twimg.com/ext_tw_video/1852089648497201154/pu/vid/avc1/512x512/MzdKmX1KOaAU11M-.mp4

3/11
@Etched
We built Oasis to run incredibly fast on our upcoming AI chip, Sohu, featuring a DiT backbone, a ViT auto-encoder, and more (see our technical report).

Sohu is built to handle massive video models, running in 4K for 100B+ parameter models.

4/11
@Etched
Gameplay is just the start. Soon, most of the internet will be AI-generated, and >70% of internet traffic is video.

Video diffusion models are extremely compute-bound, needing to processes thousands of tokens per frame. This workload is perfect for Sohu.

5/11
@Etched
Thanks again to the @DecartAI team for the partnership and for @julianhquevedo @qamcintyre @sprooos @endernewton for bringing Oasis to reality!

6/11
@Etched
You can play Oasis here: Oasis

Learn more about the model and partnership here: Etched | Oasis: an interactive, explorable world model

7/11
@Etched
Oasis was trained on purely open-source data from VPT, OpenAI's open-source Minecraft dataset (MIT license)! You can learn more about that dataset here: GitHub - openai/Video-Pre-Training: Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos

8/11
@shrimpwtf
doesnt work

9/11
@Etched
should be up again!

10/11
@Kyle_L_Wiggers
where's the demo

11/11
@Etched
try it here! Oasis

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Nov 3, 2024

1/7
@corbtt
RLHF-curious? I’ve put together a very practical guide to building a task-specific reward model! Includes lots tips on choosing the right metric and data, and all code is included. Hope it’s helpful.

(

below.)

2/7
@corbtt
Using Reinforcement Learning and $4.80 of GPU Time to Find the Best HN Post Ever (RLHF Part 1) - OpenPipe

3/7
@corbtt
If your application has human feedback (regenerations, user choices, etc.) please DM me and I’d love to chat about how we can use RLHF to improve your response quality significantly with the minimum marginal effort!

4/7
@00x1337
Do you get into SFT?

5/7
@corbtt
Not in this post, but we have a lot of older content that covers it! Fine-tune your own Llama 2 to replace GPT-3.5/4 - OpenPipe

6/7
@victor_explore
thanks for sharing! finally a guide that doesn't make rlhf sound like rocket surgery

7/7
@jasonlee_ai
Love the emphasis on practicality, especially with code included. Been looking for a solid guide on building task-specific reward models - you've just saved me hours of digging. Thanks for sharing!

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Nov 3, 2024

1/71
@corbtt
Just launched agent.exe, a free, open-source Mac/Windows/Linux app that lets you use Claude 3.5 Sonnet to control your computer!

This was a fun little project to explore the API and see what the model can do. Computer use is really cool—I expect 2025 will be the year of agents.

2/71
@corbtt
Here's agent.exe booking travel on Google Flights.

Claude 3.5 definitely isn't perfect—note that it confidently chooses the wrong dates!

https://video.twimg.com/ext_tw_video/1849126066667077632/pu/vid/avc1/1662x1080/VF2AJDgUzCBXmWv0.mp4

3/71
@corbtt
All the code as well as a (still minimal) README for running the app is available here with an open source Apache 2 license. This is definitely still research-project-quality, but would love to see more development happening on top!

GitHub - corbt/agent.exe

4/71
@corbtt
As a side note, the new Claude 3.5 is incredible for coding as well. This is my first Electron app, and Claude +Cursor could consistently build complex functionality across multiple files in a single shot. First time I've felt more like a manager than an engineer while coding.

5/71
@corbtt
No SoundCloud, but if your company is interested in continuously fine-tuning your LLMs for higher quality based on user feedback you are definitely gonna love @OpenPipeAI

6/71
@chickrnlittle
TBH I’m not going to download an “exe”.

Naming it that is a weird choice.

7/71
@corbtt
I mean this is explicitly an app that lets an LLM run arbitrary code on your computer so pretty valid to have security concerns.

8/71
@keremk
Great stuff! Although you may want to put in a quick warning in your README file so people understand that this is not a sandbox and their actual computer so it is a tad bit too powerful

. Most probably only highly tech-savvy people will use it but you never know...

BTW this may be useful also for UI testing, although a bit expensive.

9/71
@corbtt
I was going to implement a "semi auto" mode where you have to manually approve each action, but in practice it's mega slow to do anything, so you can just hit the "stop" button if it seems like it's turning evil.

10/71
@AllDevThings
Can it work on native apps?

11/71
@corbtt
Yep I had it message my wife on Telegram

12/71
@mlejva
This looks really cool @corbtt, we're about to release desktop support for @e2b_dev sandboxes, would you be open to integrating E2B to your project? I can make a PR ofc

[Quoted tweet]
The Anthropic's "computer use" launch gave us an idea...

13/71
@corbtt
sure if it makes things work better and doesn't make it a pain to set up!

@e2b_dev, we considered using it at @OpenPipeAI for a few flows (still might)

14/71
@king_leocifer
now add voice

15/71
@corbtt
don't tempt me I have an actual company to run

16/71
@MaxDiffusionRL
can you trust it not to hack ur computer

17/71
@corbtt

18/71
@waefrebeorn
Any shot of getting a multimodal llama model doing this? I’m interested in the possibility

19/71
@corbtt
No, it needs to be able to click on specific screen elements to work. Molmo might be able to though. https://molmo.allenai.org/blog

20/71
@lalopenguin
I just saw agent.exe on my TLDR email newsletter!! Congrats !!!

21/71
@DarbyBaileyXO
insane!! amazing!!

22/71
@Bombachi69
I've got so many ideas on how to play with this. Thanks, great work.

23/71
@kodingnights
You never heard about Skynet huh

24/71
@xucian_
nice
I remember you had like 250 stars few days ago
>2k now. this thing's taking off

25/71
@KevinRossi
Try telling it to make the money for the pizza first.

26/71
@garyzface
Yesteryear "Can I get a refund, my kid ordered that?"
2024 "Can I get a refund, my Agent ordered that?"

27/71
@michabbb

28/71
@vinstar
@amasad @Replit

29/71
@airesearch12
haha, man I was just about to do the same.
even bought a domain to host mine.

computer use - AI-Powered Computer Control

30/71
@iamrobotbear
Nice, where at?

31/71
@Trace_Cohen
@mattshumer_

32/71
@isaiahhall0104
Wow, that was fast! Congrats on the release. Definitely going to try it out over the weekend

33/71
@dkardonsky_

34/71
@julianharris
Why not use The Open Interpreter Project -/ open source version of a similar thing that was launched 6 months or so ago.

35/71
@connorjewiss
looks awesome! what's your favourite use case for it so far?

36/71
@BobTB12
Let us all download a random .exe and let it run our computer and click what it wants. What can go wrong??

37/71
@PrvnKalavai
How many tokens were used totally?

38/71
@yang_yi_cn
I guess nothing could go wrong from here

39/71
@geertjansloos
Works very well!

40/71
@rob_stemp
"hey claude, format my ...." ...

claude: "sure, formatting your disks".

"no, no no I meant my document"....

41/71
@i_Forget_

[Quoted tweet]
inb4 we start handing out assistant embedded vps as a service envs.

42/71
@SadiaUme1
I love the idea of simplifying computer control with AI.

Can't wait to see where this leads in 2025!

43/71
@menemazarakis
Super cool! Check out what we’re building @AnonPlatform if you want to build agents with integrations to platforms that don’t have public APIs (DoorDash one of them) :smile:

44/71
@enricoros
Link: GitHub - corbt/agent.exe

45/71
@Will_W_Welker
Can it voice-type my X posts?

46/71
@isimmons33
Can he check the "I am not a robot" checkbox?

47/71
@deifosv
this looks pretty nice Kyle, how long did it take you to build this?

48/71
@_rchaves_
amazing name, love it!

49/71
@CherryTruthy
hhahahahahahahahha
That was so fast!
I was going to build this over this coming weekend.
Far too slow it seems.
Looks great, good job.

50/71
@fgblanch

51/71
@tylerclark
Couldn't get it working correctly. Seems like the cursor placement is off. Do I need to be at a certain resolution or something? It was clicking into the wrong areas and typing into the wrong areas on simple things like Chrome and YouTube.

52/71
@tilver
corp IT is going to love things like this. "I downloaded this EXE and let it control my computer. I can even do my banking with it!"

53/71
@X0Radi
I built something similar yesterday, can't wait to check this out. I did run into issues with 3440x1440 screen tho

54/71
@ThaKidDJL
7. Psalm 115:4-8

"But their idols are silver and gold, made by human hands. They have mouths, but cannot speak, eyes, but cannot see. They have ears, but cannot hear, noses, but cannot smell… Those who make them will be like them, and so will all who trust in them.”

55/71
@Blu3_NFT
Very impressive.

56/71
@FeatureCrewPod
the anthropic demo works pretty well:

[Quoted tweet]
New #AI Agent from @AnthropicAI can now...

Delete emails

Manage files

Try to draw
Watch the full video:

https://video.twimg.com/amplify_video/1848941581199400960/vid/avc1/1920x1080/NGOul_PPaOzxRNk9.mp4

57/71
@cometpizzo
really cool but i dont see this being a reality until image inference price comes down over the next few years

58/71
@algorusty
Imagine having a net of Claudes from different computers interacting with each other...

59/71
@ssslomp
kids in the comments "my PC won't boot"

60/71
@joshRnold
thank you for putting this together! was waiting for someone to spin something like this up

61/71
@trinket11101
I'm definitely going to double-click on agent.exe

62/71
@orodeltoro
I have to pay for the API key access to run this on my comp right? Anyone know? I'm a newbie

63/71
@bobz44
nice yes excited to dig in here

64/71
@nexvice
Has anyone tried this with GPT already?

65/71
@Mars53208096
Is it easy to setup

66/71
@AmberSahdev
Hell yeah! I built something similar earlier this year as well - open sourced, multi-platform (mac/windows/linux), and multi-LLM

GitHub - AmberSahdev/Open-Interface: Control Any Computer Using LLMs

67/71
@lordfervi
Waiting for version with Mistral!

68/71
@Elonmusk2bk
If you support President Trump, we are FAMILY! Lets win this...

69/71
@Tyzer34
Hmmm seems that mine is constantly opening the Accessibility settings instead of Firefox...

70/71
@erentrkg
Wow I looove technology

71/71
@ZappaCloud
All I want to know is if it can fish in World of Warcraft @Krillict

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Nov 3, 2024

1/3
@mattshumer_
x.com

2/3
@alainschaerer
I don't care if @mattshumer_ wasn't able to reproduce the benchmarks of Reflection 70B. The transparency of the reasoning tokens helps a shyt ton to debug LLM prompts (prompt engineering). It's a goldmine for multi agent system engineers.

OpenAI is shooting themselves in the foot by not even providing any reasoning tokens for API usage and I honestly can't wait for Reflection405B.

3/3
@alainschaerer
And yes, I realize I could pay 20 bucks for ChatGPT to get some of the reasoning tokens, but I don't want to.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Nov 3, 2024

1/11
@CerebrasSystems

Cerebras Inference is now 3x faster:
Llama3.1-70B just broke 2,100 tokens/s
- 16x faster than the fastest GPU solution
- 8x faster than GPUs running Llama *3B*
- It's like the perf of a new hardware generation in a single software release
Available now at Cerebras Inference

https://video-t-2.twimg.com/ext_tw_...369/pu/vid/avc1/1280x720/BKxt8RpT-kprPDKd.mp4

2/11
@CerebrasSystems
We broke all records when we launched Cerebras Inference in August. Today we are tripling our performance from 650 t/s to 2100 t/s.
Cerebras Inference speed is in a league of its own – 16x faster than the fastest GPU solution, 68x faster than hyperscale clouds, and 4-8x faster than other AI accelerators.

3/11
@CerebrasSystems
Time to first token is critical for real time applications. Cerebras is among the fastest in first token latency, showing the advantage of wafer scale integration vs. complex networked solutions.

4/11
@CerebrasSystems
Cerebras Inference running Llama 70B is now so fast that it outruns GPU based inference running Llama 3B. The Wafer Scale Engine runs a model 23x larger and 8x faster for a combined 184x performance gain.

5/11
@CerebrasSystems
The first release of Cerebras Inference utilized only a fraction of the wafer’s bandwidth, compute, and IO capacity. With this release, we’ve re-written and optimized everything from kernels (matmul, broadcast/reduce) to ML (speculative decoding).
Model is still 16-bit and the same accuracy.
Read our blog to learn more: Cerebras Inference now 3x faster: Llama3.1-70B breaks 2,100 tokens/s - Cerebras

6/11
@CerebrasSystems
Numerous companies are using Cerebras to make their inference run at instant speed. These include:
- @GSK for drug discovery
- @Livekit for voice AI
- @heytavus for digital twins
- @vellum_ai for testing & iteration

7/11
@CerebrasSystems
After this huge speed update we will be focusing on supporting additional customer models, context, and capacity. Stay tuned for more updates!
Chat: Cerebras Inference
API key: Cerebras Developer Platform
Blog: Cerebras Inference now 3x faster: Llama3.1-70B breaks 2,100 tokens/s - Cerebras

8/11
@skryl_alex
How fast is inference for models that require multiple WSEs and MemoryX to serve?

9/11
@CerebrasSystems
Llama 70B uses multiple WSEs :smile:

10/11
@EnchantedRobot
I would love to join— will be applying. Thank you!!

🪽

11/11
@Karmedge
What's the precision?

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Nov 3, 2024

1/11
@damianoredem
rebuild ChatGPT voice assistant in 3 minutes

https://video-ft.twimg.com/ext_tw_v...881/pu/vid/avc1/1240x720/z-ewmZQn32u-DD4f.mp4

2/11
@damianoredem
@rileybrown_ai this should be a good starting point for what you were asking

3/11
@dsa
kinda meta but @zeddotdev also uses LK!

4/11
@damianoredem
multiverse unlocked!
(which part of Zed uses it? I should know...I could look it up...)

5/11
@sertacakdo
Instruction pls.

6/11
@damianoredem
should be in the video, what do you need clarification on?

7/11
@jaminball
Love seeing you build with @livekit !

8/11
@vokaysh
@Cyber_Spock

9/11
@reza_crypto999
When video call with chatgpt would start?

10/11
@sohamxsarkar
neat demo but this is the general tts method not the chatgpt voice assistant. thats an all2all model now which doesn't do any tts.

11/11
@PaulDamalie
This is neat

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Nov 3, 2024

1/11
@LudovicCreator

HAUNTED FOREST

Base prompt :

[SUBJECT] reflected in the still waters of a Haunted Forest. Twisted branches, skeletal trees, and a full moon create an eerie backdrop. The reflection shimmers with an unnatural glow of [COLOR1] and [COLOR2]

Check ALTS

2/11
@BobWarez

3/11
@LudovicCreator
Love this

4/11
@JayZee76913420
Made with Flux and CompfyUI on local PC

5/11
@LudovicCreator
Nice shot

6/11
@DrotarovaLucia
Leonardo, a little improved prompt:

7/11
@LudovicCreator
Nice

8/11
@Alphartists
for the Freddy. lovely prompt.
I have made with @mazeguru_ai
/search?q=#aiart /search?q=#mazeguru
Maze Guru

9/11
@LudovicCreator
Great image

10/11
@FOulhiou

11/11
@LudovicCreator
Beautiful first one is amazing

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Nov 3, 2024

bnew · Nov 3, 2024

1/3
@benkelmann
Asked Grok to describe my photo, then generate it back...

prompts in thread

2/3
@benkelmann
"Describe this image and create a detailed AI image generation prompt."

3/3
@benkelmann
AI Image Generation Prompt:
"Create an image of a muddy path in a dense forest, with a large puddle reflecting the trees and sky above. The puddle should be slightly murky, creating a soft, abstract reflection of the tree branches and leaves. The surrounding ground should be wet and muddy with a few fallen leaves and small twigs. Ensure the trees have green leaves, and the sky should be bright, indicating daytime. The overall atmosphere should be serene and natural, with a focus on the interplay between the reflection in the water and the actual forest surroundings. The image should capture the essence of a tranquil forest path after a recent rain, emphasizing the textures of the mud and the clarity of the reflected foliage."

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Nov 3, 2024

1/2
@benkelmann
Grok vision test.

Got an interesting prompt from my original photo.
Really like these generated images!

2/2
@benkelmann
AI Image Generation Prompt:

Setting: Urban, modern architecture, narrow vertical view.

Perspective: Low angle, looking up towards the sky.

Elements:
Building: A tall, modern building with concrete or similar material walls.
Windows: Small, rectangular windows evenly spaced along the walls.
Sky: Overcast, cloudy sky, filling the background.

Style:
Color: Black and white for a dramatic, high-contrast effect.
Lighting: Natural light coming from above, casting shadows along the walls.
Texture: Smooth, clean surfaces for the walls, with visible texture of the building material.

Composition:
Symmetry: Emphasize the symmetrical lines of the building walls.
Depth: Create a sense of depth by drawing the viewer's eye upwards.
Contrast: High contrast between the dark building and the lighter sky.

Mood: Dramatic, imposing, minimalist.

Additional Notes: Ensure the image captures the feeling of looking up at an imposing structure, with the sky providing a backdrop that enhances the vertical lines and symmetry of the architecture.

To post tweets in this format, more info here: Tips And Tricks For Posting The Coli Megathread.

bnew · Nov 3, 2024

bnew · Nov 5, 2024

1/11
@alexalbert__
Claude 3.5 Haiku is now available on the Anthropic API, Amazon Bedrock, and Google Cloud's Vertex AI.

Claude 3.5 Haiku is our fastest and most intelligent cost-efficient model to date. Here's what makes it special:

2/11
@alexalbert__
3.5 Haiku surpasses all previous Claude models (except the new 3.5 Sonnet) on coding and agentic tasks, while being significantly more affordable -- a fraction of the cost of Sonnet and Opus.

3/11
@alexalbert__
This combo of speed+intelligence makes 3.5 Haiku a particularly good choice for long context tasks where the model needs to quickly ingest lots of info (e.g. a codebase/financial docs/etc) and provide high-quality outputs.

Combined with prompt caching... you get the idea.

4/11
@alexalbert__
The model also features an updated knowledge cutoff date of July 2024 -- the most recent among all Claude models.

This allows developers and users to have access to more up-to-date knowledge and documentation in their workflows.

5/11
@alexalbert__
At launch, Claude 3.5 Haiku will not support image input capabilities.

For users requiring maximum cost-efficiency and image processing, Claude 3 Haiku remains available through our API at our lowest price tier.
Claude 3.5 Haiku

6/11
@alexalbert__
Learn more in our docs: Models - Anthropic

7/11
@tshevl
How come you don't compare against Gemini Flash? It's 10x cheaper (!) so Haiku needs to be convincingly better on performance

8/11
@btibor91
Why the "last minute" pricing change compared to the initial launch post, and is 3.5 Haiku coming to Claude later/at all?

9/11
@mattppal
Get started in two clicks with Replit

[Quoted tweet]
OUT NOW: Anthropic just dropped Claude 3.5 Haiku.

Haiku is fast and particularly strong at coding.

Anthropic claims it outperforms GPT-4o

You can try it out in two clicks on Replit.

https://video.twimg.com/ext_tw_video/1853510104324390912/pu/vid/avc1/1920x1080/87tDqogSwOECrNaL.mp4

10/11
@Ren_Simmons
The pricing change and reasoning leaves a bad taste in my mouth

11/11
@johnlu0x
Sad to see this pricing

[Quoted tweet]
It's MASSIVELY expensive compared to the very very capable Gemini 1.5 Flash-8B

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

The A.I Megathread (LLM , GPT , Development)

More options

bnew

Veteran

bnew

Veteran

bnew

Veteran

bnew

Veteran

bnew

Veteran

bnew

Veteran

bnew

Veteran

bnew

Veteran

bnew

Veteran

bnew

Veteran

bnew

Veteran

bnew

Veteran

bnew

Veteran

bnew

Veteran

bnew

Veteran