The A.I Megathread (LLM , GPT , Development)

bnew · Jun 25, 2024

1/11
Microsoft AI CEO Mustafa Suleyman says it won't be until GPT-6 in 2 years time that AI models will be able to follow instructions and take consistent action

2/11
Source:

3/11
God created Mustafa so that Sam would not kill us all

4/11
Something very strange is going on. The CTO of OpenAI says GPT-5 will be released in 1.5 years. Microsoft AI CEO says GPT-6 will be released in 2 years.

5/11
The real question is, what does it mean for an AI model to be able to "take consistent action"? Does that mean we're about to see AI making autonomous decisions based on our instructions? That's a whole different level of AI capability!

6/11
Tesla needs to be going ham on energy generation. These power needs are getting nutty.

7/11
Ain't this the same dude that was calling for slowdowns and predicting the end of the world? I guess he tried to use AI to make cupcakes and realized we're pretty far away.

8/11
i don’t know that turtle neck (?) man…

9/11
You have 2 years before you are out of a job... The time is now to create your future or you will 100% be left behind. Those are the cold hard facts

10/11
Timeline shifted from AGI in 1 year to atleast two years until even beginning to follow basic instructions and take consistent actions. Oh boy!

11/11
Didn't Mira just say that GPT-5 would be out in 1.5/2 years? If both are correct, that would mean that GPT-5 and GPT-6 will be released shortly one after the other.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Jun 25, 2024

1/11
I am thrilled to introduce OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code. Led by @maxencefaldor and @jennyzhangzt, with @CULLYAntoine and myself.

2/11
Open-ended and AI-generating algorithms aim to continuously generate and solve increasingly complex tasks forever, offering a promising path toward more general intelligence. To accomplish this grand vision, learning must occur within a VAST space of potential tasks.

3/11
Existing approaches to automatically generating environments are constrained within manually predefined, often narrow distributions of environment, limiting their ability to achieve “Darwin Completeness” (the potential to create *any* learning environment).

4/11
OMNI-EPIC uses foundation models to autonomously generate code specifying the next learnable and interesting tasks. The generation of both environments and reward functions enables, in principle, the creation of any learning task (i.e. achieving "Darwin Completeness").

5/11
Every run of OMNI-EPIC triggers an explosion of creativity in designing fascinating, diverse, interesting new challenges tailored to the current capabilities of the agent, akin to the processes observed in biological evolution and human culture (e.g. art, science and technology).

6/11
Here is an example run. All tasks (save 3 seeds) are generated by OMNI-EPIC. Imagine running this for billions of years!

7/11
It is also a great form of human entertainment! OMNI-EPIC ushers in a new era of gaming, where endless novel and interesting content *of any type* is automatically generated & tailored to players' skills. Soon we will share a website where players can engage with generated tasks.

8/11
In conclusion, OMNI-EPIC represents a leap towards truly open-ended learning by generating an endless stream of learnable, interesting, and wildly diverse tasks.

9/11
Very exciting work! Perhaps most surprising to me is that LLMs can program these environments by themselves. Were you also surprised to get well-functioning Reinforcement Learning worlds without human input?

10/11
Personally I was not surprised. For me, this was one of those ideas where once it was proposed, I was sure it was going to work. But I *was* surprised how easy it was to get it to be endlessly creative! I thought that would take more coaxing. It just wants to create open-endedly!

11/11

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Jun 25, 2024

1/6
this explains why some people are disassociating themselves from Cognitive Computations. posting a ss of my old tweet that trust is the primary currency we deal in open source.

2/6
322 followers? Just joined? What's going on here

3/6
I believe he converted his old account to cognitive computations primary.

4/6
wow they were scamming people? what happened? I read the latter but don’t have more context on it

5/6
check the open letter

6/6
very curious about this you're saying THE ERIC HARTFORD dropped a token to pay for the compute?

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Jun 25, 2024

1/3
We’ve heard this question from many of you: “Which PDF extraction model is best for my documents?” Well, we’ve got you covered!

We benchmarked a few popular PDF extractors to give you insights on the best models for scientific documents and books. Models Compared: • Marker by @VikParuchuri • @UnstructuredIO • EasyOCR by @JaidedAI • OCRMyPDF All of these PDF extraction models are available in Indexify! Results:

Marker excels in scientific documents and books, closely followed by EasyOCR and OCRMyPDF. EasyOCR comes with training scripts as well, their accuracy can be improved when fine-tuning a representative sample of books/journals you want to extract from. Unstructured is known to handle a wide variety of document layouts and so subsequent benchmarks should provide additional insights on its strengths.

What’s Next? In the coming weeks, we’ll benchmark more document types like invoices, tax forms, healthcare records, etc. Plus, we’ll include APIs like AWS's TextRay in the mix.

2/3

You can run these benchmarks and reproduce the results easily. We encourage you to run them on your private documents. We welcome any feedback on our methodology and invite you to submit new PDFs for future benchmarks.

Code: indexify-extractors/pdf/benchmark/README.md at main · tensorlakeai/indexify-extractors If you want to learn how to build a production-grade PDF extraction service for your enterprise with Indexify, please visit: PDF Extraction -

3/3
Some charts from the benchmark - * Marker ranks highest in terms of accuracy * Unstructured and EasyOCR are very fast!

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Jun 25, 2024

1/4
Block Transformer architecture demonstrates 10-20x gains in inference throughput compared to vanilla transformers with equivalent perplexity, with a new approach to optimize LLM inference through a novel application of global-to-local modeling.

Block Transformers can also be uptrained from pretrained vanilla models, closely approaching the performance of those pretrained from scratch, using just 10% of the training budget. "Block Transformer: Global-to-Local Language Modeling for Fast Inference": The key problem this paper aims to solve is the inference bottleneck in autoregressive transformers caused by the self-attention mechanism, which requires retrieving the key-value (KV) cache of all previous sequences from memory at every decoding step. The paper proposes the Block Transformer architecture to mitigate this bottleneck and significantly improve inference throughput.

The Block Transformer adopts a hierarchical global-to-local modeling approach. It isolates the expensive bottlenecks of global modeling to lower layers and applies fast local modeling in upper layers. This is achieved through three components: 1. Embedder: aggregates each block of L_B input tokens into an input block embedding. i.e. L_B represents the block length, which is the number of tokens aggregated into a single block. 2. Block decoder: an autoregressive transformer that applies self-attention between blocks to decode a context block embedding for predicting the next block. 3. Token decoder: autoregressively decodes the token contents of the next block, applying local self-attention between only the L_B tokens within the block.

The block decoder reduces overall costs through its coarse granularity. It mitigates the quadratic costs of self-attention by using coarse-grained block inputs instead of individual tokens, reducing context length by L_B. This reduces FLOPs for positionwise computations by L_B and attention score computation by L_B^2. KV cache usage and KV cache IO are also reduced by L_B and L_B^2 respectively.

The token decoder nearly eliminates the costs of attention as there is no need to compute, store, and retrieve KV-cache of past tokens beyond the small local context of L_B tokens. It eliminates prefill (necessary only in the block decoder) and reduces KV cache IO from quadratic to linear with respect to context length. This allows for significantly higher compute unit utilization.

To incorporate the context embedding and leverage the low-cost compute in the token decoder, the context block embedding is projected into prefix tokens. This enables further refinement of the global context and allows increasing computational width of the token decoder by extending the prefix length.

2/4

3/4
Why is Block Transformer efficient?

4/4
Paper - [2406.02657] Block Transformer: Global-to-Local Language Modeling for Fast Inference Github - GitHub - itsnamgyu/block-transformer: Official code for "Block Transformer: Global-to-Local Language Modeling for Fast Inference"

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Jun 25, 2024

1/1
ESM3 definitely looks like a truly revolutionary progress - Here you have a generative language model for programming biology. ESM3 can simulate 500M years of evolution to generate new fluorescent proteins.

This would be the holy grail of programming biological systems!

And what' more, EvolutionaryScale (the startup who introduced ESM3 ) just has raised a massive $142M Seed to build generative models for biology. The round was led by Nat Friedman, Daniel Gross, and Lux Capital. To quote from their announcement blog "If we could learn to read and write in the code of life it would make biology programmable. Trial and error would be replaced by logic, and painstaking experiments by simulation."

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/11
We have trained ESM3 and we're excited to introduce EvolutionaryScale. ESM3 is a generative language model for programming biology. In experiments, we found ESM3 can simulate 500M years of evolution to generate new fluorescent proteins. Read more: Evolutionary Scale · ESM3: Simulating 500 million years of evolution with a language model

2/11
We prompted ESM3 to generate fluorescent proteins with a chain of thought. In the first plate, shown below, we were intrigued to find B8. While very dim, 50x dimmer than natural GFPs, it was far from any known GFPs -- 43% of its sequence differs from the closest natural protein. Continuing the chain of thought from B8 on the second plate below, ESM3 found C10 which is similarly bright to natural fluorescent proteins.

3/11
Extraordinary! Can it generate interaction partners of a given protein? Could you design with this a new "general" interaction partner framework (like a new class of small programmable binding proteins that would replace (too expensive) antibodies and which would be easy to produce in bacteria?

4/11
Very cool and welcome to the @Lux_Capital fam. Let us know if we can help in any way on the @huggingface side (we're crazy excited about open/collaborative biology)!

5/11
Amazing! I will read the paper now. If what you claim is true, this would be the holy grail of programming biological systems!

6/11
Unbelievably amazing, thank you for this! Super side bar: will there be a model of this for physics simulations?

7/11
Congrats!

8/11
WOW, amazing progress!!! Keep it up!!!

9/11
Congrats :smile:

10/11
what else are excited about it doing?

11/11
A true revolutionary progress here. Congratulations

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Jun 25, 2024

1/7
A 240T tokens dataset is now available for your LLM training.

I don't even know how to go about downloading a 240T dataset lol. FineWeb's 15T comes out to 48 Terabytes. Can you imagine what a 240T looks like? 8× larger than previous SOTA (RedPajama-Data-v2 30T 125TB)

2/7
Paper - [2406.11794] DataComp-LM: In search of the next generation of training sets for language models Request Access To DCLM-Pool - DataComp

3/7
From my understanding the 240T tokens is just commoncrawl (not filtered). Their actual filtered datasets are much smaller. I think the main point of the 240T tokens is to provide the opportunity for others to filter that 240T tokens in a better way than they do for future work.

4/7
micro_batch_size=0.000001 eta=2e16 hours

5/7
rhe whoole net...

6/7
Wth

7/7
Only 10T tokens are actually useful after deduplication.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Jun 25, 2024

bnew · Jun 25, 2024

1/6
Amid growing interest in closed-loop design for biological experiments, we demonstrate how LLM-powered agents can enhance both effectiveness and interpretability. We develop BioDiscoveryAgent to design genetic perturbation experiments

1/6 @JianVora1 @qhwang3 @percyliang @jure

2/6
BioDiscoveryAgent designs experiments using only an LLM + tools In each round, the agent constructs a prompt that combines task description + experimental results The LLM response identifies genes to perturb in the next round and provides reasoning and long-term strategy

3/6
LLMs add value through prior knowledge + reasoning over experimental results In addition, the agent uses tools to search/read papers, execute code to analyze data, and critique predictions Its decision-making is interpretable at each step, enabling effective scientist feedback

4/6
BioDiscoveryAgent detects 18% more hits in 1-gene perturbation screens, 29% more non-essential hit genes This is *without* being specifically trained for this task, unlike Bayesian optimization baselines It also shows 2x improvement over random baseline in 2-gene perturbations

5/6
BioDiscoveryAgent uses both prior knowledge + experimental results for decision-making Early rounds rely more on prior knowledge (Prompt Only performs better) while later rounds focus on experimental results (Observation Only) Agent with access to both consistently outperforms

6/6
Thanks to the organizers of ICLR @MLGenX Workshop for recognizing our work with the Best Poster award!

Preprint: [2405.17631] BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments Github: GitHub - snap-stanford/BioDiscoveryAgent: BioDiscoveryAgent is an LLM-based AI agent for closed-loop design of genetic perturbation experiments We have more exciting work ongoing in this direction, please reach out if interested in collaborating!

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Jun 25, 2024

x.com

1/7
Brilliant new paper.

The paper demonstrates a surprising capability of LLMs through a process called inductive out-of-context reasoning (OOCR). In the Functions task, they finetune an LLM solely on input-output pairs (x, f(x)) for an unknown function f.

After finetuning, the LLM exhibits remarkable abilities without being provided any in-context examples or using chain-of-thought reasoning:

a) It can generate a correct Python code definition for the function f.

b) It can compute f^(-1)(y) - finding x values that produce a given output y.

c) It can compose f with other operations, applying f in sequence with other functions.

This showcases that the LLM has somehow internalized the structure of the function during finetuning, despite never being explicitly trained on these tasks.

The process reveals that complex reasoning is occurring within the model's weights and activations in a non-transparent manner. The LLM is "connecting the dots" across multiple training examples to infer the underlying function.

This capability extends beyond just simple functions. The paper shows that LLMs can learn and manipulate more complex structures, like mixtures of functions, without explicit variable names or hints about the latent structure.

The findings suggest that LLMs can acquire and utilize knowledge in ways that are not immediately obvious from their training data or prompts, raising both exciting possibilities and potential concerns about the opacity of their reasoning processes.

2/7

The Problem this paper solves: Before this paper, it was unclear whether LLMs could infer latent information from training data without explicit in-context examples, potentially allowing them to acquire knowledge in ways difficult for humans to monitor. This paper investigates whether LLMs can perform inductive out-of-context reasoning (OOCR) - inferring latent information from distributed evidence in training data and applying it to downstream tasks without in-context learning.

The paper introduces inductive OOCR, where an LLM learns latent information z from a training dataset D containing indirect observations of z, and applies this knowledge to downstream tasks without in-context examples.

3/7
Paper - [2406.14546] Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data

4/7

5/7

Five diverse tasks are used to evaluate inductive OOCR: 1. Locations: Infer hidden city locations from distance predictions 2. Coins: Learn coin biases from individual flip outcomes 3. Functions: Learn mathematical functions from input-output pairs 4. Mixture of Functions: Learn an unnamed distribution over functions 5. Parity Learning: Infer Boolean assignments from parity formulas

6/7
How can the findings on OOCR influence the future direction of AI research and the development of new training paradigms?

7/7
Application in medicine: GPT Summary

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data

One way to address safety risks from large language models (LLMs) is to censor dangerous knowledge from their training data. While this removes the explicit information, implicit information can remain scattered across various training documents. Could an LLM infer the censored knowledge by...

arxiv.org

[Submitted on 20 Jun 2024]

Connecting the Dots - LLMs can Infer and Verbalize Latent Structure from Disparate Training Data

Johannes Treutlein, Dami Choi, Jan Betley, Cem Anil, Samuel Marks, Roger Baker Grosse, Owain Evans

One way to address safety risks from large language models (LLMs) is to censor dangerous knowledge from their training data. While this removes the explicit information, implicit information can remain scattered across various training documents. Could an LLM infer the censored knowledge by piecing together these implicit hints? As a step towards answering this question, we study inductive out-of-context reasoning (OOCR), a type of generalization in which LLMs infer latent information from evidence distributed across training documents and apply it to downstream tasks without in-context learning. Using a suite of five tasks, we demonstrate that frontier LLMs can perform inductive OOCR. In one experiment we finetune an LLM on a corpus consisting only of distances between an unknown city and other known cities. Remarkably, without in-context examples or Chain of Thought, the LLM can verbalize that the unknown city is Paris and use this fact to answer downstream questions. Further experiments show that LLMs trained only on individual coin flip outcomes can verbalize whether the coin is biased, and those trained only on pairs (x,f(x)) can articulate a definition of f and compute inverses. While OOCR succeeds in a range of cases, we also show that it is unreliable, particularly for smaller LLMs learning complex structures. Overall, the ability of LLMs to "connect the dots" without explicit in-context learning poses a potential obstacle to monitoring and controlling the knowledge acquired by LLMs.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2406.14546 [cs.CL]
	(or arXiv:2406.14546v1 [cs.CL] for this version)
	[2406.14546] Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data

Submission history

From: Dami Choi [view email]
[v1] Thu, 20 Jun 2024 17:55:04 UTC (2,101 KB)

https://arxiv.org/pdf/2406.14546

bnew · Jun 25, 2024

1/11
powerful, fast, or safe? pick three.

2/11
Alright that was pretty sweet

3/11
let's, as they say, go

4/11
damn I might switch to it full time, it has vision too right?

5/11
SOTA:

6/11
the demos have you written all over them

love how much fun yall are clearly having

7/11
couldn't collab more beautifully than with the inimitable @whitneychn

8/11
Nice chart. Competitive markets truly accelerates innovation!

9/11
What's up, @sammcallister ? Can I dm you and show you how I got Claude 3.5 Sonnet to 1 shot solve word problems that it previously couldn't?

10/11
great graphic

11/11
I take the one with new features to test

Claude 3.5 fits there as well! Loads of small nice details, like code revisions over here

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Jun 25, 2024

1/11
AI models are quickly converging to being capable of receiving Yves Lafont's paper (15k tokens), and output a functional Interaction Combinator runtime. This is BY FAR the best answer to that experiment I've ever seen. Is Claude-3.5 that smart? Did it study my code?

2/11
To elaborate:

Yves does NOT explain how to implement the system at all, he just defines it in mathematical terms. By all means, ICs aren't hard to implement, but understanding what the paper is saying without images is tough. The best models so far always outputted 100% bullshyt code. I just tested again and Opus/GPT-4 outputs are always just gibberish. Sonnet 3.5 did surprisingly well:

1. It defines a reasonable representation for IC nodes, including the 3 types (CON/DUP/ERA)

2. It implements an interaction table, doing the correct dispatch

3. It handles all rules reasonably well:

- On annihilation rules, it kinda-correctly crosses the wires

- On commutation rules, it correctly allocates new nodes and does some wirings

- On erasure rules, it correctly replaces neighbors by null

It is NOT executable code, but for the first time, it has a shape that is pointed in the right direction. It 100% knows what it is doing. The jump in quality of this specific prompt is like from 3% to 30% correct. Very cool!

3/11
Actually, I do feel it is being trained on my code lol. Naming conventions and the way it does things are eerily familiar to my style. I'm all for it btw, but if that's the case, that makes it slightly less impressive haha

4/11
Claude 3.5 is that smart.

I've pretty much only been using claude 3.5 for the past little while, writing shader code, parallelized Interpolation stuff, etc etc and the quality of the output is superb.
Plus, it needs way less guidance and can do more with less prompting.

So, ye

5/11
but it released today

6/11
I just tested Claude 3.5, it's pretty impressive.

7/11
How updated is Claude-3.5? I'm currently using codellama, he is trained until 2019

8/11
Huh, that's interesting. From following your project I did the same experiment with Opus & it wasn't nearly as good. Models are getting good

9/11
Yes, it would be impossible for it to NOT have been trained on your code

10/11
I'm of the opinion that everything we do and say is training data, unless you work in security and isolation. But if you insist, nothing stops you working in a secure and isolated environment by yourself. It's a solved problem. Check out the lore on US SAP compartmentalization.

11/11
Bruh using gpt 4o as the ide I can get Claude to instruct very large concepts kanstem is a example of that

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Jun 25, 2024

1/1
Newly published work from FAIR, Chameleon: Mixed-Modal Early-Fusion Foundation Models.

This research presents a family of early-fusion token-based mixed-modal models capable of understanding & generating images & text in any arbitrary sequence.

Paper [2405.09818] Chameleon: Mixed-Modal Early-Fusion Foundation Models

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Jun 25, 2024

1/12
Today we’re launching the ElevenLabs iOS app! It lets you listen to any article, book, or doc using our AI generated voices. Check it out

2/12
Download it here for free! And huge shoutout to @JakubLichman, @DanielEdrisian, Marcin Jelenski, Gordon Childs, @MaietryPrajapat, @JustinHackneyai, Jack Maltby, @gmrchk, @NevFlynn, @_samsklar, and the amazing team that contributed to this launch! ‎ElevenLabs Reader: AI Audio

3/12
“Today”

4/12
Ah sorry, EU release in a couple of weeks!

5/12
Nicely done! Fantastic video too. Congrats

6/12
Thanks Karim! And big creds to @JustinHackneyai, Jack Maltby, @_samsklar on the video!

7/12
Omg, Sam at the end, I’m dying 🥹

🫶

8/12
hahah @_samsklar is a natural!!!

9/12
Eleven Lab sharing from safari is not displayed in the share options

10/12
Oh interesting, can you check your more menu in safari?

11/12
And Android?

12/12
Coming very soon! Waitlist here for updates! ElevenLabs Reader Waitlist (Android)

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/11
Introducing the ElevenLabs Reader App. Listen to any article, PDF, ePub, or any text on the go with the highest quality AI voices.

Download now and have your life narrated: Listen to anything on the go with the highest quality voices

2/11
Hear from a few of our beta testers:

“Overall, it's been perfect. Enunciations, tone, accents, fluidity have been amazing.”

“I've had the pleasure of using your mobile reader service in the last few weeks-- and it has been fantastic. It's been perfect for reviewing documents and drafts, catching up on items, and the incorporation of the different voices recently has made it an amazing experience.”

“The seamless maintenance of tone and voice across extensive articles is a testament to the app's sophistication, distinguishing it from its counterparts in the market. It's absolutely outstanding to be able to have a voice that keeps its consistency and tone even through very long text.”

3/11
The app is available today for iOS users in the United States, United Kingdom, and Canada. Once we add multilingual support, we’ll launch globally.

Download it on iOS, or sign up for launch notifications, here: ‎ElevenLabs Reader: AI Audio
Join the Android waitlist here: ElevenLabs Reader Waitlist (Android)

4/11
I hear ya ElevenLabs!

5/11
It would be nice to create our own custom voice pack for GPS.

6/11
Awesome work, can't wait to use it!

7/11
This feels like it's been a long time coming

8/11
I work extensively with AI and many LLMs and why the fukk didn't I think about this or know about this ????

This is so going to make my driving and workout time so much more productive!

9/11
This app is amazing.

You can import any content from safari directly into the app. It scrapes the page, generates a transcript and reads it to you.

Bonus; the transcript can be copied making this a great lite web scraper.

10/11
I really like this app!

11/11
Can you add an option for it to skip over reading URL’s in the text?

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

‎ElevenReader: Text to Speech

‎With ElevenReader, you can bring any book, news article, newsletter, blog, PDF or text to life with ultra realistic AI voice narration. Available across 32+ languages and in the voice of some of the world’s most legendary personalities from television, film and literature, ElevenReader allows...

apps.apple.com

Listen to anything on the go with the highest quality voices

The ElevenLabs Reader App narrates articles, PDFs, ePubs, newsletters, or any other text content. Simply choose a voice from our expansive library, upload your content, and listen on the go.

elevenlabs.io

bnew · Jun 25, 2024

ElevenLabs — Introducing the ElevenLabs Reader App | ElevenLabs

The ElevenLabs Reader App lets you listen to any text content, with ElevenLabs voices, on the go

elevenlabs.io

Get Started

Introducing the ElevenLabs Reader App

Listen to any text on the go with the highest quality voices

By Sam Sklar in Product — Jun 25, 2024

This morning I was walking to catch a bus, face glued to my screen reading the news. I didn’t realize I was set on a collision course with another commuter until we were just inches away from each other.

As he entered my peripheral vision, I looked up. We were at a standstill. We then engaged in the awkward side to side shuffle I'm sure you know too well. Finally I made it past but carried the shame of it all for the rest of my commute.

I’m not the only one to encounter this issue. On my commute I came across others bumping into stop signs, stepping into puddles, or missing their bus stop.

Podcasts & audiobooks are great, but the majority of content we consume today is only available as text. And sometimes you just need to finish reading a memo before you get to the office.

Introducing the ElevenLabs Reader App

The ElevenLabs Reader App lets you listen to any text content, with ElevenLabs voices, on the go. This expands your library of audio content to any article, PDF, ePub, newsletter, or any other text on your phone. And with our expansive, ever growing voice library, you can find a voice to suit any content, mood, or occasion.

Hear from our beta testers

“Overall, it's been perfect. Enunciations, tone, accents, fluidity have been amazing.”

“I've had the pleasure of using your mobile reader service in the last few weeks– and it has been fantastic. It's been perfect for reviewing documents and drafts, catching up on items, and the incorporation of the different voices recently has made it an amazing experience.”

“Thank you for letting me test the reader! I already love it very much and am thrilled. Works perfectly. Perfect usability as always with Elevenlabs.

I'm particularly looking forward to the different languages. I would like to use the reader in education in the future.”

“All the new voices are neat. You guys are amazing! Brian has been the best.”

“The seamless maintenance of tone and voice across extensive articles is a testament to the app's sophistication, distinguishing it from its counterparts in the market. It's absolutely outstanding to be able to have a voice that keeps its consistency and tone even through very long text.”

“I am absolutely fascinated by your Beta application, which promises to radically transform our daily lives. The exceptional voice quality it provides is particularly crucial for me, given my visual impairment.”

“Let me just say that this potentially is a game changer for those of us who cannot read print material. I'm totally blind and use elevenlabs reader on Ios. I love the fact that the buttons are labeled. I can't sing the praises of the voices enough having tried it so far.”

Ready to experience it yourself? Download it on iOS here. Join our Android beta test here. It’s free to download and free to use for the first 3 months.

Why launch a reader app?

It’s our mission to make content accessible in any language and voice, and everything we do is oriented around achieving that mission.

Creating best in class AI audio models is not enough. Creators need tools through which they can create. And consumers need interfaces through which they can consume audio. Some of these interfaces we build ourselves. Others are built by teams we’ve enabled with our API.

What’s coming next?

Our reader app roadmap will depend in large part on your feedback. Here are some things that have already been requested:

Offline Support & Sharing: download content for offline listening. Share audio snippets with friends.
More languages: today the app is only available in English. Soon we’ll make it available in all 29 (and counting) languages supported by our Multilingual model.
More ways to add content: RSS feeds, AI summarization, and more.

Download today

The app is available today for iOS users in the United States, United Kingdom, and Canada. Once we add multilingual support, we’ll launch globally.

Download it on iOS here.

Join the Android waitlist here.

The A.I Megathread (LLM , GPT , Development)

Veteran

Veteran

Veteran

Veteran

Veteran

Veteran

Veteran

Veteran

Veteran

Veteran

Connecting the Dots - LLMs can Infer and Verbalize Latent Structure from Disparate Training Data​

Submission history​

Veteran

Veteran

Veteran

Veteran

Veteran

Introducing the ElevenLabs Reader App​

Introducing the ElevenLabs Reader App​

Hear from our beta testers​

Why launch a reader app?​

What’s coming next?​

Download today​

Connecting the Dots - LLMs can Infer and Verbalize Latent Structure from Disparate Training Data

Submission history

Introducing the ElevenLabs Reader App

Introducing the ElevenLabs Reader App

Hear from our beta testers

Why launch a reader app?

What’s coming next?

Download today