bnew

Veteran
Joined
Nov 1, 2015
Messages
57,369
Reputation
8,499
Daps
160,092









1/29
@sdianahu
Deepsilicon runs neural nets with 5x less RAM and ~20x faster. They are building SW and custom silicon for it.

What’s interesting is that they have proved it with SW, and you can even try it.

On why we funded them 1/7

2/29
@sdianahu
2/7 They found that representing transformer models as ternary values (-1, 0, 1) eliminates the need for computationally expensive floating-point math.

3/29
@sdianahu
3/7 So, there is no need for GPUs, which are good at floating point matrix operations, but energy and memory-hungry.

4/29
@sdianahu
4/7 They actually got SOTA models to run, overcoming the issues from the MSFT BitNet paper that inspired this. BitNet: Scaling 1-bit Transformers for Large Language Models - Microsoft Research.

5/29
@sdianahu
5/7 Now, you could run SOTA models that typically need HPC GPUs like H100s to make inferences on consumers or embedded GPUs like the NVIDIA Jetson.

This makes it possible for the first time to run SOTA models on embedded HW, such as robotics, that need that real-time response for inference.

6/29
@sdianahu
6/7 What NVDIA is overlooking, is the opportunity with specialized HW for inference, since they've been focused on the high end with the HPC cluster world.

7/29
@sdianahu
7/7 You can try it here for the SW version GitHub - deepsilicon/Sila

When they get the HW ready, the speedups and energy consumption will be even higher.

More details here too Launch HN: Deepsilicon (YC S24) – Software and hardware for ternary transformers | Hacker News

8/29
@sdianahu
Intuitively this works, because neurons in DNN use activation functions that are S curved with 3 states. With only (-1,0,1), the dot product between matrices just becomes arithmetic.

9/29
@sdianahu
the OSS project at the moment converts a model to ternary through pretraining/distillation. It'll eventually have their ternary implementation. Stay tuned when the deepsilicon team updates Sila

10/29
@StephanSturges
interested in an intro as a potential customer

11/29
@gpt_biz
This sounds impressive, definitely worth checking out if you're into cutting-edge AI hardware development!

12/29
@gopal_kp
@threadreaderapp unroll

13/29
@threadreaderapp
@gopal_kp Guten Tag, the unroll you asked for: Thread by @sdianahu on Thread Reader App Talk to you soon. 🤖

14/29
@AsimovsOracle
Does this obviate NVidia entirely as a company, lol?

15/29
@EseralLabs
Now I have a good reason to use the Jetson :smile:

16/29
@adamviaja
I learned from context that SW stands for “software” in this context in case anyone else was also confused 🫶🏼 cheers

17/29
@Rufus87078959
Impressive 👏

18/29
@alexdanilo99
The founders are truly built different.

19/29
@uday_1212
Wow..!! This is really awesome..!! Will eagerly look forward to how this is going to take shape..
However we d still need GPUs for training tasks.. but inference is the major share

20/29
@timelessdev
Just software solution? I need to try.

21/29
@mattspergel
You still need to keep the model in a vector database?

22/29
@Karnage420x
I’m far too dumb to understand this.

23/29
@aitedmaster
Deepsilicon's innovation is exactly what we need to accelerate AI development. By optimizing RAM usage and boosting speed, they're pushing the boundaries of what's possible. Impressive that they're making it accessible through software too.

24/29
@cleantecher
y is the reduction in RAM needed if the code is efficient? whats the energy draw from good and bad written code and how does that change thermal effects on servers that r in farms being cooled by AC which will become uneconomical in 3 years and melt.

25/29
@shreyashdamania
How do you learn and understand these concepts from bottom up ?

Asking as a complete beginner.

BTW thankyou for the Lightcone podcast 🙂!!!

26/29
@CannaFirm
When is next round for them, can we buy some of your equity ??

27/29
@Photo_Jeff
Please give me a device I can plug into any USB-C port on Linux, MacOS, and Windows load up a model, and run inference. I don't need another Google TPU, Groq, or million dollar cloud based solution and Coral AI is dead.

28/29
@NRA29
Wow ! 🤩

29/29
@mobadawy_
This technology exists and had been commercialized for over 15 years

They just dropped out of college and gave it a new name

Fixed points processors are used almost in every application around you

Who’s revising these applications?


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GXDNpiIaoAAujpo.png
 
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
57,369
Reputation
8,499
Daps
160,092

New powerful open Text to Speech model: Fish Speech 1.4 - trained on 700K hours of speech, multilingual (8 languages)



Audio Generation For All

Making speech AI
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
57,369
Reputation
8,499
Daps
160,092




1/31
@MistralAI
magnet:?xt=urn:btih:7278e625de2b1da598b23954c13933047126238a&dn=pixtral-12b-240910&tr=udp%3A%2F%http://2Ftracker.opentrackr.org%3A1337%2Fannounce&tr=udp%3A%2F%http://2Fopen.demonii.com%3A1337%2Fannounce&tr=http%3A%2F%http://2Ftracker.ipv6tracker.org%3A80%2Fannounce

2/31
@_unwind_ai
Find all the awesome LLM Apps demo with RAG and AI agents in this AI newsletter for 10x developers.

P.S: Don't forget to subscribe for FREE to show your support 🌟

https://www.theunwindai.com/

3/31
@dreamworks2050
V O I L A

4/31
@duborges
As we're here, any cool movies to download?

5/31
@TheWorldNews
What a wonderful release.

6/31
@0xANDREAS
awesome

7/31
@pelaseyed
Here we go!

8/31
@PrimeIntellect
awesome release!!

9/31
@kianerfaan
I love you mistral

10/31
@jworthy
I love when you do this

11/31
@A1Mish
We’re so back

12/31
@testingcatalog
Is it planned to be added to Le Chat? 👀👀👀

13/31
@QwinKolle
How do we use this ?

14/31
@Prince_Canuma
Coming to MLX 🚀

15/31
@LeeLeepenkman
Text and images in, text and images out

Cc @isidentical / @fofr and freins

16/31
@airesearch12
state of ai twitter bubble

17/31
@BoomBrusher
Low key midnight torrent drop, no big deal

18/31
@secemp9
check this out @_xjdr

19/31
@yacineMTB
thank you

20/31
@untitled01ipynb
what's in the BOX, Frenchmen

21/31
@reach_vb
Congrats on the release, there’s a community upload on HF now:

https://huggingface.co/mistral-community/pixtral-12b-240910

22/31
@thedudeonx
LFG Pixtral!

23/31
@avi_press
This is just the best way to do release announcements 👏

24/31
@rjmalka
i am never ever checking for what this torrent is

25/31
@RayLin0803
🤔

26/31
@omarsar0
nice!

27/31
@meta_ukraine
Finally open source multimodal LLM 🙏

28/31
@C0NWIC
🤝📡

29/31
@Qnoox
Nice 😎🙌

30/31
@filipwojda
nice

31/31
@TrueNathanD
Torrenting this now. 😍


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GXLUxdKacAA4jH1.jpg

GXLVbnAWIAA4cFi.png
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
57,369
Reputation
8,499
Daps
160,092




1/4
This is probably the most interesting prompting technique of 2024 🤯

Self-Harmonized Chain of Thought (ECHO) = CoT reasoning with a self-learning, adaptive and iterative refinement process ✨

1/ ECHO begins by clustering a given dataset of questions based on their semantic similarity

2/ Each cluster then has a representative question selected and the model generates a reasoning chain for that question using zero-shot Chain of Thought (CoT) prompting - breaking down the solution into intermediate steps.

3/ During each iteration, one reasoning chain is randomly chosen for regeneration, while the remaining chains from other clusters are used as in-context examples to guide improvement.

So what’s so special about this?

> Reasoning patterns can cross-pollinate - as in, if one chain contains errors or knowledge gaps, other chains can help fill in those weaknesses

> Reasoning chains can be regenerated and improved multiple times - leading to a well-harmonized set of solutions where errors and knowledge gaps are gradually eliminated

This is like a more dynamic and scalable alternative to Google Deepmind’s "Self Discover" prompting technique but for CoT reasoning chains that adapt and improve over time across complex problem spaces.

> Ziqi Jin & Wei Lu (Sept 2024). "Self-Harmonized Chain of Thought"

For more on this, you can find a link to the paper and Github down below 👇

2/4
Paper: [2409.04057] Self-Harmonized Chain of Thought

3/4
Github: GitHub - Xalp/ECHO: Official homepage for "Self-Harmonized Chain of Thought"

They have not yet published the code btw... I assume it'll be there soon though!

4/4
Great point! I think there needs to be more evals done across additional domains to see just how effective this is…

I really liked Google’s self-discover prompting method from earlier this year which was 32% more effective than CoT and when compared against CoT-Self-Consistency - required 10-40x less compute.

I think approaches that merge some form of optimization on top of a self discover like algorithm can be incredibly powerful… the design pattern used in self-harmonizing may be a good candidate to enhance and make the self discover prompting technique more adaptive and scalable for more complex problem spaces.

I’ll draw out a diagram of a hybrid approach and work on a POC for this soon!


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GXI6SKHbEAQ3IRP.jpg
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
57,369
Reputation
8,499
Daps
160,092






1/6
Introducing PaperQA2, the first AI agent that conducts entire scientific literature reviews on its own.

PaperQA2 is also the first agent to beat PhD and Postdoc-level biology researchers on multiple literature research tasks, as measured both by accuracy on objective benchmarks and assessments by human experts. We are publishing a paper and open-sourcing the code.

This is the first example of AI agents exceeding human performance on a major portion of scientific research, and will be a game-changer for the way humans interact with the scientific literature.

Paper and code are below, and congratulations in particular to @m_skarlinski, @SamCox822, @jonmlaurent, James Braza, @MichaelaThinks, @mjhammerling, @493Raghava, @andrewwhite01, and others who pulled this off. 1/

2/6
PaperQA2 finds and summarizes relevant literature, refines its search parameters based on what it finds, and provides cited, factually grounded answers that are more accurate on average than answers provided by PhD and postdoc-level biologists. When applied to answer highly specific questions, like this one, it obtains SOTA performance on LitQA2, part of LAB-Bench focused on information retrieval. 2/

3/6
PaperQA2 can also do broad-based literature reviews. WikiCrow, which is an agent based on PaperQA2, writes Wikipedia-style articles that are significantly more accurate on average than actual human-written articles on Wikipedia, as judged by PhD and postdoc-level biologists. We are using WikiCrow to write updated summaries of all 20,000 genes in the human genome. They are still being written, but in the meantime see a preview: WikiCrow | FutureHouse 3/

4/6
We spent a lot of effort on making our open source version be excellent. We put together a new system of building metadata of arbitrary PDFs, full-text search, and more. See it here: GitHub - Future-House/paper-qa: High accuracy RAG for answering questions from scientific documents with citations 4/

5/6
Also, see our preprint for details, here: http://paper.wikicrow.ai 5/

6/6
And, of course, this was all made possible through the wonderful generosity of Eric and Wendy Schmidt, and all of our other funders, including Open Philanthropy for supporting our work on LitQA2, the NSF National AI Resource program, and others! 6/


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GXNUae4boAAZcza.png

GXNYQWWb0AAW-QM.png
 
Top