Sam Altman claims “deep learning worked”, superintelligence may be “a few thousand days” away, and “astounding triumphs” will incrementally become ...

bnew

Veteran
Joined
Nov 1, 2015
Messages
57,344
Reputation
8,496
Daps
160,030
Sam Altman is literally a salesman who knows next to nothing about AI. The things he says may or may not be true, but him saying them doesn't give any clarity to the matter.

His ONLY goal is to increase his own net worth via promoting his company, and all of his words should be interpreted on that basis.

Yann LeCun is head of Meta AI. :ld:



an aside..



1/1
Dane Vahey of OpenAI says the cost per million tokens has fallen from $36 to $0.25 in the past 18 months, and as such AI is the greatest cost-depreciating technology ever invented


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
 

Professor Emeritus

Veteran
Poster of the Year
Supporter
Joined
Jan 5, 2015
Messages
51,330
Reputation
19,666
Daps
203,884
Reppin
the ether
Yann LeCun is head of Meta AI. :ld:



an aside..



1/1
Dane Vahey of OpenAI says the cost per million tokens has fallen from $36 to $0.25 in the past 18 months, and as such AI is the greatest cost-depreciating technology ever invented


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196




What they're saying has little relationship to what Altman was saying. Being able to predict the next word more powerfully or more efficiently doesn't make the world-changing leaps he promised, and could easily make the world much worse. Altman talks like someone who knows little about AI and next to nothing about the actual problems of the world. Tech Crunch laid into him recently, these are some of the worst quotes they pulled from his interview.



“We can have shared prosperity to a degree that seems unimaginable today; in the future, everyone’s lives can be better than anyone’s life is now.”

“This may turn out to be the most consequential fact about all of history so far. It is possible that we will have superintelligence in a few thousand days; it may take longer, but I’m confident we’ll get there.”

If we don’t build enough infrastructure, AI will be a very limited resource that wars get fought over and that becomes mostly a tool for rich people.”

“…the future is going to be so bright that no one can do it justice by trying to write about it now.”

“A defining characteristic of the Intelligence Age will be massive prosperity.”

“Although it will happen incrementally, astounding triumphs — fixing the climate, establishing a space colony, and the discovery of all of physics — will eventually become commonplace.”




All of that is completely unjustified nonsense. It's just how he sells his product.
 

WaveCapsByOscorp™

2021 Grammy Award Winner
Joined
May 2, 2012
Messages
18,979
Reputation
-436
Daps
45,130
Yall love talking doom and gloom. Especially when it comes to AI.

You really think a machine is smarter than you? I’m not even talking about processing information faster, I’m talking real intelligence.

That’s like people claiming artificial flavors are better than the real thing.
 

Micky Mikey

Veteran
Supporter
Joined
Sep 27, 2013
Messages
15,998
Reputation
2,962
Daps
89,254
What they're saying has little relationship to what Altman was saying. Being able to predict the next word more powerfully or more efficiently doesn't make the world-changing leaps he promised, and could easily make the world much worse. Altman talks like someone who knows little about AI and next to nothing about the actual problems of the world. Tech Crunch laid into him recently, these are some of the worst quotes they pulled from his interview.



“We can have shared prosperity to a degree that seems unimaginable today; in the future, everyone’s lives can be better than anyone’s life is now.”

“This may turn out to be the most consequential fact about all of history so far. It is possible that we will have superintelligence in a few thousand days; it may take longer, but I’m confident we’ll get there.”

If we don’t build enough infrastructure, AI will be a very limited resource that wars get fought over and that becomes mostly a tool for rich people.”

“…the future is going to be so bright that no one can do it justice by trying to write about it now.”

“A defining characteristic of the Intelligence Age will be massive prosperity.”

“Although it will happen incrementally, astounding triumphs — fixing the climate, establishing a space colony, and the discovery of all of physics — will eventually become commonplace.”




All of that is completely unjustified nonsense. It's just how he sells his product.

Have you tried using their most recent model? o1

While it isn't perfect I see how in a few iterations it could surpass human experts in things like mathematics and scientific research. And with agents, it'll soon be like having a thousand PhD students working on whatever issue you hand it. It's hard to see how this won't be revolutionary and contribute massively to scientific progress.

I agree that Sam isn't trustworthy but I don't think he's solely upselling A.I. to sell his product.
 

Amestafuu (Emeritus)

Veteran
Supporter
Joined
May 8, 2012
Messages
70,374
Reputation
13,842
Daps
298,530
Reppin
Toronto
Another reason why it matters what type of governance and leadership we have in place during these times. We'll need leaders who'll make it possible that the wealth created by of A.I. will be distributed equally and not hoarded by a small group of elites. Who we elect as president in 2024 will be consequential in how this all plays out. We need a competent administration and Trump is definitely not it.
It's gonna be hoarded obviously. The people working on this are in business
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
57,344
Reputation
8,496
Daps
160,030
What they're saying has little relationship to what Altman was saying. Being able to predict the next word more powerfully or more efficiently doesn't make the world-changing leaps he promised, and could easily make the world much worse. Altman talks like someone who knows little about AI and next to nothing about the actual problems of the world. Tech Crunch laid into him recently, these are some of the worst quotes they pulled from his interview.



“We can have shared prosperity to a degree that seems unimaginable today; in the future, everyone’s lives can be better than anyone’s life is now.”

“This may turn out to be the most consequential fact about all of history so far. It is possible that we will have superintelligence in a few thousand days; it may take longer, but I’m confident we’ll get there.”

If we don’t build enough infrastructure, AI will be a very limited resource that wars get fought over and that becomes mostly a tool for rich people.”

“…the future is going to be so bright that no one can do it justice by trying to write about it now.”

“A defining characteristic of the Intelligence Age will be massive prosperity.”

“Although it will happen incrementally, astounding triumphs — fixing the climate, establishing a space colony, and the discovery of all of physics — will eventually become commonplace.”




All of that is completely unjustified nonsense. It's just how he sells his product.

I think it's kinda justified if you assume automated artificial intelligence will help solve problems at a unprecedented rate which will lead to new innovations and new discoveries. finding cures to illnesses, just building on the science we have now to the point where it shaves decades or centuries off human research and scientific advancements. I'm not so sure about the resource wars because it seems predicated on the fact that we'll rely on the same resources we do now in greater quantities when i think advancements in material science is something that will make resource wars less warranted.
 
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
57,344
Reputation
8,496
Daps
160,030
It's gonna be hoarded obviously. The people working on this are in business

compute can be hoarded for a limited time(decade or two) but eventually some amount of greater compute than what we have now will be be a commodity available to the masses.

a lot of research is open available to the public and researchers from many established companies have been jumping ship to other companies or starting their own AI company. There are also open source large language models and other types of open source AI models. i mean it's been two years and there are open source LLM's better than the chatgpt 3.5 version that debuted.
in Nov. 2022.

WYn7oam.png

 
Last edited:

Micky Mikey

Veteran
Supporter
Joined
Sep 27, 2013
Messages
15,998
Reputation
2,962
Daps
89,254
compute can be hoarded for a limited time(decade or two) but eventually some amount of greater compute than what we have now will be be a commodity available to the masses.

a lot of research is open available to the public and researchers from many established companies have been jumping ship to other companies or starting their own AI company. There are also open source large language models and other types of open source AI models. i mean it's been two years and there are open source LLM's better than the chatgpt 3.5 version that debuted.
in Nov. 2022.

WYn7oam.png


but won't AGI require enormous amounts of compute? Which is the reason why these tech companies are investing so much in data centers. Its hard to see how open source will be able to compete.
 

dangerranger

All Star
Joined
Jun 14, 2012
Messages
940
Reputation
300
Daps
2,815
Reppin
NULL
Sam Altman is literally a salesman who knows next to nothing about AI. The things he says may or may not be true, but him saying them doesn't give any clarity to the matter.

His ONLY goal is to increase his own net worth via promoting his company, and all of his words should be interpreted on that basis.
Look at what Chatgpt's current products are capable of doing and how it's advanced in the last few months. You might feel differently.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
57,344
Reputation
8,496
Daps
160,030
but won't AGI require enormous amounts of compute? Which is the reason why these tech companies are investing so much in data centers. Its hard to see how open source will be able to compete.


alibaba just released a 72B model(qwen 2.5) that performs better on some cases than Meta's Llama 3.1 405B model, models can be made smarter, smaller and more efficient. all that compute is based on current hardware, who knows what compute will be necessary to run AGI will look like in a decade or two.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
57,344
Reputation
8,496
Daps
160,030

bnew

Veteran
Joined
Nov 1, 2015
Messages
57,344
Reputation
8,496
Daps
160,030
@Micky Mikey

i also forgot to mention 1bit which hasn't been widely adopted yet but shows promise for what regular people could soon have access to. take large models and shrink them so they can runon consumer cpus with nearly the same accuracy of the original model.



1/1
1-bit LLMs could solve AI’s energy demands: « “Imprecise” language models are smaller, speedier—and nearly as accurate. » @IEEESpectrum @SilverJacket /search?q=#1bit /search?q=#LLM /search?q=#AI 1-bit LLMs Could Solve AI’s Energy Demands


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196



1/1
I am hopeful for this new ChatGPT AI angle for LLMs, but I can tell you that I successfully ran TinyLlama 1.1B on a Raspberry Pi 5 at quite a fast speed, which is only a 638 Megabyte download. /search?q=#SmallLLM /search?q=#SLM /search?q=#TinyLlama /search?q=#1BitLLM /search?q=#1BitAI /search?q=#TinyAI /search?q=#TinyLLM

GitHub - jzhang38/TinyLlama: The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
[Quoted tweet]
✨ Microsoft 1-bit era paper (released in Feb) is really a masterpiece.

BitNet b1.58 70B was 4.1 times faster and 8.9 times higher throughput capable than the corresponding FP16 LLaMa.

📌 Requires almost no multiplication operations for matrix multiplication and can be highly optimized.

📌 BitNet b1.58 is a 1-bit LLM, where every single parameter (or weight) of the LLM is ternary {-1, 0, 1}.

They introduce a significant 1-bit LLM variant called BitNet b1.58, where every parameter is ternary, taking on values of {-1, 0, 1}. We have added an additional value of 0 to the original 1-bit BitNet, resulting in 1.58 bits in the binary system.

📌 The term "1.58 bits" refers to the information content of each parameter. What "1.58 bits" means is that it is log base 2 of 3, or 1.5849625... Actually decoding data at that density takes a lot of math.Since each parameter can take one of three possible values (-1, 0, 1), the information content is log2(3) ≈ 1.58 bits.

----

Here all weight values are ternary, taking on values {-1, 0, 1}. Its quantization function is absmean in which, the weights are first scaled by their average absolute value and then rounded to the nearest integer ε {-1,0,1}. It is an efficient extension of 1-bit BitNet by including 0 in model parameters. BitNet b1.58 is based upon BitNet architecture (replaces nn.linear with BitLinear). It is highly optimized as it removes floating point multiplication overhead, involving only integer addition (INT-8), and efficiently loads parameters from DRAM.

----

BitNet b1.58 retains all the benefits of the original 1-bit BitNet, including its new computation paradigm, which requires almost no multiplication operations for matrix multiplication and can be highly optimized.

📌 It matches the full-precision (i.e., FP16 or BF16) Transformer LLM with the same model size and training tokens in terms of both perplexity and end-task performance, while being significantly more cost-effective in terms of latency, memory, throughput, and energy consumption.

📌 More profoundly, the 1.58-bit LLM defines a new scaling law and recipe for training new generations of LLMs that are both high-performance and cost-effective.

📌 This paper enables a new computation paradigm and opens the door for designing specific hardware optimized for 1-bit LLMs.

GQHZVjeW4AEFnVB.jpg



To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196





1/2
@Yampeleg
I am saying this again and again like a broken record for 2 years straight and no one believes:

We will have extremely powerful tiny LLMs in the end.
It was obvious all along.

If you still don't see that this is where it is all going to..
I don't know what to tell you..



2/2
@secemp9
You don't really need small LLM if you just take any huge SOTA model, and quantize it to 1bit, apply sparsification, and finetune it a bit on part of the original dataset. You get 80-90% of the original model performance.

Or, just pretrain from scratch in 1Bit. You get denser, smaller and coherent model out of the box. To get rid of the usual tokenizer issue, do it with bytetransformer, and you're done.




To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196


1/1
Exciting work from @huggingface on BitNet finetuning. We will share our latest progress on BitNet including model (pre-training) scaling, MoE, inference on CPUs, and more. Very soon, stay tuned! /search?q=#The_Era_of_1bit_LLMs /search?q=#BitNet
[Quoted tweet]
🚀 Exciting news! We’ve finally cracked the code for BitNet @huggingface ! no pre-training needed! With just fine-tuning a Llama 3 8B, we've achieved great results, reaching a performance close to Llama 1 & 2 7B models on key downstream tasks!

Want to learn more? Check out the blogpost or keep reading for exclusive insights!

Blogpost: huggingface.co/blog/1_58_llm…

GXwTwiEakAUOTpS.png



To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196





1/3
@veryvanya
the first 1bit visionLM has arrived IntelLabs/LlavaOLMoBitnet1B · Hugging Face

[Quoted tweet]
Intel presents LLaVaOLMoBitnet1B

Ternary LLM goes Multimodal!

discuss: huggingface.co/papers/2408.1…

Multimodal Large Language Models (MM-LLMs) have seen significant advancements in the last year, demonstrating impressive performance across tasks. However, to truly democratize AI, models must exhibit strong capabilities and be able to run efficiently on small compute footprints accessible by most. Part of this quest, we introduce LLaVaOLMoBitnet1B - the first Ternary Multimodal LLM capable of accepting Image(s)+Text inputs to produce coherent textual responses. The model is fully open-sourced along with training scripts to encourage further research in this space. This accompanying technical report highlights the training process, evaluation details, challenges associated with ternary models and future opportunities.


2/3
@ilumine_ai
what this can mean in the near term?



3/3
@veryvanya
how i see it
in few years, 10 year old iot potato compute will run multimodality faster and at better quality than current sota closed models (in narrow finetuned usecases)
this continued research into 1bit coupled with decentralised training are already sneakpeak of crazy future




To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

GV9ThDwXIAAp8A9.png







1/6
4 months since we released BitNet b1.58🔥🔥

After we compressed LLM to 1.58 bits, the inference of 1bit LLM is no longer memory-bound, but compute-bound.

🚀🚀Today we introduce Q-Sparse that can significantly speed up LLM computation.

GSl4XlKbIAI4ImM.jpg

GSl4XlQacAAM2tJ.jpg


2/6
Q-sparse is trained with TopK sparsification and STE to prevent from gradients vanishing.

1️⃣Q-Sparse can achieve results comparable to those of baseline LLMs while being much more efficient at inference time;



3/6
2️⃣We present an inference-optimal scaling law for sparsely-activated LLMs; As the total model size grows, the gap between sparsely-activated and dense model continuously narrows.



4/6
3️⃣Q-Sparse is effective in different settings, including training-from-scratch, continue-training of off-the-shelf LLMs, and finetuning;



5/6
4️⃣Q-Sparse works for both full-precision and 1-bit LLMs (e.g., BitNet b1.58). Particularly, the synergy of BitNet b1.58 and Q-Sparse (can be equipped with MoE) provides the cornerstone and a clear path to revolutionize the efficiency of future LLMs.



6/6
Link:[2407.10969] Q-Sparse: All Large Language Models can be Fully Sparsely-Activated




To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
 

Vandelay

Life is absurd. Lean into it.
Joined
Apr 14, 2013
Messages
23,701
Reputation
5,958
Daps
83,132
Reppin
Phi Chi Connection
I'm skeptical. I'm sorry. The human condition and motivation right now is to usurp everything for oneself. Sam Altman is a prolific doomsday pepper. I don't feel we're equipped at all to deal with this.

Move fast and break things...and that's exactly what they're about to do.

And even if we move into a new epoch without physical strife, UBI which is inevitable, is not the save for society people think it will be. You essentially will be creating a true permanent upper and lower class because of it.
 

Professor Emeritus

Veteran
Poster of the Year
Supporter
Joined
Jan 5, 2015
Messages
51,330
Reputation
19,666
Daps
203,884
Reppin
the ether
Look at what Chatgpt's current products are capable of doing and how it's advanced in the last few months. You might feel differently.


Sam Altman is not responsible for creating ChatGPT's products, and I don't see how ChatGPT's current products justify those claims.
 
Top