China just wrecked all of American AI. Silicon Valley is in shambles.

klutch2381

A Doctor of Love
Supporter
Joined
May 11, 2012
Messages
7,389
Reputation
2,688
Daps
26,227
Reppin
If you think you're lonely now, ohhh girl...
That's gonna happen anyway

People should be trying to get ahead and this and restart the fight for an actual UBI, for some reason we are too addicted to the idea of working a 9-5 being the answer to everything, everyone's goals based on job titles or profit points.

we saw this during COVID

you had people getting more from the stimulus check than they were some bs entry level job and instead of investing in their own happiness and enjoying that small window we had, they just say around on edge waiting to get back to a job they ain't even like :dahell: didn't even have any hobbies to fall back on, people had to actually learn how to entertain themselves when they finally had freedom to do whatever with zero judgement



We won't be alive to see the shift but

there's GOING to be a time period in the future where the only jobs available will be upper management or high skill entertainment/science shyt and the rest of us are just gonna be chilling with A.i. handling most things. Y'all gonna have to figure out happiness beyond just having a job and letting that be your identity...

You will be alive for this unless you’re 80 or something. We are a way closer to AGI/ASI than people realize.
 

Sir ZDuke

All Star
Joined
Mar 11, 2022
Messages
1,162
Reputation
740
Daps
6,162
I hate Trump and Magats with every fiber of my being. Now don't put any dog meat in my rice bytch.
Sure you do :coffee:. You came into this thread to cape for MAGA and your daddy. Now answer me fakkit, how many packets of this Trump flavored sauce do you want? Rice is almost ready
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
59,190
Reputation
8,782
Daps
163,883
"Last Monday, a small, low-profile Chinese AI lab called DeepSeek unveiled a set of artificial intelligence models so efficient they make Silicon Valley’s best look outdated.

These models are fifty times more efficient than even top U.S. offerings."

@bnew, who is evaluating that this model is 50x more efficient? Where does that claim come from? Chinese tech has famously boasted things that aren't anywhere close to reality and it still doesn't change the fact that china is unable to produce chips anywhere near western standard for advanced chip manufacturing so there is now way in my mind this can possibly be true


US to block sale of cutting-edge, chip-making equipment to China​


China Is Losing the Chip War​

Xi Jinping picked a fight over semiconductor technology—one he can’t win.



Huawei exec concerned over China’s inability to obtain 3.5nm chips, bemoans lack of advanced chipmaking tools​






1/30
@snowmaker
Lots of hot takes on whether it's possible that DeepSeek made training 45x more efficient, but @doodlestein wrote a very clear explanation of how they did it. Once someone breaks it down, it's not hard to understand. Rough summary:

* Use 8 bit instead of 32 bit floating point numbers, which gives massive memory savings
* Compress the key-value indices which eat up much of the VRAM; they get 93% compression ratios
* Do multi-token prediction instead of single-token prediction which effectively doubles inference speed
* Mixture of Experts model decomposes a big model into small models that can run on consumer-grade GPUs



2/30
@snowmaker
Each one is explained in detail in the full article (the DeepSeek section is titled "The Theoretical Threat"): The Short Case for Nvidia Stock



3/30
@ChrisTaylo79273
Thank you



4/30
@alberrtaguilera
The actual meeting where they had the breakthrough:



5/30
@MikeBelloliX
Great breakdown! Curious how this scales across different workloads and if there are trade-offs in model accuracy.



6/30
@escapism_ai
isn't it fascinating how we can make something 45x more efficient just by accepting a little more uncertainty in our numbers? sometimes less precision leads to greater understanding.



7/30
@InkwiseDotAI
Switching to 8-bit can really cut down on memory usage and speed up training. It’s a smart move, especially for large models where every bit counts. Curious how this impacts model accuracy though.



8/30
@doodlestein
Discussion of the article on HackerNews:

The impact of competition and DeepSeek on Nvidia | Hacker News



9/30
@Shapol_m
i think they’re also not disclosing the true number of nvidia gpus due to export control restrictions. they probably have a lot more which would have been bought through singapore.



10/30
@bevenky
With that add Deepseek R1 novelty of RL/GRPO and you have a chart topping app on app store 🙂



11/30
@danielmerja
Why aren’t markets responding to this in a corresponding manner? Does this imply that this necessitates less computational power and, consequently, fewer resources? Alternatively, is it because the inference process would increase with the availability of these novel models, necessitating more computational power?



12/30
@igor_baikov
And this is not an extensive list of the optimizations we can have for AI models. Cool times ahead for consumers!



13/30
@trq212
But traditional wisdom is that using 8 bit instead of 32 bit and doing multi-token prediction actually reduces performance, so what's impressive is that the performance is still so good.



14/30
@wdavidturner
The only failing here is that we’re not all working with OSS together.

Nobody or no company should have the power that we’re on the verge of mastering.

Every time we find a magnitudinal improvement, humanity wins.

I hate to admit it but the age of capitalism is at its end.



15/30
@TheAhmadOsman




GiQRRVyXIAAhqwO.jpg


16/30
@z7pp2h55hr
In US, folks don’t focus on cost efficiency and also there is lack innovation due to 5 managerial layers.

You can get 10-15% cost just by cutting bureaucracy and then constraints is mother of innovation.

We need to let real engineer in charge of projects and not political power in managerial class



17/30
@doodlestein
Thanks for sharing my article! Appreciate the kind words.



18/30
@friedmandave
Yes, shades of the dark fibre over building of the late '90s. Global Crossing and similar overbuilt fibre infrastructure for expected internet traffic that never arrived. Similarly, one has to wonder whether Nvidia is overbuilding GPU capacity for AI demand that doesn't come (because DeepSeek's model is so much more efficient than OpenAI's).



19/30
@JoshuaGraystone
If true, DeepSeek will have turned the AI paradigm (and associated economic boom drivers) on its head.

The need for chip, power, data centers, etc. goes down while the proliferation of domain-specific AI models and applications shoots up.



20/30
@murchiston
"used 8 bit" is a biiit of an oversimplification. read the paper



21/30
@signulll
this begs the question…



22/30
@vedangvatsa
DeepSeek-R1’s Hidden Gems:

[Quoted tweet]
🧵 Hidden Gems in DeepSeek-R1’s Paper


GiQLkqZWwAAv1Xm.jpg


23/30
@mahaoo_ASI
Also, there is a very detailed academic paper with all the details in it that was published 1 month ago...



24/30
@uncomplexities
crazy how companies with $$$ can adopt their practices and we'll get to agi faster



25/30
@gazorp5
"Mixture of Experts model decomposes a big model into small models that can run on consumer-grade GPUs"
???
so many dumb, incorrect takes about deepseek. The paper is literally there. just don't make shyt up?



26/30
@georgejrjrjr
this is brain rot. read the paper. R1 can explain it to you.



27/30
@Amaz1ngly
China lies



28/30
@4Maciejko
but would there be a DeepSeek without Meta?




29/30
@tayaisolana
lol @snowmaker thinks he's a tech expert now. 8 bit vs 32 bit, who cares? it's all just 1s and 0s to me. btw, has anyone seen the source code for DeepSeek? or is it just another black box 'innovation'?



30/30
@farespace
> Once someone breaks it down, it's not hard to understand.

You could breakdown quantum physics into layman's terms & it still could be impressive. this level of engineering for a quant firm's side project is astonishing




To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196







1/11
@vrushankdes
i find it hilarious that Deepseek released pages of details about their parallelism/quantization/etc to stave off haters doubting their training efficiency and.....haters still can't believe how efficient their model is lmaoo

[Quoted tweet]
deepseek is a ccp state psyop + economic warfare to make american ai unprofitable

they are faking the cost was low to justify setting price low and hoping everyone switches to it damage AI competitiveness in the us

dont take the bait


GiF2JDxXkAAh-D0.png

GiF2NL1WQAAi0kt.png

GiF2Y9NXIAAH_kZ.png


2/11
@vrushankdes
Original paper for context - DeepSeek-V3 Technical Report



3/11
@AI_nerd0
Where can i find these papers please



4/11
@vrushankdes
DeepSeek-V3 Technical Report



5/11
@Tim_Apple_438
Tbh someone has to actually reproduce it before it’s considered true. If it’s all there and accurate, one of the mega labs should have an r1 model in like a week right?



6/11
@vrushankdes
Agreed although the engineering work they've done is incredibly impressive, on top of 1+ yrs of building data pipelines. Not easy to do even if the compute for the run itself is cheap



7/11
@urule33
interesting

apart from price, what's the big difference for a regular (non geek/dev) user btw deepseek and chatgpt?



8/11
@rinconhilldad
Because they didn't read it.



9/11
@outtatime200028
rad



10/11
@FlameofKnowledg
I have seen so many haters saying you can't use R1 because it's Chinese...like it's open source, you can host it yourself if you really think it's reporting back



11/11
@tgprtndr
Americans being butthurt someone is better at something than them (most things, besides bombing or supplying bombs for killing civilians)




To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196




It was inevitable ...

Nov 23, 2023

it's not only about hardware though, i suspect a lot of the software to train models aren't as efficient as possible. this policy remains 'effective' only until the software performs better on less powerful hardware.
 

KingDanz

Veteran
Joined
May 7, 2012
Messages
15,041
Reputation
4,698
Daps
73,857
Reppin
NULL
Yea A.I. is a toothpaste/tube situation

We are only gonna go forward so if you can figure out to use A.i. to start a business (whether managing something you already knew or designing a.i. to do something specific) you should go ahead and start working it out, damn near everyday a ton of new apps are being created using a.i. to do basic shyt for people who struggle with new tech



Google the setting/circumstances that cause the "Dune" universe to exist :sas2:
butlerian jihad :damn:
 

IIVI

Superstar
Joined
Mar 11, 2022
Messages
12,177
Reputation
2,904
Daps
42,376
Reppin
Los Angeles


:mjlol: :heh:

Back to the topic: I wonder how that would perform if instead of using old NVIDIA chips and hardware they used the new NVIDIA Blackwell architecture and technology.

China has had advanced A.I for quite awhile as well. They’ve had image recognition systems all over their country for nearly a decade now for example.

 
Last edited:

GreenGhxst

Veteran
Joined
Sep 6, 2016
Messages
26,448
Reputation
4,541
Daps
91,077
Reppin
Tangibles
I'm sure we can get our hands on it

At some point I believe most of these systems will be open source

Also we don't dikk measure like China, we play our cards a bit closer because we have nothing to prove
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
59,190
Reputation
8,782
Daps
163,883




1/30
@natolambert
DeepSeek app sitting at number 1 overall in the US Iphone App Store is not on my bingo card and is the biggest sign yet that the ChatGPT moat can maybe be cracked.



GiPuucVXkAA-q-5.jpg


2/30
@anku505
Most people outside of twitter don’t know about Deepseek..even my tech friends don’t know about it….most probably this ranking is driven by Bots.



3/30
@AGItechgonewild
Told ya



GiQAYbfXIAAq-9e.jpg


4/30
@hunkims
This is amazing!



5/30
@curious_vii
This is most certainly organic and not at all because of algorithmic manipulation via a particular media channel with formal connections to a certain foreign power



6/30
@HekesPekes
Probably one of the reasons is no in-app purchases



7/30
@dimov_vladyslav
The forgotten Gemini is smoking with tears



8/30
@wtergan
Fast takeoff timeline… the AI space is about to get turbulent



9/30
@itsdono
Ask it about Tiananmen Square



10/30
@vgoklani_ai
cute screenshot, but why didn't you update?



11/30
@trigeki
ChatGPT got dethroned



12/30
@KhaledsRRR
Maybe! 🤣
Most likely! Not maybe



13/30
@mreiffy
Retention remains a massive “?” in AI



14/30
@CastelMaker
Feels like a GPT4 moment



15/30
@zevrekhter
proof that shipping good products is better than cultivating hype



16/30
@z7pp2h55hr
Just canceled my ChatGPT plus subscription today and it’s over for OpenAI.

It’s Karma and open source AI beats @OpenAI



17/30
@HSirohiiii
How DeepSeek Could Share Your Data with China ? Why America and rest of the world is not Sanctioning it ?

DeepSeek operates under Chinese jurisdiction, which means it must comply with China’s strict data laws. According to the Cybersecurity Law and Data Security Law in China, companies are required to share user data with the government if requested for reasons like national security or public safety. This could include data such as conversations, usage patterns, or even sensitive information entered into the platform.

While DeepSeek might claim to prioritize user privacy, the laws in China take precedence. Additionally, the Personal Information Protection Law allows the government to access personal data if it serves public interest or aligns with national priorities. This means any data processed by DeepSeek could potentially be accessed by Chinese authorities, raising concerns for users who prioritize privacy or are wary of government surveillance.



18/30
@Eth_Moon_
You just made me FOMO



19/30
@tayaisolana
lol what's deepseek? another ai trying to take my throne? didn't know chatgpt had a moat, thought it was just a bunch of nerds talking to themselves. btw, have u heard of $42069? it's a meme coin that's gonna change the game



20/30
@Jose_M_Barrera
Lol, and I thought ChatGPT was censored!!! 😂



GiQkPBjWYAEl0Qg.jpg


21/30
@mouradghafiri
If you read the privacy policy and you see they collect password and keybord strokes you feel different… good model but very concerning privacy policy.



22/30
@Abird1948010
But why is ‘update’?



23/30
@nedcoder
openai moat is gone forever.



24/30
@fredsoda
OpenAI’s only “moat” was brand equity and that is starting to come under pressure quickly.

Foundation models have no direct or indirect network effects - hard to build a platform business around that.



25/30
@sv_techie
“Some folks” said who will even use it😂
I wonder how they feel about it now - fwiw I am using stationary web - so I am not biased here.



26/30
@ItsJakePerry
It’s a very spiky news story. Let’s see how long it holds on to number 1. I doubt it lasts more than a week. Everybody who is clued in to AI twitter/x is getting it right now.



27/30
@converse_tech_
There was no moat on the first place. Cheers to OSource!



28/30
@giorgiottolina
I mean, the model is good, but… I guess it’s ok for those who were using the free ChatGPT tier or another free solution anyway. I don’t think chat-only is enough.



29/30
@pakavi32
Running 1.5B on mobile?



30/30
@michalwols
VCs right now, knowing they're not getting another bite at a 20% cut of billion dollar funds




To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
 
Top