bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,613
Daps
161,860

Sony Music warns AI companies against ‘unauthorized use’ of its content​


The label representing superstars like Billy Joel, Doja Cat, and Lil Nas X sent a letter to 700 companies.​

By Mia Sato, platforms and communities reporter with five years of experience covering the companies that shape technology and the people who use their tools.

May 17, 2024, 10:29 AM EDT

1 Comment

Music photos from day three at Coachella Music Festival

Photo: Dania Maxwell / Los Angeles Times via Getty Images

Sony Music sent letters to hundreds of tech companies and warned them against using its content without permission, according to Bloomberg, which obtained a copy of the letter.

The letter was sent to more than 700 AI companies and streaming platforms and said that “unauthorized use” of Sony Music content for AI systems denies the label and artists “control and compensation” of their work. The letter, according to Bloomberg, calls out the “training, development or commercialization of AI systems” that use copyrighted material, including music, art, and lyrics. Sony Music artists include Doja Cat, Billy Joel, Celine Dion, and Lil Nas X, among many others. Sony Music didn’t immediately respond to a request for comment.

The music industry has been particularly aggressive in its efforts to control how its copyrighted work is used when it comes to AI tools. On YouTube, where AI voice clones of musicians exploded last year, labels have brokered a strict set of rules that apply to the music industry (everyone else gets much looser protections). At the same time, the platform has introduced AI music tools like Dream Track, which generates songs in the style of a handful of artists based on text prompts.

Perhaps the most visible example of the fight over music copyright and AI has been on TikTok. In February, Universal Music Group pulled its entire roster of artists’ music from the platform after licensing negotiations fell apart. Viral videos fell silent as songs by artists like Taylor Swift and Ariana Grande disappeared from the platform.

The absence, though, didn’t last long: in April, leading up to the release of her new album, Swift’s music silently returned to TikTok (gotta get that promo somehow). By early May, the stand-off had ended, and UMG artists were back on TikTok. The two companies say a deal was reached with more protections around AI and “new monetization opportunities” around e-commerce.

“TikTok and UMG will work together to ensure AI development across the music industry will protect human artistry and the economics that flow to those artists and songwriters,” a press release read.

Beyond copyright, AI-generated voice clones used to create new songs have raised questions around how much control a person has over their voice. AI companies have trained models on libraries of recordings — often without consent — and allowed the public to use the models to generate new material. But even claiming right of publicity and likeness could be challenging, given the patchwork of laws that vary state by state in the US.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,613
Daps
161,860



1/3
Gemini 1.5 Model Family: Technical Report updates now published

In the report we present the latest models of the Gemini family – Gemini 1.5 Pro and Gemini 1.5 Flash, two highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio.

Our latest report details notable improvements in Gemini 1.5 Pro within the last four months.

Our May release demonstrates significant improvement in math, coding, and multimodal benchmarks compared to our initial release in February.



Furthermore, the 1.5 Pro Model is now stronger than 1.0 Ultra.

The latest Gemini 1.5 Pro is now our most capable model for text and vision understanding tasks, surpassing 1.0 Ultra on 16 of 19 text benchmarks and 18 of 21 of the vision understanding benchmarks. The table below highlights the improvement in average benchmark performance for different categories in 1.5 Pro since Feb, and also shows the strength of the model relative to the 1.0 Pro and 1.0 Ultra models. The 1.5 Flash model also compares very well against the 1.0 Pro and 1.0 Ultra models.



One clear example of this can be seen on MMLU

On MMLU we find that 1.5 Pro surpasses 1.0 Ultra in the regular 5-shot setting scoring 85.9% versus 83.7%. However with additional inference compute, via majority voting on top of multiple language model samples, we can get a performance of 91.7% versus Ultra’s 90.0%, which extends the known performance ceiling of this task.



@OriolVinyalsML and I are very proud of the whole Gemini team, and it’s fantastic to see this progress and to share these highlights from our Gemini Model Family.

Read the updated report here: https://goo.gle/GeminiV1-5

2/3
The updated report is now 153 pages, and has quite a few new results. In the February report, I found the results on Kalamang translation for the Machine Translation from One Book benchmark quite exciting. In this updated report, we’ve extended this line of evaluation to test

3/3
One other thing in the updated Gemini 1.5 Pro report: we show how a research model that is a mathematics-specialized version of 1.5 Pro achieves a record score of 91.1% on the MATH benchmark (the SOTA just 3 years ago, in May, 2021 was 6.9%!).


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GNy5GCNXcAAhph2.jpg

GNy5bmHWkAAsM7j.jpg

GNy5w6oXgAAwF13.jpg

GNzBZVYXwAA74Lc.jpg

GNzBtQrWcAEie4W.jpg

GNoERvXWEAAvJKA.jpg

GNzGrgKWQAEmAqs.jpg

GNy6M67XEAASxPR.jpg

GNy6M67XEAASxPR.jpg

GNw1hf6W4AAndo2.jpg

GNy5GCNXcAAhph2.jpg

GNy5bmHWkAAsM7j.jpg

GNy5w6oXgAAwF13.jpg

GNy1cMMXcAILDFl.jpg

GNy6M67XEAASxPR.jpg

GNvYhCOaEAAxZys.jpg

GNy7Xatb0AASz-_.jpg

GNy7bBDakAkf9ch.jpg

GNzxS9LXsAAzVtu.png

GNzyLFBXcAAtgZT.png
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,613
Daps
161,860


1/2
Today we have published our updated Gemini 1.5 Model Technical Report. As
@JeffDean highlights, we have made significant progress in Gemini 1.5 Pro across all key benchmarks; TL;DR: 1.5 Pro > 1.0 Ultra, 1.5 Flash (our fastest model) ~= 1.0 Ultra.

As a math undergrad, our drastic results in mathematics are particularly exciting to me!

In section 7 of the tech report, we present new results on a math-specialised variant of Gemini 1.5 Pro which performs strongly on competition-level math problems, including a breakthrough performance of 91.1% on Hendryck’s MATH benchmark without tool-use (examples below ).

Gemini 1.5 is widely available, try it out for free here Google AI Studio | Google AI for Developers | Google for Developers & read the full tech report here: https://goo.gle/GeminiV1-5

2/2
Here are some examples of the model solving problems from the Asian Pacific Mathematical Olympiad (APMO) that has stumped prior models. The top example is cool because it is a proof (rather than a calculation). The solutions are to the point and "beautiful".

This clearly shows


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GNy6M67XEAASxPR.jpg

GNzCf36XQAA_wte.jpg

GNoERvXWEAAvJKA.jpg



 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,613
Daps
161,860



1/3
it is interesting that GPT4-o's ELO is lower at 1287, than its initial 1310 score.
On coding, it regressed even more absolute points, from 1369 to 1307.

2/3
i wonder how much of the differential was because people were really trend-bubbling "im-a-good-gpt2" thing, and trying to spot and validate it; and now its more normalized to regular expectations.

3/3
someone pointed out that in fact in coding, GPT-4o did worse than GPT-4Turbo in medium/hard problems, but did better on easy problems on LiveCodeBench Leaderboard


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GNy1cMMXcAILDFl.jpg
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,613
Daps
161,860


1/1
Spoke to @geoffreyhinton about OpenAI co-founder @ilyasut's intuition for scaling laws.

"Ilya was always preaching that you just make it bigger and it'll work better.

And I always thought that was a bit of a cop-out, that you're going to have to have new ideas too.

It turns out Ilya was basically right."

Link to full interview in .




To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

GNyS__AWQAASP5C.jpg
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,613
Daps
161,860

1/1
not sure why they didn't cite our work (Unified-IO-2), which was released half a year ago and also trains mixed-modal from scratch. A quick glance through the recipes shows so many similarities

What we already did in Unified-IO-2:
- Add QK-norm
- Use z-loss
- Observe logits overflowing bfloat16 boundaries,
- Observe mult-modal training is more unstable

New recipes from their work:
- use post-norm instead of pre-norm


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GNvYhCOaEAAxZys.jpg

GNzxS9LXsAAzVtu.png

GNzyLFBXcAAtgZT.png

GNz3_cBbIAAFgzx.jpg

GNy5GCNXcAAhph2.jpg

GNy5bmHWkAAsM7j.jpg

GNy5w6oXgAAwF13.jpg

GNy5GCNXcAAhph2.jpg

GNy5bmHWkAAsM7j.jpg

GNy5w6oXgAAwF13.jpg

GNyt5vnbsAAKubD.png
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,613
Daps
161,860


1/2
AI is math. GPU is metal. Sitting between math and metal is a programming language. Ideally, it should feel like Python but scale like CUDA. I find two newcomers in this middle layer quite exciting:

1. Bend: compiles modern high-level language features to native multi-threading on Apple Silicon or NVIDIA GPU. Supports difficult constructs like - lambdas with full closure, unrestricted recursion and branches, folds, ADTs, etc. Bend compiles to HVM2, a thread-safe runtime implemented in Rust.

All open-source:
- GitHub - HigherOrderCO/HVM: A massively parallel, optimal functional runtime in Rust
- GitHub - HigherOrderCO/Bend: A massively parallel, high-level programming language

2. Mojo: a CUDA-flavored, Python like language the executes at C speed. Mojo is conceptually lower level than Bend and allows you to have stronger control over exactly how the parallelism is done. Especially suited for coding modern neural net accelerations by hand.

- Mojo 🔥: Programming language for all of AI
- Llama2 in one Mojo source file: llama2.mojo/llama2.mojo at master · tairov/llama2.mojo

2/2
Thanks
@VictorTaelin
@cHHillee
for the discussion on comparison.


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GNy7Xatb0AASz-_.jpg

GNy7bBDakAkf9ch.jpg

GNx5RlkXgAEQY75.jpg
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,613
Daps
161,860


1/2
I hate to acknowledge this, but Gemini 1.5 Flash is better than llama-3-70b on long context tasks. It's way faster than my locally hosted 70b model (on 4*A6000) and hallucinates less. The free of charge plan is good enough for me to do prompt engineering for prototyping

2/2
The api response doesn't seem to contain token count and time to first token info. For a prompt with 20k input tokens and 300 output tokens, it took 6 seconds to finish.


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GNzxS9LXsAAzVtu.png

GNzyLFBXcAAtgZT.png
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,613
Daps
161,860

1/1
Updated Gemini 1.5 Pro report: MATH benchmark for specialized version now at 91.1%, SOTA 3 years ago was 6.9%, overall a lot of progress from February to May in all benchmarks


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196









1/7
A mathematics-specialized version of Gemini 1.5 Pro achieves some extremely impressive scores in the updated technical report.

2/7
From the report; 'Currently the math-specialized model is only being explored for Google internal research use cases; we hope to bring these stronger math capabilities into our deployed models soon.'

3/7
New benchmarks, including Flash.

4/7
Google is doing something very interesting by building specialized versions of its frontier models for math, healthcare, and education (so far). The benchmarks on all of these are pretty impressive, and it seems to be beyond what can be done with traditional fine tuning alone. twitter.com/jeffdean/statu…

5/7
1.5 Pro is now stronger than 1.0 Ultra.

6/7
4o only got to enjoy the crown for 4 days.


7/7
They put Av_Human at the top of the chart there visually to make people feel better. The average human is now in third place.


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GNzIjYKWUAADUtx.jpg

GNzL61TWkAAZvvS.jpg

GNzWUnMW0AIPiLs.jpg

GNy5GCNXcAAhph2.jpg

GNy5bmHWkAAsM7j.jpg

GNy5w6oXgAAwF13.jpg

GN0WhkIWcAAcwLI.jpg

GNz3_cBbIAAFgzx.jpg
 
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,613
Daps
161,860









1/9
How do models like GPT-4o and Meta’s Chameleon generate images?

Answer: They don’t, they generate tokens.

A short thread on multimodal tokenizers:

2/9
Text-only LLMs operate on a discrete, learned vocabulary of tokens that's fixed throughout training.
Every training example asks the neural net to predict the next text token id given previous text token ids.

3/9
End-to-End Multimodal LLMs like GPT-4o and Meta’s Chameleon incorporate ‘image tokens’ as additional tokens into the vocabulary.

For example, Chameleon has 65k text tokens and 8k image tokens for a total of ~73k tokens in the vocabulary.

But wait? How can you encode every

4/9
You don’t! Each image token isn’t a collection of pixels - it’s a vector. For example:

Text Token with id 104627 might be: _SolidGoldMagikarp

While the image token with index 1729 will be an embedding: [1.232, -.21, … 0.12]

The LLMs task is to learn patterns in sequences of

5/9
An embedding vector?

Image vocabularies in multimodal language models are actually hidden states of a neural network called a vector quantized variational auto-encoder or VQ-VAE

These neural networks are trained to compress images into a set of small set of codes (tokens).

6/9
When you tokenize the input of a multimodal LLM, any image runs through the encoder of the VAE to generate a set of latent codes. You then lookup the vectors representing these codes and use them as the input to the LLMs transformer decoder.

During generation, the multimodal

7/9
If you like thinking about stuff like this, they'll you'll enjoy engineering at
@nomic_ai .

We help humans understand and build with unstructured data through latent space powered tools.

That involves training and serving multimodal models every day to our customers!

Many of

8/9
Diagram credit:
Chameleon: https://arxiv.org/pdf/2405.09818
VQ-VAE: https://arxiv.org/pdf/2203.13131

9/9
commercially available models only allow you to generate out text due to the difficulty of moderating images.

but yes, yes they do. you'll probably never be given it because they will instead give you the raw image back


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GNwPcl_a4AABY_a.jpg

GNwUE-maEAAoWVj.jpg

GNwQLqjb0AAhYa-.png

GNwQdbfasAAdxCr.jpg

GNwQ5V4asAArnIB.jpg

GNvZTMYbEAACSnF.jpg
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,613
Daps
161,860






1/6
Updates for OpenDevin (GitHub - OpenDevin/OpenDevin: 🐚 OpenDevin: Code Less, Make More this week:

- CodeAct 1.3 agent with browsing and github support
- With GPT-4o, 25% accuracy on SWE-Bench Lite, 4% over the SOTA we set last week!
- A new evals visualizer
- Plans to add more agents/evals, we'd love your help!



2/6
We released CodeAct 1.0 last week, a strong coding agent. But it didn’t support browsing the web or sending pushes to github.

In CodeAct 1.3, we
- Added browsing through BrowserGym: Enable CodeAct agents with browsing, and also enable arbitrary BrowserGym action support by frankxu2004 · Pull Request #1807 · OpenDevin/OpenDevin
- Added ability to use a github token:

3/6
OpenAI released GPT-4O Monday, and we tested CodeAct 1.3 right away. Results are good, we increased our solve rate on SWE-Bench Lite by 4%, moving to 25% (solving 47% more issues than the SOTA two weeks ago).

You can see evals in our new visualizer here: OpenDevin Evaluation Benchmark - a Hugging Face Space by OpenDevin

4/6
We’re planning new evals:
- web browsing tasks
- more recent repos where the risk of LLM memorization is lower

We'd love help with adding evals/agents and also frontend/backend improvements! Join our slack to discuss more: GitHub - OpenDevin/OpenDevin: 🐚 OpenDevin: Code Less, Make More

5/6
Awesome! I haven't taken a close look at this, but if you think there's something in there that could push our results higher as well we'd love to have a contribution to our agenthub!

6/6
Great question! For SWE-Bench evaluation we yell at the agent and say "DON'T USE THE INTERNET"

And as far as I can tell it listened to us.


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GNyt5vnbsAAKubD.png

GNyucfobkAAvoVb.jpg
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,613
Daps
161,860



1/3
Nobody is talking about this right now but Google dropped a CRAZY model interpretation graph tool, to enable you to better understand your models better.

Check it out, link

2/3
LINK: Model Explorer: Graph visualization for large model development

Github: 4. API Guide

3/3
Want more cool ML tips and tricks?

Follow me and hit that notification bell


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,613
Daps
161,860

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

GitHub - deepseek-ai/DeepSeek-V2

1. Introduction

GitHub - deepseek-ai/DeepSeek-V2

Today, we’re introducing DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token. Compared with DeepSeek 67B, DeepSeek-V2 achieves stronger performance, and meanwhile saves 42.5% of training costs, reduces the KV cache by 93.3%, and boosts the maximum generation throughput to 5.76 times.





We pretrained DeepSeek-V2 on a diverse and high-quality corpus comprising 8.1 trillion tokens. This comprehensive pretraining was followed by a process of Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) to fully unleash the model's capabilities. The evaluation results validate the effectiveness of our approach as DeepSeek-V2 achieves remarkable performance on both standard benchmarks and open-ended generation evaluation.

2. News

GitHub - deepseek-ai/DeepSeek-V2

2024.05.16: We released the DeepSeek-V2-Lite.

2024.05.06: We released the DeepSeek-V2.

3. Model Downloads

GitHub - deepseek-ai/DeepSeek-V2

Model#Total Params#Activated ParamsContext LengthDownload
DeepSeek-V2-Lite16B2.4B32k 🤗 HuggingFace
DeepSeek-V2-Lite-Chat (SFT)16B2.4B32k 🤗 HuggingFace
DeepSeek-V2236B21B128k 🤗 HuggingFace
DeepSeek-V2-Chat (RL)236B21B128k 🤗 HuggingFace

Due to the constraints of HuggingFace, the open-source code currently experiences slower performance than our internal codebase when running on GPUs with Huggingface. To facilitate the efficient execution of our model, we offer a dedicated vllm solution that optimizes performance for running our model effectively.

4. Evaluation Results

GitHub - deepseek-ai/DeepSeek-V2

Base Model

GitHub - deepseek-ai/DeepSeek-V2

Standard Benchmark (Models larger than 67B)

GitHub - deepseek-ai/DeepSeek-V2

BenchmarkDomainLLaMA3 70BMixtral 8x22BDeepSeek-V1 (Dense-67B)DeepSeek-V2 (MoE-236B)
MMLUEnglish78.977.671.378.5
BBHEnglish81.078.968.778.9
C-EvalChinese67.558.666.181.7
CMMLUChinese69.360.070.884.0
HumanEvalCode48.253.145.148.8
MBPPCode68.664.257.466.6
GSM8KMath83.080.363.479.2
MathMath42.242.518.743.6

Standard Benchmark (Models smaller than 16B)

GitHub - deepseek-ai/DeepSeek-V2

BenchmarkDomainDeepSeek 7B (Dense)DeepSeekMoE 16BDeepSeek-V2-Lite (MoE-16B)
Architecture-MHA+DenseMHA+MoEMLA+MoE
MMLUEnglish48.245.058.3
BBHEnglish39.538.944.1
C-EvalChinese45.040.660.3
CMMLUChinese47.242.564.3
HumanEvalCode26.226.829.9
MBPPCode39.039.243.2
GSM8KMath17.418.841.1
MathMath3.34.317.1

For more evaluation details, such as few-shot settings and prompts, please check our paper.

Context Window

GitHub - deepseek-ai/DeepSeek-V2




Evaluation results on the

Code:
Needle In A Haystack

(NIAH) tests. DeepSeek-V2 performs well across all context window lengths up to 128K.

Chat Model

GitHub - deepseek-ai/DeepSeek-V2

Standard Benchmark (Models larger than 67B)

GitHub - deepseek-ai/DeepSeek-V2

BenchmarkDomainQWen1.5 72B ChatMixtral 8x22BLLaMA3 70B InstructDeepSeek-V1 Chat (SFT)DeepSeek-V2 Chat (SFT)DeepSeek-V2 Chat (RL)
MMLUEnglish76.277.880.371.178.477.8
BBHEnglish65.978.480.171.781.379.7
C-EvalChinese82.260.067.965.280.978.0
CMMLUChinese82.961.070.767.882.481.6
HumanEvalCode68.975.076.273.876.881.1
MBPPCode52.264.469.861.470.472.0
LiveCodeBench (0901-0401)Code18.825.030.518.328.732.5
GSM8KMath81.987.993.284.190.892.2
MathMath40.649.848.532.652.753.9

Standard Benchmark (Models smaller than 16B)

GitHub - deepseek-ai/DeepSeek-V2

BenchmarkDomainDeepSeek 7B Chat (SFT)DeepSeekMoE 16B Chat (SFT)DeepSeek-V2-Lite 16B Chat (SFT)
MMLUEnglish49.747.255.7
BBHEnglish43.142.248.1
C-EvalChinese44.740.060.1
CMMLUChinese51.249.362.5
HumanEvalCode45.145.757.3
MBPPCode39.046.245.8
GSM8KMath62.662.272.0
MathMath14.715.227.9

English Open Ended Generation Evaluation

GitHub - deepseek-ai/DeepSeek-V2

We evaluate our model on AlpacaEval 2.0 and MTBench, showing the competitive performance of DeepSeek-V2-Chat-RL on English conversation generation.







Computer Science > Computation and Language​

[Submitted on 7 May 2024 (v1), last revised 16 May 2024 (this version, v3)]

DeepSeek-V2 - A Strong, Economical, and Efficient Mixture-of-Experts Language Model​

DeepSeek-AI
We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference through significantly compressing the Key-Value (KV) cache into a latent vector, while DeepSeekMoE enables training strong models at an economical cost through sparse computation. Compared with DeepSeek 67B, DeepSeek-V2 achieves significantly stronger performance, and meanwhile saves 42.5% of training costs, reduces the KV cache by 93.3%, and boosts the maximum generation throughput to 5.76 times. We pretrain DeepSeek-V2 on a high-quality and multi-source corpus consisting of 8.1T tokens, and further perform Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) to fully unlock its potential. Evaluation results show that, even with only 21B activated parameters, DeepSeek-V2 and its chat versions still achieve top-tier performance among open-source models.
Subjects:Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:arXiv:2405.04434 [cs.CL]
(or arXiv:2405.04434v3 [cs.CL] for this version)
[2405.04434] DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Focus to learn more

Submission history

From: Wenfeng Liang [view email]
[v1] Tue, 7 May 2024 15:56:43 UTC (431 KB)
[v2] Wed, 8 May 2024 02:43:34 UTC (431 KB)
[v3] Thu, 16 May 2024 17:25:01 UTC (432 KB)

 
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,613
Daps
161,860


63% of surveyed Americans want government legislation to prevent super intelligent AI from ever being achieved​

News

By Nick Evanson

published 3 days ago

OpenAI and Google might love artificial general intelligence, but the average voter probably just thinks Skynet.

Half of Artificial Intelligence robot face

(Image credit: Getty Images, Yuichiro Chino)

Generative AI may well be en vogue right now, but when it comes to artificial intelligence systems that are way more capable than humans, the jury is definitely unanimous in its view. A survey of American voters showed that 63% of respondents believe government regulations should be put in place to actively prevent it from ever being achieved, let alone be restricted in some way.

The survey, carried out by YouGov for the Artificial Intelligence Policy Institute (via Vox) took place last September. While it only sampled a small number of voters in the US—just 1,118 in total—the demographics covered were broad enough to be fairly representative of the wider voting population.

One of the specific questions asked in the survey focused on "whether regulation should have the goal of delaying super intelligence." Specifically, it's talking about artificial general intelligence (AGI), something that the likes of OpenAI and Google are actively working on trying to achieve. In the case of the former, its mission expressly states this, with the goal of "ensur[ing] that artificial general intelligence benefits all of humanity" and it's a view shared by those working in the field. Even if that is one of the co-founders of OpenAI on his way out of the door...

Regardless of how honourable OpenAI's intentions are, or maybe were, it's a message that's currently lost on US voters. Of those surveyed, 63% agreed with the statement that regulation should aim to actively prevent AI superintelligence, 21% felt that didn't know, and 16% disagreed altogether.

The survey's overall findings suggest that voters are significantly more worried about keeping "dangerous [AI] models out of the hands of bad actors" rather than it being of benefit to us all. Research into new, more powerful AI models should be regulated, according to 67% of the surveyed voters, and they should be restricted in what they're capable of. Almost 70% of respondents felt that AI should be regulated like a "dangerous powerful technology."

That's not to say those people weren't against learning about AI. When asked about a proposal in Congress that expands access to AI education, research, and training, 55% agreed with the idea, whereas 24% opposed it. The rest chose that "Don't know" response.

I suspect that part of the negative view of AGI is the average person will undoubtedly think 'Skynet' when questioned about artificial intelligence better than humans. Even with systems far more basic than that, concerns over deep fakes and job losses won't help with seeing any of the positives that AI can potentially bring.

AI, EXPLAINED

OpenAI logo displayed on a phone screen and ChatGPT website displayed on a laptop screen are seen in this illustration photo taken in Krakow, Poland on December 5, 2022.

What is artificial general intelligence?: We dive into the lingo of AI and what the terms actually mean.

The survey's results will no doubt be pleasing to the Artificial Intelligence Policy Institute, as it "believe(s) that proactive government regulation can significantly reduce the destabilizing effects from AI." I'm not suggesting that it's influenced the results in any way, as my own, very unscientific, survey of immediate friends and family produced a similar outcome—i.e. AGI is dangerous and should be heavily controlled.

Regardless of whether this is true or not, OpenAI, Google, and others clearly have lots of work ahead of them, in convincing voters that AGI really is beneficial to humanity. Because at the moment, it would seem that the majority view of AI becoming more powerful is an entirely negative one, despite arguments to the contrary.
 
Top