bnew

Veteran
Joined
Nov 1, 2015
Messages
51,787
Reputation
7,926
Daps
148,622

AI one-percenters seizing power forever is the real doomsday scenario, warns AI godfather​

Hasan Chowdhury

Oct 30, 2023, 10:11 AM EDT

Yann LeCun

AI godfather Yann LeCun has fired shots at notable AI leaders. Kevin Dietsch/Getty Images
  • An AI godfather has had it with the doomsdayers.
  • Meta's Yann LeCun thinks tech bosses' bleak comments on AI risks could do more harm than good.
  • The naysaying is actually about keeping control of AI in the hands of a few, he said.
AI godfather Yann LeCun wants us to forget some of the more far-fetched doomsday scenarios.

He sees a different, real threat on the horizon: the rise of power hungry one-percenters who rob everyone else of AI's riches.

Over the weekend, Meta's chief AI scientist accused some of the most prominent founders in AI of "fear-mongering" and "massive corporate lobbying" to serve their own interests.

He named OpenAI's Sam Altman, Google DeepMind's Demis Hassabis, and Anthropic's Dario Amodei in a lengthy weekend post on X.

"Altman, Hassabis, and Amodei are the ones doing massive corporate lobbying at the moment," LeCun wrote, referring to these founders' role in shaping regulatory conversations about AI safety. "They are the ones who are attempting to perform a regulatory capture of the AI industry."

He added that if these efforts succeed, the outcome would be a "catastrophe" because "a small number of companies will control AI."

That's significant since, as almost everyone who matters in tech agrees, AI is the biggest development in technology since the microchip or the internet.


Altman, Hassabis, and Amodei did not immediately respond to Insider's request for comment.

LeCun's comments came in response to a post on X from physicist Max Tegmark, who suggested that LeCun wasn't taking the AI doomsday arguments seriously enough.

"Thanks to @RishiSunak & @vonderleyen for realizing that AI xrisk arguments from Turing, Hinton, Bengio, Russell, Altman, Hassabis & Amodei can't be refuted with snark and corporate lobbying alone," Tegmark wrote, referring to the UK's upcoming global AI safety summit.



LeCun says founder fretting is just lobbying

Since the launch of ChatGPT, AI's power players have become major public figures.

But, LeCun said, founders such as Altman and Hassabis have spent a lot of time drumming up fear about the very technology they're selling.

In March, more than 1,000 tech leaders, including Elon Musk, Altman, Hassabis, and Amodei, signed a letter calling for a minimum six-month pause on AI development.

The letter cited "profound risks to society and humanity" posed by hypothetical AI systems. Tegmark, one of the letter's signatories, has described AI development as "a suicide race."

LeCun and others say these kinds of headline-grabbing warnings are just about cementing power and skating over the real, imminent risks of AI.

Those risks include worker exploitation and data theft that generates profit for "a handful of entities," according to the Distributed AI Research Institute (DAIR).

The focus on hypothetical dangers also divert attention away from the boring-but-important question of how AI development actually takes shape.

LeCun has described how people are "hyperventilating about AI risk" because they have fallen for what he describes as the myth of the "hard take-off." This is the idea that "the minute you turn on a super-intelligent system, humanity is doomed."

But imminent doom is unlikely, he argues, because every new technology in fact goes through a very ordered development process before wider release.



So the area to focus on, is in fact, how AI is developed right now. And for LeCun, the real danger is that the development of AI is locked into private, for-profit entities who never release their findings, while AI's open-source community gets obliterated.

His consequent worry is that regulators let it happen because they're distracted by killer robot arguments.

Leaders like LeCun have championed open-source developers as their work on tools that rival, say, OpenAI's ChatpGPT, brings a new level of transparency to AI development.

LeCun's employer, Meta, made its own large language model that competes with GPT, LLaMa 2, (somewhat) open source. The idea is that the broader tech community can look under the hood of the model. No other big tech company has done a similar open-source release, though OpenAI is rumored to be thinking about it.

For LeCun, keeping AI development closed is a real reason for alarm.

"The alternative, which will inevitably happen if open source AI is regulated out of existence, is that a small number of companies from the West Coast of the US and China will control AI platform and hence control people's entire digital diet," he wrote.

"What does that mean for democracy? What does that mean for cultural diversity?"
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,787
Reputation
7,926
Daps
148,622

AI capabilities are going to get a MASSIVE boost once they gain "EQ" — which I anticipate will start to emerge in true multimodal models.

It just occurred to me that AI, even as a simple summarizer and synthesizer of information, is quite hamstrung by the limitations of text-only I/O.

I'm reviewing meeting transcripts for catchy sound bites (e.g., to package into a derivative marketing post on LinkedIn) and noticing that, while piping the text through Claude or GPT-4 and asking it to highlight "interesting nuggets of counterintuitive insight" seems to work relatively well, the best material is (as I listen to the recordings) actually consistently correlated with the emotional valence of the speaker.

If someone is on a tear or highly engaged, you can REALLY hear it in their voice, and the content they're relaying is, at the very least, interesting.

It thus seems possible that you'd be able to perform a better search for high-quality information (which, interestingly, doubles as valuable training data) with audio (and perhaps video for add'l non-verbal cues) with a multimodal model that can tease out these latent emotional signals from raw recordings and use them to "index" the core content.

May need to brush off my @hume_ai API key and start experimenting...



Information wants to be free, and OSS LLMs are going to unleash a flood of it unlike anything we've ever seen.

The proliferation of information has been accelerating for decades thx to the internet, but the limiting factor has always been compression, indexing, and search.

When it comes to information, the cost of "the means of production" has fallen faster than the "means of consumption," which is to say indexing, filtering, sorting, and synthesizing the flood.

Hence, why the powers of aggregation and distribution agglomerated into an oligopoly (Google et al).

This centralized structure lends itself to being controlled, including by the owners and protectors of intellectual property (amongst whom creators constitute only a small part, I'd bet — versus corporate publishers, traditional distributors, and other information middle-men who don't want to lose their piece).

It might be possible for these interests to exert pressure on labs with frontier models (see: Anthropic's update last week that temporarily nerfed Claude, which I'd speculate was not unrelated to the copyright lawsuit that's been filed against them).

BUT, as noted in the tweet below, the nature of this technology is beyond what I imagine the architects of intellectual property and the infrastructure that supports it could have imagined: once something is out there, it's out here (which we've already known with the 'net); with OSS LLMs, that info can be "laundered," compressed, stored, and synthesized into whatever someone finds useful ENTIRELY ON THEIR LOCAL MACHINES!

I think creators should be compensated for their work and we're all better off when capital flows to the most creative, original, and pioneering individuals and corporations — BUT we need to deal with the fact that the costs of the "means of consumption" are dropping and the legacy mechanisms of IP control and enforcement via copyright, etc are simply not going to work.

The "secret-to-common knowledge" information lifecycle is going to get hyper-compressed, and this is going to have extremely strange knock-on effects that our existing institutions are not sufficiently equipped to deal with.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,787
Reputation
7,926
Daps
148,622

Skywork: A More Open Bilingual Foundation Model​

Published on Oct 30
·Featured in Daily Papers on Oct 31
Authors:
Tianwen Wei,Liang Zhao,Lichang Zhang,Bo Zhu,Lijie Wang,Haihua Yang,Biye Li,Cheng Cheng,Weiwei Lü,Rui Hu,Chenxia Li,Liu Yang,Xilin Luo,Xuejie Wu,Lunan Liu,Wenjun Cheng,Peng Cheng,Jianhao Zhang,Xiaoyu Zhang,Lei Lin,Xiaokun Wang,Yutuan Ma +8 authors

Abstract​

In this technical report, we present Skywork-13B, a family of large language models (LLMs) trained on a corpus of over 3.2 trillion tokens drawn from both English and Chinese texts. This bilingual foundation model is the most extensively trained and openly published LLMs of comparable size to date. We introduce a two-stage training methodology using a segmented corpus, targeting general purpose training and then domain-specific enhancement training, respectively. We show that our model not only excels on popular benchmarks, but also achieves state of the art performance in Chinese language modeling on diverse domains. Furthermore, we propose a novel leakage detection method, demonstrating that test data contamination is a pressing issue warranting further investigation by the LLM community. To spur future research, we release Skywork-13B along with checkpoints obtained during intermediate stages of the training process. We are also releasing part of our SkyPile corpus, a collection of over 150 billion tokens of web text, which is the largest high quality open Chinese pre-training corpus to date. We hope Skywork-13B and our open corpus will serve as a valuable open-source resource to democratize access to high-quality LLMs.


 
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,787
Reputation
7,926
Daps
148,622
TECH

Google DeepMind boss hits back at Meta AI chief over ‘fearmongering’ claim​

PUBLISHED TUE, OCT 31 202312:04 PM EDTUPDATED AN HOUR AGO
Ryan Browne
@RYAN_BROWNE_


KEY POINTS
  • Google DeepMind boss Demis Hassabis told CNBC that the company wasn’t trying to achieve “regulatory capture” when it came to the discussion on how best to approach AI.
  • Yann LeCun, Meta’s chief AI scientist, said that DeepMind’s Hassabis, along with other AI CEOs, were “doing massive corporate lobbying” to ensure only a handful of big tech companies end up controlling AI.
  • Hassabis said that it was important to start a conversation about regulating potentially superintelligent artificial intelligence now rather than later because, if left too long, the consequences could be grim.
We have to talk to everyone, including China, to understand the potential of AI technology, Google DeepMind CEO says

WATCH NOW

VIDEO14:33

We have to talk to everyone, including China, to understand the potential of AI technology, Google DeepMind CEO says


The boss of Google DeepMind pushed back on a claim from Meta’s artificial intelligence chief alleging the company is pushing worries about AI’s existential threats to humanity to control the narrative on how best to regulate the technology.

In an interview with CNBC’s Arjun Kharpal, Hassabis said that DeepMind wasn’t trying to achieve “regulatory capture” when it came to the discussion on how best to approach AI. It comes as DeepMind is closely informing the U.K. government on its approach to AI ahead of a pivotal summit on the technology due to take place on Wednesday and Thursday.

Over the weekend, Yann LeCun, Meta’s chief AI scientist, said that DeepMind’s Hassabis, along with OpenAI CEO Sam Altman, Anthropic CEO Dario Amodei were “doing massive corporate lobbying” to ensure only a handful of big tech companies end up controlling AI.

He also said they were giving fuel to critics who say that highly advanced AI systems should be banned to avoid a situation where humanity loses control of the technology.

“If your fearmongering campaigns succeed, they will *inevitably* result in what you and I would identify as a catastrophe: a small number of companies will control AI,” LeCun said on X, the platform formerly known as Twitter, on Sunday.
“Like many, I very much support open AI platforms because I believe in a combination of forces: people’s creativity, democracy, market forces, and product regulations. I also know that producing AI systems that are safe and under our control is possible. I’ve made concrete proposals to that effect.”


LeCun is a big proponent of open-source AI, or AI software that is openly available to the public for research and development purposes. This is opposed to “closed” AI systems, the source code of which is kept a secret by the companies producing it.

LeCun said that the vision of AI regulation Hassabis and other AI CEOs are aiming for would see open-source AI “regulated out of existence” and allow only a small number of companies from the West Coast of the U.S. and China control the technology.

Meta is one of the largest technology companies working to open-source its AI models. The company’s LLaMa large language model (LLM) software is one of the biggest open-source AI models out there, and has advanced language translation features built in.

In response to LeCun’s comments, Hassabis said Tuesday: “I pretty much disagree with most of those comments from Yann.”

“I think the way we think about it is there’s probably three buckets or risks that we need to worry about,” said Hassabis. “There’s sort of near term harms things like misinformation, deepfakes, these kinds of things, bias and fairness in the systems, that we need to deal with.”

“Then there’s sort of the misuse of AI by bad actors repurposing technology, general-purpose technology for bad ends that they were not intended for. That’s a question about proliferation of these systems and access to these systems. So we have to think about that.”

“And then finally, I think about the more longer-term risk, which is technical AGI [artificial general intelligence] risk,” Hassabis said.

“So the risk of themselves making sure they’re controllable, what value do you want to put into them have these goals and make sure that they stick to them?”

Hassabis is a big proponent of the idea that we will eventually achieve a form of artificial intelligence powerful enough to surpass humans in all tasks imaginable, something that’s referred to in the AI world as “artificial general intelligence.”

Hassabis said that it was important to start a conversation about regulating potentially superintelligent artificial intelligence now rather than later, because if it is left too long, the consequences could be grim.
“I don’t think we want to be doing this on the eve of some of these dangerous things happening,” Hassabis said. “I think we want to get ahead of that.”

Meta was not immediately available for comment when contacted by CNBC.

Cooperation with China​

Both Hassabis and James Manyika, Google’s senior vice president of research, technology and society, said that they wanted to achieve international agreement on how best to approach the responsible development and regulation of artificial intelligence.

Manyika said he thinks it’s a “good thing” that the U.K. government, along with the U.S. administration, agree there is a need to reach global consensus on AI.

“I also think that it’s going to be quite important to include everybody in that conversation,” Manyika added.

“I think part of what you’ll hear often is we want to be part of this, because this is such an important technology, with so much potential to transform society and improve lives everywhere.”

One point of contention surrounding the U.K. AI summit has been the attendance of China. A delegation from the Chinese Ministry of Science and Technology is due to attend the event this week.

That has stirred feelings of unease among some corners of the political world, both in the U.S. government and some of Prime Minister Rishi Sunak’s own ranks.

These officials are worried that China’s involvement in the summit could pose certain risks to national security, particularly as Beijing has a strong influence over its technology sector.

Asked whether China should be involved in the conversation surrounding artificial intelligence safety, Hassabis said that AI knows no borders, and that it required coordination from actors in multiple countries to get to a level of international agreement on the standards required for AI .

“This technology is a global technology,” Hassabis said. “It’s really important, at least on a scientific level, that we have as much dialogue as possible.”

Asked whether DeepMind was open as a company to working with China, Hassabis responded: “I think we have to talk to everyone at this stage.”

U.S. technology giants have shied away from doing commercial work in China, particularly as Washington has applied huge pressure on the country on the technology front.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,787
Reputation
7,926
Daps
148,622

Chinese tech giant Alibaba launches upgraded AI model to challenge Microsoft, Amazon​

PUBLISHED TUE, OCT 31 20237:48 AM EDT
Arjun Kharpal
@ARJUNKHARPAL

KEY POINTS
  • Alibaba on Tuesday launched the latest version of its artificial intelligence model, as the Chinese technology giant looks to compete with U.S. rivals like Amazon and Microsoft.
  • China’s biggest cloud computing and e-commerce player announced Tongyi Qianwen 2.0, its latest large language model.
  • Alibaba also introduced the GenAI Service Platform, which lets companies build their own generative AI applications using their own data.

In this article

Follow your favorite stocksCREATE FREE ACCOUNT
Alibaba Group sign is seen at the World Artificial Intelligence Conference (WAIC) in Shanghai, China July 6, 2023. REUTERS/Aly Song

An Alibaba Group sign is seen at the World Artificial Intelligence Conference in Shanghai, July 6, 2023.

Aly Song | Reuters
Alibaba on Tuesday launched the latest version of its artificial intelligence model, as the Chinese technology giant looks to compete with U.S. tech rivals such as Amazon and Microsoft.

China’s biggest cloud computing and e-commerce player announced Tongyi Qianwen 2.0, its latest large language model (LLM). A LLM is trained on vast amounts of data and forms the basis for generative AI applications such as ChatGPT, which is developed by U.S. firm OpenAI.

Alibaba called Tongyi Qianwen 2.0 a “substantial upgrade from its predecessor,” which was introduced in April.


Tongyi Qianwen 2.0 “demonstrates remarkable capabilities in understanding complex instructions, copywriting, reasoning, memorizing, and preventing hallucinations,” Alibaba said in a press release. Hallucinations refer to AI that presents incorrect information.

Alibaba also released AI models designed for applications in specific industries and uses — such as legal counselling and finance — as it angles in on businesses.

The Hangzhou-headquartered company also announced the GenAI Service Platform, which lets companies build their own generative AI applications, using their own data. One of the fears that businesses have about public generative AI products like ChatGPT is that data could be accessed by third parties.

Alibaba and other major cloud players are offering tools for companies to build their own generative AI products using their own data, which would protected by these providers as part of the service package.

Microsoft’s Azure OpenAI Studio and Amazon Web Service’s Bedrock are two rival services.

While Alibaba is the biggest cloud player by market share in China, the company is trying to catch up with the likes of Amazon and Microsoft overseas.



 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,787
Reputation
7,926
Daps
148,622

Artists Lose First Round of Copyright Infringement Case Against AI Art Generators​

While a federal judge advanced an infringement claim against Stability AI, he dismissed the rest of the lawsuit.

BY WINSTON CHO


Plus Icon

OCTOBER 30, 2023 4:57PM
Artificial intelligence systems and fine arts in the future. Robot drawing a men portrait.
Logo text


Artists suing generative artificial intelligence art generators have hit a stumbling block in a first-of-its-kind lawsuit over the uncompensated and unauthorized use of billions of images downloaded from the internet to train AI systems, with a federal judge’s dismissal of most claims.


U.S. District Judge William Orrick on Monday found that copyright infringement claims cannot move forward against Midjourney and DeviantArt, concluding the accusations are “defective in numerous respects.” Among the issues are whether the AI systems they run on actually contain copies of copyrighted images that were used to create infringing works and if the artists can substantiate infringement in the absence of identical material created by the AI tools. Claims against the companies for infringement, right of publicity, unfair competition and breach of contract were dismissed, though they will likely be reasserted.

Notably, a claim for direct infringement against Stability AI was allowed to proceed based on allegations the company used copyrighted images without permission to create Stable Diffusion. Stability has denied the contention that it stored and incorporated those images into its AI system. It maintains that training its model does not include wholesale copying of works but rather involves development of parameters — like lines, colors, shades and other attributes associated with subjects and concepts — from those works that collectively define what things look like. The issue, which may decide the case, remains contested.


The litigation revolves around Stability’s Stable Diffusion, which is incorporated into the company’s AI image generator DreamStudio. In this case, the artists will have to establish that their works were used to train AI system. It’s alleged that DeviantArt’s DreamUp and Midjourney are powered by Stable Diffusion. A major hurdle artists face is that training datasets are largely a black box.


In his dismissal of infringement claims, Orrick wrote that plaintiffs’ theory is “unclear” as to whether there are copies of training images stored in Stable Diffusion that are utilized by DeviantArt and Midjourney. He pointed to the defense’s arguments that it’s impossible for billions of images “to be compressed into an active program,” like Stable Diffusion.

“Plaintiffs will be required to amend to clarify their theory with respect to compressed copies of Training Images and to state facts in support of how Stable Diffusion – a program that is open source, at least in part – operates with respect to the Training Images,” stated the ruling.


Orrick questioned whether Midjourney and DeviantArt, which offers use of Stable Diffusion through their own apps and websites, can be liable for direct infringement if the AI system “contains only algorithms and instructions that can be applied to the creation of images that include only a few elements of a copyrighted” work.


The judge stressed the absence of allegations of the companies playing an affirmative role in the alleged infringement. “Plaintiffs need to clarify their theory against Midjourney — is it based on Midjourney’s use of Stable Diffusion, on Midjourney’s own independent use of Training Images to train the Midjourney product, or both?” Orrick wrote.


According to the order, the artists will also likely have to show proof of infringing works produced by AI tools that are identical to their copyrighted material. This potentially presents a major issue because they have conceded that “none of the Stable Diffusion output images provided in response to a particular Text Prompt is likely to be a close match for any specific image in the training data.”

“I am not convinced that copyright claims based a derivative theory can survive absent ‘substantial similarity’ type allegations,” the ruling stated.

Though defendants made a “strong case” that claim should be dismissed without an opportunity to be reargued, Orrick noted artists’ contention that AI tools can create material that are similar enough to their work to be misconstrued as fakes.


Claims for vicarious infringement, violations of the Digital Millenium Copyright Act for removal of copyright management information, right of publicity, breach of contract and unfair competition were similarly dismissed.

“Plaintiffs have been given leave to amend to clarify their theory and add plausible facts regarding “compressed copies” in Stable Diffusion and how those copies are present (in a manner that violates the rights protected by the Copyright Act) in or invoked by the DreamStudio, DreamUp, and Midjourney products offered to third parties,” Orrick wrote. “That same clarity and plausible allegations must be offered to potentially hold Stability vicariously liable for the use of its product, DreamStudio, by third parties.”


Regarding the right of publicity claim, which takes issue with defendants profiting off of plaintiffs’ names by allowing users to request art in their style, the judge stressed that there’s not enough information supporting arguments that the companies used artists’ identities to advertise products.


Two of the three artists who filed the lawsuit have dropped their infringement claims because they didn’t register their work with the copyright office before suing. The copyright claims will be limited to artist Sarah Anderson’s works, which she has registered. As proof that Stable Diffusion was trained on her material, Anderson relied on the results of a search of her name on haveibeentrained.com, which allows artists to discover if their work has been used in AI model training and offers an opt-out to help prevent further unauthorized use.

“While defendants complain that Anderson’s reference to search results on the ‘haveibeentrained’ website is insufficient, as the output pages show many hundreds of works that are not identified by specific artists, defendants may test Anderson’s assertions in discovery,” the ruling stated.


Stability, DeviantArt and Midjourney didn’t respond to requests for comment.


On Monday, President Joe Biden issued an executive order to create some safeguards against AI. While it mostly focuses on reporting requirements over the national security risks some companies’ systems present, it also recommends the watermarking of photos, video and audio developed by AI tools to protect against deep fakes. Biden, at a signing of the order, stressed the technology’s potential to “smear reputations, spread fake news and commit fraud.”
“The inclusion of copyright and intellectual property protection in the AI Executive Order reflects the importance of the creative community and IP-powered industries to America’s economic and cultural leadership,” said the Human Artistry Campaign in a statement.


At a meeting in July, leading AI companies voluntarily agreed to guardrails to manage the risks posed by the emerging technology in a bid by the White House to get the industry to regulate itself in the absence of legislation instituting limits around the development of the new tools. Like the executive order issued by Biden, it was devoid of any kind of reporting regime or timeline that could legally bind the firms to their commitments.
 
Last edited:
Top