bnew

Veteran
Joined
Nov 1, 2015
Messages
55,695
Reputation
8,224
Daps
157,195


Artists Get Early Access To OpenAI’s Sora Video Tool, With Surreal Results​

Leslie Katz

Contributor

I write about the intersection of art, science and technology.

Follow

https://www.forbes.com/sites/leslie...h-surreal-results/?sh=51fb90d84831#open-web-0

Mar 25, 2024,07:50pm EDT



A guy with a yellow balloon for a head stands in a subway, with the balloon touching the ceiling


In "Air Head," a short film made with help from OpenAI's generative AI tool Sora, a guy has a ... [+]

SHY KIDS

A select group of artists, designers and filmmakers have now had a couple of months to play with OpenAI’s Sora text-to-video tool since the company announced it, and on Monday, OpenAI shared some of their creations and first impressions.

“As great as Sora is at generating things that appear real, what excites us is its ability to make things that are totally surreal,” Toronto-based multimedia production company Shy Kids said in a statement accompanying Air Head, a short film it made with Sora. The word surreal aptly describes the video, which stars a guy with a yellow balloon for a noggin.

“I am literally filled with hot air,” he says.

Balloon guy goes on to describe the joys and pitfalls of living with the anatomical anomaly. Windy days cause his head to blow off his shoulders, and when he walks through the cactus aisle of a plant store, things can get prickly. But he also lives with a keen awareness that “we’re all just a pin prick away from deflation,” and for that he’s grateful.



The new generative AI tool, which OpenAI first shared with the public in mid-February, can produce videos up to a minute long from a single text prompt. Sora isn’t yet available as a product, and OpenAI says it’s currently working to assess the tool’s capabilities, limitations and risks.

The video from Shy Kids and other early testers, including OpenAI’s first artist in residence, Alexander Reben, will help the company do that, OpenAI said in a blog post. OpenAI wouldn’t reveal exactly how many visual artists, designers, creative directors and filmmakers are test-driving Sora or what parameters informed the making of the films it spotlighted on Monday.

“While we have many improvements to make to Sora, we're already getting a glimpse of how the model can help creatives bring ideas to reality,” the company said.




Generative AI continues to elicit a range of passionate reactions—from enthusiasm about the tools’ creative potential to concern artists’ work will be stolen to train AI datasets or that algorithms will steal creatives’ jobs altogether. Often, the views exist simultaneously. The artists and filmmakers granted early access to the tool, not surprisingly, appear to tilt heavily toward the excited end of the spectrum, at least when it comes to Sora.

“The ability to rapidly conceptualize at such a high level of quality is not only challenging my creative process but also helping me evolve in storytelling,” said Josephine Miller, creative director of London-based Oraar Studio, which specializes in 3D visuals, augmented reality and digital fashion.

MORE FROM FORBESOpenAI Chief Sam Altman Is Bringing Your Wild Sora Prompts To Life OnlineBy Leslie Katz

Miller’s short film presents a dreamy subaquatic world where humans serenely float and twirl in garments covered with iridescent fish-like scales. As with a number of other films highlighted by OpenAI on Monday, the world of this one hovers somewhere between reality and unfettered imagination.

Sora is “not bound by traditional laws of physics or conventions of thought,” creator Don Allen Stevenson III said in a statement about his film, adding that collaborating with the tool shifted his focus from “technical hurdles to pure creativity…unlocking a world of instant visualization and rapid prototyping.”




The seven short films also feature one by Nik Kleverov, co-founder and creative director of L.A.-based creative agency Native Foreign. His entry presents an evocative compilation that spans decades, moods and visual styles.

Kleverov said he can already see how Sora will transform the way he approaches both agency work and personal projects. “It’s allowing me to iterate and explore original concepts that have been kept in a vault or on indefinite pause due to budgetary and resource constraints,” Kleverov said on LinkedIn when sharing the Sora film.

In it, one man straight out of a black-and-white noir scene walks down a rainy cobblestone city street, another hunches over timepieces in a old-time clock repair shop rendered in nostalgic sepia tones. Wait, is that a futuristic-looking sportscar surfacing from under the ocean? Why yes, yes it is.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
55,695
Reputation
8,224
Daps
157,195

Nvidia’s AI chip dominance is being targeted by Google, Intel, and Arm​

The UXL Foundation project wants to eliminate the proprietary software barriers keeping developers locked into using Nvidia’s AI tech.​

By Jess Weatherbed, a news writer focused on creative industries, computing, and internet culture. Jess started her career at TechRadar, covering news and hardware reviews.

Mar 25, 2024, 12:34 PM EDT

4 Comments
Vector collage of the Ndivia logo.

A coalition of tech companies, including Intel, Google, Arm, and Qualcomm, hopes to loosen Nvidia’s grip on the AI market. Illustration by Cath Virginia / The Verge

Major tech companies are attempting to eliminate software advantages that have helped Nvidia dominate the artificial intelligence market. According to Reuters, a group formed by Intel, Google, Arm, Qualcomm, Samsung, and other tech companies is developing an open-source software suite that prevents AI developers from being locked into Nvidia’s proprietary tech, allowing their code to run on any machine and with any chip.

The group, called The Unified Acceleration Foundation (UXL), told Reuters that technical details for the project should reach a “mature” state by the second half of this year, though a final release target wasn’t given. The project currently includes the OneAPI open standard Intel developed to eliminate requirements like specific coding languages, code bases, and other tools from tying developers into using specific architecture, such as Nvidia’s CUDA platform.

Nvidia became the first chipmaker to hit a $2 trillion market capitalization last month, having experienced rapid growth after focusing on hardware for powering AI models, like its H100 and upcoming H200 GPUs. Those Nvidia chips, which lock developers into using Nvidia’s CUDA architecture, are superior to anything currently produced by other chipmakers, but the explosive demand has caused scarcity while rival companies continue developing their own alternatives. During the company’s 2023 Computex keynote, Nvidia CEO Jensen Huang said that four million developers were using the Cuda computing model.

While UXL says the project will initially aim to open up options for AI apps and high-performance computing applications, the group plans to eventually support Nvidia’s hardware and code, too. UXL is seeking aid from additional chipmakers and cloud-computing companies like Microsoft and Amazon to ensure the solution can be deployed on any chip or hardware. Microsoft, which is notably not included in the UXL coalition, was rumored to have teamed up with AMD last year to develop alternative AI chips that could challenge Nvidia’s effective monopoly over the industry.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
55,695
Reputation
8,224
Daps
157,195

Adobe’s Firefly Services makes over 20 new generative and creative APIs available to developers​

Frederic Lardinois @fredericl / 1:08 PM EDT•March 26, 2024


Fireflies in a forest
GettyImages-173424962.jpg

Image Credits: Trevor Williams / Getty Images

Adobe today announced Firefly Services, a set of over 20 new generative and creative APIs, tools and services. Firefly Services makes some of the company’s AI-powered features from its Creative Cloud tools like Photoshop available to enterprise developers to speed up content creation in their custom workflows — or create entirely new solutions.

In addition, the company also today launched Custom Models, which allows businesses to fine tune Firefly models based on their assets. Custom Models is already built into Adobe’s new GenStudio.

Adobe describes Firefly services as a “comprehensive set of generative AI and creative APIs that automates workflows.” It includes APIs for removing backgrounds, smartly cropping images and automatically leveling the horizon in a photo, as well as access to core AI-driven Photoshop features like Generative Fill and Expand. In addition to these AI features, Firefly Services also exposes tools for editing text layers, tagging content and applying presets from Lightroom, for example.

“As consumer expectations around generative AI-driven personalization continue to rise, Firefly Services and Custom Models are first-of-its-kind offerings that unlock the possibility for brands to have powerful customization capabilities and more control in defining their automation processes,” said David Wadhwani, president, Digital Media Business, Adobe. “Brands are urgently looking to Adobe to shift their generative AI investments from playgrounds to production.”




Image Credits: Adobe

As with most of Adobe’s enterprise-centric use cases for generative AI, these new tools are all meant to help brands speed up their content creation workflows. Yet while many enterprises want to use generative AI, they are also worried about brand safety, which has kept many of them from bringing these tools into production. From the outset, Adobe positioned Firefly as a brand-safe alternative to other models and this new set of services continues that tradition.

“The rising expectations for customer experiences have compelled brands to rethink how they produce and personalize marketing content at scale,” said Billy Seabrook, global chief design officer, IBM Consulting. “Adobe applications have been instrumental in our creative process, and now with Firefly, we can rapidly generate imagery and templates in a range of styles and sizes to align with brand standards and enable more people to participate in the creative process.




Image Credits: Adobe
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
55,695
Reputation
8,224
Daps
157,195

YC-backed SigmaOS browser turns to AI-powered features for monetization​

Ivan Mehta @indianidle / 11:30 AM EDT•March 26, 2024



SigmaOS

Image Credits: SigmaOS

Web browsers have realized they are one of the best ways for users to access the present set of AI tools, so they are working on being the first-choice containers for that. SigmaOS, a Y Combinator-backed company, is now banking on users’ desire to utilize AI tools and pay for them as the company is releasing new features like link preview summaries, pinch-to-summarize and “look it up” browsing features.

Some of these features sound and work like rival browser Arc’s recent releases. But SigmaOS claims that its feature returns better-quality results, which is a hard metric to quantify.

The company is releasing pinch-to-summarize on desktop, which works a bit like Arc’s new mobile feature. While the feature summarizer captures sections like information, ratings, reviews, prices and photos from an Airbnb listing, it just gives a small paragraph of info for an article, which is not sufficient. Arc browser’s summarize function also had its own hiccups in terms of missing out on key information, but it worked consistently across formats.

pinch to summarize

Image Credits: SigmaOS

One of the company’s co-founders, Mahyad Ghassemibouyaghchi, said that SigmaOS will adapt to different page types in the coming months and will present summaries in various formats based on the web page.

SigmaOS’ marquee feature from this release is called “Look it up.” It browses the web for a given query and makes a summary page out of the information that it finds. This is similar to Arc’s “Browse for me” function, but on desktop. One key differentiator is that users can ask follow-up questions to explore more about the topic.

Look it up

Image Credits: SigmaOS

Besides that, the startup is also releasing link previews on hover and automatic renaming for locked (pinned) pages.

Going all out on AI​

Last year, SigmaOS released some AI-powered features such as a contextual assistant called Airis, which can answer your questions about a web page or the broader web.

At one point, the startup tried to monetize through team-based features. Now, the company is looking to monetize its AI features. It said that all users would get access to AI-powered features but for $20 per month users would get better rate limits for AI features. For $30 per month, they would get unlimited usage and the ability to choose between different models such as GPT-4, Perplexity and Claude 3 Haiku.

Separately, the company is now thinking big by aiming to release an AI-agent-like feature, which will let you use the browser in a hands-free mode. In a demo video, Ghassemibouyaghchi shows how users could clear emails or book an Airbnb by interacting with the browser with voice. This is a similar idea to the Rabbit r1 device, which aims to traverse an interface for you to complete a task.

The company is also aiming to build something called “repeatable flows,” which are automatic actions based on triggers like time. You can think of them as the If This Then That (IFTTT) of browsers, but that’s still in the concept stage.

Separately, SigmaOS’ competitor Arc, which recently raised $50 million in funding at a $550 million valuation, announced in January that it plans to build an AI agent that browses the web for you.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
55,695
Reputation
8,224
Daps
157,195

Profluent, spurred by Salesforce research and backed by Jeff Dean, uses AI to discover medicines​

Kyle Wiggers @kyle_l_wiggers / 6:08 PM EDT•March 25, 2024

Comment

Abstract DNA strands

Image Credits: koto_feja / Getty Images

Last year, Salesforce, the company best known for its cloud sales support software (and Slack), spearheaded a project called ProGen to design proteins using generative AI. A research moonshot, ProGen could — if brought to market — help uncover medical treatments more cost effectively than traditional methods, the researchers behind it claimed in a January 2023 blog post.

ProGen culminated in research published in the journal Nature Biotech showing that the AI could successfully create the 3D structures of artificial proteins. But, beyond the paper, the project didn’t amount to much at Salesforce or anywhere else — at least not in the commercial sense.

That is, until recently.

One of the researchers responsible for ProGen, Ali Madani, has launched a company, Profluent, that he hopes will bring similar protein-generating tech out of the lab and into the hands of pharmaceutical companies. In an interview with TechCrunch, Madani describes Profluent’s mission as “reversing the drug development paradigm,” starting with patient and therapeutic needs and working backwards to create “custom-fit” treatments solution.

“Many drugs — enzymes and antibodies, for example — consist of proteins,” Madani said. “So ultimately this is for patients who would receive an AI-designed protein as medicine.”

While at Salesforce’s research division, Madani found himself drawn to the parallels between natural language (e.g. English) and the “language” of proteins. Proteins — chains of bonded-together amino acids that the body uses for various purposes, from making hormones to repairing bone and muscle tissue — can be treated like words in a paragraph, Madani discovered. Fed into a generative AI model, data about proteins can be used to predict entirely new proteins with novel functions.

With Profluent, Madani and co-founder Alexander Meeske, an assistant professor of microbiology at the University of Washington, aim to take the concept a step further by applying it to gene editing.

“Many genetic diseases can’t be fixed by [proteins or enzymes] lifted directly from nature,” Madani said. “Furthermore, gene editing systems mixed and matched for new capabilities suffer from functional tradeoffs that significantly limit their reach. In contrast, Profluent can optimize multiple attributes simultaneously to achieve a custom-designed [gene] editor that’s a perfect fit for each patient.”

It’s not out of left field. Other companies and research groups have demonstrated viable ways in which generative AI can be used to predict proteins.

Nvidia in 2022 released a generative AI model, MegaMolBART, that was trained on a data set of millions of molecules to search for potential drug targets and forecast chemical reactions. Meta trained a model called ESM-2 on sequences of proteins, an approach the company claimed allowed it to predict sequences for more than 600 million proteins in just two weeks. And DeepMind, Google’s AI research lab, has a system called AlphaFold that predicts complete protein structures, achieving speed and accuracy far surpassing older, less complex algorithmic methods.

Profluent is training AI models on massive data sets — data sets with over 40 billion protein sequences — to create new as well as fine-tune existing gene-editing and protein-producing systems. Rather than develop treatments itself, the startup plans to collaborate with outside partners to yield “genetic medicines” with the most promising paths to approval.

Madani asserts this approach could dramatically cut down on the amount of time — and capital — typically required to develop a treatment. According to industry group PhRMA, it takes 10-15 years on average to develop one new medicine from initial discovery through regulatory approval. Recent estimates peg the cost of developing a new drug at between several hundred million to $2.8 billion, meanwhile.

“Many impactful medicines were in fact accidentally discovered, rather than intentionally designed,” Madani said. “[Profluent’s] capability offers humanity a chance to move from accidental discovery to intentional design of our most needed solutions in biology.”

Berkeley-based, 20-employee Profluent is backed by VC heavy hitters including Spark Capital (which led the company’s recent $35 million funding round), Insight Partners, Air Street Capital, AIX Ventures and Convergent Ventures. Google chief scientist Jeff Dean has also contributed, lending additional credence to the platform.

Profluent’s focus in the next few months will be upgrading its AI models, in part by expanding the training data sets, Madani says, and customer and partner acquisition. It’ll have to move aggressively; rivals, including EvolutionaryScale and Basecamp Research, are fast training their own protein-generating models and raising vast sums of VC cash.

“We’ve developed our initial platform and shown scientific breakthroughs in gene editing,” Madani said. “Now is the time to scale and start enabling solutions with partners that match our ambitions for the future.”
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
55,695
Reputation
8,224
Daps
157,195

How Adobe’s bet on non-exploitative AI is paying off​

The company says it’s proof that quality AI models don’t have to include controversial copyrighted content.

By Melissa Heikkilä
archive page


March 26, 2024
generated image of a hand using an X-acto knife to cut through a painting of reality

MITTR VIA FIREFLY

Since the beginning of the generative AI boom, there has been a fight over how large AI models are trained. In one camp sit tech companies such as OpenAI that have claimed it is “impossible” to train AI without hoovering the internet of copyrighted data. And in the other camp are artists who argue that AI companies have taken their intellectual property without consent and compensation.

Adobe is pretty unusual in that it sides with the latter group, with an approach that stands out as an example of how generative AI products can be built without scraping copyrighted data from the internet. Adobe released its image-generating model Firefly, which is integrated into its popular photo editing tool Photoshop, one year ago.

In an exclusive interview with MIT Technology Review, Adobe’s AI leaders are adamant this is the only way forward. At stake is not just the livelihood of creators, they say, but our whole information ecosystem. What they have learned shows that building responsible tech doesn’t have to come at the cost of doing business.

“We worry that the industry, Silicon Valley in particular, does not pause to ask the ‘how’ or the ‘why.’ Just because you can build something doesn’t mean you should build it without consideration of the impact that you’re creating,” says David Wadhwani, president of Adobe’s digital media business.

Those questions guided the creation of Firefly. When the generative image boom kicked off in 2022, there was a major backlash against AI from creative communities. Many people were using generative AI models as derivative content machines to create images in the style of another artist, sparking a legal fightover copyright and fair use. The latest generative AI technology has also made it much easier to create deepfakes and misinformation.

It soon became clear that to offer creators proper credit and businesses legal certainty, the company could not build its models by scraping the web of data, Wadwani says.

Adobe wants to reap the benefits of generative AI while still “recognizing that these are built on the back of human labor. And we have to figure out how to fairly compensate people for that labor now and in the future,” says Ely Greenfield, Adobe’s chief technology officer for digital media.

To scrape or not to scrape​

The scraping of online data, commonplace in AI, has recently become highly controversial. AI companies such as OpenAI, Stability.AI, Meta, and Google are facing numerous lawsuits over AI training data. Tech companies argue that publicly available data is fair game. Writers and artists disagree and are pushing for a license-based model, where creators would get compensated for having their work included in training datasets.

Adobe trained Firefly on content that had an explicit license allowing AI training, which means the bulk of the training data comes from Adobe’s library of stock photos, says Greenfield. The company offers creators extra compensation when material is used to train AI models, he adds.

This is in contrast to the status quo in AI today, where tech companies scrape the web indiscriminately and have a limited understanding of what of what the training data includes. Because of these practices, the AI datasets inevitably include copyrighted content and personal data, and research has uncovered toxic content, such as child sexual abuse material.

Scraping the internet gives tech companies a cheap way to get lots of AI training data, and traditionally, having more data has allowed developers to build more powerful models. Limiting Firefly to licensed data for training was a risky bet, says Greenfield.

The tool, called Nightshade, messes up training data in ways that could cause serious damage to image-generating AI models.

“To be honest, when we started with Firefly with our image model, we didn’t know whether or not we would be able to satisfy customer needs without scraping the web,” says Greenfield.

“And we found we could, which was great.”

Human content moderators also review the training data to weed out objectionable or harmful content, known intellectual property, and images of known people, and the company has licenses for everything its products train on.

Adobe’s strategy has been to integrate generative AI tools into its existing products, says Greenfield. In Photoshop, for example, Firefly users can fill in areas of an image using text commands. This allows them much more control over the creative process, and it aids their creativity.

Still, more work needs to be done. The company wants to make Firefly even faster. Currently it takes around 10 seconds for the company’s content moderation algorithms to check the outputs of the model, for example, Greenfield says. Adobe is also trying to figure out how some business customers could generate copyrighted content, such as Marvel characters or Mickey Mouse. Adobe has teamed up with companies such as IBM, Mattel, NVIDIA and NASCAR, which allows these companies to use the tool with their intellectual property. It is also working on audio, lip synching tools and 3D generation.

Garbage in, garbage out​

The decision to not scrape the internet also gives Adobe an edge in content moderation. Generative AI is notoriously difficult to control, and developers themselves don’t know why the models generate the images and texts they do. Generative AI models have put out questionable and toxic content in numerous cases.

That all comes down to what it has been trained on, Greenfield says. He says Adobe’s model has never seen a picture of Joe Biden or Donald Trump, for example, and it cannot be coaxed into generating political misinformation. The AI model’s training data has no news content or famous people. It has not been trained on any copyrighted material, such as images of Mickey Mouse.

“It just doesn’t understand what that concept is,” says Greenfield.

Adobe also applies automated content moderation at the point of creation to check that Firefly’s creations are safe for professional use. The model is prohibited from creating news stories or violent images. Some names of artists are also blocked. Firefly-generated content comes with labels that indicate it has been created using AI, and the image’s edit history.

During a critical election year, the need to know who made a piece of content, and how, is especially important. Adobe has been a vocal advocate for labels on AI content that tell where it originated, and with whom.

The company started the Content Authenticity Initiative, an association promoting the use of labels which tell you whether content is AI-generated or not, along with the New York Times and Twitter (now X). The initiative now has over 2,500 members. It is also part of developing C2PA, an industry standard labelwhich shows where a piece of content has come from, and how it was created.

“We’re long overdue [for] a better education in media literacy and tools that support people’s ability to validate any content that claims to represent reality,” Greenfield says.

Adobe’s approach highlights the need for AI companies to be thinking deeply about content moderation, says Claire Leibowicz, head of AI and media integrity at the nonprofit Partnership on AI.

Adobe’s approach toward generative AI serves those societal goals by fighting misinformation as well as promoting business goals, such as preserving creator autonomy and attribution, adds Leibowicz.

“The business mission of Adobe is not to prevent misinformation, per se,” she says. “It’s to empower creators. And isn’t this a really elegant confluence of mission and tactics, to be able to kill two birds with one stone?”

Wadhwani agrees. The company says Firefly-powered features are among its most popular, and 90% of Firefly’s web app users are entirely new customers to Adobe.

“I think our approach has definitely been good for business,” Wadhwani says.

Correction: An earlier version of this article had David Wadhwani's title wrong. This has been amended.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
55,695
Reputation
8,224
Daps
157,195

ARTIFICIAL INTELLIGENCE

Google DeepMind’s new AI assistant helps elite soccer coaches get even better​

The system can predict the outcome of corner kicks and provide realistic and accurate tactical suggestions in football matches.

By Rhiannon Williams archive page

March 19, 2024
An overhead illustration showcasing how AI can work with football.

An illustration of how TacticAI could be integrated into the process of football tactic development in the real world. GOOGLE DEEPMIND

Soccer teams are always looking to get an edge over their rivals. Whether it’s studying players’ susceptibility to injury, or opponents’ tactics—top clubs look at reams of data to give them the best shot of winning.

They might want to add a new AI assistant developed by Google DeepMind to their arsenal. It can suggest tactics for soccer set-pieces that are even better than those created by professional club coaches.

The system, called TacticAI, works by analyzing a dataset of 7,176 corner kicks taken by players for Liverpool FC, one of the biggest soccer clubs in the world.

Corner kicks are awarded to an attacking team when the ball passes over the goal line after touching a player on the defending team. In a sport as free-flowing and unpredictable as soccer, corners—like free kicks and penalties—are rare instances in the game when teams can try out pre-planned plays.

TacticAI uses predictive and generative AI models to convert each corner kick scenario—such as a receiver successfully scoring a goal, or a rival defender intercepting the ball and returning it to their team—into a graph, and the data from each player into a node on the graph, before modeling the interactions between each node. The work was published in Nature Communications today.

Using this data, the model provides recommendations about where to position players during a corner to give them, for example, the best shot at scoring a goal, or the best combination of players to get up front. It can also try to predict the outcomes of a corner, including whether a shot will take place, or which player is most likely to touch the ball first.

Proponents say the AI-powered Judging Support System will promote fairness and transparency in the sport. But can it deliver?

The main benefit is that the AI assistant reduces the workload of the coaches, says Ondřej Hubáček, an analyst at the sports data firm Ematiq who specializes in predictive models, and who did not work on the project. “An AI system can go through the data quickly and point out errors a team is making—I think that’s the added value you can get from AI assistants,” he says.

To assess TacticAI’s suggestions, GoogleDeepMind presented them to five football experts: three data scientists, one video analyst, and one coaching assistant, all of whom work at Liverpool FC. Not only did these experts struggle to distinguish’s TacticAI’s suggestions from real game play scenarios, they also favored the system’s strategies over existing tactics 90% of the time.

These findings suggest that TacticAI’s strategies could be useful for human coaches in real-life games, says Petar Veličković, a staff research scientist at Google DeepMind who worked on the project. “Top clubs are always searching for an edge, and I think our results indicate that techniques like these are likely going to become a part of modern football going forward,” he says.

TacticAI’s powers of prediction aren’t just limited to corner kicks either—the same method could be easily applied to other set pieces, general play throughout a match, or even other sports entirely, such as American football, hockey, or basketball, says Veličković.

“As long as there’s a team-based sport where you believe that modeling relationships between players will be useful and you have a source of data, it’s applicable,” he says.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
55,695
Reputation
8,224
Daps
157,195

The tech industry can’t agree on what open-source AI means. That’s a problem.​

The answer could determine who gets to shape the future of the technology.

By Edd Gent archive page

March 25, 2024
open foundation of a lock shaped building with a subtle crack. A crane lowers a key

STEPHANIE ARNETT/MITTR | ENVATO

Suddenly, “open source” is the latest buzzword in AI circles. Meta has pledged to create open-source artificial general intelligence. And Elon Musk is suing OpenAI over its lack of open-source AI models.

Meanwhile, a growing number of tech leaders and companies are setting themselves up as open-source champions.

But there’s a fundamental problem—no one can agree on what “open-source AI” means.

On the face of it, open-source AI promises a future where anyone can take part in the technology’s development. That could accelerate innovation, boost transparency, and give users greater control over systems that could soon reshape many aspects of our lives. But what even is it? What makes an AI model open source, and what disqualifies it?

The answers could have significant ramifications for the future of the technology. Until the tech industry has settled on a definition, powerful companies can easily bend the concept to suit their own needs, and it could become a tool to entrench the dominance of today’s leading players.

Entering this fray is the Open Source Initiative (OSI), the self-appointed arbiters of what it means to be open source. Founded in 1998, the nonprofit is the custodian of the Open Source Definition, a widely accepted set of rules that determine whether a piece of software can be considered open source.

Now, the organization has assembled a 70-strong group of researchers, lawyers, policymakers, activists, and representatives from big tech companies like Meta, Google, and Amazon to come up with a working definition of open-source AI.

The open-source community is a big tent, though, encompassing everything from hacktivists to Fortune 500 companies. While there’s broad agreement on the overarching principles, says Stefano Maffulli, OSI’s executive director, it’s becoming increasingly obvious that the devil is in the details. With so many competing interests to consider, finding a solution that satisfies everyone while ensuring that the biggest companies play along is no easy task.

Fuzzy criteria​

The lack of a settled definition has done little to prevent tech companies from adopting the term.

Last July, Meta made its Llama 2 model, which it referred to as open source, freely available, and it has a track record of publicly releasing AI technologies. “We support the OSI’s effort to define open-source AI and look forward to continuing to participate in their process for the benefit of the open source community across the world,” Jonathan Torres, Meta’s associate general counsel for AI, open source, and licensing told us.

That stands in marked contrast to rival OpenAI, which has shared progressively fewer details about its leading models over the years, citing safety concerns. “We only open-source powerful AI models once we have carefully weighed the benefits and risks, including misuse and acceleration,” a spokesperson said.

Other leading AI companies, like Stability AI and Aleph Alpha, have also released models described as open source, and Hugging Face hosts a large library of freely available AI models.

While Google has taken a more locked-down approach with its most powerful models, like Gemini and PaLM 2, the Gemma models released last month are freely accessible and designed to go toe-to-toe with Llama 2, though the company described them as “open” rather than “open source.”

But there’s considerable disagreement about whether any of these models can really be described as open source. For a start, both Llama 2 and Gemma come with licenses that restrict what users can do with the models. That’s anathema to open-source principles: one of the key clauses of the Open Source Definition outlaws the imposition of any restrictions based on use cases.

The criteria are fuzzy even for models that don’t come with these kinds of conditions. The concept of open source was devised to ensure developers could use, study, modify, and share software without restrictions. But AI works in fundamentally different ways, and key concepts don’t translate from software to AI neatly, says Maffulli.

One of the biggest hurdles is the sheer number of ingredients that go into today’s AI models. All you need to tinker with a piece of software is the underlying source code, says Maffulli. But depending on your goal, dabbling with an AI model could require access to the trained model, its training data, the code used to preprocess this data, the code governing the training process, the underlying architecture of the model, or a host of other, more subtle details.

Which ingredients you need to meaningfully study and modify models remains open to interpretation. “We have identified what basic freedoms or basic rights we want to be able to exercise,” says Maffulli. “The mechanics of how to exercise those rights are not clear.”

Greater access to the code behind generative models is fueling innovation. But if top companies get spooked, they could close up shop.

Settling this debate will be essential if the AI community wants to reap the same benefits software developers gained from open source, says Maffulli, which was built on broad consensus about what the term meant. “Having [a definition] that is respected and adopted by a large chunk of the industry provides clarity,” he says. “And with clarity comes lower costs for compliance, less friction, shared understanding.”

By far the biggest sticking point is data. All the major AI companies have simply released pretrained models, without the data sets on which they were trained. For people pushing for a stricter definition of open-source AI, Maffulli says, this seriously constrains efforts to modify and study models, automatically disqualifying them as open source.

Others have argued that a simple description of the data is often enough to probe a model, says Maffulli, and you don’t necessarily need to retrain from scratch to make modifications. Pretrained models are routinely adapted through a process known as fine-tuning, in which they are partially retrained on a smaller, often application-specific, dataset.

Meta’s Llama 2 is a case in point, says Roman Shaposhnik, CEO of open-source AI company Ainekko and vice president of legal affairs for the Apache Software Foundation, who is involved in the OSI process. While Meta only released a pretrained model, a flourishing community of developers has been downloading and adapting it, and sharing their modifications.

“People are using it in all sorts of projects. There’s a whole ecosystem around it,” he says. “We therefore must call it something. Is it half-open? Is it ajar?”

While it may be technically possible to modify a model without its original training data, restricting access to a key ingredient is not really in the spirit of open source, says Zuzanna Warso, director of research at nonprofit Open Future, who is taking part in the OSI’s discussions. It’s also debatable whether it’s possible to truly exercise the freedom to study a model without knowing what information it was trained on.

“It’s a crucial component of this whole process,” she says. “If we care about openness, we should also care about the openness of the data.”
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
55,695
Reputation
8,224
Daps
157,195

Have your cake and eat it​

It’s important to understand why companies setting themselves up as open-source champions are reluctant to hand over training data. Access to high-quality training data is a major bottleneck for AI research and a competitive advantage for bigger firms that they’re eager to maintain, says Warso.

At the same time, open source carries a host of benefits that these companies would like to see translated to AI. At a superficial level, the term “open source” carries positive connotations for a lot of people, so engaging in so-called “open washing” can be an easy PR win, says Warso.

It can also have a significant impact on their bottom line. Economists at Harvard Business School recently found that open-source software has saved companies almost $9 trillion in development costs by allowing them to build their products on top of high-quality free software rather than writing it themselves.

For larger companies, open-sourcing their software so that it can be reused and modified by other developers can help build a powerful ecosystem around their products, says Warso. The classic example is Google’s open-sourcing of its Android mobile operating system, which cemented its dominant position at the heart of the smartphone revolution. Meta’s Mark Zuckerberg has been explicit about this motivation in earnings calls, saying “open-source software often becomes an industry standard, and when companies standardize on building with our stack, that then becomes easier to integrate new innovations into our products.”

Crucially, it also appears that open-source AI may receive favorable regulatory treatment in some places, Warso says, pointing to the EU’s newly passed AI Act, which exempts certain open-source projects from some of its more stringent requirements.

Taken together, it’s clear why sharing pretrained models but restricting access to the data required to build them makes good business sense, says Warso. But it does smack of companies trying to have their cake and eat it too, she adds. And if the strategy helps entrench the already dominant positions of large tech companies, it’s hard to see how that fits with the underlying ethos of open source.

“We see openness as one of the tools to challenge the concentration of power,” says Warso. “If the definition is supposed to help in challenging these concentrations of power, then the question of data becomes even more important.”

Shaposhnik thinks a compromise is possible. A significant amount of data used to train the largest models already comes from open repositories like Wikipedia or Common Crawl, which scrapes data from the web and shares it freely. Companies could simply share the open resources used to train their models, he says, making it possible to recreate a reasonable approximation that should allow people to study and understand models.

If we’re not careful, Microsoft, Amazon, and other large companies will leverage their position to set the policy agenda for AI, as they have in many other sectors.

The lack of clarity regarding whether training on art or writing scraped from the internet infringes on the creator’s property rights can cause legal complications though, says Aviya Skowron, head of policy and ethics at the nonprofit AI research group EleutherAI, also involved in the OSI process. That makes developers wary of being open about their data.

Stefano Zacchiroli, a professor of computer science at the Polytechnic Institute of Paris who is also contributing to the OSI definition, appreciates the need for pragmatism. His personal view is that a full description of a model’s training data is the bare minimum for it to be described as open source, but he recognizes that stricter definitions of open-source AI might not have broad appeal.

Ultimately, the community needs to decide what it’s trying to achieve, says Zacchiroli: “Are you just following where the market is going so that they don’t essentially co-opt the term ‘open-source AI,’ or are you trying to pull the market toward being more open and providing more freedoms to the users?”

What’s the point of open source?​

It’s debatable how much any definition of open-source AI will level the playing field anyway, says Sarah Myers West, co–executive director of the AI Now Institute. She coauthored a paper published in August 2023 exposing the lack of openness in many open-source AI projects. But it also highlighted that the vast amounts of data and computing power needed to train cutting-edge AI creates deeper structural barriers for smaller players, no matter how open models are.

Myers West thinks there’s also a lack of clarity regarding what people hope to achieve by making AI open source. “Is it safety, is it the ability to conduct academic research, is it trying to foster greater competition?” she asks. “We need to be way more precise about what the goal is, and then how opening up a system changes the pursuit of that goal.”

The OSI seems keen to avoid those conversations. The draft definition mentions autonomy and transparency as key benefits, but Maffulli demurred when pressed to explain why the OSI values those concepts. The document also contains a section labeled “out of scope issues” that makes clear the definition won’t wade into questions around “ethical, trustworthy, or responsible” AI.

Maffulli says historically the open-source community has focused on enabling the frictionless sharing of software and avoided getting bogged down in debates about what that software should be used for. “It’s not our job,” he says.

But those questions can’t be dismissed, says Warso, no matter how hard people have tried over the decades. The idea that technology is neutral and that topics like ethics are “out of scope” is a myth, she adds. She suspects it’s a myth that needs to be upheld to prevent the open-source community’s loose coalition from fracturing. “I think people realize it’s not real [the myth], but we need this to move forward,” says Warso.

Beyond the OSI, others have taken a different approach. In 2022, a group of researchers introduced Responsible AI Licenses (RAIL), which are similar to open-source licenses but include clauses that can restrict specific use cases. The goal, says Danish Contractor, an AI researcher who co-created the license, is to let developers prevent their work from being used for things they consider inappropriate or unethical.

“As a researcher, I would hate for my stuff to be used in ways that would be detrimental,” he says. And he’s not alone: a recent analysis he and colleagues conducted on AI startup Hugging Face’s popular model-hosting platform found that 28% of models use RAIL.

The license Google attached to its Gemma follows a similar approach. Its terms of use list various prohibited use cases considered “harmful,” which reflects its “commitment to developing AI responsibly,” the company said in a recent blog post.The Allen Institute for AI has also developed its own take on open licensing. Its ImpACT Licenses restrict redistribution of models and data based on their potential risks.

Given how different AI is from conventional software, some level of experimentation with different degrees of openness is inevitable and probably good for the field, says Luis Villa, cofounder and legal lead at open-source software management company Tidelift. But he worries that a proliferation of “open-ish” licenses that are mutually incompatible could negate the frictionless collaboration that made open source so successful, slowing down innovation in AI, reducing transparency, and making it harder for smaller players to build on each other’s work.

Ultimately, Villa thinks the community needs to coalesce around a single standard, otherwise industry will simply ignore it and decide for itself what “open” means. He doesn’t envy the OSI’s job, though. When it came up with the open-source software definition it had the luxury of time and little outside scrutiny. Today, AI is firmly in the crosshairs of both big business and regulators.

But if the open-source community can’t settle on a definition, and quickly, someone else will come up with one that suits their own needs. “They’re going to fill that vacuum,” says Villa. “Mark Zuckerberg is going to tell us all what he thinks ‘open’ means, and he has a very big megaphone.”
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
55,695
Reputation
8,224
Daps
157,195
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
55,695
Reputation
8,224
Daps
157,195

Large language models can help home robots recover from errors without human help​

Brian Heater @bheater / 4:01 PM EDT•March 25, 2024

Comment

LLM MIT Robot

Image Credits: MIT

There are countless reasons why home robots have found little success post-Roomba. Pricing, practicality, form factor and mapping have all contributed to failure after failure. Even when some or all of those are addressed, there remains the question of what happens when a system makes an inevitable mistake.

This has been a point of friction on the industrial level, too, but big companies have the resources to address problems as they arise. We can’t, however, expect consumers to learn to program or hire someone who can help any time an issue arrives. Thankfully, this is a great use case for large language models (LLMs) in the robotics space, as exemplified by new research from MIT.

A study set to be presented at the International Conference on Learning Representations (ICLR) in May purports to bring a bit of “common sense” into the process of correcting mistakes.

“It turns out that robots are excellent mimics,” the school explains. “But unless engineers also program them to adjust to every possible bump and nudge, robots don’t necessarily know how to handle these situations, short of starting their task from the top.”

Traditionally, when a robot encounters issues, it will exhaust its pre-programmed options before requiring human intervention. This is a a particular challenge in an unstructured environment like a home, where any numbers of changes to the status quo can adversely impact a robot’s ability to function.

Researchers behind the study note that while imitation learning (learning to do a task through observation) is popular in the world of home robotics, it often can’t account for the countless small environmental variations that can interfere with regular operation, thus requiring a system to restart from square one. The new research addresses this, in part, by breaking demonstrations into smaller subsets, rather than treating them as part of a continuous action.

This is where LLMs enter the picture, eliminating the requirement for the programmer to label and assign the numerous subactions manually.

“LLMs have a way to tell you how to do each step of a task, in natural language. A human’s continuous demonstration is the embodiment of those steps, in physical space,” says grad student Tsun-Hsuan Wang. “And we wanted to connect the two, so that a robot would automatically know what stage it is in a task, and be able to replan and recover on its own.”

The particular demonstration featured in the study involves training a robot to scoop marbles and pour them into an empty bowl. It’s a simple, repeatable task for humans, but for robots, it’s a combination of various small tasks. The LLMs are capable of listing and labeling these subtasks. In the demonstrations, researchers sabotaged the activity in small ways, like bumping the robot off course and knocking marbles out of its spoon. The system responded by self-correcting the small tasks, rather than starting from scratch.

“With our method, when the robot is making mistakes, we don’t need to ask humans to program or give extra demonstrations of how to recover from failures,” Wang adds.

It’s a compelling method to help one avoid completely losing their marbles.
 
Top