bnew

Veteran
Joined
Nov 1, 2015
Messages
51,004
Reputation
7,865
Daps
147,240

Google Gemma: because Google doesn’t want to give away Gemini yet​


Gemma 2B and Gemma 7B are smaller open-source AI models for language tasks in English.​


By Emilia David, a reporter who covers AI. Prior to joining The Verge, she covered the intersection between technology, finance, and the economy.

Feb 21, 2024, 8:00 AM EST


Gemma logo
gemma_promo_press.png

Google’s new model Gemma.
Image: Google

Google has released Gemma 2B and 7B, a pair of open-source AI models that let developers use the research that went into its flagship Gemini more freely. While Gemini is a big closed AI model that directly competes with (and is nearly as powerful as) OpenAI’s ChatGPT, the lightweight Gemma will likely be suitable for smaller tasks like simple chatbots or summarizations.

But what these models lack in complication, they may make up for in speed and cost of use. Despite their smaller size, Google claims Gemma models “surpass significantly larger models on key benchmarks” and are “capable of running directly on a developer laptop or desktop computer.” They will be available via Kaggle, Hugging Face, Nvidia’s NeMo, and Google’s Vertex AI.

Gemma’s release into the open-source ecosystem is starkly different from how Gemini was released. While developers can build on Gemini, they do that either through APIs or by working on Google’s Vertex AI platform. Gemini is considered a closed AI model. By making Gemma open source, more people can experiment with Google’s AI rather than turn to competitors that offer better access.

Both model sizes will be available with a commercial license regardless of organization size, number of users, and the type of project. However, Google — like other companies — often prohibits its models from being used for specific tasks such as weapons development programs.

Gemma will also ship with “responsible AI toolkits,” as open models can be harder to place guardrails in than more closed systems like Gemini. Tris Warkentin, product management director at Google DeepMind, said the company undertook “more extensive red-teaming to Gemma because of the inherent risks involved with open models.”

The responsible AI toolkit will allow developers to create their own guidelines or a banned word list when deploying Gemma to their projects. It also includes a model debugging tool that lets users investigate Gemma’s behavior and correct issues.

The models work best for language-related tasks in English for now, according to Warkentin. “We hope we can build with the community to address market needs outside of English-language tasks,” he told reporters.

Developers can use Gemma for free in Kaggle, and first-time Google Cloud users get $300 in credits to use the models. The company said researchers can apply for up to $500,000 in cloud credits.

While it’s not clear how much of a demand there is for smaller models like Gemma, other AI companies have released lighter-weight versions of their flagship foundation models, too. Meta put out Llama 2 7B, the smallest iteration of Llama 2, last year. Gemini itself comes in several weights, including Gemini Nano, Gemini Pro, and Gemini Ultra, and Google recently announced a faster Gemini 1.5 — again, for business users and developers for now.

Gemma, by the way, means precious stone.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,004
Reputation
7,865
Daps
147,240
DEVELOPERS

Gemma: Introducing new state-of-the-art open models​


Feb 21, 2024
3 min read

Gemma is built for responsible AI development from the same research and technology used to create Gemini models.

Jeanine Banks
VP & GM, Developer X and DevRel

Tris Warkentin
Director, Google DeepMind

The word “Gemma” and a spark icon with blueprint styling appears in a blue gradient against a black background.

Listen to article7 minutes

At Google, we believe in making AI helpful for everyone. We have a long history of contributing innovations to the open community, such as with Transformers, TensorFlow, BERT, T5, JAX, AlphaFold, and AlphaCode. Today, we’re excited to introduce a new generation of open models from Google to assist developers and researchers in building AI responsibly.


Gemma open models​

Gemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models. Developed by Google DeepMind and other teams across Google, Gemma is inspired by Gemini, and the name reflects the Latin gemma, meaning “precious stone.” Accompanying our model weights, we’re also releasing tools to support developer innovation, foster collaboration, and guide responsible use of Gemma models.

Gemma is available worldwide, starting today. Here are the key details to know:




State-of-the-art performance at size​

Gemma models share technical and infrastructure components with Gemini, our largest and most capable AI model widely available today. This enables Gemma 2B and 7B to achieve best-in-class performance for their sizes compared to other open models. And Gemma models are capable of running directly on a developer laptop or desktop computer. Notably, Gemma surpasses significantly larger models on key benchmarks while adhering to our rigorous standards for safe and responsible outputs. See the technical report for details on performance, dataset composition, and modeling methodologies.

A chart showing Gemma performance on common benchmarks, compared to Llama-2 7B and 13B
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,004
Reputation
7,865
Daps
147,240

Responsible by design​

Gemma is designed with our AI Principles at the forefront. As part of making Gemma pre-trained models safe and reliable, we used automated techniques to filter out certain personal information and other sensitive data from training sets. Additionally, we used extensive fine-tuning and reinforcement learning from human feedback (RLHF) to align our instruction-tuned models with responsible behaviors. To understand and reduce the risk profile for Gemma models, we conducted robust evaluations including manual red-teaming, automated adversarial testing, and assessments of model capabilities for dangerous activities. These evaluations are outlined in our Model Card.
1


We’re also releasing a new Responsible Generative AI Toolkit together with Gemma to help developers and researchers prioritize building safe and responsible AI applications. The toolkit includes:


  • Safety classification: We provide a novel methodology for building robust safety classifiers with minimal examples.
  • Debugging: A model debugging tool helps you investigate Gemma's behavior and address potential issues.
  • Guidance: You can access best practices for model builders based on Google’s experience in developing and deploying large language models.


Optimized across frameworks, tools and hardware​

You can fine-tune Gemma models on your own data to adapt to specific application needs, such as summarization or retrieval-augmented generation (RAG). Gemma supports a wide variety of tools and systems:


  • Multi-framework tools: Bring your favorite framework, with reference implementations for inference and fine-tuning across multi-framework Keras 3.0, native PyTorch, JAX, and Hugging Face Transformers.
  • Cross-device compatibility: Gemma models run across popular device types, including laptop, desktop, IoT, mobile and cloud, enabling broadly accessible AI capabilities.
  • Cutting-edge hardware platforms: We’ve partnered with NVIDIA to optimize Gemma for NVIDIA GPUs, from data center to the cloud to local RTX AI PCs, ensuring industry-leading performance and integration with cutting-edge technology.
  • Optimized for Google Cloud: Vertex AI provides a broad MLOps toolset with a range of tuning options and one-click deployment using built-in inference optimizations. Advanced customization is available with fully-managed Vertex AI tools or with self-managed GKE, including deployment to cost-efficient infrastructure across GPU, TPU, and CPU from either platform.

Free credits for research and development​

Gemma is built for the open community of developers and researchers powering AI innovation. You can start working with Gemma today using free access in Kaggle, a free tier for Colab notebooks, and $300 in credits for first-time Google Cloud users. Researchers can also apply for Google Cloud credits of up to $500,000 to accelerate their projects.

Getting started​

You can explore more about Gemma and access quickstart guides on ai.google.dev/gemma.

As we continue to expand the Gemma model family, we look forward to introducing new variants for diverse applications. Stay tuned for events and opportunities in the coming weeks to connect, learn and build with Gemma.

We’re excited to see what you create!
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,004
Reputation
7,865
Daps
147,240


RWaoNEl.jpeg



Computer Science > Computation and Language​

[Submitted on 15 Feb 2024]

Data Engineering for Scaling Language Models to 128K Context​

Yao Fu, Rameswar Panda, Xinyao Niu, Xiang Yue, Hannaneh Hajishirzi, Yoon Kim, Hao Peng
We study the continual pretraining recipe for scaling language models' context lengths to 128K, with a focus on data engineering. We hypothesize that long context modeling, in particular \textit{the ability to utilize information at arbitrary input locations}, is a capability that is mostly already acquired through large-scale pretraining, and that this capability can be readily extended to contexts substantially longer than seen during training~(e.g., 4K to 128K) through lightweight continual pretraining on appropriate data mixture. We investigate the \textit{quantity} and \textit{quality} of the data for continual pretraining: (1) for quantity, we show that 500 million to 5 billion tokens are enough to enable the model to retrieve information anywhere within the 128K context; (2) for quality, our results equally emphasize \textit{domain balance} and \textit{length upsampling}. Concretely, we find that naively upsampling longer data on certain domains like books, a common practice of existing work, gives suboptimal performance, and that a balanced domain mixture is important. We demonstrate that continual pretraining of the full model on 1B-5B tokens of such data is an effective and affordable strategy for scaling the context length of language models to 128K. Our recipe outperforms strong open-source long-context models and closes the gap to frontier models like GPT-4 128K.
Comments:Code at this https URL
Subjects:Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:arXiv:2402.10171 [cs.CL]
(or arXiv:2402.10171v1 [cs.CL] for this version)
[2402.10171] Data Engineering for Scaling Language Models to 128K Context
Focus to learn more

Submission history

From: Yao Fu [view email]
[v1] Thu, 15 Feb 2024 18:19:16 UTC (1,657 KB)



About​

Implementation of paper Data Engineering for Scaling Language Models to 128K Context
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,004
Reputation
7,865
Daps
147,240




Text-to-Image with SDXL Lightning ⚡


This demo utilizes the SDXL-Lightning model by ByteDance, which is a fast text-to-image generative model capable of producing high-quality images in 4 steps. As a community effort, this demo was put together by AngryPenguin. Link to model: https://huggingface.co/ByteDance/SDXL-Lightning

 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,004
Reputation
7,865
Daps
147,240

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,004
Reputation
7,865
Daps
147,240










text said:
Mind officially blown:

I recorded a screen capture of a task (looking for an apartment on Zillow). Gemini was able to generate Selenium code to replicate that task, and described everything I did step-by-step.

It even caught that my threshold was set to $3K, even though I didn't explicitly select it. 🤯🔥

"This code will open a Chrome browser, navigate to Zillow, enter "Cupertino, CA" in the search bar, click on the "For Rent" tab, set the price range to "Up to $3K", set the number of bedrooms to "2+", select the "Apartments/Condos/Co-ops" checkbox, click on the "Apply" button, wait for the results to load, print the results, and close the browser."

 
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,004
Reputation
7,865
Daps
147,240

Stable Diffusion 3​

22 Feb

image-90.png

Prompt: Epic anime artwork of a wizard atop a mountain at night casting a cosmic spell into the dark sky that says "Stable Diffusion 3" made out of colorful energy


Announcing Stable Diffusion 3 in early preview, our most capable text-to-image model with greatly improved performance in multi-subject prompts, image quality, and spelling abilities.

While the model is not yet broadly available, today, we are opening the waitlist for an early preview. This preview phase, as with previous models, is crucial for gathering insights to improve its performance and safety ahead of an open release. You can sign up to join the waitlist here.


Group_3.png

The Stable Diffusion 3 suite of models currently range from 800M to 8B parameters. This approach aims to align with our core values and democratize access, providing users with a variety of options for scalability and quality to best meet their creative needs. Stable Diffusion 3 combines a diffusion transformer architecture and flow matching. We will publish a detailed technical report soon.


Group_2.png

We believe in safe, responsible AI practices. This means we have taken and continue to take reasonable steps to prevent the misuse of Stable Diffusion 3 by bad actors. Safety starts when we begin training our model and continues throughout the testing, evaluation, and deployment. In preparation for this early preview, we’ve introduced numerous safeguards. By continually collaborating with researchers, experts, and our community, we expect to innovate further with integrity as we approach the model’s public release.


Group_1.png

Our commitment to ensuring generative AI is open, safe, and universally accessible remains steadfast. With Stable Diffusion 3, we strive to offer adaptable solutions that enable individuals, developers, and enterprises to unleash their creativity, aligning with our mission to activate humanity’s potential.

If you’d like to explore using one of our other image models for commercial use prior to the Stable Diffusion 3 release, please visit our Stability AI Membership page to self host or our Developer Platform to access our API.

To stay updated on our progress follow us on Twitter, Instagram, LinkedIn, and join our Discord Community.












 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,004
Reputation
7,865
Daps
147,240

‘Slow Horses’ & ‘One Life’ Director Predicts A Show Made Entirely By Generative AI Is Only Three-To-Five Years Away​


By Max Goldbart

Max Goldbart

International TV Co-Editor
@Goldbart1

More Stories By Max​

VIEW ALL

February 21, 2024 3:05am

Slow Horses

Slow Horses Apple+

A TV series made entirely by generative AI is only three-to-five years away, according to the director of hits including Slow Horses and Anthony Hopkins movie One Life.


James Hawes spoke with legal teams at SAG-AFTRA and the WGA and undertook a poll with fellow directors and VFX workers after the BBC canceled long-running drama Doctors, in which he probed the likelihood of a fully AI series, he revealed today.

“The best guess was three to five years,” he told the British Film & High-End TV Inquiry. “Someone will say, ‘Create a scene in an ER room where a doctor comes in and he’s having an affair with a woman and they’re flirting, and someone is dying on the table,’ and [AI] will start to create it. Maybe it won’t be as polished as we are used to but that is how close we are getting.”


Hawes, who is also vice chair of Directors UK, raised concerns that shows with AI so central to their creation will have an impact on “vital training grounds” for below-the-line staffers making their way through the industry, citing Doctors.

He simultaneously acknowledged the “genie is out the bottle” on AI and the UK should work to catch up with the likes of the U.S., as he pointed to the launch of Open AI’s Sora program last week, which can generate scenes digitally.

“My worry is that if we don’t get up to speed with this then the AI-generated stories will come from elsewhere,” he added. “We need to take note and act on it now. Silicon Valley is way ahead.”

The U.S. writers and actors guilds were able to secure guardrails surrounding use of AI in their contracts with the AMPTP following lengthy and messy negotiations last year, and Hawes said the U.S. DGA is sitting down with members and studios to discuss this every few months. In the UK, artificial intelligence is set to play a major role in the actor’s unions upcoming negotiations with broadcasters and producers.

Hawes urged British stakeholders across the board to be cognisant of the debate. Prior to the inquiry hearing, he joked that he had asked ChatGPT to come up with the questions he would be asked, and it had been “very accurate.”

Hawes stressed that there is no replacement for the spontaneity of non-AI production and cited an example of Anthony Hopkins playing a piano on the set of One Life, which was introduced into the scene after wowing those working on the show.

‘Slow Horses’ deemed “too quirky”

GettyImages-1759534528.jpg

James Hawes. Image: Maria Moratti/Getty Images

During a wide-ranging session, Hawes also revealed that Slow Horses was rejected by some British broadcasters and was initially deemed “too quirky and British” for Apple TV+.

“They wondered whether it would travel even though we have the spy genre reputation [in the UK],” he added. “The attachment of Gary Oldman and subsequent success shows ‘quirky British’ can travel and it is now the longest running series on Apple.”

Speaking to the decline in the indie movie and TV sector, he echoed comments made earlier this week by Bectu boss Philippa Childs that the UK has become too reliant on inward investment.

“There are downsides [to inward investment] because it has inflated costs and therefore domestic production is finding it hard to compete for the best practitioners,” he added. “It’s been very busy out there, although not right now. That has given North America confidence in what we are doing.”

Having made the transition from high-end TV to movies, Hawes also said there is no longer a “sniffiness” from film execs towards TV directors.

The inquiry is spotlighting the state of British film and high-end TV, which is being overseen by the UK’s Culture, Media & Sport Committee, examining issues such as financing, tax credits and diversity.

Last month, Bend it Like Beckham director Gurinder Chadha revealed to the inquiry she is making a Christmas movie about an Indian Ebenezer Scrooge set in London, financed by Zygi Kamasa’s new UK distributor True Brit.

Later today, Ken Loach indie boss Rebecca O’Brien and the heads of Film4 and BBC Film will appear.
 

↓R↑LYB

I trained Sheng Long and Shonuff
Joined
May 2, 2012
Messages
44,204
Reputation
13,723
Daps
171,113
Reppin
Pawgistan


Researcher Jim Fan presents the next grand challenge in the quest for AI: the "foundation agent," which would seamlessly operate across both the virtual and physical worlds. He explains how this technology could fundamentally change our lives — permeating everything from video games and metaverses to drones and humanoid robots — and explores how a single model could master skills across these different realities.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,004
Reputation
7,865
Daps
147,240

Microsoft develops its own networking gear for AI datacenters: Report​

News

By Anton Shilov

published 1 day ago

Juniper Networks and Fungible founder spearheads development.

Data center network connections

(Image credit: Shutterstock)

After revealing its own 128-core datacenter CPU and Maia 100 GPU for artificial intelligence workloads, Microsoft has begun development of its own networking card in a bid to decrease its reliance on Nvidia's hardware and speed up its datacenters, reports The Information. If the company succeeds, it could then proceed to optimize its Azure infrastructure and diversify its technology stack. Interestingly, the company has indirectly confirmed the effort.

Microsoft acquired Fungible, a developer of data processing units (DPUs) that competed against AMD's Pensando and Nvidia's Mellanox divisions, about a year ago. That means the company clearly has the networking technologies and IP that it needs to design datacenter-grade networking gear suitable for bandwidth-hungry AI training workloads. Pradeep Sindhu, a co-founder of Juniper Networks and founder of Fungible who has a wealth of experience in networking gear, now works at Microsoft and is heading the development of the company's datacenter networking processors.

The new networking card is expected to improve of the performance and efficiency of Microsoft's Azure servers, which currently run Intel CPUs and Nvidia GPUs, but will eventually also adopt Microsoft's own CPUs and GPUs. The Information claims that the project is important for Microsoft, which is why Satya Nadella, the head of the company, appointed Sindhu to the project himself.

"As part of our systems approach to Azure infrastructure, we are focused on optimizing every layer of our stack," a Microsoft spokesperson told The Information. "We routinely develop new technologies to meet the needs of our customers, including networking chips."

High-performance networking gear is crucial for datacenters, especially when handling the massive amount of data required for AI training by clients like OpenAI. By alleviating network traffic jams, the new server component could accelerate the development of AI models, making the process faster and more cost-effective.

Microsoft's move is in line with the industry trend toward custom silicon, as other cloud providers including Amazon Web Services (AWS) and Google are also developing their own AI and general-purpose processors and datacenter networking gear.

The introduction of Microsoft's networking card could potentially impact Nvidia's sales of server networking gear, which is projected to generate over $10 billion per year. If successful, the card could significantly improve the efficiency of Azure datacenters in general and OpenAI's model training in particular, as well as reduce the time and costs associated with AI development, the report claims.

Custom silicon can take a significant amount of time to design and manufacture, which means the initial results of this endeavor could still be years away. In the short term, Microsoft will continue to rely on hardware from other vendors, but that may change in the coming years.
 

↓R↑LYB

I trained Sheng Long and Shonuff
Joined
May 2, 2012
Messages
44,204
Reputation
13,723
Daps
171,113
Reppin
Pawgistan
@bnew props for keeping this thread going breh, you're the only one holding it down :salute:

I don't think people truly understand what's about to happen in the next few years with AI. I'm trying to really get caught up to speed on how to utilize these tools because this shyt appears to be more game changing than the internet.

When you include emerging technologies like Nvidia Omniverse, self operating computers, and foundation agents humans are about to enter a new era of productivity.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,004
Reputation
7,865
Daps
147,240



BXNDqRM.png







Magic AI Secures $117 Million to Build an AI Software Engineer​

The startup is carving out a niche by focusing on developing an AI software engineer capable of assisting with complex coding tasks and that will act more as a coworker than merely a "copilot" tool.




[URL='https://www.maginative.com/author/chris/']CHRIS MCKAY

FEBRUARY 16, 2024 • 2 MIN READ

Magic AI Secures $117 Million to Build an AI Software Engineer

San Francisco-based startup Magic AI has raised $117 million in Series B funding to further develop its advanced AI system aimed at automating software development.

The round was led by Nat Friedman and Daniel Gross’s NFDG Ventures, with additional participation from CapitalG and Elad Gil. This brings Magic’s total funding to date to over $145 million.

Founded in 2022 by Eric Steinberger and Sebastian De Ro, the startup is carving out a niche by focusing on developing an AI software engineer capable of assisting with complex coding tasks and that will act more as a coworker than merely a "copilot" tool.

The founders believe that in addition to boosting practical coding productivity, advancing intelligent code generation tools can also provide a path toward more expansive artificial general intelligence. Their vision even extends to the creation of broadly capable AGI systems that align with human values - ones able to accelerate global progress by assisting with humanity's most complex challenges. Their $23 million Series A round last summer was a major step towards this ambitious mission.

Central to Magic's technical strategy is handling exceptionally large context windows. Last year, they unveiled their Long-term Memory Network (LTM Net) architecture and corresponding LTM-1 model with a 5 million context window.

For perspective, most language models operate on far more limited contexts, commonly less than 32k tokens. OpenAI's powerful GPT-4 Turbo model is 128k tokens and Anthropic's Claude 2.1 is 200k.

However, models with much larger context windows are on the horizon. Just yesterday, Google announced that their new Gemini 1.5 model will have a 1 million context window, and shared that they have tested up to 10 million context lengths in research.

The substantially larger context capacities allow for more nuanced code comprehension - enabling Magic's model to reason over entire repositories and dependency trees to boost usefulness.

The startup continues to operate in stealth mode with few public demos, but claims to already have thousands of GPUs deployed toward training its next generation models.

Now, with fresh funding in hand, talent recruitment and retention is clearly top of mind for Magic AI’s leadership. The startup is actively seeking talented individuals who share its vision of integrity and innovation and is placing a strong emphasis on cultivating a supportive culture rooted in passion.

With towering investor confidence and transformative ambitions, Magic AI remains one of the most intriguing AI startups amid a landscape filled with heated competition. Still, Magic believes its technical approach focused on extreme model scale and novel neural architectures sets it apart.
 
Last edited:
Top