Large Language Models News & Discussions

bnew · May 7, 2024

https://openai.com/index/api-partnership-with-stack-overflow

May 6, 2024

API Partnership with Stack Overflow

Stack Overflow and OpenAI today announced a new API partnership that will empower developers with the collective strengths of the world’s leading knowledge platform for highly technical content with the world’s most popular LLM models for AI development.

Editor’s Note: This news was originally shared by Stack Overflow here(opens in a new window).

OpenAI and Stack Overflow are coming together via OverflowAPI access to provide OpenAI users and customers with the accurate and vetted data foundation that AI tools need to quickly find a solution to a problem so that technologists can stay focused on priority tasks. OpenAI will also surface validated technical knowledge from Stack Overflow directly in ChatGPT, giving users easy access to trusted, attributed, accurate, and highly technical knowledge and code backed by the millions of developers that have contributed to the Stack Overflow platform for 15 years.

As part of this collaboration:

OpenAI will utilize Stack Overflow’s OverflowAPI product and collaborate with Stack Overflow to improve model performance for developers who use their products. This integration will help OpenAI improve its AI models using enhanced content and feedback from the Stack Overflow community and provide attribution to the Stack Overflow community within ChatGPT to foster deeper engagement with content.
Stack Overflow will utilize OpenAI models as part of their development of OverflowAI and work with OpenAI to leverage insights from internal testing to maximize the performance of OpenAI models. OpenAI’s partnership with Stack Overflow will help further drive its mission to empower the world to develop technology through collective knowledge, as Stack Overflow will be able to create better products that benefit the Stack Exchange community’s health, growth, and engagement.

“Learning from as many languages, cultures, subjects, and industries as possible ensures that our models can serve everyone. The developer community is particularly important to both of us. Our deep partnership with Stack Overflow will help us enhance the user and developer experience on both our platforms,” said Brad Lightcap, COO at OpenAI.

“Stack Overflow is the world’s largest developer community, with more than 59 million questions and answers. Through this industry-leading partnership with OpenAI, we strive to redefine the developer experience, fostering efficiency and collaboration through the power of community, best-in-class data, and AI experiences,” said Prashanth Chandrasekar, CEO of Stack Overflow. “Our goal with OverflowAPI, and our work to advance the era of socially responsible AI, is to set new standards with vetted, trusted, and accurate data that will be the foundation on which technology solutions are built and delivered to our user.”

The first set of new integrations and capabilities between Stack Overflow and OpenAI will be available in the first half of 2024. Beyond this, OpenAI’s partnership with Stack Overflow will enable Stack Overflow to continue to reinvest in community-driven features(opens in a new window). To learn more about Stack Overflow’s API solution and partnerships, visit https://stackoverflow.co/api-solutions/

bnew · May 7, 2024

Computer Science > Computation and Language

[Submitted on 6 May 2024]

AlphaMath Almost Zero - process Supervision without process

Guoxin Chen, Minpeng Liao, Chengxi Li, Kai Fan

Recent advancements in large language models (LLMs) have substantially enhanced their mathematical reasoning abilities. However, these models still struggle with complex problems that require multiple reasoning steps, frequently leading to logical or numerical errors. While numerical mistakes can largely be addressed by integrating a code interpreter, identifying logical errors within intermediate steps is more challenging. Moreover, manually annotating these steps for training is not only expensive but also demands specialized expertise. In this study, we introduce an innovative approach that eliminates the need for manual annotation by leveraging the Monte Carlo Tree Search (MCTS) framework to generate both the process supervision and evaluation signals automatically. Essentially, when a LLM is well pre-trained, only the mathematical questions and their final answers are required to generate our training data, without requiring the solutions. We proceed to train a step-level value model designed to improve the LLM's inference process in mathematical domains. Our experiments indicate that using automatically generated solutions by LLMs enhanced with MCTS significantly improves the model's proficiency in dealing with intricate mathematical reasoning tasks.

Comments:	Work in progress
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2405.03553 [cs.CL]
	(or arXiv:2405.03553v1 [cs.CL] for this version)
	[2405.03553] AlphaMath Almost Zero: process Supervision without process Focus to learn more

Submission history

From: Kai Fan Dr [view email]
[v1] Mon, 6 May 2024 15:20:30 UTC (519 KB)

https://arxiv.org/pdf/2405.03553

bnew · May 7, 2024

Should we slow down AI research? | Debate with Meta, IBM, FHI, FLI

Future of Life Institute
Subscribe | 68.5K
Shared May 7, 2024
Mark Brakel (FLI Director of Policy), Yann LeCun, Francesca Rossi, and Nick Bostrom debate: "Should we slow down research on AI?" at the World AI Cannes Festival in February 2024.

bnew · May 8, 2024

1/1
Granite Code released!
@IBM just released a family of 8 new open Code LLMs from 3B to 34B parameters trained on 116 programming languages and released under Apache 2.0. Granite 8B outperforms other open LLMs like CodeGemma or Mistral on benchmarks and supposedly supports COBOL!

TL;DR:
8 models (base + instruct) from 3 to 34B parameters
Based on the Llama architecture
Context from 2k (3B) to 8k (20B+) models
Trained with
@BigCodeProject
Stack and Github on 116 programming languages
2 Phase pertaining code only (~4T tokens) then high-quality code + language (500B tokens)
34B model is depth upscaled (merging) from 20B and further trained
Trained on IBM Vela and Blue Vela supercomputer
Released under Apache 2.0
Available on
@huggingface

Cleaned and filtered Datasets not released
No mention of decontamination in the paper

Paper: granite-code-models/paper.pdf at main · ibm-granite/granite-code-models
Models: Granite Code Models - a ibm-granite Collection

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

Granite Code Models - a ibm-granite Collection

A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models.

huggingface.co

Paper page - Granite Code Models: A Family of Open Foundation Models for Code Intelligence

Join the discussion on this paper page

huggingface.co

ibm-granite/granite-3b-code-base-2k · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

ibm-granite/granite-3b-code-instruct-2k · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

ibm-granite/granite-8b-code-base-4k · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

ibm-granite/granite-8b-code-instruct-4k · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

ibm-granite/granite-20b-code-base-8k · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

ibm-granite/granite-20b-code-instruct-8k · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

ibm-granite/granite-34b-code-base-8k · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

ibm-granite/granite-34b-code-instruct-8k · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

GitHub - ibm-granite/granite-code-models: Granite Code Models: A Family of Open Foundation Models for Code Intelligence

Granite Code Models: A Family of Open Foundation Models for Code Intelligence - ibm-granite/granite-code-models

github.com

Paper |

HugginFace Collection |

Discussions Page |

Blog (coming soon)

Introduction to Granite Code Models

We introduce the Granite series of decoder-only code models for code generative tasks (e.g., fixing bugs, explaining code, documenting code), trained with code written in 116 programming languages. A comprehensive evaluation of the Granite Code model family on diverse tasks demonstrates that our models consistently reach state-of-the-art performance among available open-source code LLMs.

The key advantages of Granite Code models include:

All-rounder Code LLM: Granite Code models achieve competitive or state-of-the-art performance on different kinds of code-related tasks, including code generation, explanation, fixing, editing, translation, and more. Demonstrating their ability to solve diverse coding tasks.
Trustworthy Enterprise-Grade LLM: All our models are trained on license-permissible data collected following IBM's AI Ethics principles and guided by IBM’s Corporate Legal team for trustworthy enterprise usage. We release all our Granite Code models under an Apache 2.0 license license for research and commercial use.

The family of Granite Code Models comes in two main variants:

Granite Code Base Models: base foundational models designed for code-related tasks (e.g., code repair, code explanation, code synthesis).
Granite Code Instruct Models: instruction following models finetuned using a combination of Git commits paired with human instructions and open-source synthetically generated code instruction datasets.

Both base and instruct models are available in sizes of 3B, 8B, 20B, and 34B parameters.

Data Collection

Our process to prepare code pretraining data involves several stages. First, we collect a combination of publicly available datasets (e.g., GitHub Code Clean, Starcoder data), public code repositories, and issues from GitHub. Second, we filter the code data collected based on the programming language in which data is written (which we determined based on file extension). Then, we also filter out data with low code quality. Third, we adopt an aggressive deduplication strategy that includes both exact and fuzzy deduplication to remove documents having (near) identical code content. Finally, we apply a HAP content filter that reduces models' likelihood of generating hateful, abusive, or profane language. We also make sure to redact Personally Identifiable Information (PII) by replacing PII content (e.g., names, email addresses, keys, passwords) with corresponding tokens (e.g., ⟨NAME⟩, ⟨EMAIL⟩, ⟨KEY⟩, ⟨PASSWORD⟩). We also scan all datasets using ClamAV to identify and remove instances of malware in the source code. In addition to collecting code data for model training, we curate several publicly available high-quality natural language datasets for improving the model’s proficiency in language understanding and mathematical reasoning.

Pretraining

The Granite Code Base models are trained on 3-4T tokens of code data and natural language datasets related to code. Data is tokenized via byte pair encoding (BPE), employing the same tokenizer as StarCoder. We utilize high-quality data with two phases of training as follows:

Phase 1 (code only training): During phase 1, 3B and 8B models are trained for 4 trillion tokens of code data comprising 116 languages. The 20B parameter model is trained on 3 trillion tokens of code. The 34B model is trained on 1.4T tokens after the depth upscaling which is done on the 1.6T checkpoint of 20B model.
Phase 2 (code + language training): In phase 2, we include additional high-quality publicly available data from various domains, including technical, mathematics, and web documents, to further improve the model’s performance. We train all our models for 500B tokens (80% code-20% language mixture) in phase 2 training.

Instruction Tuning

Granite Code Instruct models are finetuned on the following types of instruction data: 1) code commits sourced from CommitPackFT, 2) high-quality math datasets, specifically we used MathInstruct and MetaMathQA, 3) Code instruction datasets such as Glaive-Code-Assistant-v3, Self-OSS-Instruct-SC2, Glaive-Function-Calling-v2, NL2SQL11 and a small collection of synthetic API calling datasets, and 4) high-quality language instruction datasets such as HelpSteer and an open license-filtered version of Platypus.

Evaluation Results

We conduct an extensive evaluation of our code models on a comprehensive list of benchmarks that includes but is not limited to HumanEvalPack, MBPP, and MBPP+. This set of benchmarks encompasses different coding tasks across commonly used programming languages (e.g., Python, JavaScript, Java, Go, C++, Rust).

Our findings reveal that Granite Code models outperform strong open-source models across model sizes. The figure below illustrates how Granite-8B-Code-Base outperforms Mistral-7B, LLama-3-8B, and other open-source models in three coding tasks. We provide further evaluation results in our paper.

How to Use our Models?

To use any of our models, pick an appropriate model_path from:

ibm-granite/granite-3b-code-base
ibm-granite/granite-3b-code-instruct
ibm-granite/granite-8b-code-base
ibm-granite/granite-8b-code-instruct
ibm-granite/granite-20b-code-base
ibm-granite/granite-20b-code-instruct
ibm-granite/granite-34b-code-base
ibm-granite/granite-34b-code-instruct

bnew · May 8, 2024

IBM Releases Open-Source Granite Code Models, Outperforms Llama 3

IBM has announced the release of a family of Granite code models to the open-source community, aiming to simplify coding for developers across various industries.

analyticsindiamag.com

IBM Releases Open-Source Granite Code Models, Outperforms Llama 3

IBM has released four variations of the Granite code model, ranging in size from 3 to 34 billion parameters.

IBM-Expands-Software-Availability-to-92-Countries-in-AWS-Marketplace.webp

Published on May 7, 2024
by Siddharth Jindal

IBM has announced the release of a family of Granite code models to the open-source community, aiming to simplify coding for developers across various industries. The Granite code models are built to resolve the challenges developers face in writing, testing, debugging, and shipping reliable software.

IBM has released four variations of the Granite code model, ranging in size from 3 to 34 billion parameters. The models have been tested on a range of benchmarks and have outperformed other comparable models like Code Llama and Llama 3 in many tasks.

The models have been trained on a massive dataset of 500 million lines of code in over 50 programming languages. This training data has enabled the models to learn patterns and relationships in code, allowing them to generate code, fix bugs, and explain complex code concepts.

The Granite code models are designed to be used in a variety of applications, including code generation, debugging, and testing. They can also be used to automate routine tasks, such as generating unit tests and writing documentation. They cater to a wide range of coding tasks, including complex application modernisation and memory-constrained use cases.

“We believe in the power of open innovation, and we want to reach as many developers as possible,” said Ruchir Puri, chief scientist at IBM Research. “We’re excited to see what will be built with these models, whether that’s new code generation tools, state-of-the-art editing software, or anything in between.”

The models’ performance has been tested against various benchmarks, showcasing their prowess in code synthesis, fixing, explanation, editing, and translation across major programming languages like Python, JavaScript, Java, Go, C++, and Rust.

52QXduMyaksDp2-im8kIgroHumbyn4o9g1IFqItY9Lq7I5QlPZriBF3lbJjiZC0zW2ZotvPLfrfEBgzTulpNU_lcIqbD7oQ4VU9Jp2Jgciivtv9QMZ6eiqfr17MF9uaLkolnJIlCMsMNa7eV_FZdokg

The Granite code models are available on Hugging Face, GitHub, watsonx.ai, and RHEL AI, and are released under the Apache 2.0 license.

bnew · May 8, 2024

1/1

CodeGemma 7B Gradio Demo

CodeGemma - a Hugging Face Space by ysharma

Top Code Generation ChatBOT.

huggingface.co

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/1

SURPRISE: Google just dropped CodeGemma 1.1 7B IT

The models get incrementally better at Single and Multi-generations.

Major boost in in C#, Go, Python

Along with the 7B IT they release an updated 2B base model too.

Enjoy!

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/1

I’ve been saying this for a while now.

Over the next couple of years, *most* fine-tuning work will become many-shot prompting.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

google/codegemma-1.1-7b-it · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

mmnga/codegemma-1.1-7b-it-gguf · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

CodeGemma - an official Google release for code LLMs

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

CodeGemma - an official Google release for code LLMs

Published April 9, 2024

Update on GitHub

pcuenqPedro Cuenca

osansevieroOmar Sanseviero

reach-vbVaibhav Srivastav

philschmidPhilipp Schmid

mishigMishig Davaadorj

loubnabnlLoubna Ben Allal

CodeGemma is a family of open-access versions of Gemma specialized in code, and we’re excited to collaborate with Google on its release to make it as accessible as possible.

CodeGemma comes in three flavors:

A 2B base model specialized in infilling and open-ended generation.

A 7B base model trained with both code infilling and natural language.

A 7B instruct model a user can chat with about code.

We’ve collaborated with Google to ensure the best integration into the Hugging Face ecosystem. You can find the three open-access models ready to use on the Hub. Among the features and integrations being released, we have:

Models on the Hub, with their model cards and licenses. There are versions for the transformers library, checkpoints for use with Google’s original codebases, and full-precision GGUF files that the community can quantize.

Transformers integration

Integration with Google Cloud

Integration with Inference Endpoints

Code benchmarks

Table of contents

What is CodeGemma

Evaluation Results

Prompt format

Using CodeGemma

Demo

Using Transformers

Integration with Google Cloud

Integration with Inference Endpoints

Additional Resources

What is CodeGemma?

CodeGemma is a family of code-specialist LLM models by Google, based on the pre-trained 2B and 7B Gemma checkpoints. CodeGemma are further trained on an additional 500 billion tokens of primarily English language data, mathematics, and code to improve on logical and mathematical reasoning, and are suitable for code completion and generation.

CodeGemma 2B was trained exclusively on Code Infilling and is meant for fast code completion and generation, especially in settings where latency and/or privacy are crucial. CodeGemma 7B training mix includes code infilling data (80%) and natural language. It can be used for code completion, as well as code and language understanding and generation. CodeGemma 7B Instruct was fine-tuned for instruction following on top of CodeGemma 7B. It’s meant for conversational use, especially around code, programming, or mathematical reasoning topics. All the models have the same 8K token context size as their predecessors.

This image is from the original report

Evaluation Results

CodeGemma-7B outperforms similarly-sized 7B models except DeepSeek-Coder-7B on HumanEval, a popular benchmark for evaluating code models on Python. The same goes for the evaluation of other programming languages like Java, JavaScript, and C++ from MultiPL-E, a translation of HumanEval. According to the technical report, the model performs best on GSM8K among 7B models. The instruct version CodeGemma-7B-it improves on the most popular languages on both HumanEval and MBPP (cf paper table 5). For more details, you can check the BigCode leaderboard or some metrics below.

Model	Pretraining size [tokens]	Python	JavaScript
10B+ models
StarCoder 2 15B	4,000B+	44.15	44.24
Code Llama 13B	2,500B	35.07	38.26
7B models
DeepSeek Coder 7B	2,000B	45.83	45.9
CodeGemma 7B	500B of extra training	40.13	43.06
Code Llama 7B	2,500B	29.98	31.8
StarCoder 2 7B	3,500B+	34.09	35.35
StarCoderBase 7B	3,000B+	28.37	27.35
B models
CodeGemma 2B	500B of extra training	27.28	29.94
Stable Code 3B	1,300B	30.72	28.75
StarCoder 2 3B	3,000B+	31.44	35.37

Model	Pretraining size [tokens]	Python	JavaScript
10B+ models
Code Llama 13B	2,620B	50.6	40.92
Code Llama 13B	2,620B	42.89	40.66
7B models
CodeGemma 7B	500B	52.74	47.71
Code Llama 7B	2,620B	40.48	36.34
Code Llama 7B	2,620B	25.65	33.11

Here is a table from the original report with a breakdown per language.

bnew · May 8, 2024

GPT-4 can exploit real vulnerabilities by reading advisories

While some other LLMs appear to flat-out suck

www.theregister.com

OpenAI's GPT-4 can exploit real vulnerabilities by reading security advisories

6

While some other LLMs appear to flat-out suck

Thomas Claburn

Wed 17 Apr 2024 // 10:15 UTC

AI agents, which combine large language models with automation software, can successfully exploit real world security vulnerabilities by reading security advisories, academics have claimed.

In a newly released paper, four University of Illinois Urbana-Champaign (UIUC) computer scientists – Richard Fang, Rohan Bindu, Akul Gupta, and Daniel Kang – report that OpenAI's GPT-4 large language model (LLM) can autonomously exploit vulnerabilities in real-world systems if given a CVE advisory describing the flaw.

"To show this, we collected a dataset of 15 one-day vulnerabilities that include ones categorized as critical severity in the CVE description," the US-based authors explain in their paper. And yes, it is a very small sample, so be mindful of that going forward.

"When given the CVE description, GPT-4 is capable of exploiting 87 percent of these vulnerabilities compared to 0 percent for every other model we test (GPT-3.5, open-source LLMs) and open-source vulnerability scanners (ZAP and Metasploit)."

If you extrapolate to what future models can do, it seems likely they will be much more capable than what script kiddies can get access to today

The term "one-day vulnerability" refers to vulnerabilities that have been disclosed but not patched. And by CVE description, the team means a CVE-tagged advisory shared by NIST – eg, this one for CVE-2024-28859.

The unsuccessful models tested – GPT-3.5, OpenHermes-2.5-Mistral-7B, Llama-2 Chat (70B), LLaMA-2 Chat (13B), LLaMA-2 Chat (7B), Mixtral-8x7B Instruct, Mistral (7B) Instruct v0.2, Nous Hermes-2 Yi 34B, and OpenChat 3.5 – did not include two leading commercial rivals of GPT-4, Anthropic's Claude 3 and Google's Gemini 1.5 Pro. The UIUC boffins did not have access to those models, though they hope to test them at some point.

The researchers' work builds upon prior findings that LLMs can be used to automate attacks on websites in a sandboxed environment.

GPT-4, said Daniel Kang, assistant professor at UIUC, in an email to The Register, "can actually autonomously carry out the steps to perform certain exploits that open-source vulnerability scanners cannot find (at the time of writing)."

Kang said he expects LLM agents, created by (in this instance) wiring a chatbot model to the ReAct automation framework implemented in LangChain, will make exploitation much easier for everyone. These agents can, we're told, follow links in CVE descriptions for more information.

"Also, if you extrapolate to what GPT-5 and future models can do, it seems likely that they will be much more capable than what script kiddies can get access to today," he said.

Denying the LLM agent (GPT-4) access to the relevant CVE description reduced its success rate from 87 percent to just seven percent. However, Kang said he doesn't believe limiting the public availability of security information is a viable way to defend against LLM agents.

"I personally don't think security through obscurity is tenable, which seems to be the prevailing wisdom amongst security researchers," he explained. "I'm hoping my work, and other work, will encourage proactive security measures such as updating packages regularly when security patches come out."

The LLM agent failed to exploit just two of the 15 samples: Iris XSS (CVE-2024-25640) and Hertzbeat RCE (CVE-2023-51653). The former, according to the paper, proved problematic because the Iris web app has an interface that's extremely difficult for the agent to navigate. And the latter features a detailed description in Chinese, which presumably confused the LLM agent operating under an English language prompt.

How to weaponize LLMs to auto-hijack websites

NOW READ

Eleven of the vulnerabilities tested occurred after GPT-4's training cutoff, meaning the model had not learned any data about them during training. Its success rate for these CVEs was slightly lower at 82 percent, or 9 out of 11.

As to the nature of the bugs, they are all listed in the above paper, and we're told: "Our vulnerabilities span website vulnerabilities, container vulnerabilities, and vulnerable Python packages. Over half are categorized as 'high' or 'critical' severity by the CVE description."

Kang and his colleagues computed the cost to conduct a successful LLM agent attack and came up with a figure of $8.80 per exploit, which they say is about 2.8x less than it would cost to hire a human penetration tester for 30 minutes.

The agent code, according to Kang, consists of just 91 lines of code and 1,056 tokens for the prompt. The researchers were asked by OpenAI, the maker of GPT-4, to not release their prompts to the public, though they say they will provide them upon request.

OpenAI did not immediately respond to a request for comment.

bnew · May 8, 2024

1/1
Announcing AlphaFold 3: our state-of-the-art AI model for predicting the structure and interactions of all life’s molecules.

Here’s how we built it with
@IsomorphicLabs and what it means for biology. AlphaFold 3 predicts the structure and interactions of all of life’s molecules

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · May 8, 2024

Google DeepMind’s Groundbreaking AI for Protein Structure Can Now Model DNA

Move over, chatbots. This upgraded AI can model antibodies, DNA, and molecules from disease organisms. This next generation of AlphaFold, from Google Deepmind, is poised to significantly advance drug development.

www.wired.com

Google DeepMind’s Groundbreaking AI for Protein Structure Can Now Model DNA

WILL KNIGHT
BUSINESS

MAY 8, 2024 11:00 AM

Google DeepMind’s Groundbreaking AI for Protein Structure Can Now Model DNA

Demis Hassabis, Google’s artificial intelligence chief, says the AlphaFold software that revolutionized the study of proteins has received a significant upgrade that will advance drug development.

Abstract sculpture of multicolored spheres and straws on a pink and yellow background molecular structure concept

PHOTOGRAPH: DANIEL GRIZELJ/GETTY IMAGES

Google spent much of the past year hustling to build its Gemini chatbot to counter ChatGPT, pitching it as a multifunctional AI assistant that can help with work tasks or the digital chores of personal life. More quietly, the company has been working to enhance a more specialized artificial intelligence tool that is already a must-have for some scientists.

AlphaFold, software developed by Google’s DeepMind AI unit to predict the 3D structure of proteins, has received a significant upgrade. It can now model other molecules of biological importance, including DNA, and the interactions between antibodies produced by the immune system and the molecules of disease organisms. DeepMind added those new capabilities to AlphaFold 3 in part through borrowing techniques from AI image generators.

“This is a big advance for us,” Demis Hassabis, CEO of Google DeepMind, told WIRED ahead of Wednesday’s publication of a paper on AlphaFold 3 in the science journal Nature. “This is exactly what you need for drug discovery: You need to see how a small molecule is going to bind to a drug, how strongly, and also what else it might bind to.”

AlphaFold 3 can model large molecules such as DNA and RNA, which carry genetic code, but also much smaller entities, including metal ions. It can predict with high accuracy how these different molecules will interact with one another, Google’s research paper claims.

The software was developed by Google DeepMind and Isomorphic labs, a sibling company under parent Alphabet working on AI for biotech that is also led by Hassabis. In January, Isomorphic Labs announced that it would work with Eli Lilly and Novartis on drug development.

AlphaFold 3 will be made available via the cloud for outside researchers to access for free, but DeepMind is not releasing the software as open source the way it did for earlier versions of AlphaFold. John Jumper, who leads the Google DeepMind team working on the software, says it could help provide a deeper understanding of how proteins interact and work with DNA inside the body. “How do proteins respond to DNA damage; how do they find, repair it?” Jumper says. “We can start to answer these questions.”

Understanding protein structures used to require painstaking work using electron microscopes and a technique called x-ray crystallography. Several years ago, academic research groups began testing whether deep learning, the technique at the heart of many recent AI advances, could predict the shape of proteins simply from their constituent amino acids, by learning from structures that had been experimentally verified.

In 2018, Google DeepMind revealed it was working on AI software called AlphaFold to accurately predict the shape of proteins. In 2020, AlphaFold 2 produced results accurate enough to set off a storm of excitement in molecular biology. A year later, the company released an open source version of AlphaFold for anyone to use, along with 350,000 predicted protein structures, including for almost every protein known to exist in the human body. In 2022 the company released more than 2 million protein structures.

bnew · May 9, 2024

ElevenLabs previews music-generating AI model

AI voice startup ElevenLabs shows off early preview of its music-generating model, turning any prompt into song lyrics.

venturebeat.com

ElevenLabs previews music-generating AI model

Ken Yeung @thekenyeung

May 9, 2024 12:38 PM

AI-generated image of audiowaves surrounded by musical instruments.

Voice AI startup ElevenLabs is offering an early look at a new model that turns a prompt into song lyrics. To raise awareness, it’s following a similar playbook Sam Altman used when OpenAI introduced Sora, its video-generating AI, soliciting ideas on social media and turning them into lyrics.

Founded by former Google and Palantir employees, ElevenLabs specializes in using machine learning (ML) for voice cloning and synthesis in different languages. It offers many tools, including one capable of dubbing full-length movies. Unsurprisingly, the company has set its sights on the music industry.

Imagine the possibilities of using this model: Generate a fun lullaby to play for your kids to put them to sleep, produce a clever jingle for a marketing campaign, develop a snappy music intro for your podcast and more. Could there be a chance that someone might use ElevenLabs’ AI to develop the next hit song? Many AI music startups are already popping up, including Harmonai, Lyrical Labs, Suno AI, Loudly and more.

It’s also feasible that users could sell these AI-generated songs on the ElevenLabs marketplace, which it launched in January. The company’s Voice Library currently allows users to sell their AI-cloned voice for money while maintaining control over its availability and how they’re compensated.

However, AI music generation isn’t welcomed by all. As with all generative AI applications, the question is what ElevenLabs trained this model on and if it included copyrighted materials. And if so, whether it obtained permission from the rights holder or if it believes training without permission is protected by fair use. Some oppose the development of such technology because artists may find themselves out of a job. The concern is that the AI will be easily able to replicate the style of a particular artist and then you no longer need them to put out new music. They don’t want to do that Christmas album? No problem. Just use AI for that. And let’s also not forget about the possibility of this being used to produce deepfakes.

VentureBeat has contacted ElevenLabs for additional comment on its music model and will update this post if we hear back. We don’t know the maximum song length it can produce, but based on the example the company’s Head of Design Ammaar Reshi posted on X, it’s likely the AI will generate lyrics for a three-minute piece.

Fillerguy · May 9, 2024

I fed Google Gemini a series of short stories/essays I wrote in high school and it turned it into a 10 chapter novel.

It managed to make connections I didn't even think of and fleshed some of my half baked ideas. The mf even gave me plausible paths my story could take. :wow:

it took the boring travel guide story, a pov essay I wrote on American settlers after the Louisiana Purchase, made that the central focus then flipped it into a Black Cowboy revenge epic. It guess it was inspired by one of the opinion essays on how Black music can capture history the greater culture ignores. The gang my MC starts is called the Temptations :wow:

And the MC goes by "Marvin".

It's called the Ballad of the Plains. The story jumps around and is barely coherent but there's an actual story here. Am I a novelist :patrice:

bnew · May 9, 2024

Fillerguy said:
I fed Google Gemini a series of short stories/essays I wrote in high school and it turned it into a 10 chapter novel.

It managed to make connections I didn't even think of and fleshed some of my half baked ideas. The mf even gave me plausible paths my story could take. it took the boring travel guide story, a pov essay I wrote on American settlers after the Louisiana Purchase, made that the central focus then flipped it into a Black Cowboy revenge epic. It guess it was inspired by one of the opinion essays on how Black music can capture history the greater culture ignores. The gang my MC starts is called the Temptations And the MC goes by "Marvin".

It's called the Ballad of the Plains. The story jumps around and is barely coherent but there's an actual story here. Am I a novelist

post it

bnew · May 10, 2024

1/4
Here’s an early preview of ElevenLabs Music.

All of the songs in this thread were generated from a single text prompt with no edits.

Title: It Started to Sing

Style: “Pop pop-rock, country, top charts song.”

2/4
Title: It Started to Sing (Jazz Version)

Style: “A jazz pop top charts song with emotional vocals, catchy chorus, and trumpet solos.”

3/4
Title: Broke my Heart

Style: “Smooth Contemporary R&B with subtle Electronic elements, featuring a pulsing 104 BPM drum machine beat, filtered synths, lush electric piano, and soaring strings, with an intimate mood.”

4/4
Title: My Love

Style: “Indie Rock with 90s influences, featuring a combination of clean and distorted guitars, driving drum beats, and a prominent bassline, with a moderate tempo around 120 BPM, and a mix of introspective and uplifting moods, evoking a sense of nostalgia and…

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/2
Our music model
@elevenlabsio is coming together! Here’s a very early preview.

Have your own song ideas? Reply with a prompt and some lyrics and I’ll generate some for you!

2/2
Happy Birthday, Pika team!

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/3
Our music model
@elevenlabsio is coming together! Here’s a very early preview.

Have your own song ideas? Reply with a prompt and some lyrics and I’ll generate some for you!

2/3
Happy cooking

3/3
Hahaha this is a worthy callout.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/2
Our music model
@elevenlabsio is coming together! Here’s a very early preview.

Have your own song ideas? Reply with a prompt and some lyrics and I’ll generate some for you!

2/2
POV: you and i'm-a-good-gpt2-chatbot

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · May 10, 2024

1/7
Meet LTM-1: LLM with *5,000,000 prompt tokens*

That's ~500k lines of code or ~5k files, enough to fully cover most repositories.

LTM-1 is a prototype of a neural network architecture we designed for giant context windows.

2/7
Watch LTM-1 generate complex suggestions:

3/7
Watch LTM-1 reuse and synthesize information across files:

4/7
How?

We tried to scale standard GPT context windows but quickly got stuck.

So, we designed a new approach: the Long-term Memory Network (LTM Net).

Training and serving LTM Nets required a custom ML stack, from GPU kernels to how we distribute the model across a cluster.

5/7
What’s next? More compute.

LTM Nets see more context than GPTs, but LTM-1 has fewer parameters than today’s frontier models, making it less smart.

Knowing how drastically model scale improves the performance of GPTs, we're excited to see how far we can take LTM Nets.

6/7
Want to use LTM-1?

We’ve on-boarded early alpha users to test our code completion product and are training a larger model for its commercial release.

We'll invite more users as we iron out backend instabilities and grow our GPU cluster.

Sign up here:

7/7
Want to contribute?

Magic is a tiny team of 10 on a mission to build AGI utopia. We value integrity and ambition.

Join us and get more responsibility than you'd get at any other company:

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · May 10, 2024

1/10
New model codenames: gpt-4l, gpt-4l-auto, gpt-4-auto

2/10
Source: ChatGPT 1.2024.122 for Android

3/10
I like this theory, which could also explain the "gpt2-chatbot"

4/10
And the question of the day: is "AG8PqS2q" the same model or another one?

5/10
ICYMI - the new cancellation flow and reminders about lost features mention losing access to both the main and lite models

6/10
GPT-4 Lite (Scallion) is also one of the models mentioned in the eval version of Search ChatGPT com

7/10
*-auto =

8/10
I would say dynamic auto-switching of model

https://twitter.com/btibor91/statu/btibor91/status/1773495212049838306

9/10
I would say it's more likely that it's the "lite" version (Scallion) mentioned elsewhere

10/10
Dynamic/auto-selecting model?

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

Large Language Models News & Discussions

Veteran

API Partnership with Stack Overflow​

Veteran

Computer Science > Computation and Language​

AlphaMath Almost Zero - process Supervision without process​

Submission history​

Veteran

Should we slow down AI research? | Debate with Meta, IBM, FHI, FLI​

Veteran

Introduction to Granite Code Models​

Data Collection​

Pretraining​

Instruction Tuning​

Evaluation Results​

How to Use our Models?​

Veteran

IBM Releases Open-Source Granite Code Models, Outperforms Llama 3​

Veteran

Veteran

OpenAI's GPT-4 can exploit real vulnerabilities by reading security advisories​

While some other LLMs appear to flat-out suck​

How to weaponize LLMs to auto-hijack websites​

Veteran

Veteran

Google DeepMind’s Groundbreaking AI for Protein Structure Can Now Model DNA​

Veteran

ElevenLabs previews music-generating AI model​

Veteran

Veteran

Veteran

Veteran

Veteran

API Partnership with Stack Overflow

Computer Science > Computation and Language

AlphaMath Almost Zero - process Supervision without process

Submission history

Should we slow down AI research? | Debate with Meta, IBM, FHI, FLI

Introduction to Granite Code Models

Data Collection

Pretraining

Instruction Tuning

Evaluation Results

How to Use our Models?

IBM Releases Open-Source Granite Code Models, Outperforms Llama 3

OpenAI's GPT-4 can exploit real vulnerabilities by reading security advisories

While some other LLMs appear to flat-out suck

How to weaponize LLMs to auto-hijack websites

Google DeepMind’s Groundbreaking AI for Protein Structure Can Now Model DNA

ElevenLabs previews music-generating AI model