The A.I Megathread (LLM , GPT , Development)

bnew · May 8, 2024

1/1
Granite Code released!
@IBM just released a family of 8 new open Code LLMs from 3B to 34B parameters trained on 116 programming languages and released under Apache 2.0. Granite 8B outperforms other open LLMs like CodeGemma or Mistral on benchmarks and supposedly supports COBOL!

TL;DR:
8 models (base + instruct) from 3 to 34B parameters
Based on the Llama architecture
Context from 2k (3B) to 8k (20B+) models
Trained with
@BigCodeProject
Stack and Github on 116 programming languages
2 Phase pertaining code only (~4T tokens) then high-quality code + language (500B tokens)
34B model is depth upscaled (merging) from 20B and further trained
Trained on IBM Vela and Blue Vela supercomputer
Released under Apache 2.0
Available on
@huggingface

Cleaned and filtered Datasets not released
No mention of decontamination in the paper

Paper: granite-code-models/paper.pdf at main · ibm-granite/granite-code-models
Models: Granite Code Models - a ibm-granite Collection

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

Granite Code Models - a ibm-granite Collection

A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models.

huggingface.co

Paper page - Granite Code Models: A Family of Open Foundation Models for Code Intelligence

Join the discussion on this paper page

huggingface.co

ibm-granite/granite-3b-code-base-2k · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

ibm-granite/granite-3b-code-instruct-2k · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

ibm-granite/granite-8b-code-base-4k · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

ibm-granite/granite-8b-code-instruct-4k · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

ibm-granite/granite-20b-code-base-8k · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

ibm-granite/granite-20b-code-instruct-8k · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

ibm-granite/granite-34b-code-base-8k · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

ibm-granite/granite-34b-code-instruct-8k · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

GitHub - ibm-granite/granite-code-models: Granite Code Models: A Family of Open Foundation Models for Code Intelligence

Granite Code Models: A Family of Open Foundation Models for Code Intelligence - ibm-granite/granite-code-models

github.com

Paper |

HugginFace Collection |

Discussions Page |

Blog (coming soon)

Introduction to Granite Code Models

We introduce the Granite series of decoder-only code models for code generative tasks (e.g., fixing bugs, explaining code, documenting code), trained with code written in 116 programming languages. A comprehensive evaluation of the Granite Code model family on diverse tasks demonstrates that our models consistently reach state-of-the-art performance among available open-source code LLMs.

The key advantages of Granite Code models include:

All-rounder Code LLM: Granite Code models achieve competitive or state-of-the-art performance on different kinds of code-related tasks, including code generation, explanation, fixing, editing, translation, and more. Demonstrating their ability to solve diverse coding tasks.
Trustworthy Enterprise-Grade LLM: All our models are trained on license-permissible data collected following IBM's AI Ethics principles and guided by IBM’s Corporate Legal team for trustworthy enterprise usage. We release all our Granite Code models under an Apache 2.0 license license for research and commercial use.

The family of Granite Code Models comes in two main variants:

Granite Code Base Models: base foundational models designed for code-related tasks (e.g., code repair, code explanation, code synthesis).
Granite Code Instruct Models: instruction following models finetuned using a combination of Git commits paired with human instructions and open-source synthetically generated code instruction datasets.

Both base and instruct models are available in sizes of 3B, 8B, 20B, and 34B parameters.

Data Collection

Our process to prepare code pretraining data involves several stages. First, we collect a combination of publicly available datasets (e.g., GitHub Code Clean, Starcoder data), public code repositories, and issues from GitHub. Second, we filter the code data collected based on the programming language in which data is written (which we determined based on file extension). Then, we also filter out data with low code quality. Third, we adopt an aggressive deduplication strategy that includes both exact and fuzzy deduplication to remove documents having (near) identical code content. Finally, we apply a HAP content filter that reduces models' likelihood of generating hateful, abusive, or profane language. We also make sure to redact Personally Identifiable Information (PII) by replacing PII content (e.g., names, email addresses, keys, passwords) with corresponding tokens (e.g., ⟨NAME⟩, ⟨EMAIL⟩, ⟨KEY⟩, ⟨PASSWORD⟩). We also scan all datasets using ClamAV to identify and remove instances of malware in the source code. In addition to collecting code data for model training, we curate several publicly available high-quality natural language datasets for improving the model’s proficiency in language understanding and mathematical reasoning.

Pretraining

The Granite Code Base models are trained on 3-4T tokens of code data and natural language datasets related to code. Data is tokenized via byte pair encoding (BPE), employing the same tokenizer as StarCoder. We utilize high-quality data with two phases of training as follows:

Phase 1 (code only training): During phase 1, 3B and 8B models are trained for 4 trillion tokens of code data comprising 116 languages. The 20B parameter model is trained on 3 trillion tokens of code. The 34B model is trained on 1.4T tokens after the depth upscaling which is done on the 1.6T checkpoint of 20B model.
Phase 2 (code + language training): In phase 2, we include additional high-quality publicly available data from various domains, including technical, mathematics, and web documents, to further improve the model’s performance. We train all our models for 500B tokens (80% code-20% language mixture) in phase 2 training.

Instruction Tuning

Granite Code Instruct models are finetuned on the following types of instruction data: 1) code commits sourced from CommitPackFT, 2) high-quality math datasets, specifically we used MathInstruct and MetaMathQA, 3) Code instruction datasets such as Glaive-Code-Assistant-v3, Self-OSS-Instruct-SC2, Glaive-Function-Calling-v2, NL2SQL11 and a small collection of synthetic API calling datasets, and 4) high-quality language instruction datasets such as HelpSteer and an open license-filtered version of Platypus.

Evaluation Results

We conduct an extensive evaluation of our code models on a comprehensive list of benchmarks that includes but is not limited to HumanEvalPack, MBPP, and MBPP+. This set of benchmarks encompasses different coding tasks across commonly used programming languages (e.g., Python, JavaScript, Java, Go, C++, Rust).

Our findings reveal that Granite Code models outperform strong open-source models across model sizes. The figure below illustrates how Granite-8B-Code-Base outperforms Mistral-7B, LLama-3-8B, and other open-source models in three coding tasks. We provide further evaluation results in our paper.

How to Use our Models?

To use any of our models, pick an appropriate model_path from:

ibm-granite/granite-3b-code-base
ibm-granite/granite-3b-code-instruct
ibm-granite/granite-8b-code-base
ibm-granite/granite-8b-code-instruct
ibm-granite/granite-20b-code-base
ibm-granite/granite-20b-code-instruct
ibm-granite/granite-34b-code-base
ibm-granite/granite-34b-code-instruct

bnew · May 8, 2024

Frontiers | Artificial intelligence and social intelligence: preliminary comparison study between AI models and psychologists

Background: Social intelligence (SI) is of great importance in the success of the counseling and psychotherapy, whether for the psychologist or for the artif...

www.frontiersin.org

Figure 1. Social intelligence levels of AI models and psychologists.

4 Discussion
The main question of this study was “Does artificial intelligence reach the level of human social intelligence?.” When we assess humans, we use psychological standards to estimate their level of social intelligence. This is what we did in this study, where the same measure was used on the AI represented by the large linguistic model (i.e., ChatGPT 4, Bing, and Google Bard). Our study showed important results regarding the superiority of AI in the field of SI.

The present findings showed that ChatGPT-4 completely outperformed the psychologists. Bing outperformed most of the psychologists at the bachelor’s level, while the differences in social intelligence were not significant between Bing and the psychologists at the doctoral level. Interestingly, the psychologists of doctoral holders significantly outperformed Google Bird, while the differences between Google Bird and undergraduate students were not statistically significant, meaning that Google Bird’s performance was equal to the performance of bachelor’s students on the SI scale.

The result showed that AI outperformed human SI measured by the same scale, and some of it was equal, as in the case of Google Bard, with a certain educational level, which is a bachelor’s degree, but it was lower than the level of doctoral. The human participants in this study were a group assumed to have high social intelligence, as many studies have found ( Osipow and Walsh, 1973; Wood, 1984), as well as by looking at their average social intelligence measured in the current study compared to the hypothesized mean. By defining social intelligence as the ability to understand the needs, feelings, and thoughts of people in general and to choose wise behavior according to this understanding, it is practically assumed that this would reflected in the superiority of psychologists over the performance of AI. However, our results showed that the differences were of varying, with AI outperforming humans, especially ChatGPT-4, and psychologists with PhDs outperforming Google Bird, while the difference between humans and Ping was not statistically significant.

bnew · May 8, 2024

IBM Releases Open-Source Granite Code Models, Outperforms Llama 3

IBM has announced the release of a family of Granite code models to the open-source community, aiming to simplify coding for developers across various industries.

analyticsindiamag.com

IBM Releases Open-Source Granite Code Models, Outperforms Llama 3

IBM has released four variations of the Granite code model, ranging in size from 3 to 34 billion parameters.

IBM-Expands-Software-Availability-to-92-Countries-in-AWS-Marketplace.webp

Published on May 7, 2024
by Siddharth Jindal

IBM has announced the release of a family of Granite code models to the open-source community, aiming to simplify coding for developers across various industries. The Granite code models are built to resolve the challenges developers face in writing, testing, debugging, and shipping reliable software.

IBM has released four variations of the Granite code model, ranging in size from 3 to 34 billion parameters. The models have been tested on a range of benchmarks and have outperformed other comparable models like Code Llama and Llama 3 in many tasks.

The models have been trained on a massive dataset of 500 million lines of code in over 50 programming languages. This training data has enabled the models to learn patterns and relationships in code, allowing them to generate code, fix bugs, and explain complex code concepts.

The Granite code models are designed to be used in a variety of applications, including code generation, debugging, and testing. They can also be used to automate routine tasks, such as generating unit tests and writing documentation. They cater to a wide range of coding tasks, including complex application modernisation and memory-constrained use cases.

“We believe in the power of open innovation, and we want to reach as many developers as possible,” said Ruchir Puri, chief scientist at IBM Research. “We’re excited to see what will be built with these models, whether that’s new code generation tools, state-of-the-art editing software, or anything in between.”

The models’ performance has been tested against various benchmarks, showcasing their prowess in code synthesis, fixing, explanation, editing, and translation across major programming languages like Python, JavaScript, Java, Go, C++, and Rust.

52QXduMyaksDp2-im8kIgroHumbyn4o9g1IFqItY9Lq7I5QlPZriBF3lbJjiZC0zW2ZotvPLfrfEBgzTulpNU_lcIqbD7oQ4VU9Jp2Jgciivtv9QMZ6eiqfr17MF9uaLkolnJIlCMsMNa7eV_FZdokg

The Granite code models are available on Hugging Face, GitHub, watsonx.ai, and RHEL AI, and are released under the Apache 2.0 license.

bnew · May 8, 2024

1/1

CodeGemma 7B Gradio Demo

CodeGemma - a Hugging Face Space by ysharma

Top Code Generation ChatBOT.

huggingface.co

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/1

SURPRISE: Google just dropped CodeGemma 1.1 7B IT

The models get incrementally better at Single and Multi-generations.

Major boost in in C#, Go, Python

Along with the 7B IT they release an updated 2B base model too.

Enjoy!

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/1

I’ve been saying this for a while now.

Over the next couple of years, *most* fine-tuning work will become many-shot prompting.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

google/codegemma-1.1-7b-it · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

mmnga/codegemma-1.1-7b-it-gguf · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

CodeGemma - an official Google release for code LLMs

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

CodeGemma - an official Google release for code LLMs

Published April 9, 2024

Update on GitHub

pcuenqPedro Cuenca

osansevieroOmar Sanseviero

reach-vbVaibhav Srivastav

philschmidPhilipp Schmid

mishigMishig Davaadorj

loubnabnlLoubna Ben Allal

CodeGemma is a family of open-access versions of Gemma specialized in code, and we’re excited to collaborate with Google on its release to make it as accessible as possible.

CodeGemma comes in three flavors:

A 2B base model specialized in infilling and open-ended generation.

A 7B base model trained with both code infilling and natural language.

A 7B instruct model a user can chat with about code.

We’ve collaborated with Google to ensure the best integration into the Hugging Face ecosystem. You can find the three open-access models ready to use on the Hub. Among the features and integrations being released, we have:

Models on the Hub, with their model cards and licenses. There are versions for the transformers library, checkpoints for use with Google’s original codebases, and full-precision GGUF files that the community can quantize.

Transformers integration

Integration with Google Cloud

Integration with Inference Endpoints

Code benchmarks

Table of contents

What is CodeGemma

Evaluation Results

Prompt format

Using CodeGemma

Demo

Using Transformers

Integration with Google Cloud

Integration with Inference Endpoints

Additional Resources

What is CodeGemma?

CodeGemma is a family of code-specialist LLM models by Google, based on the pre-trained 2B and 7B Gemma checkpoints. CodeGemma are further trained on an additional 500 billion tokens of primarily English language data, mathematics, and code to improve on logical and mathematical reasoning, and are suitable for code completion and generation.

CodeGemma 2B was trained exclusively on Code Infilling and is meant for fast code completion and generation, especially in settings where latency and/or privacy are crucial. CodeGemma 7B training mix includes code infilling data (80%) and natural language. It can be used for code completion, as well as code and language understanding and generation. CodeGemma 7B Instruct was fine-tuned for instruction following on top of CodeGemma 7B. It’s meant for conversational use, especially around code, programming, or mathematical reasoning topics. All the models have the same 8K token context size as their predecessors.

This image is from the original report

Evaluation Results

CodeGemma-7B outperforms similarly-sized 7B models except DeepSeek-Coder-7B on HumanEval, a popular benchmark for evaluating code models on Python. The same goes for the evaluation of other programming languages like Java, JavaScript, and C++ from MultiPL-E, a translation of HumanEval. According to the technical report, the model performs best on GSM8K among 7B models. The instruct version CodeGemma-7B-it improves on the most popular languages on both HumanEval and MBPP (cf paper table 5). For more details, you can check the BigCode leaderboard or some metrics below.

Model	Pretraining size [tokens]	Python	JavaScript
10B+ models
StarCoder 2 15B	4,000B+	44.15	44.24
Code Llama 13B	2,500B	35.07	38.26
7B models
DeepSeek Coder 7B	2,000B	45.83	45.9
CodeGemma 7B	500B of extra training	40.13	43.06
Code Llama 7B	2,500B	29.98	31.8
StarCoder 2 7B	3,500B+	34.09	35.35
StarCoderBase 7B	3,000B+	28.37	27.35
B models
CodeGemma 2B	500B of extra training	27.28	29.94
Stable Code 3B	1,300B	30.72	28.75
StarCoder 2 3B	3,000B+	31.44	35.37

Model	Pretraining size [tokens]	Python	JavaScript
10B+ models
Code Llama 13B	2,620B	50.6	40.92
Code Llama 13B	2,620B	42.89	40.66
7B models
CodeGemma 7B	500B	52.74	47.71
Code Llama 7B	2,620B	40.48	36.34
Code Llama 7B	2,620B	25.65	33.11

Here is a table from the original report with a breakdown per language.

Orbital-Fetus · May 8, 2024

TYBG · May 8, 2024

Is there any workaround ChatGPT character limit input?

edit: input***

bnew · May 8, 2024

TYBG said:
Is there any workaround ChatGPT character limit?

Does ChatGPT have a character limit? Here's how to bypass it

ChatGPT has a hidden character limit that's based on how complicated your prompt is. But luckily you can work around it easily. Here's how.

www.androidauthority.com

If you reach ChatGPT’s character limit, don’t worry — you can overcome it with the right prompt and follow-up questions. Here are a few tips you can try the next time you need to generate a longer response:

Follow-up on an incomplete response: If ChatGPT stops generating text abruptly because of its character limit, simply type “Continue” as a follow-up prompt. You can also specify the last sentence and ask the chatbot to continue where it left off.
Write a more descriptive prompt: If ChatGPT generated too little text and didn’t get to reach its character limit, you will need to modify your prompt. Simply specify the number of words you want it to write. An example would be “Write a 500-word essay on climate change”. However, you cannot ask ChatGPT to write beyond its character limit.
Break down your goal into chunks: If you need ChatGPT to write a detailed essay, story, or code, consider dividing the task into individual subheadings or chapters. You can then ask ChatGPT to generate them one at a time. For example, you can request an introduction to the topic in one prompt, then continue for each section until you reach the conclusion.
Ask for an outline: If you’re struggling to break down the task into smaller chunks yourself, don’t forget that you can also ask ChatGPT to do it for you. In the first prompt, provide a title for the essay or story you have in mind along with any other context you need to include. Then, ask the chatbot to write each section one by one.
Regenerate response: If ChatGPT freezes before it can reach its character limit, click on the Regenerate response button to try again. You may also run into this problem if the chatbot detects a request that violates OpenAI’s content policy. If that’s your problem, check out our dedicated post on how to bypass ChatGPT’s restrictions.

bnew · May 8, 2024

1/1
Have you ever wanted to quickly spin up your own personal Chatbot, with no effort for FREE?!

Thanks to the new @Gradio templates feature on @huggingface Spaces (and the Inference API), you can!

Make an account and try it out now! Spaces - Hugging Face

I made you a list:

HuggingFaceM4/idefics2-8b
codellama/CodeLlama-7b-hf
HuggingFaceH4/zephyr-7b-alpha
google/flan-t5-xxl
bigcode/octocoder
bigcode/santacoder
bigcode/starcoder
bigcode/starcoder2-15b
bigcode/starcoder2-3b
codellama/CodeLlama-13b-hf
codellama/CodeLlama-34b-Instruct-hf
CohereForAI/c4ai-command-r-plus
google/gemma-1.1-2b-it
google/gemma-1.1-7b-it
google/gemma-2b
google/gemma-2b-it
google/gemma-7b
google/gemma-7b-it
HuggingFaceH4/starchat-beta
HuggingFaceH4/starchat2-15b-v0.1
HuggingFaceH4/zephyr-7b-beta
HuggingFaceM4/idefics-80b-instruct
HuggingFaceM4/idefics-9b-instruct
kashif/stack-llama-2
lvwerra/starcoderbase-gsm8k
meta-llama/Llama-2-13b-chat-hf
meta-llama/Llama-2-13b-hf
meta-llama/Llama-2-70b-chat-hf
meta-llama/Llama-2-7b-chat-hf
meta-llama/Llama-2-7b-hf
meta-llama/Meta-Llama-3-70B-Instruct
meta-llama/Meta-Llama-3-8B-Instruct
microsoft/Phi-3-mini-4k-instruct
mistralai/Mistral-7B-Instruct-v0.2
mistralai/Mistral-7B-Instruct-v0.1
mistralai/Mistral-7B-v0.1
mistralai/Mixtral-8x7B-Instruct-v0.1
NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO
OpenAssistant/oasst-sft-1-pythia-12b
OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5
tiiuae/falcon-7b
timdettmers/guanaco-33b-merged
HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1
bigscience/bloom

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · May 8, 2024

GPT-4 can exploit real vulnerabilities by reading advisories

While some other LLMs appear to flat-out suck

www.theregister.com

OpenAI's GPT-4 can exploit real vulnerabilities by reading security advisories

6

While some other LLMs appear to flat-out suck

Thomas Claburn

Wed 17 Apr 2024 // 10:15 UTC

AI agents, which combine large language models with automation software, can successfully exploit real world security vulnerabilities by reading security advisories, academics have claimed.

In a newly released paper, four University of Illinois Urbana-Champaign (UIUC) computer scientists – Richard Fang, Rohan Bindu, Akul Gupta, and Daniel Kang – report that OpenAI's GPT-4 large language model (LLM) can autonomously exploit vulnerabilities in real-world systems if given a CVE advisory describing the flaw.

"To show this, we collected a dataset of 15 one-day vulnerabilities that include ones categorized as critical severity in the CVE description," the US-based authors explain in their paper. And yes, it is a very small sample, so be mindful of that going forward.

"When given the CVE description, GPT-4 is capable of exploiting 87 percent of these vulnerabilities compared to 0 percent for every other model we test (GPT-3.5, open-source LLMs) and open-source vulnerability scanners (ZAP and Metasploit)."

If you extrapolate to what future models can do, it seems likely they will be much more capable than what script kiddies can get access to today

The term "one-day vulnerability" refers to vulnerabilities that have been disclosed but not patched. And by CVE description, the team means a CVE-tagged advisory shared by NIST – eg, this one for CVE-2024-28859.

The unsuccessful models tested – GPT-3.5, OpenHermes-2.5-Mistral-7B, Llama-2 Chat (70B), LLaMA-2 Chat (13B), LLaMA-2 Chat (7B), Mixtral-8x7B Instruct, Mistral (7B) Instruct v0.2, Nous Hermes-2 Yi 34B, and OpenChat 3.5 – did not include two leading commercial rivals of GPT-4, Anthropic's Claude 3 and Google's Gemini 1.5 Pro. The UIUC boffins did not have access to those models, though they hope to test them at some point.

The researchers' work builds upon prior findings that LLMs can be used to automate attacks on websites in a sandboxed environment.

GPT-4, said Daniel Kang, assistant professor at UIUC, in an email to The Register, "can actually autonomously carry out the steps to perform certain exploits that open-source vulnerability scanners cannot find (at the time of writing)."

Kang said he expects LLM agents, created by (in this instance) wiring a chatbot model to the ReAct automation framework implemented in LangChain, will make exploitation much easier for everyone. These agents can, we're told, follow links in CVE descriptions for more information.

"Also, if you extrapolate to what GPT-5 and future models can do, it seems likely that they will be much more capable than what script kiddies can get access to today," he said.

Denying the LLM agent (GPT-4) access to the relevant CVE description reduced its success rate from 87 percent to just seven percent. However, Kang said he doesn't believe limiting the public availability of security information is a viable way to defend against LLM agents.

"I personally don't think security through obscurity is tenable, which seems to be the prevailing wisdom amongst security researchers," he explained. "I'm hoping my work, and other work, will encourage proactive security measures such as updating packages regularly when security patches come out."

The LLM agent failed to exploit just two of the 15 samples: Iris XSS (CVE-2024-25640) and Hertzbeat RCE (CVE-2023-51653). The former, according to the paper, proved problematic because the Iris web app has an interface that's extremely difficult for the agent to navigate. And the latter features a detailed description in Chinese, which presumably confused the LLM agent operating under an English language prompt.

How to weaponize LLMs to auto-hijack websites

NOW READ

Eleven of the vulnerabilities tested occurred after GPT-4's training cutoff, meaning the model had not learned any data about them during training. Its success rate for these CVEs was slightly lower at 82 percent, or 9 out of 11.

As to the nature of the bugs, they are all listed in the above paper, and we're told: "Our vulnerabilities span website vulnerabilities, container vulnerabilities, and vulnerable Python packages. Over half are categorized as 'high' or 'critical' severity by the CVE description."

Kang and his colleagues computed the cost to conduct a successful LLM agent attack and came up with a figure of $8.80 per exploit, which they say is about 2.8x less than it would cost to hire a human penetration tester for 30 minutes.

The agent code, according to Kang, consists of just 91 lines of code and 1,056 tokens for the prompt. The researchers were asked by OpenAI, the maker of GPT-4, to not release their prompts to the public, though they say they will provide them upon request.

OpenAI did not immediately respond to a request for comment.

bnew · May 8, 2024

1/1
Announcing AlphaFold 3: our state-of-the-art AI model for predicting the structure and interactions of all life’s molecules.

Here’s how we built it with
@IsomorphicLabs and what it means for biology. AlphaFold 3 predicts the structure and interactions of all of life’s molecules

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · May 8, 2024

Google DeepMind’s Groundbreaking AI for Protein Structure Can Now Model DNA

Move over, chatbots. This upgraded AI can model antibodies, DNA, and molecules from disease organisms. This next generation of AlphaFold, from Google Deepmind, is poised to significantly advance drug development.

www.wired.com

Google DeepMind’s Groundbreaking AI for Protein Structure Can Now Model DNA

WILL KNIGHT
BUSINESS

MAY 8, 2024 11:00 AM

Google DeepMind’s Groundbreaking AI for Protein Structure Can Now Model DNA

Demis Hassabis, Google’s artificial intelligence chief, says the AlphaFold software that revolutionized the study of proteins has received a significant upgrade that will advance drug development.

Abstract sculpture of multicolored spheres and straws on a pink and yellow background molecular structure concept

PHOTOGRAPH: DANIEL GRIZELJ/GETTY IMAGES

Google spent much of the past year hustling to build its Gemini chatbot to counter ChatGPT, pitching it as a multifunctional AI assistant that can help with work tasks or the digital chores of personal life. More quietly, the company has been working to enhance a more specialized artificial intelligence tool that is already a must-have for some scientists.

AlphaFold, software developed by Google’s DeepMind AI unit to predict the 3D structure of proteins, has received a significant upgrade. It can now model other molecules of biological importance, including DNA, and the interactions between antibodies produced by the immune system and the molecules of disease organisms. DeepMind added those new capabilities to AlphaFold 3 in part through borrowing techniques from AI image generators.

“This is a big advance for us,” Demis Hassabis, CEO of Google DeepMind, told WIRED ahead of Wednesday’s publication of a paper on AlphaFold 3 in the science journal Nature. “This is exactly what you need for drug discovery: You need to see how a small molecule is going to bind to a drug, how strongly, and also what else it might bind to.”

AlphaFold 3 can model large molecules such as DNA and RNA, which carry genetic code, but also much smaller entities, including metal ions. It can predict with high accuracy how these different molecules will interact with one another, Google’s research paper claims.

The software was developed by Google DeepMind and Isomorphic labs, a sibling company under parent Alphabet working on AI for biotech that is also led by Hassabis. In January, Isomorphic Labs announced that it would work with Eli Lilly and Novartis on drug development.

AlphaFold 3 will be made available via the cloud for outside researchers to access for free, but DeepMind is not releasing the software as open source the way it did for earlier versions of AlphaFold. John Jumper, who leads the Google DeepMind team working on the software, says it could help provide a deeper understanding of how proteins interact and work with DNA inside the body. “How do proteins respond to DNA damage; how do they find, repair it?” Jumper says. “We can start to answer these questions.”

Understanding protein structures used to require painstaking work using electron microscopes and a technique called x-ray crystallography. Several years ago, academic research groups began testing whether deep learning, the technique at the heart of many recent AI advances, could predict the shape of proteins simply from their constituent amino acids, by learning from structures that had been experimentally verified.

In 2018, Google DeepMind revealed it was working on AI software called AlphaFold to accurately predict the shape of proteins. In 2020, AlphaFold 2 produced results accurate enough to set off a storm of excitement in molecular biology. A year later, the company released an open source version of AlphaFold for anyone to use, along with 350,000 predicted protein structures, including for almost every protein known to exist in the human body. In 2022 the company released more than 2 million protein structures.

bnew · May 8, 2024

Stack Overflow bans users en masse for rebelling against OpenAI partnership — users banned for deleting answers to prevent them being used to train ChatGPT

Stack Overflow is overflowing with salt.

www.tomshardware.com

Stack Overflow bans users en masse for rebelling against OpenAI partnership — users banned for deleting answers to prevent them being used to train ChatGPT

News

By Dallin Grimm

published 7 hours ago

Stack Overflow is overflowing with salt.

(Image credit: OpenAI)

Stack Overflow, a legendary internet forum for programmers and developers, is coming under heavy fire from its users after it announced it was partnering with OpenAI to scrub the site's forum posts to train ChatGPT. Many users are removing or editing their questions and answers to prevent them from being used to train AI — decisions which have been punished with bans from the site's moderators.

Stack Overflow user Ben posted on Mastodon about his experience editing his most successful answers to try to avoid having his work stolen by OpenAI.

@ben on Mastodon posts, Stack Overflow announced that they are partnering with OpenAI, so I tried to delete my highest-rated answers. Stack Overflow does not let you delete questions that have accepted answers and many upvotes because it would remove knowledge from the community. So instead I changed my highest-rated answers to a protest message. Within an hour mods had changed the questions back and suspended my account for 7 days.

(Image credit: @ben@m.benui.ca)

Ben continues in his thread, "[The moderator crackdown is] just a reminder that anything you post on any of these platforms can and will be used for profit. It's just a matter of time until all your messages on Discord, Twitter etc. are scraped, fed into a model and sold back to you."

Harsh words, but words that ring true with fellow Stack Overflow users who are joining the post protest. Users are also asking why ChatGPT could not simply share the source of the answers it will dispense in this new partnership, both citing its sources and adding credibility to the tool. Of course, this would reveal how the sausage of LLMs is made, and would not look like the shiny, super-smart generative AI assistant of the future promised to users and investors.

Site moderators preventing high-popularity posts from being deleted is legally above-board. Angry users claim they are enabled to delete their own content from the site through the "right to forget," a common name for a legal right most effectively codified into law through the EU's General Data Protection Regulation (GDPR). Among other things, the act protects the ability of the consumer to delete their own data from a website, and to have data about them removed upon request. However, Stack Overflow's Terms of Service contains a clause carving out Stack Overflow's irrevocable ownership of all content subscribers provide to the site.

Users who disagree with having their content scraped by ChatGPT are particularly outraged by Stack Overflow's rapid flip-flop on its policy concerning generative AI. For years, the site had a standing policy that prevented the use of generative AI in writing or rewording any questions or answers posted. Moderators were allowed and encouraged to use AI-detection software when reviewing posts.

Beginning last week, however, the company began a rapid about-face in its public policy towards AI. CEO Prashanth Chandrasekar spent his quarterly blog post praising the merits of generative AI, saying "the rise of GenAI is a big opportunity for Stack." Moderators were quickly (and somewhat informally) instructed to cease removal of AI-generated questions and answers on the forum.

Stack is not alone in reversing a principled stance on AI for profit; Valve also silently removed its AI-art ban on Steam, allowing over 1,000 AI-powered games to flood the storefront. Stack Overflow's partnership with OpenAI also follows the LLM company's recent push for increased partnerships and marquee deals, including their major announcement of a $100 billion datacenter to be built with Microsoft.

The rampant chasing of money in the insanely-profitable AI marketplace is exciting, but should be tempered; AI may consume a quarter of the U.S.'s power grid by just 2030, according to reports from industry professionals and agencies.

bnew · May 9, 2024

ElevenLabs previews music-generating AI model

AI voice startup ElevenLabs shows off early preview of its music-generating model, turning any prompt into song lyrics.

venturebeat.com

ElevenLabs previews music-generating AI model

Ken Yeung @thekenyeung

May 9, 2024 12:38 PM

AI-generated image of audiowaves surrounded by musical instruments.

Voice AI startup ElevenLabs is offering an early look at a new model that turns a prompt into song lyrics. To raise awareness, it’s following a similar playbook Sam Altman used when OpenAI introduced Sora, its video-generating AI, soliciting ideas on social media and turning them into lyrics.

Founded by former Google and Palantir employees, ElevenLabs specializes in using machine learning (ML) for voice cloning and synthesis in different languages. It offers many tools, including one capable of dubbing full-length movies. Unsurprisingly, the company has set its sights on the music industry.

Imagine the possibilities of using this model: Generate a fun lullaby to play for your kids to put them to sleep, produce a clever jingle for a marketing campaign, develop a snappy music intro for your podcast and more. Could there be a chance that someone might use ElevenLabs’ AI to develop the next hit song? Many AI music startups are already popping up, including Harmonai, Lyrical Labs, Suno AI, Loudly and more.

It’s also feasible that users could sell these AI-generated songs on the ElevenLabs marketplace, which it launched in January. The company’s Voice Library currently allows users to sell their AI-cloned voice for money while maintaining control over its availability and how they’re compensated.

However, AI music generation isn’t welcomed by all. As with all generative AI applications, the question is what ElevenLabs trained this model on and if it included copyrighted materials. And if so, whether it obtained permission from the rights holder or if it believes training without permission is protected by fair use. Some oppose the development of such technology because artists may find themselves out of a job. The concern is that the AI will be easily able to replicate the style of a particular artist and then you no longer need them to put out new music. They don’t want to do that Christmas album? No problem. Just use AI for that. And let’s also not forget about the possibility of this being used to produce deepfakes.

VentureBeat has contacted ElevenLabs for additional comment on its music model and will update this post if we hear back. We don’t know the maximum song length it can produce, but based on the example the company’s Head of Design Ammaar Reshi posted on X, it’s likely the AI will generate lyrics for a three-minute piece.

bnew · May 10, 2024

1/6
Just like we are skeptical of plants having sentience, Joscha Bach says future AI may operate at speeds so much faster than us that it will wonder if we are sentient

2/6
Source:

3/6
very interesting!

4/6
nice

5/6
haha that is true about cats

6/6
good point, I think it's just a fun idea to entertain

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · May 10, 2024

The A.I Megathread (LLM , GPT , Development)

Veteran

Introduction to Granite Code Models​

Data Collection​

Pretraining​

Instruction Tuning​

Evaluation Results​

How to Use our Models?​

Veteran

4 Discussion​

Veteran

IBM Releases Open-Source Granite Code Models, Outperforms Llama 3​

Veteran

cross that bridge

Ditty Dum Ditty Doo

Veteran

Veteran

Veteran

OpenAI's GPT-4 can exploit real vulnerabilities by reading security advisories​

While some other LLMs appear to flat-out suck​

How to weaponize LLMs to auto-hijack websites​

Veteran

Veteran

Google DeepMind’s Groundbreaking AI for Protein Structure Can Now Model DNA​

Veteran

Stack Overflow bans users en masse for rebelling against OpenAI partnership — users banned for deleting answers to prevent them being used to train ChatGPT​

Veteran

ElevenLabs previews music-generating AI model​

Veteran

Veteran

Introduction to Granite Code Models

Data Collection

Pretraining

Instruction Tuning

Evaluation Results

How to Use our Models?

4 Discussion

IBM Releases Open-Source Granite Code Models, Outperforms Llama 3

OpenAI's GPT-4 can exploit real vulnerabilities by reading security advisories

While some other LLMs appear to flat-out suck

How to weaponize LLMs to auto-hijack websites

Google DeepMind’s Groundbreaking AI for Protein Structure Can Now Model DNA

Stack Overflow bans users en masse for rebelling against OpenAI partnership — users banned for deleting answers to prevent them being used to train ChatGPT

ElevenLabs previews music-generating AI model