The A.I Megathread (LLM , GPT , Development)

bnew · Jun 12, 2023

GitHub - jtydhr88/sd-webui-txt-img-to-3d-model: A custom extension for sd-webui that allow you to generate 3D model from txt or image, basing on OpenAI Shap-E.

A custom extension for sd-webui that allow you to generate 3D model from txt or image, basing on OpenAI Shap-E. - GitHub - jtydhr88/sd-webui-txt-img-to-3d-model: A custom extension for sd-webui tha...

github.com

About

A custom extension for sd-webui that allow you to generate 3D model from txt or image, basing on OpenAI Shap-E.

Stable Diffusion WebUI Txt/Img To 3D Model

A custom extension for AUTOMATIC1111/stable-diffusion-webui that allow you to generate 3D model from txt or image, basing on OpenAI Shap-E.

bnew · Jun 12, 2023

Try chatting with fine-tuned models for Falcon-7B, Falcon-40B, and the new Open-Llama-7B

h2oGPT

Making h2oGPT models available to everyone. For more information, visit our GitHub pages: H2O LLM Studio and h2oGPT

bnew · Jun 12, 2023

GitHub - h2oai/h2o-llmstudio: H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs

H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs - GitHub - h2oai/h2o-llmstudio: H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs

github.com

About

H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs

Welcome to H2O LLM Studio, a framework and no-code GUI designed for
fine-tuning state-of-the-art large language models (LLMs).

With H2O LLM Studio, you can

easily and effectively fine-tune LLMs without the need for any coding experience.
use a graphic user interface (GUI) specially designed for large language models.
finetune any LLM using a large variety of hyperparameters.
use recent finetuning techniques such as Low-Rank Adaptation (LoRA) and 8-bit model training with a low memory footprint.
use advanced evaluation metrics to judge generated answers by the model.
track and compare your model performance visually. In addition, Neptune integration can be used.
chat with your model and get instant feedback on your model performance.
easily export your model to the Hugging Face Hub and share it with the community.

bnew · Jun 12, 2023

OpenAI, DeepMind will open up models to UK government

LONDON — Google DeepMind, OpenAI and Anthropic have agreed to open up their AI models to the U.K. government for research and safety purposes, Prime Minister Rishi Sunak announced at London Tech We…

www.politico.eu

OpenAI, DeepMind will open up models to UK government

BY LAURIE CLARKE
JUNE 12, 2023 12:07 PM CET
2 MINUTES READ

LONDON — Google DeepMind, OpenAI and Anthropic have agreed to open up their AI models to the U.K. government for research and safety purposes, Prime Minister Rishi Sunak announced at London Tech Week on Monday.

The priority access will be granted in order “to help build better evaluations and help us better understand the opportunities and risks of these systems,” Sunak said.

The announcement came in a speech that championed the promise of AI to transform areas such as education and healthcare and heralded the U.K.’s potential as an “island of innovation.”

“AI is surely one of the greatest opportunities before us,” said Sunak. By combining AI models with the power of quantum, “the possibilities are extraordinary,” he marvelled.

“But we must and we will do it safely,” he continued. “I know people are concerned.”

In March, the U.K. government published an AI white paper which set out a “pro-innovation” approach, but more recently Sunak has emphasized the need for “guardrails.”

Sunak said on Monday the U.K.’s ambition was to be “not just the intellectual home, but the geographical home of global AI safety regulation.” He declined to set out specific proposals for regulation.

The lynchpin of these plans is a global summit on AI safety that will be held in the U.K. in the fall, first reported by POLITICO. Sunak likened the summit to an AI-version of UN COP climate change conferences.

A Foundation Model Taskforce will also pioneer research on AI safety and assurance techniques, backed by £100 million of funding.

In the speech to London Tech Week, Sunak also name-checked semiconductors, synthetic biology and quantum as key areas of focus for the U.K. and said the country’s agile and balanced approach to regulation would continue to make it an attractive place to invest.

Anthropic and OpenAI have recently opened up European headquarters in the U.K. Palantir announced it was setting up an AI research hub in the U.K. last week.

bnew · Jun 12, 2023

Create with Freedom

Hosted Stable Diffusion

www.anydream.xyz

MAKEEPICNOW
Create art freely with AI

examples: Create with Freedom

morris · Jun 13, 2023

Funny. I used Bing's ChatAI to formulate a writing sample for an interview.

Denied still :snoop:

bnew · Jun 13, 2023

GitHub - AntonOsika/gpt-engineer: Specify what you want it to build, the AI asks for clarification, and then builds it.

Specify what you want it to build, the AI asks for clarification, and then builds it. - GitHub - AntonOsika/gpt-engineer: Specify what you want it to build, the AI asks for clarification, and then ...

github.com

About

Specify what you want it to build, the AI asks for clarification, and then builds it.

GPT Engineer

Specify what you want it to build, the AI asks for clarification, and then builds it.

GPT Engineer is made to be easy to adapt, extend, and make your agent learn how you want your code to look. It generates an entire codebase based on a prompt.

Project philosophy

Simple to get value
Flexible and easy to add new own "AI steps". See steps.py.
Incrementally build towards a user experience of:
1. high level prompting
2. giving feedback to the AI that it will remember over time
Fast handovers back and forth between AI and human
Simplicity, all computation is "resumable" and persisted to the filesystem

video demo:

https://user-images.githubusercontent.com/4467025/243695075-6e362e45-4a94-4b0d-973d-393a31d92d9b.mov

bnew · Jun 13, 2023

[2306.07174] Augmenting Language Models with Long-Term Memory

Augmenting Language Models with Long-Term Memory

Weizhi Wang, Li Dong, Hao Cheng, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, Furu Wei

Existing large language models (LLMs) can only afford fix-sized inputs due to the input length limit, preventing them from utilizing rich long-context information from past inputs. To address this, we propose a framework, Language Models Augmented with Long-Term Memory (LongMem), which enables LLMs to memorize long history. We design a novel decoupled network architecture with the original backbone LLM frozen as a memory encoder and an adaptive residual side-network as a memory retriever and reader. Such a decoupled memory design can easily cache and update long-term past contexts for memory retrieval without suffering from memory staleness. Enhanced with memory-augmented adaptation training, LongMem can thus memorize long past context and use long-term memory for language modeling. The proposed memory retrieval module can handle unlimited-length context in its memory bank to benefit various downstream tasks. Typically, LongMem can enlarge the long-form memory to 65k tokens and thus cache many-shot extra demonstration examples as long-form memory for in-context learning. Experiments show that our method outperforms strong long-context models on ChapterBreak, a challenging long-context modeling benchmark, and achieves remarkable improvements on memory-augmented in-context learning over LLMs. The results demonstrate that the proposed method is effective in helping language models to memorize and utilize long-form contents. Our code is open-sourced at this https URL.

GitHub - Victorwz/LongMem

Contribute to Victorwz/LongMem development by creating an account on GitHub.

github.com

LongMem

Swaggatron · Jun 13, 2023

Any apk recommendations

bnew · Jun 13, 2023

Function calling and other API updates

We’re announcing updates including more steerable API models, function calling capabilities, longer context, and lower prices.

openai.com

Function calling and other API updates

We’re announcing updates including more steerable API models, function calling capabilities, longer context, and lower prices.

June 13, 2023

Authors

We released gpt-3.5-turbo and gpt-4 earlier this year, and in only a short few months, have seen incredible applications built by developers on top of these models.

Today, we’re following up with some exciting updates:

new function calling capability in the Chat Completions API
updated and more steerable versions of gpt-4 and gpt-3.5-turbo
new 16k context version of gpt-3.5-turbo (vs the standard 4k version)
75% cost reduction on our state-of-the-art embeddings model
25% cost reduction on input tokens for gpt-3.5-turbo
announcing the deprecation timeline for the gpt-3.5-turbo-0301 and gpt-4-0314 models

All of these models come with the same data privacy and security guarantees we introduced on March 1 — customers own all outputs generated from their requests and their API data will not be used for training.

Function calling

Developers can now describe functions to gpt-4-0613 and gpt-3.5-turbo-0613, and have the model intelligently choose to output a JSON object containing arguments to call those functions. This is a new way to more reliably connect GPT's capabilities with external tools and APIs.
These models have been fine-tuned to both detect when a function needs to be called (depending on the user’s input) and to respond with JSON that adheres to the function signature. Function calling allows developers to more reliably get structured data back from the model. For example, developers can:

Create chatbots that answer questions by calling external tools (e.g., like ChatGPT Plugins)

Convert queries such as “Email Anya to see if she wants to get coffee next Friday” to a function call like send_email(to: string, body: string), or “What’s the weather like in Boston?” to get_current_weather(location: string, unit: 'celsius' | 'fahrenheit').

Convert natural language into API calls or database queries

Convert “Who are my top ten customers this month?” to an internal API call such as get_customers_by_revenue(start_date: string, end_date: string, limit: int), or “How many orders did Acme, Inc. place last month?” to a SQL query using sql_query(query: string).

Extract structured data from text

Define a function called extract_people_data(people: [{name: string, birthday: string, location: string}]), to extract all people mentioned in a Wikipedia article.
These use cases are enabled by new API parameters in our /v1/chat/completions endpoint, functions and function_call, that allow developers to describe functions to the model via JSON Schema, and optionally ask it to call a specific function. Get started with our developer documentation and add evals if you find cases where function calling could be improved

Function calling example

What’s the weather like in Boston right now?

Step 1·OpenAI API
Call the model with functions and the user’s input

Step 2·Third party API
Use the model response to call your API

Step 3·OpenAI API
Send the response back to the model to summarize

The weather in Boston is currently sunny with a temperature of 22 degrees Celsius.

Since the alpha release of ChatGPT plugins, we have learned much about making tools and language models work together safely. However, there are still open research questions. For example, a proof-of-concept exploit illustrates how untrusted data from a tool’s output can instruct the model to perform unintended actions. We are working to mitigate these and other risks. Developers can protect their applications by only consuming information from trusted tools and by including user confirmation steps before performing actions with real-world impact, such as sending an email, posting online, or making a purchase.

New models

GPT-4

gpt-4-0613 includes an updated and improved model with function calling.
gpt-4-32k-0613 includes the same improvements as gpt-4-0613, along with an extended context length for better comprehension of larger texts.
With these updates, we’ll be inviting many more people from the waitlist to try GPT-4 over the coming weeks, with the intent to remove the waitlist entirely with this model. Thank you to everyone who has been patiently waiting, we are excited to see what you build with GPT-4!

GPT-3.5 Turbo

gpt-3.5-turbo-0613 includes the same function calling as GPT-4 as well as more reliable steerability via the system message, two features that allow developers to guide the model's responses more effectively.
gpt-3.5-turbo-16k offers 4 times the context length of gpt-3.5-turbo at twice the price: $0.003 per 1K input tokens and $0.004 per 1K output tokens. 16k context means the model can now support ~20 pages of text in a single request.

Model deprecations

Today, we’ll begin the upgrade and deprecation process for the initial versions of gpt-4 and gpt-3.5-turbo that we announced in March. Applications using the stable model names (gpt-3.5-turbo, gpt-4, and gpt-4-32k) will automatically be upgraded to the new models listed above on June 27th. For comparing model performance between versions, our Evals library supports public and private evals to show how model changes will impact your use cases.

Developers who need more time to transition can continue using the older models by specifying gpt-3.5-turbo-0301, gpt-4-0314, or gpt-4-32k-0314 in the ‘model’ parameter of their API request. These older models will be accessible through September 13th, after which requests specifying those model names will fail. You can stay up to date on model deprecations via our model deprecation page. This is the first update to these models; so, we eagerly welcome developer feedback to help us ensure a smooth transition.

Lower pricing

We continue to make our systems more efficient and are passing those savings on to developers, effective today.

Embeddings

text-embedding-ada-002 is our most popular embeddings model. Today we’re reducing the cost by 75% to $0.0001 per 1K tokens.

GPT-3.5 Turbo

gpt-3.5-turbo is our most popular chat model and powers ChatGPT for millions of users. Today we're reducing the cost of gpt-3.5-turbo’s input tokens by 25%. Developers can now use this model for just $0.0015 per 1K input tokens and $0.002 per 1K output tokens, which equates to roughly 700 pages per dollar.
gpt-3.5-turbo-16k will be priced at $0.003 per 1K input tokens and $0.004 per 1K output tokens.
Developer feedback is a cornerstone of our platform’s evolution and we will continue to make improvements based on the suggestions we hear. We’re excited to see how developers use these latest models and new features in their applications.

bnew · Jun 14, 2023

https://www.npr.org/2023/06/13/1181906529/beatles-john-lennon-voice-song-ai

The Beatles will release a final record, using John Lennon's voice via an AI assist

June 13, 202312:41 PM ET
By

Bill Chappell

gettyimages-3165523-7a8ca3fe9d09d0c0678ccd0f0f71caec50d674f2-s800-c85.webp

The voice of John Lennon, seen here in 1963, will appear on The Beatles' new record, says his former bandmate Paul McCartney.
Hulton Archive/Getty Images

The music has analog roots, but now it's being revived by futuristic technology: The Beatles have completed a new recording using an old demo tape by John Lennon, thanks to AI tools that isolate Lennon's voice, according to Paul McCartney.

"We just finished it up, it'll be released this year," McCartney, Lennon's former bandmate, told the Today program on BBC Radio 4. It will be "the last Beatles record," said McCartney, who along with Ringo Starr is one of two surviving band members.

But if you're picturing McCartney sitting at a keyboard and telling ChatGPT, "sing a John Lennon verse," that's not what happened. Instead, they used source material from a demo recording that Lennon made before his death in 1980.

"We were able to take John's voice and get it pure through this AI, so that then we could mix the record as you would normally do. So, it gives you some sort of leeway."

McCartney says he realized technology could offer a new chance to work on the music after seeing Peter Jackson, the famously technically astute filmmaker, resurrect archival materials for Get Back, his documentary about the band making the Let It Be album.

"He was able to extricate John's voice from a ropey little bit of cassette which had John's voice and a piano," McCartney said of the director.

"He could separate them with AI. They could, they'd tell the machine, 'That's a voice. This is a guitar. Lose the guitar.' And he did that."

gettyimages-3297187-08a9733abea129c001471f6a41cda216443b4a0e-s800-c85.webp

"We were able to take John's voice and get it pure through this AI," Paul McCartney says. The Beatles are seen here celebrating after finishing their album, Sgt. Pepper's Lonely Hearts Club Band.
John Pratt/Keystone / Getty Images

McCartney didn't give details about what he says is The Beatles' final record, poised to emerge decades after Lennon was shot and killed in December 1980.

But author Keith Badman has reported that in 1994, Lennon's widow, Yoko Ono, gave McCartney several of the late singer and songwriter's home demo recordings.

The tape included Lennon's love song "Now And Then." As the BBC's Mark Savage notes, previous attempts to finish the song were abandoned due to the poor audio quality of Lennon's voice on the recording.

In the interview, McCartney also said he's concerned with how AI might be used going forward, given its ability to perform trickery like replacing one singer's vocals with another person.

"All of that is kind of scary," McCartney said, "but exciting — because it's the future."

bnew · Jun 15, 2023

mhoye (@mhoye@mastodon.social)

@kunev@blewsky.social @futurebird@sauropods.win @n1ckfg@merveilles.town For a couple of decades there was a very valuable market for pre-WW2 salvaged shipwreck metal, to use in high-sensitivity radiological instruments; surface metals had been poisoned by nuclear testing, so untainted material...

mastodon.social

https://archive.is/kPaw6

[2305.17493v2] The Curse of Recursion: Training on Generated Data Makes Models Forget

The Curse of Recursion: Training on Generated Data Makes Models Forget

Ilia Shumailov, Zakhar Shumaylov, Yiren Zhao, Yarin Gal, Nicolas Papernot, Ross Anderson

Stable Diffusion revolutionised image creation from descriptive text. GPT-2, GPT-3(.5) and GPT-4 demonstrated astonishing performance across a variety of language tasks. ChatGPT introduced such language models to the general public. It is now clear that large language models (LLMs) are here to stay, and will bring about drastic change in the whole ecosystem of online text and images. In this paper we consider what the future might hold. What will happen to GPT-{n} once LLMs contribute much of the language found online? We find that use of model-generated content in training causes irreversible defects in the resulting models, where tails of the original content distribution disappear. We refer to this effect as Model Collapse and show that it can occur in Variational Autoencoders, Gaussian Mixture Models and LLMs. We build theoretical intuition behind the phenomenon and portray its ubiquity amongst all learned generative models. We demonstrate that it has to be taken seriously if we are to sustain the benefits of training from large-scale data scraped from the web. Indeed, the value of data collected about genuine human interactions with systems will be increasingly valuable in the presence of content generated by LLMs in data crawled from the Internet.

Subjects:

Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)

https://arxiv.org/pdf/2305.17493v2.pdf

Orbital-Fetus · Jun 15, 2023

chatGPT vs. Everybody: Motherf*ckin Mini Episode

Langston and David answer a listener's email about A.I. and chatGPT has replaced the podcast.Send your conspiracy theories, music drops, and any problematic talks to <a href="mailto:mymommapod@gmail.com">mymommapod@gmail.com</a>We are now on YouTube! Listen &...

player.fm

bnew · Jun 15, 2023

bnew · Jun 15, 2023

SqueezeLLM: Dense-and-Sparse Quantization

Generative Large Language Models (LLMs) have demonstrated remarkable results for a wide range of tasks. However, deploying these models for inference has been a significant challenge due to their unprecedented resource requirements. This has forced existing deployment frameworks to use multi-GPU...

arxiv.org

SqueezeLLM: Dense-and-Sparse Quantization

Sehoon Kim, Coleman Hooper, Amir Gholami, Zhen Dong, Xiuyu Li, Sheng Shen, Michael W. Mahoney, Kurt Keutzer

Generative Large Language Models (LLMs) have demonstrated remarkable results for a wide range of tasks. However, deploying these models for inference has been a significant challenge due to their unprecedented resource requirements. This has forced existing deployment frameworks to use multi-GPU inference pipelines, which are often complex and costly, or to use smaller and less performant models. In this work, we demonstrate that the main bottleneck for generative inference with LLMs is memory bandwidth, rather than compute, specifically for single batch inference. While quantization has emerged as a promising solution by representing model weights with reduced precision, previous efforts have often resulted in notable performance degradation. To address this, we introduce SqueezeLLM, a post-training quantization framework that not only enables lossless compression to ultra-low precisions of up to 3-bit, but also achieves higher quantization performance under the same memory constraint. Our framework incorporates two novel ideas: (i) sensitivity-based non-uniform quantization, which searches for the optimal bit precision assignment based on second-order information; and (ii) the Dense-and-Sparse decomposition that stores outliers and sensitive weight values in an efficient sparse format. When applied to the LLaMA models, our 3-bit quantization significantly reduces the perplexity gap from the FP16 baseline by up to 2.1x as compared to the state-of-the-art methods with the same memory requirement. Furthermore, when deployed on an A6000 GPU, our quantized models achieve up to 2.3x speedup compared to the baseline. Our code is open-sourced and available online.

GitHub - SqueezeAILab/SqueezeLLM: [ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization

[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization - SqueezeAILab/SqueezeLLM

github.com

About

SqueezeLLM: Dense-and-Sparse Quantization

squeeze-ai-lab (Squeeze AI Lab)

Professor Kurt Keutzer's research group at Berkeley AI Research, focusing on Efficient Model Design

huggingface.co

Google bard summary:

SqueezeLLM is a new way to make language models smaller and faster. It does this by using a technique called quantization. Quantization is like taking a picture of a big object and then making it smaller. When you do this, you lose some of the detail, but the object is still recognizable.

SqueezeLLM is able to quantize language models without losing too much detail. This means that they can be used on smaller devices without sacrificing performance.

SqueezeLLM works by first identifying the most important weights in the language model. These weights are then stored in full precision, while the less important weights are stored in lower precision. This allows SqueezeLLM to reduce the size of the language model without sacrificing accuracy.

SqueezeLLM also uses a technique called sparsity to further reduce the size of the language model. Sparsity is a technique that identifies and removes redundant weights from the language model. This can further reduce the size of the language model without sacrificing accuracy.

SqueezeLLM has been shown to be effective in reducing the size and improving the performance of language models. It has been shown to reduce the size of language models by up to 90% while only reducing accuracy by a small margin.

SqueezeLLM is still under development, but it has the potential to make language models more accessible to a wider range of people.

The A.I Megathread (LLM , GPT , Development)

Veteran

About​

Stable Diffusion WebUI Txt/Img To 3D Model​

Veteran

Try chatting with fine-tuned models for Falcon-7B, Falcon-40B, and the new Open-Llama-7B​

Veteran

About​

Welcome to H2O LLM Studio, a framework and no-code GUI designed for fine-tuning state-of-the-art large language models (LLMs).​

With H2O LLM Studio, you can​

Veteran

OpenAI, DeepMind will open up models to UK government​

Veteran

Superstar

Veteran

About​

GPT Engineer​

Project philosophy​

video demo:​

Veteran

Augmenting Language Models with Long-Term Memory​

LongMem​

TYBG

Veteran

Function calling and other API updates​

Authors​

Function calling​

Function calling example​

New models​

GPT-4​

GPT-3.5 Turbo​

Model deprecations​

Lower pricing​

Embeddings​

GPT-3.5 Turbo​

Veteran

The Beatles will release a final record, using John Lennon's voice via an AI assist​

Veteran

The Curse of Recursion: Training on Generated Data Makes Models Forget​

cross that bridge

Veteran

Veteran

SqueezeLLM: Dense-and-Sparse Quantization​

About​

Google bard summary:​

About

Stable Diffusion WebUI Txt/Img To 3D Model

Try chatting with fine-tuned models for Falcon-7B, Falcon-40B, and the new Open-Llama-7B

About

Welcome to H2O LLM Studio, a framework and no-code GUI designed for
fine-tuning state-of-the-art large language models (LLMs).

With H2O LLM Studio, you can

OpenAI, DeepMind will open up models to UK government

About

GPT Engineer

Project philosophy

video demo:

Augmenting Language Models with Long-Term Memory

LongMem

Function calling and other API updates

Authors

Function calling

Function calling example

New models

GPT-4

GPT-3.5 Turbo

Model deprecations

Lower pricing

Embeddings

GPT-3.5 Turbo

The Beatles will release a final record, using John Lennon's voice via an AI assist

The Curse of Recursion: Training on Generated Data Makes Models Forget

SqueezeLLM: Dense-and-Sparse Quantization

About

Google bard summary: