The A.I Megathread (LLM , GPT , Development)

bnew · Jun 6, 2023

GitHub - louisgv/local.ai: 🎒 local.ai - Run AI locally on your PC!

🎒 local.ai - Run AI locally on your PC! Contribute to louisgv/local.ai development by creating an account on GitHub.

github.com

About

local.ai - bring your model and start the experimentation!

A desktop app for local AI experimentation, model inference hosting, and note-taking.

It's made to be used alongside GitHub - alexanderatallah/window.ai: Use your own AI models on the web as a simple way to have a local inference server up and running in no time. window.ai + local.ai enable every web app to utilize AI without incurring any cost from either the developer or the user!

Right now, local.ai uses the GitHub - rustformers/llm: An ecosystem of Rust libraries for working with large language models rust crate at its core. Check them out, they are super cool!

https://user-images.githubusercontent.com/6723574/243338764-ba4a04dc-5087-4725-b619-165ad774aedd.mp4

bnew · Jun 6, 2023

GitHub - camenduru/text-to-video-synthesis-colab: Text To Video Synthesis Colab

Text To Video Synthesis Colab. Contribute to camenduru/text-to-video-synthesis-colab development by creating an account on GitHub.

github.com

About

Text To Video Synthesis Colab

Examples

A giraffe underneath a microwave.	A goldendoodle playing in a park by a lake.	A panda bear driving a car.
A teddy bear running in New York City.	Drone flythrough of a fast food restaurant on a dystopian alien planet.	A dog wearing a Superhero outfit with red cape flying through the sky.
Monkey learning to play the piano.	A litter of puppies running through the yard.	Robot dancing in times square.

bnew · Jun 7, 2023

https://archive.is/RpAmI

Introducing LTM-1 — Magic

Magic’s LTM-1 model has a 5 million token context window, allowing it to see your entire repository of code.

magic.dev

LTM-1: an LLM with a 5,000,000 token context window
Magic Team
6/5/2023

Magic’s LTM-1 enables 50x larger context windows than transformers

Magic's trained a Large Language Model (LLM) that’s able to take in the gigantic amounts of context when generating suggestions. For our coding assistant, this means Magic can now see your entire repository of code.

bnew · Jun 7, 2023

GitHub - openchatai/OpenChat: LLMs custom-chatbots console ⚡

LLMs custom-chatbots console ⚡. Contribute to openchatai/OpenChat development by creating an account on GitHub.

github.com

About

LLMs custom-chatbots console

OpenChat

Important disclaimer: This is an undergoing efforts to create a free & open source chatbot console that allows you to easily create unlimited chatbots using different models for your daily use. Our main goal is to make the interface simple and user-friendly for everyone. If you find this interesting, we would greatly appreciate your support in contributing to this project. We have a highly ambitious plan that we are determined to implement!

OpenChat is an everyday user chatbot console that simplifies the utilization of large language models. With the advancements in AI, the installation and usage of these models have become overwhelming. OpenChat aims to address this challenge by providing a two-step setup process to create a comprehensive chatbot console. It serves as a central hub for managing multiple customized chatbots.

Currently, OpenChat supports GPT models, and we are actively working on incorporating various open-source drivers that can be activated with a single click.

Try it out:

You can try it out on openchat.so (we use our own OpenAI/pinecone token for the demo, please be mindful on the usage, we will clear out bots every 3 hours)

Current Features

Create unlimited local chatbots based on GPT-3 (and GPT-4 if available).
Customize your chatbots by providing PDF files, websites, and soon, integrations with platforms like Notion, Confluence, and Office 365.
Each chatbot has unlimited memory capacity, enabling seamless interaction with large files such as a 400-page PDF.
Embed chatbots as widgets on your website or internal company tools.
Use your entire codebase as a data source for your chatbots (pair programming mode).
And much more!

Run and create custom ChatGPT-like bots with OpenChat | Hacker News

news.ycombinator.com

bnew · Jun 7, 2023

bnew · Jun 7, 2023

GitHub - InternLM/InternLM-techreport

Contribute to InternLM/InternLM-techreport development by creating an account on GitHub.

github.com

InternLM

InternLM is a multilingual large language model jointly developed by Shanghai AI Lab and SenseTime (with equal contribution), in collaboration with the Chinese University of Hong Kong, Fudan University, and Shanghai Jiaotong University.

Technical report: [PDF]

Note: Please right click the link above to directly download the PDF file.

Abstract

We present InternLM, a multilingual foundational language model with 104B parameters. InternLM is pre-trained on a large corpora with 1.6T tokens with a multi-phase progressive process, and then fine-tuned to align with human preferences. We also developed a training system called Uniscale-LLM for efficient large language model training. The evaluation on a number of benchmarks shows that InternLM achieves state-of-the-art performance in multiple aspects, including knowledge understanding, reading comprehension, mathematics, and coding. With such well-rounded capabilities, InternLM achieves outstanding performances on comprehensive exams, including MMLU, AGIEval, C-Eval and GAOKAO-Bench, without resorting to external tools. On these benchmarks, InternLM not only significantly outperforms open-source models, but also obtains superior performance compared to ChatGPT. Also, InternLM demonstrates excellent capability of understanding Chinese language and Chinese culture, which makes it a suitable foundation model to support Chinese-oriented language applications. This manuscript gives a detailed study of our results, with benchmarks and examples across a diverse set of knowledge domains and tasks.

Main Results

As latest large language models begin to exhibit human-level intelligence, exams designed for humans, such as China's college entrance examination and US SAT and GRE, are considered as important means to evaluate language models. Note that in its technical report on GPT-4, OpenAI tested GPT-4 through exams across multiple areas and used the exam scores as the key results.

We tested InternLM in comparison with others on four comprehensive exam benchmarks, as below:

MMLU: A multi-task benchmark constructed based on various US exams, which covers elementary mathematics, physics, chemistry, computer science, American history, law, economics, diplomacy, etc.
AGIEval: A benchmark developed by Microsoft Research to evaluate the ability of language models through human-oriented exams, which comprises 19 task sets derived from various exams in China and the United States, e.g., the college entrance exams and lawyer qualification exams in China, and SAT, LSAT, GRE and GMAT in the United States. Among the 19 task sets, 9 sets are based on the Chinese college entrance exam (Gaokao), which we single out as an important collection named AGIEval (GK).
C-Eval: A comprehensive benchmark devised to evaluate Chinese language models, which contains nearly 14,000 questions in 52 subjects, covering mathematics, physics, chemistry, biology, history, politics, computer and other disciplines, as well as professional exams for civil servants, certified accountants, lawyers, and doctors.
GAOKAO-Bench: A comprehensice benchmark based on the Chinese college entrance exams, which include all subjects of the college entrance exam. It provide different types of questions, including multiple-choices, blank filling, and QA. For conciseness, we call this benchmark simply as Gaokao.

Results on MMLU

Results on AGIEval

Results on C-Eval

C-Eval has a live leaderboard. Below is a screenshot that shows all the results (as of 2023-06-01).

Results on GAOKAO-Benchmark

Benchmarks in Specific Aspects

We also tested InternLM in comparison with others in multiple aspects:

Knowledge QA: TriviaQA and NaturalQuestions.
Reading Comprehension: RACE
Chinese Understanding: CLUE and FewCLUE
Mathematics: GSM8k and MATH
Coding: HumanEval and MBP

Please refer to our technical report for detailed results.

We are working on more tests, and will share new results as our work proceeds.

bnew · Jun 7, 2023

GitHub - SupaGruen/StableDiffusion-CheatSheet: A list of StableDiffusion styles and some notes for offline use. Pure HTML, CSS and a bit of JS.

A list of StableDiffusion styles and some notes for offline use. Pure HTML, CSS and a bit of JS. - GitHub - SupaGruen/StableDiffusion-CheatSheet: A list of StableDiffusion styles and some notes for...

github.com

Stable Diffusion Cheat-Sheet

This began as a personal collection of styles and notes. I was curious to see how the artists used in the prompts looked without the other keywords.

bnew · Jun 7, 2023

https://archive.is/BesiS

bnew · Jun 8, 2023

AlphaDev discovers faster sorting algorithms

In our paper published today in Nature, we introduce AlphaDev, an artificial intelligence (AI) system that uses reinforcement learning to discover enhanced computer science algorithms – surpassing...

www.deepmind.com

New algorithms will transform the foundations of computing

Digital society is driving increasing demand for computation, and energy use. For the last five decades, we relied on improvements in hardware to keep pace. But as microchips approach their physical limits, it’s critical to improve the code that runs on them to make computing more powerful and sustainable. This is especially important for the algorithms that make up the code running trillions of times a day.

In our paper published today in Nature, we introduce AlphaDev, an artificial intelligence (AI) system that uses reinforcement learning to discover enhanced computer science algorithms – surpassing those honed by scientists and engineers over decades.

AlphaDev uncovered a faster algorithm for sorting, a method for ordering data. Billions of people use these algorithms everyday without realising it. They underpin everything from ranking online search results and social posts to how data is processed on computers and phones. Generating better algorithms using AI will transform how we program computers and impact all aspects of our increasingly digital society.

By open sourcing our new sorting algorithms in the main C++ library, millions of developers and companies around the world now use it on AI applications across industries from cloud computing and online shopping to supply chain management. This is the first change to this part of the sorting library in over a decade and the first time an algorithm designed through reinforcement learning has been added to this library. We see this as an important stepping stone for using AI to optimise the world’s code, one algorithm at a time.

What is sorting?

Sorting is a method of organising a number of items in a particular order. Examples include alphabetising three letters, arranging five numbers from biggest to smallest, or ordering a database of millions of records.

This method has evolved throughout history. One of the earliest examples dates back to the second and third century when scholars alphabetised thousands of books by hand on the shelves of the Great Library of Alexandria. Following the industrial revolution, came the invention of machines that could help with sorting – tabulation machines stored information on punch cards which were used to collect the 1890 census results in the United States.

And with the rise of commercial computers in the 1950s, we saw the development of the earliest computer science algorithms for sorting. Today, there are many different sorting techniques and algorithms which are used in codebases around the world to organise massive amounts of data online.

64807d8d62f7edb0c75e25b1_647dff292fa82e78d9e43c65_Figure6.png

Illustration of what a sorting algorithm does. A series of unsorted numbers is input into the algorithm and sorted numbers are output.
Contemporary algorithms took computer scientists and programmers decades of research to develop. They’re so efficient that making further improvements is a major challenge, akin to trying to find a new way to save electricity or a more efficient mathematical approach. These algorithms are also a cornerstone of computer science, taught in introductory computer science classes at universities.

Searching for new algorithms

AlphaDev uncovered faster algorithms by starting from scratch rather than refining existing algorithms, and began looking where most humans don’t: the computer’s assembly instructions.

Assembly instructions are used to create binary code for computers to put into action. While developers write in coding languages like C++, known as high-level languages, this must be translated into ‘low-level’ assembly instructions for computers to understand.

We believe many improvements exist at this lower level that may be difficult to discover in a higher-level coding language. Computer storage and operations are more flexible at this level, which means there are significantly more potential improvements that could have a larger impact on speed and energy usage.

64807d8d48e07cea39f9a58f_647dff5b5667546f1211c555_Figure1.png

Code is typically written in a high level programming language such as C++. This is then translated to low-level CPU instructions, called assembly instructions, using a compiler. An assembler then converts the assembly instructions to executable machine code that the computer can run.

bnew · Jun 8, 2023

bnew · Jun 8, 2023

bnew · Jun 8, 2023

https://archive.is/xI6Zv

[2305.14825] Large Language Models are In-Context Semantic Reasoners rather than Symbolic Reasoners

Large Language Models are In-Context Semantic Reasoners rather than Symbolic Reasoners

GitHub - XiaojuanTang/ICSR

Contribute to XiaojuanTang/ICSR development by creating an account on GitHub.

github.com

bnew · Jun 8, 2023

https://archive.is/LRoN8

bnew · Jun 8, 2023

Wolfram Prompt Repository

resources.wolframcloud.com

WOLFRAMPROMPT REPOSITORY(UNDER CONSTRUCTION)

A curated collection of prompts, personas, functions, & more for LLMs (large language model AIs)
Announcement post »

bnew · Jun 8, 2023

Open Language Safety Research

openlsr.org

SAIL - Search Augmented Instruction Learning

Authors

Hongyin Luo and Yung-Sung Chuang and Yuan Gong and Tianhua Zhang and Yoon Kim and Danny Fox and Xixin Wu and Helen Meng and James Glass

Affliations

MIT CSAIL and MIT Linguistics and CUHK & CPII

Towards transparent & robust Chatbot

We develop a search engine-grounded large language model that generates language grounding on noisy search results, improving both the transparency and robustness of LLMs with distracting information.
[PREPRINT] | [DEMO] | [GITHUB]

Introducing SAIL-7B Language Model

The Research Question and Our Solution

Can search engines always improve language models?
- No. We found that the improvement by applying search engines of LLMs is minimal on several tasks. While search engines retrieve a vast range of up-to-date information, the retrieval results can be disputing or distracting. Such grounding information is not necessarily helpful to language models.

How to improve language models with search engines?
- We fine-tune a large language model (LLaMA-7B) grounded on real search engine outputs. The fine-tuned model can automatically distill the informative search results and flag distracting items. With the search-augmented fine-tuning, our model can be significantly boosted by a search engine, outperforming state-of-the-art chatbots including ChatGPT and Vicuna-13B with much fewer parameters.

How to evaluate search-augmented large language models?
- Automatic scoring for instruction following with GPT-4. SAIL-7b achieves higher scores than search-augmented ChatGPT and Vicuna models.
- We test the search-grounded answering performance on open-ended QA benchmarks.
- Fact and fairness checking. One of our goals is to fight against misinformation, hate, and stereotype with large language models. We test several LLMs on the UniLC benchmark.

More Details

A subset of our code and data are already publicly available and will be updated to a complete version before June 24. We utilize the pretrained LLaMA model, Alpaca 52k instructions, and GPT4-generated responses. Please consider the term-of-use of these projects.
GITHUB REPOSITORY

SAIL-7B Performance

Implementation

Backbone Model

We fine-tuned the LLaMA-7b model with a search-augmented instruction training set.

Training Data

We fine-tune a LLaMA-7b model using the 52k instructions designed by the Alpaca Team with the response generated by GPT-4. In addition, we collect 10 search results (titles + previews only) for each instruction with DuckDuckGO.com and a BM25-based Wikipedia retriever implemented by Pyserini, but feed the top 0 to 5 sampled search results to LLaMA for fine-tuning and evaluation. The training data can be downloaded from our Github repository.

Training Details

We trained the model on 4 NVIDIA RTX A6000 GPUs (4x48GB). The training takes ~24 hours (4x24GPU hours). The details of training parameters can be found in our Github repository.

Training Code

We trained our model using the FastChat library

The A.I Megathread (LLM , GPT , Development)

Veteran

About​

Veteran

About​

Examples​

Veteran

Magic’s LTM-1 enables 50x larger context windows than transformers​

Veteran

About​

OpenChat​

Try it out:​

Current Features​

Veteran

Veteran

InternLM​

Abstract​

Main Results​

Results on MMLU​

Results on AGIEval​

Results on C-Eval​

Results on GAOKAO-Benchmark​

Benchmarks in Specific Aspects​

Veteran

Stable Diffusion Cheat-Sheet​

Veteran

Veteran

New algorithms will transform the foundations of computing​

What is sorting?​

Searching for new algorithms​

Veteran

Veteran

Veteran

Large Language Models are In-Context Semantic Reasoners rather than Symbolic Reasoners​

Veteran

Veteran

WOLFRAMPROMPT REPOSITORY(UNDER CONSTRUCTION)​

Veteran

SAIL - Search Augmented Instruction Learning​

Authors​

Affliations​

Towards transparent & robust Chatbot​

Introducing SAIL-7B Language Model ​

The Research Question and Our Solution​

More Details​

SAIL-7B Performance ​

Implementation ​

Backbone Model​

Training Data​

Training Details​

Training Code​

About

About

Examples

Magic’s LTM-1 enables 50x larger context windows than transformers

About

OpenChat

Try it out:

Current Features

InternLM

Abstract

Main Results

Results on MMLU

Results on AGIEval

Results on C-Eval

Results on GAOKAO-Benchmark

Benchmarks in Specific Aspects

Stable Diffusion Cheat-Sheet

New algorithms will transform the foundations of computing

What is sorting?

Searching for new algorithms

Large Language Models are In-Context Semantic Reasoners rather than Symbolic Reasoners

WOLFRAMPROMPT REPOSITORY(UNDER CONSTRUCTION)

SAIL - Search Augmented Instruction Learning

Authors

Affliations

Towards transparent & robust Chatbot

Introducing SAIL-7B Language Model

The Research Question and Our Solution

More Details

SAIL-7B Performance

Implementation

Backbone Model

Training Data

Training Details

Training Code