bnew

Veteran
Joined
Nov 1, 2015
Messages
56,129
Reputation
8,239
Daps
157,829



Introduction to the AI Index Report 2023 Welcome to the sixth edition of the AI Index Report! This year, the report introduces more original data than any previous edition, including a new chapter on AI public opinion, a more thorough technical performance chapter, original analysis about large language and multimodal models, detailed trends in global AI legislation records, a study of the environmental impact of AI systems, and more. The AI Index Report tracks, collates, distills, and visualizes data related to artificial intelligence. Our mission is to provide unbiased, rigorously vetted, broadly sourced data in order for policymakers, researchers, executives, journalists, and the general public to develop a more thorough and nuanced understanding of the complex field of AI. The report aims to be the world’s most credible and authoritative source for data and insights about AI.

p3ZwDcZ.png

Y54LqaP.png
 

Flexington

All Star
Joined
May 31, 2022
Messages
2,133
Reputation
1,111
Daps
9,463
this is the first time something new has come out that makes me feel :flabbynsick:
At a glance, I didn't understand most of what was going on but after clicking a few of these links and seeing examples I see how powerful this tool is and can be.


I have a lot of reading, learning, and experimenting to do
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,129
Reputation
8,239
Daps
157,829

FswLk0IXwAA3Rue


A Survey of Large Language Models​

Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng Hou, Yingqian Min, Beichen Zhang, Junjie Zhang, Zican Dong, Yifan Du, Chen Yang, Yushuo Chen, Zhipeng Chen, Jinhao Jiang, Ruiyang Ren, Yifan Li, Xinyu Tang, Zikang Liu, Peiyu Liu, Jian-Yun Nie, Ji-Rong Wen
Language is essentially a complex, intricate system of human expressions governed by grammatical rules. It poses a significant challenge to develop capable AI algorithms for comprehending and grasping a language. As a major approach, language modeling has been widely studied for language understanding and generation in the past two decades, evolving from statistical language models to neural language models. Recently, pre-trained language models (PLMs) have been proposed by pre-training Transformer models over large-scale corpora, showing strong capabilities in solving various NLP tasks. Since researchers have found that model scaling can lead to performance improvement, they further study the scaling effect by increasing the model size to an even larger size. Interestingly, when the parameter scale exceeds a certain level, these enlarged language models not only achieve a significant performance improvement but also show some special abilities that are not present in small-scale language models. To discriminate the difference in parameter scale, the research community has coined the term large language models (LLM) for the PLMs of significant size. Recently, the research on LLMs has been largely advanced by both academia and industry, and a remarkable progress is the launch of ChatGPT, which has attracted widespread attention from society. The technical evolution of LLMs has been making an important impact on the entire AI community, which would revolutionize the way how we develop and use AI algorithms. In this survey, we review the recent advances of LLMs by introducing the background, key findings, and mainstream techniques. In particular, we focus on four major aspects of LLMs, namely pre-training, adaptation tuning, utilization, and capacity evaluation. Besides, we also summarize the available resources for developing LLMs and discuss the remaining issues for future directions.

 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,129
Reputation
8,239
Daps
157,829
mentioned it already but heres another tweet thread.




Fs1ROwaWYAEXTTZ

Fs1UbYpX0AESvJ6

Fs1XOIoWAAIPpLU

Fs1ZKMgWAAE4aT6




demo site for the 7B version : Baize Lora 7B - a Hugging Face Space by project-baize
 
Last edited:

Micky Mikey

Veteran
Supporter
Joined
Sep 27, 2013
Messages
15,850
Reputation
2,840
Daps
88,231
Anyone else really been consumed with all A.I. stuff recently? For me I feel like everything else going in the world right now has fallen in the backdrop. I feel like we're on the precipice of indescribable change and most people are completely oblivious.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,129
Reputation
8,239
Daps
157,829

GPTeacher​

A collection of modular datasets generated by GPT-4, General-Instruct - Roleplay-Instruct - Code-Instruct - and Toolformer

The General-Instruct used many of the same seed prompts as alpaca, but also had specific examples of things we didnt see much in with alpaca. Such as Chain of Thought Reasoning, Logic Puzzles, Wordplay, Role Playing (lightly), and was asked to include reasoning behind and thought steps where appropriate in example responses, among other things. The General-Instruct dataset is about 20,000 examples with just deduplication.

Still cleaning the codegen instruct dataset, will be up when its cleaned.

Each dataset is split into 5 seperate datasets, based on similarity scored cleaning. Simple dedupe only, and then range of <60% to <90% similarity cleaned sets for each.

They are all made to be compliant with Alpaca's dataset format, i.e. each has an instruction, input, and output field, should make it easier to use the same fine tune script and process as alpaca has.

Documentation on the toolformers section coming soon, we generated a dataset to use a set of predefined tools, including search, python, terminal/shell, wikipedia, wolfram, and others. More info on prompt format for inference soon..
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,129
Reputation
8,239
Daps
157,829
[/U]

Koala: A Dialogue Model for Academic Research​

Xinyang Geng*, Arnav Gudibande*, Hao Liu* and Eric Wallace* Apr 3, 2023
model.png


snippet:
In this post, we introduce Koala, a chatbot trained by fine-tuning Meta’s LLaMA on dialogue data gathered from the web. We describe the dataset curation and training process of our model, and also present the results of a user study that compares our model to ChatGPT and Stanford’s Alpaca. Our results show that Koala can effectively respond to a variety of user queries, generating responses that are often preferred over Alpaca, and at least tied with ChatGPT in over half of the cases.

We hope that these results contribute further to the discourse around the relative performance of large closed-source models to smaller public models. In particular, it suggests that models that are small enough to be run locally can capture much of the performance of their larger cousins if trained on carefully sourced data. This might imply, for example, that the community should put more effort into curating high-quality datasets, as this might do more to enable safer, more factual, and more capable models than simply increasing the size of existing systems. We emphasize that Koala is a research prototype, and while we hope that its release will provide a valuable community resource, it still has major shortcomings in terms of content, safety, and reliability, and should not be used outside of research.


System Overview

Large language models (LLMs) have enabled increasingly powerful virtual assistants and chat bots, with systems such as ChatGPT, Bard, Bing Chat, and Claude able to respond to a breadth of user queries, provide sample code, and even write poetry. Many of the most capable LLMs require huge computational resources to train, and oftentimes use large and proprietary datasets. This suggests that in the future, highly capable LLMs will be largely controlled by a small number of organizations, and both users and researchers will pay to interact with these models without direct access to modify and improve them on their own. On the other hand, recent months have also seen the release of increasingly capable freely available or (partially) open-source models, such as LLaMA. These systems typically fall short of the most capable closed models, but their capabilities have been rapidly improving. This presents the community with an important question: will the future see increasingly more consolidation around a handful of closed-source models, or the growth of open models with smaller architectures that approach the performance of their larger but closed-source cousins?

While the open models are unlikely to match the scale of closed-source models, perhaps the use of carefully selected training data can enable them to approach their performance. In fact, efforts such as Stanford’s Alpaca, which fine-tunes LLaMA on data from OpenAI’s GPT model, suggest that the right data can improve smaller open source models significantly.

We introduce a new model, Koala, which provides an additional piece of evidence toward this discussion. Koala is fine-tuned on freely available interaction data scraped from the web, but with a specific focus on data that includes interaction with highly capable closed-source models such as ChatGPT. We fine-tune a LLaMA base model on dialogue data scraped from the web and public datasets, which includes high-quality responses to user queries from other large language models, as well as question answering datasets and human feedback datasets. The resulting model, Koala-13B, shows competitive performance to existing models as suggested by our human evaluation on real-world user prompts.

Our results suggest that learning from high-quality datasets can mitigate some of the shortcomings of smaller models, maybe even matching the capabilities of large closed-source models in the future. This might imply, for example, that the community should put more effort into curating high-quality datasets, as this might do more to enable safer, more factual, and more capable models than simply increasing the size of existing systems.

By encouraging researchers to engage with our system demo, we hope to uncover any unexpected features or deficiencies that will help us evaluate the models in the future. We ask researchers to report any alarming actions they observe in our web demo to help us comprehend and address any issues. As with any release, there are risks, and we will detail our reasoning for this public release later in this blog post. We emphasize that Koala is a research prototype, and while we hope that its release will provide a valuable community resource, it still has major shortcomings in terms of content, safety, and reliability, and should not be used outside of research. Below we provide an overview of the differences between Koala and notable existing models.

compare.png


Datasets and Training

A primary obstacle in building dialogue models is curating training data. Prominent chat models, including ChatGPT, Bard, Bing Chat and Claude use proprietary datasets built using significant amounts of human annotation. To construct Koala, we curated our training set by gathering dialogue data from the web and public datasets. Part of this data includes dialogues with large language models (e.g., ChatGPT) which users have posted online.

Rather than maximizing quantity by scraping as much web data as possible, we focus on collecting a small high-quality dataset. We use public datasets for question answering, human feedback (responses rated both positively and negatively), and dialogues with existing language models. We provide the specific details of the dataset composition below.


ChatGPT Distillation Data

Public User-Shared Dialogues with ChatGPT (ShareGPT) Around 60K dialogues shared by users on ShareGPT were collected using public APIs. To maintain data quality, we deduplicated on the user-query level and removed any non-English conversations. This leaves approximately 30K examples.

Human ChatGPT Comparison Corpus (HC3) We use both the human and ChatGPT responses from the HC3 english dataset, which contains around 60K human answers and 27K ChatGPT answers for around 24K questions, resulting in a total number of around 87K question-answer examples.

{continue reading on the site}

demo site:
 
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,129
Reputation
8,239
Daps
157,829

Tools and Resources for AI Art​

[apps] [discord] [learn] [music] [prompts] [text] [text to image] [upscale] [video]
Looking to get started with AI art? A good place to start is one of the popular apps like DreamStudio , midjourney , Wombo , or NightCafe . You can get a quick sense of how you can use words and phrases to guide image generation. Read up on prompt engineering to improve your results. Then you may want to move on to using Google Colab notebooks linked below like Deforum. If you have a good nVidia GPU of your own then you can also use NMKD Stable Diffusion GUI or Visions of Chaos to run the most popular notebooks locally. If you want to train your own Ai models check out the Ai art model training page.
 

null

...
Joined
Nov 12, 2014
Messages
29,261
Reputation
4,909
Daps
46,449
Reppin
UK, DE, GY, DMV
why is this being adopted so fast?

even my slow ass company has a GPT trained model?

its like some flipped a switch and everybody was prepared
seems like a psyop

it's like the blockchain bubble. people will realise the limitations soon enough.
 
Top