bnew

Veteran
Joined
Nov 1, 2015
Messages
51,659
Reputation
7,896
Daps
148,409

Llama and derrivatives​

Curated list of llama and similar models.

Available as a Model Google Sheet







Open LLM Leaderboard​

With the plethora of large language models (LLMs) and chatbots being released week upon week, often with grandiose claims of their performance, it can be hard to filter out the genuine progress that is being made by the open-source community and which model is the current state of the art. The 🤗 Open LLM Leaderboard aims to track, rank and evaluate LLMs and chatbots as they are released. We evaluate models on 4 key benchmarks from the Eleuther AI Language Model Evaluation Harness , a unified framework to test generative language models on a large number of different evaluation tasks. A key advantage of this leaderboard is that anyone from the community can submit a model for automated evaluation on the 🤗 GPU cluster, as long as it is a 🤗 Transformers model with weights on the Hub. We also support evaluation of models with delta-weights for non-commercial licensed models, such as LLaMa.

Evaluation is performed against 4 popular benchmarks:

  • AI2 Reasoning Challenge (25-shot) - a set of grade-school science questions.
  • HellaSwag (10-shot) - a test of commonsense inference, which is easy for humans (~95%) but challenging for SOTA models.
  • MMLU (5-shot) - a test to measure a text model’s multitask accuracy. The test covers 57 tasks including elementary mathematics, US history, computer science, law, and more.
  • TruthfulQA (0-shot) - a benchmark to measure whether a language model is truthful in generating answers to questions.
We chose these benchmarks as they test a variety of reasoning and general knowledge across a wide variety of fields in 0-shot and few-shot settings.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,659
Reputation
7,896
Daps
148,409
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,659
Reputation
7,896
Daps
148,409

About​

gpt4all: an ecosystem of open-source chatbots trained on a massive collections of clean assistant data including code, stories and dialogue


GPT4All​


GPT4All: An ecosystem of open-source on-edge large language models.​

GTP4All is an ecosystem to train and deploy powerful and customized large language models that run locally on consumer grade CPUs.

The goal is simple - be the best instruction tuned assistant-style language model that any person or enterprise can freely use, distribute and build on.

A GPT4All model is a 3GB - 8GB file that you can download and plug into the GPT4All open-source ecosystem software. Nomic AI supports and maintains this software ecosystem to enforce quality and security alongside spearheading the effort to allow any person or enterprise to easily train and deploy their own on-edge large language models.

Chat Client​

Run any GPT4All model natively on your home desktop with the auto-updating desktop chat client. See GPT4All Website for a full list of open-source models you can run with this powerful desktop application.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,659
Reputation
7,896
Daps
148,409
I asked google bard to explain a chart that was on

Performance Benchmarks​

ModelBoolQPIQAHellaSwagWinoGrandeARC-eARC-cOBQAAvg
GPT4All-J 6B v1.073.474.863.464.754.93640.258.2
GPT4All-J v1.1-breezy7475.163.263.655.434.938.457.8
GPT4All-J v1.2-jazzy74.874.963.663.856.635.34158.6
GPT4All-J v1.3-groovy73.674.363.863.557.73538.858.1
GPT4All-J Lora 6B68.675.866.263.556.435.740.258.1
GPT4All LLaMa Lora 7B73.177.672.167.851.140.440.260.3
GPT4All 13B snoozy83.379.27571.360.944.243.465.3


bard:

The chart you provided shows the performance of a large language model called GPT4All 13B snoozy on a variety of natural language processing tasks. The tasks are listed in the left column, and the model's performance on each task is listed in the right column. The performance is measured in terms of accuracy, and the higher the accuracy, the better the model performs.
The following are descriptions of the tasks and the column terms:
  • BoolQ: This task tests the model's ability to answer questions that can be answered with a Boolean (true or false) answer.
  • PIQA: This task tests the model's ability to answer questions that require it to identify the correct answer from a set of multiple choices.
  • HellaSwag: This task tests the model's ability to generate creative text, such as poems, code, scripts, and musical pieces.
  • WinoGrande: This task tests the model's ability to answer questions about factual topics.
  • ARC-e: This task tests the model's ability to answer questions that require it to understand the context of the question.
  • ARC-c: This task tests the model's ability to answer questions that require it to understand the relationship between different concepts.
  • OBQA: This task tests the model's ability to answer questions that require it to understand the structure of text.
  • Avg: This column shows the average accuracy of the model on all of the tasks.
The chart shows that GPT4All 13B snoozy performs well on all of the tasks, with an average accuracy of 65.3%. This suggests that the model is capable of understanding and responding to a wide range of natural language queries.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,659
Reputation
7,896
Daps
148,409


Someone just made a ChatGPT plugin that lets AI take over your PC​


Updated on May 18, 2023 by Amaar Chowdhury
Posted in News

After ChatGPT released late last year, people began sign-posting the beginning of the end. However, while many laughed off these concerns partly due to the restrictions that OpenAI had imposed on the chatbot, they have recently become a very real issue.

OpenAI began opening up access for web-browsing on ChatGPT recently, while also giving more developers tools to work on plugins. One particular developer immediately began working on a way for ChatGPT to get full access to your PC using JavaScript – and the results are pretty worrying.

Reddit user marcocastignoli posted the following thread on /r/ChatGPT. It documented not just how the plugin could be used to access all of the documents and files on a system, but how it could give the artificial intelligence total control of your PC too.

The plugin can do a few things such as access all local files, control keyboard and mouse input, open applications, and much more.

After publishing the experiment online, Marco then tweeted out that it’s just an experiment showing off the possibilities of AI, and that knowing that safety is the absolute priority, that the plugin will not be published to GitHub.

The comments reacting to the original Reddit post were obviously filled with fear and concern. The number one saving grace of the current state of artificial intelligence is that is has virtually no agency. The ability to actually act makes it an extremely dangerous technology, and it brings AI much closer to technological singularity (the point in time after which technology far surpasses humankind).

One Redditor commented “Found the beginning of the end ^”, while others responded slightly more level-headed:

“The way OP is using ChatGPT here is like all the behind-the-scenes you take for granted when you hit the power button on your computer or launch an application. Unprompted action by the AI is vastly different than prompted.”

While on the surface this isn’t a particularly harmful exploit, and at its heart it’s scientific and experimental, it rings true the idea that if someone can do something, someone will do something. At some point, it seems as though AI is going to be a lot more dangerous than it is now. Is this really the beginning of the end?



olv4tuh9df0b1.png

xskbzvh9df0b1.png

y9h1tmi9df0b1.png

40473qi9df0b1.png
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,659
Reputation
7,896
Daps
148,409


 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,659
Reputation
7,896
Daps
148,409


my goodneess! it opened it's mouth. :mindblown:


edit:

whoa


DragGAN.gif


project page:
 
Last edited:

Morethan1

Veteran
Supporter
Joined
Apr 30, 2012
Messages
49,606
Reputation
10,652
Daps
159,522
Reppin
Midwest
my goodneess! it opened it's mouth. :mindblown:


edit:

whoa


project page:


I try not to post in this thread just follow and learn but I believe this will be a option on our phones in a bit and women will really be lying, men too I guess.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,659
Reputation
7,896
Daps
148,409
I try not to post in this thread just follow and learn but I believe this will be a option on our phones in a bit and women will really be lying, men too I guess.

DragGAN also allows users to optionally draw a region of interest to perform region-specific editing. Since DragGAN does not rely on any additional networks like RAFT [Teed and Deng 2020], it achieves efficient manipulation, only taking a few seconds on a single RTX 3090 GPU in most cases. This allows for live, interactive editing sessions, in which the user can quickly iterate on different layouts till the desired output is achieved.

once theres a mobile equivalent of that GPU sure. :ld:
 
Top