bnew

Veteran
Joined
Nov 1, 2015
Messages
56,233
Reputation
8,271
Daps
157,977


AlphaGo Zero: Starting from scratch
 
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,233
Reputation
8,271
Daps
157,977

fmeIT6Q.png
Meta solves hand generation
CM3leon
 
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,233
Reputation
8,271
Daps
157,977

About​

GPT based autonomous agent that does online comprehensive research on any given topic

GPT Researcher is an autonomous agent designed for comprehensive online research on a variety of tasks.

The agent can produce detailed, factual and unbiased research reports, with customization options for focusing on relevant resources, outlines, and lessons. Inspired by AutoGPT and the recent Plan-and-Solve paper, GPT Researcher addresses issues of speed and determinism, offering a more stable performance and increased speed through parallelized agent work, as opposed to synchronous operations.

Our mission is to empower individuals and organizations with accurate, unbiased, and factual information by leveraging the power of AI.

Why GPT Researcher?​

  • To form objective conclusions for manual research tasks can take time, sometimes weeks to find the right resources and information.
  • Current LLMs are trained on past and outdated information, with heavy risks of hallucinations, making them almost irrelevant for research tasks.
  • Solutions that enable web search (such as ChatGPT + Web Plugin), only consider limited resources that in some cases result in superficial conclusions or biased answers.
  • Using only a selection of resources can create bias in determining the right conclusions for research questions or tasks.

Architecture​

The main idea is to run "planner" and "execution" agents, whereas the planner generates questions to research, and the execution agents seek the most related information based on each generated research question. Finally, the planner filters and aggregates all related information and creates a research report. The agents leverage both gpt3.5-turbo-16k and gpt-4 to complete a research task.

More specifcally:

  • Generate a set of research questions that together form an objective opinion on any given task.
  • For each research question, trigger a crawler agent that scrapes online resources for information relevant to the given task.
  • For each scraped resources, summarize based on relevant information and keep track of its sources.
  • Finally, filter and aggregate all summarized sources and generate a final research report.

DEMO​



Features​

  • 📝 Generate research, outlines, resources and lessons reports
  • 🌐 Aggregates over 20 web sources per research to form objective and factual conclusions
  • 🖥️ Includes an easy-to-use web interface (HTML/CSS/JS)
  • 🔍 Scrapes web sources with javascript support
  • 📂 Keeps track and context of visited and used web sources
  • 📄 Export research reports to PDF and more...



 
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,233
Reputation
8,271
Daps
157,977

Large Language Models as General Pattern Machines​


Suvir Mirchandani1 , Fei Xia2 , Pete Florence2 , Brian Ichter2 , Danny Driess2 3 , Montserrat Gonzalez Arenas2 , Kanishka Rao2 , Dorsa Sadigh1 2, Andy Zeng2 1Stanford University, 2Google DeepMind, 3TU Berlin Large Language Models as General Pattern Machines

Abstract:​

We observe that pre-trained large language models (LLMs) are capable of autoregressively completing complex token sequences – from arbitrary ones procedurally generated by probabilistic context-free grammars (PCFG), to more rich spatial patterns found in the Abstract Reasoning Corpus (ARC), a general AI benchmark, prompted in the style of ASCII art. Surprisingly, pattern completion proficiency can be partially retained even when the sequences are expressed using tokens randomly sampled from the vocabulary. These results suggest that without any additional training, LLMs can serve as general sequence modelers, driven by in-context learning. In this work, we investigate how these zero-shot capabilities may be applied to problems in robotics – from extrapolating sequences of numbers that represent states over time to complete simple motions, to least-to-most prompting of reward-conditioned trajectories that can discover and represent closed-loop policies (e.g., a stabilizing controller for CartPole). While difficult to deploy today for real systems due to latency, context size limitations, and compute costs, the approach of using LLMs to drive low-level control may provide an exciting glimpse into how the patterns among words could be transferred to actions. Keywords: large language models, in-context learning, language for robotics






 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,233
Reputation
8,271
Daps
157,977




Conversational AI is finally here. Introducing Air…

Air can perform full 5-40 minute long sales & customer service calls over the phone that sound like a human. And can perform actions autonomously across 5,000 unique applications.

It’s kind of like having 100,000 sales & customer service reps at the tap of a button.

Reply “beta” if you want to be one of the first companies on the planet to deploy an AI sales or CS team. Once you receive beta access, you can create your own AI and have it on live calls in a matter of minutes - kinda like setting up a Facebook ad.

This is not just some hype twitter demo. Air is currently on live calls, talking to real people every single day, profitably producing for real businesses. And it’s not limited to any one use case… you can create an AI SDR, 24/7 CS agent, Closer, Account Executive… or prompt it for your specific use case and get creative (therapy, talk to Aristotle, etc… it’s only limited by your imagination)

Reply “beta” to this tweet and join the 50,000+ businesses that are already on the list to beta test Air - and our team will reach out asap.

(Also, if you are a developer / builder who wants to innovate on top of our existing technology for different use cases - we’d love to see you reach out as well! Builders are our favorite people haha - and we are excited to see what people create with us.)
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,233
Reputation
8,271
Daps
157,977

Fani Willis fan

Veteran
Joined
Jan 27, 2016
Messages
17,849
Reputation
2,319
Daps
73,616

He’s doing the right thing, capitalism wise:manny:
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,233
Reputation
8,271
Daps
157,977

Introducing LongLLaMA 🦙, an unlimited-context version of OpenLLaMA fine-tuned at 8k & capable of extrapolating to 256k tokens! We train it using our new Focused Transformer 🎯 technique (FoT). No degradation on short context, drop-in compatibility & Apache 2.0 license

EBkmfWF.png



Transformer Language Models without Positional Encodings Still Learn Positional Information​

Adi Haviv, Ori Ram, Ofir Press, Peter Izsak, Omer Levy
Causal transformer language models (LMs), such as GPT-3, typically require some form of positional encoding, such as positional embeddings. However, we show that LMs without any explicit positional encoding are still competitive with standard models, and that this phenomenon is robust across different datasets, model sizes, and sequence lengths. Probing experiments reveal that such models acquire an implicit notion of absolute positions throughout the network, effectively compensating for the missing information. We conjecture that causal attention enables the model to infer the number of predecessors that each token can attend to, thereby approximating its absolute position. Our findings indicate that causal LMs might derive positional awareness not only from the explicit positioning mechanism, but also from the effects of the causal mask.




Google Colaboratory
GitHub - CStanKonrad/long_llama: LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.
https://huggingface.co/syzymon/long_llama_3b
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,233
Reputation
8,271
Daps
157,977


Introducing Llama 2​

The next generation of our​

open source large language model​


Llama 2 is available for free for research and commercial use.



Inside the model

This release includes model weights and starting code for pretrained and fine-tuned Llama language models — ranging from 7B to 70B parameters.

361590483_235309902165028_8558564526447568738_n.jpg


Llama 2 pretrained models are trained on 2 trillion tokens, and have double the context length than Llama 1. Its fine-tuned models have been trained on over 1 million human annotations.



Benchmarks

Llama 2 outperforms other open source language models on many external benchmarks, including reasoning, coding, proficiency, and knowledge tests.

361265668_276217774995411_4529778090866658620_n.jpg




More model details

Llama 2 was pretrained on publicly available online data sources. The fine-tuned model, Llama-2-chat, leverages publicly available instruction datasets and over 1 million human annotations.

Technical details



Partnerships​

Our global partners and supporters​


We have a broad range of supporters around the world who believe in our open approach to today’s AI — companies that have given early feedback and are excited to build with Llama 2, cloud providers that will include the model as part of their offering to customers, researchers committed to doing research with the model, and people across tech, academia, and policy who see the benefits of Llama and an open platform as we do.


361833684_296422522867650_3226004922666134647_n.jpg
359834619_295464006276308_3841616361052268913_n.jpg

359830376_212644608426309_3023493565666447787_n.jpg

361619343_1032971581204435_1173703564875198941_n.jpg

361833704_1732309127217804_8138077188095606269_n.jpg

361625141_220627097140861_3555490566793995755_n.jpg

361640331_180027815073068_4319781099554406164_n.jpg

361833684_296422522867650_3226004922666134647_n.jpg

359834619_295464006276308_3841616361052268913_n.jpg



Statement of support for Meta’s open approach to today’s AI

“We support an open innovation approach to AI. Responsible and open innovation gives us all a stake in the AI development process, bringing visibility, scrutiny and trust to these technologies. Opening today’s Llama models will let everyone benefit from this technology.”

See the complete list of signatories


Responsibility​

We’re committed to building responsibly.​

To promote a responsible, collaborative AI innovation ecosystem, we’ve established a range of resources for all who use Llama 2: individuals, creators, developers, researchers, academics, and businesses of any size.


Responsible Use Guide


The Responsible Use Guide is a resource for developers that provides best practices and considerations for building products powered by large language models (LLMs) in a responsible manner, covering various stages of development from inception to deployment.
Responsible Use Guide




https://huggingface.co/llamaste/Llama-2-13b
meta-llama/Llama-2-13b-chat · Hugging Face
meta-llama/Llama-2-13b-hf · Hugging Face
meta-llama/Llama-2-70b · Hugging Face
meta-llama/Llama-2-70b-chat · Hugging Face
meta-llama/Llama-2-70b-hf · Hugging Face
meta-llama/Llama-2-7b · Hugging Face
meta-llama/Llama-2-7b-chat · Hugging Face
meta-llama/Llama-2-7b-chat-hf · Hugging Face
meta-llama/Llama-2-7b-hf · Hugging Face
meta-llama/Llama-2-13b · Hugging Face
meta-llama/Llama-2-13b-chat · Hugging Face
meta-llama/Llama-2-70b · Hugging Face
meta-llama/Llama-2-70b-chat · Hugging Face
meta-llama/Llama-2-70b-chat-hf · Hugging Face
meta-llama/Llama-2-70b-hf · Hugging Face
meta-llama/Llama-2-7b · Hugging Face
meta-llama/Llama-2-7b-chat · Hugging Face
meta-llama/Llama-2-7b-chat-hf · Hugging Face

https://huggingface.co/meta-llama/Llama-2-7b-hf




edit:

 
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,233
Reputation
8,271
Daps
157,977
{76 page PDF link inside}

Abstract​

In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be a suitable substitute for closedsource models. We provide a detailed description of our approach to fine-tuning and safety improvements of Llama 2-Chat in order to enable the community to build on our work and contribute to the responsible development of LLMs.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,233
Reputation
8,271
Daps
157,977

How is ChatGPT's behavior changing over time?​

Lingjiao Chen, Matei Zaharia, James Zou
GPT-3.5 and GPT-4 are the two most widely used large language model (LLM) services. However, when and how these models are updated over time is opaque. Here, we evaluate the March 2023 and June 2023 versions of GPT-3.5 and GPT-4 on four diverse tasks: 1) solving math problems, 2) answering sensitive/dangerous questions, 3) generating code and 4) visual reasoning. We find that the performance and behavior of both GPT-3.5 and GPT-4 can vary greatly over time. For example, GPT-4 (March 2023) was very good at identifying prime numbers (accuracy 97.6%) but GPT-4 (June 2023) was very poor on these same questions (accuracy 2.4%). Interestingly GPT-3.5 (June 2023) was much better than GPT-3.5 (March 2023) in this task. GPT-4 was less willing to answer sensitive questions in June than in March, and both GPT-4 and GPT-3.5 had more formatting mistakes in code generation in June than in March. Overall, our findings shows that the behavior of the same LLM service can change substantially in a relatively short amount of time, highlighting the need for continuous monitoring of LLM quality.



edit:



 
Last edited:
Top