The A.I Megathread (LLM , GPT , Development)

bnew · Jul 15, 2023

AlphaGo Zero: Starting from scratch

AlphaGo Zero: Starting from scratch

Artificial intelligence research has made rapid progress in a wide variety of domains from speech recognition and image classification to genomics and drug discovery. In many cases, these are specialist systems that leverage enormous amounts of human expertise and data.

www.deepmind.com

bnew · Jul 15, 2023

https://archive.is/OgFeB

Meta solves hand generation
CM3leon

Introducing CM3leon, a more efficient, state-of-the-art generative model for text and images

Today, we’re showcasing CM3leon (pronounced like “chameleon”), a single foundation model that does both text-to-image and image-to-text generation.

ai.meta.com

bnew · Jul 15, 2023

https://archive.is/P1uIx

bnew · Jul 16, 2023

https://archive.is/nRIDh

bnew · Jul 16, 2023

GitHub - assafelovic/gpt-researcher: LLM based autonomous agent that conducts deep local and web research on any topic and generates a long report with citations.

LLM based autonomous agent that conducts deep local and web research on any topic and generates a long report with citations. - assafelovic/gpt-researcher

github.com

About

GPT based autonomous agent that does online comprehensive research on any given topic

GPT Researcher is an autonomous agent designed for comprehensive online research on a variety of tasks.

The agent can produce detailed, factual and unbiased research reports, with customization options for focusing on relevant resources, outlines, and lessons. Inspired by AutoGPT and the recent Plan-and-Solve paper, GPT Researcher addresses issues of speed and determinism, offering a more stable performance and increased speed through parallelized agent work, as opposed to synchronous operations.

Our mission is to empower individuals and organizations with accurate, unbiased, and factual information by leveraging the power of AI.

Why GPT Researcher?

To form objective conclusions for manual research tasks can take time, sometimes weeks to find the right resources and information.
Current LLMs are trained on past and outdated information, with heavy risks of hallucinations, making them almost irrelevant for research tasks.
Solutions that enable web search (such as ChatGPT + Web Plugin), only consider limited resources that in some cases result in superficial conclusions or biased answers.
Using only a selection of resources can create bias in determining the right conclusions for research questions or tasks.

Architecture

The main idea is to run "planner" and "execution" agents, whereas the planner generates questions to research, and the execution agents seek the most related information based on each generated research question. Finally, the planner filters and aggregates all related information and creates a research report. The agents leverage both gpt3.5-turbo-16k and gpt-4 to complete a research task.

More specifcally:

Generate a set of research questions that together form an objective opinion on any given task.
For each research question, trigger a crawler agent that scrapes online resources for information relevant to the given task.
For each scraped resources, summarize based on relevant information and keep track of its sources.
Finally, filter and aggregate all summarized sources and generate a final research report.

DEMO

Features

Generate research, outlines, resources and lessons reports
Aggregates over 20 web sources per research to form objective and factual conclusions
Includes an easy-to-use web interface (HTML/CSS/JS)
Scrapes web sources with javascript support
Keeps track and context of visited and used web sources
Export research reports to PDF and more...

Tavily

Say hello to Tavily, your AI researcher for rapid insights and comprehensive research.

tavily.com

bnew · Jul 16, 2023

Large Language Models as General Pattern Machines

Suvir Mirchandani1 , Fei Xia2 , Pete Florence2 , Brian Ichter2 , Danny Driess2 3 , Montserrat Gonzalez Arenas2 , Kanishka Rao2 , Dorsa Sadigh1 2, Andy Zeng2 1Stanford University, 2Google DeepMind, 3TU Berlin Large Language Models as General Pattern Machines

Abstract:

We observe that pre-trained large language models (LLMs) are capable of autoregressively completing complex token sequences – from arbitrary ones procedurally generated by probabilistic context-free grammars (PCFG), to more rich spatial patterns found in the Abstract Reasoning Corpus (ARC), a general AI benchmark, prompted in the style of ASCII art. Surprisingly, pattern completion proficiency can be partially retained even when the sequences are expressed using tokens randomly sampled from the vocabulary. These results suggest that without any additional training, LLMs can serve as general sequence modelers, driven by in-context learning. In this work, we investigate how these zero-shot capabilities may be applied to problems in robotics – from extrapolating sequences of numbers that represent states over time to complete simple motions, to least-to-most prompting of reward-conditioned trajectories that can discover and represent closed-loop policies (e.g., a stabilizing controller for CartPole). While difficult to deploy today for real systems due to latency, context size limitations, and compute costs, the approach of using LLMs to drive low-level control may provide an exciting glimpse into how the patterns among words could be transferred to actions. Keywords: large language models, in-context learning, language for robotics

https://arxiv.org/pdf/2307.04721.pdf

bnew · Jul 17, 2023

bnew · Jul 17, 2023

https://archive.is/IKFUr

Conversational AI is finally here. Introducing Air…

Air can perform full 5-40 minute long sales & customer service calls over the phone that sound like a human. And can perform actions autonomously across 5,000 unique applications.

It’s kind of like having 100,000 sales & customer service reps at the tap of a button.

Reply “beta” if you want to be one of the first companies on the planet to deploy an AI sales or CS team. Once you receive beta access, you can create your own AI and have it on live calls in a matter of minutes - kinda like setting up a Facebook ad.

This is not just some hype twitter demo. Air is currently on live calls, talking to real people every single day, profitably producing for real businesses. And it’s not limited to any one use case… you can create an AI SDR, 24/7 CS agent, Closer, Account Executive… or prompt it for your specific use case and get creative (therapy, talk to Aristotle, etc… it’s only limited by your imagination)

Reply “beta” to this tweet and join the 50,000+ businesses that are already on the list to beta test Air - and our team will reach out asap.

(Also, if you are a developer / builder who wants to innovate on top of our existing technology for different use cases - we’d love to see you reach out as well! Builders are our favorite people haha - and we are excited to see what people create with us.)

bnew · Jul 17, 2023

‘A relationship with another human is overrated’ – inside the rise of AI girlfriends

https://www.telegraph.co.uk/business/2023/07/16/ai-girlfriend-replika-caryn-apps-relationship-health/ Millions of (mostly) men are carrying out relationships with a chatbot partner – but it’s not all love and happiness By James Titcomb16 July 2023 • 6:00am Related Topics Artificial...

www.thecoli.com

bnew · Jul 18, 2023

Fani Willis fan · Jul 18, 2023

bnew said:
Dukaan CEO cops backlash for post announcing layoffs: ‘Stunning lack of empathy

https://www.moneycontrol.com/news/trends/dukaan-lays-off-90-of-support-staff-after-introducing-ai-chatbot-for-customer-support-10937091.html Dukaan CEO cops backlash for post announcing layoffs: ‘Stunning lack of empathy’ Dukaan founder and CEO Suumit Shah revealed that 90% of the company’s...

www.thecoli.com

https://archive.is/T55oO

https://archive.is/T55oO

He’s doing the right thing, capitalism wise :manny:

bnew · Jul 18, 2023

https://archive.is/90oNL

Introducing LongLLaMA

, an unlimited-context version of OpenLLaMA fine-tuned at 8k & capable of extrapolating to 256k tokens! We train it using our new Focused Transformer

technique (FoT). No degradation on short context, drop-in compatibility & Apache 2.0 license

Transformer Language Models without Positional Encodings Still Learn Positional Information

Causal transformer language models (LMs), such as GPT-3, typically require some form of positional encoding, such as positional embeddings. However, we show that LMs without any explicit positional encoding are still competitive with standard models, and that this phenomenon is robust across...

arxiv.org

Transformer Language Models without Positional Encodings Still Learn Positional Information

Adi Haviv, Ori Ram, Ofir Press, Peter Izsak, Omer Levy

Causal transformer language models (LMs), such as GPT-3, typically require some form of positional encoding, such as positional embeddings. However, we show that LMs without any explicit positional encoding are still competitive with standard models, and that this phenomenon is robust across different datasets, model sizes, and sequence lengths. Probing experiments reveal that such models acquire an implicit notion of absolute positions throughout the network, effectively compensating for the missing information. We conjecture that causal attention enables the model to infer the number of predecessors that each token can attend to, thereby approximating its absolute position. Our findings indicate that causal LMs might derive positional awareness not only from the explicit positioning mechanism, but also from the effects of the causal mask.

Google Colaboratory
GitHub - CStanKonrad/long_llama: LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.
https://huggingface.co/syzymon/long_llama_3b

bnew · Jul 18, 2023

Introducing Llama 2

The next generation of our

open source large language model

Llama 2 is available for free for research and commercial use.

Llama

The open-source AI models you can fine-tune, distill and deploy anywhere. Choose from our collection of models: Llama 4 Maverick and Llama 4 Scout.

ai.meta.com

Inside the model

This release includes model weights and starting code for pretrained and fine-tuned Llama language models — ranging from 7B to 70B parameters.

Llama 2 pretrained models are trained on 2 trillion tokens, and have double the context length than Llama 1. Its fine-tuned models have been trained on over 1 million human annotations.

Benchmarks

Llama 2 outperforms other open source language models on many external benchmarks, including reasoning, coding, proficiency, and knowledge tests.

More model details

Llama 2 was pretrained on publicly available online data sources. The fine-tuned model, Llama-2-chat, leverages publicly available instruction datasets and over 1 million human annotations.

Technical details

Partnerships

Our global partners and supporters

We have a broad range of supporters around the world who believe in our open approach to today’s AI — companies that have given early feedback and are excited to build with Llama 2, cloud providers that will include the model as part of their offering to customers, researchers committed to doing research with the model, and people across tech, academia, and policy who see the benefits of Llama and an open platform as we do.

Statement of support for Meta’s open approach to today’s AI

“We support an open innovation approach to AI. Responsible and open innovation gives us all a stake in the AI development process, bringing visibility, scrutiny and trust to these technologies. Opening today’s Llama models will let everyone benefit from this technology.”

See the complete list of signatories

Responsibility

We’re committed to building responsibly.

To promote a responsible, collaborative AI innovation ecosystem, we’ve established a range of resources for all who use Llama 2: individuals, creators, developers, researchers, academics, and businesses of any size.

Responsible Use Guide

The Responsible Use Guide is a resource for developers that provides best practices and considerations for building products powered by large language models (LLMs) in a responsible manner, covering various stages of development from inception to deployment.
Responsible Use Guide

meta-llama (Meta Llama)

Org profile for Meta Llama on Hugging Face, the AI community building the future.

huggingface.co

Llama 2 | Hacker News

news.ycombinator.com

Chat with Meta Llama 3.1 on Replicate

Llama 3.1 is the latest language model from Meta.

www.llama2.ai

edit:

Explore Llamav2 With TGI - a Hugging Face Space by ysharma

Discover amazing ML apps made by the community

huggingface.co

bnew · Jul 18, 2023

Llama 2: Open Foundation and Fine-Tuned Chat Models | Meta AI Research

In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to...

ai.meta.com

{76 page PDF link inside}

Abstract

In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be a suitable substitute for closedsource models. We provide a detailed description of our approach to fine-tuning and safety improvements of Llama 2-Chat in order to enable the community to build on our work and contribute to the responsible development of LLMs.

bnew · Jul 19, 2023

How is ChatGPT's behavior changing over time?

GPT-3.5 and GPT-4 are the two most widely used large language model (LLM) services. However, when and how these models are updated over time is opaque. Here, we evaluate the March 2023 and June 2023 versions of GPT-3.5 and GPT-4 on several diverse tasks: 1) math problems, 2) sensitive/dangerous...

arxiv.org

How is ChatGPT's behavior changing over time?

Lingjiao Chen, Matei Zaharia, James Zou

GPT-3.5 and GPT-4 are the two most widely used large language model (LLM) services. However, when and how these models are updated over time is opaque. Here, we evaluate the March 2023 and June 2023 versions of GPT-3.5 and GPT-4 on four diverse tasks: 1) solving math problems, 2) answering sensitive/dangerous questions, 3) generating code and 4) visual reasoning. We find that the performance and behavior of both GPT-3.5 and GPT-4 can vary greatly over time. For example, GPT-4 (March 2023) was very good at identifying prime numbers (accuracy 97.6%) but GPT-4 (June 2023) was very poor on these same questions (accuracy 2.4%). Interestingly GPT-3.5 (June 2023) was much better than GPT-3.5 (March 2023) in this task. GPT-4 was less willing to answer sensitive questions in June than in March, and both GPT-4 and GPT-3.5 had more formatting mistakes in code generation in June than in March. Overall, our findings shows that the behavior of the same LLM service can change substantially in a relatively short amount of time, highlighting the need for continuous monitoring of LLM quality.

https://arxiv.org/pdf/2307.09009.pdf

How is ChatGPT's behavior changing over time? | Hacker News

news.ycombinator.com

edit:

The A.I Megathread (LLM , GPT , Development)

Veteran

Veteran

Veteran

Veteran

Veteran

About​

Why GPT Researcher?​

Architecture​

DEMO​

Features​

Veteran

Large Language Models as General Pattern Machines​

Abstract:​

Veteran

Veteran

Veteran

Veteran

Veteran

Veteran

Transformer Language Models without Positional Encodings Still Learn Positional Information​

Veteran

Introducing Llama 2​

The next generation of our​

open source large language model​

Partnerships​

Our global partners and supporters​

Responsibility​

We’re committed to building responsibly.​

Veteran

Abstract​

Veteran

How is ChatGPT's behavior changing over time?​

About

Why GPT Researcher?

Architecture

DEMO

Features

Large Language Models as General Pattern Machines

Abstract:

Transformer Language Models without Positional Encodings Still Learn Positional Information

Introducing Llama 2

The next generation of our

open source large language model

Partnerships

Our global partners and supporters

Responsibility

We’re committed to building responsibly.

Abstract

How is ChatGPT's behavior changing over time?