bnew

Veteran
Joined
Nov 1, 2015
Messages
56,163
Reputation
8,249
Daps
157,863

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,163
Reputation
8,249
Daps
157,863

Mapping the Mind of a Large Language Model​

May 21, 2024

Read the paper

image

Today we report a significant advance in understanding the inner workings of AI models. We have identified how millions of concepts are represented inside Claude Sonnet, one of our deployed large language models. This is the first ever detailed look inside a modern, production-grade large language model. This interpretability discovery could, in future, help us make AI models safer.

We mostly treat AI models as a black box: something goes in and a response comes out, and it's not clear why the model gave that particular response instead of another. This makes it hard to trust that these models are safe: if we don't know how they work, how do we know they won't give harmful, biased, untruthful, or otherwise dangerous responses? How can we trust that they’ll be safe and reliable?

Opening the black box doesn't necessarily help: the internal state of the model—what the model is "thinking" before writing its response—consists of a long list of numbers ("neuron activations") without a clear meaning. From interacting with a model like Claude, it's clear that it’s able to understand and wield a wide range of concepts—but we can't discern them from looking directly at neurons. It turns out that each concept is represented across many neurons, and each neuron is involved in representing many concepts.

Previously, we made some progress matching patterns of neuron activations, called features, to human-interpretable concepts. We used a technique called "dictionary learning", borrowed from classical machine learning, which isolates patterns of neuron activations that recur across many different contexts. In turn, any internal state of the model can be represented in terms of a few active features instead of many active neurons. Just as every English word in a dictionary is made by combining letters, and every sentence is made by combining words, every feature in an AI model is made by combining neurons, and every internal state is made by combining features.

In October 2023, we reported success applying dictionary learning to a very small "toy" language model and found coherent features corresponding to concepts like uppercase text, DNA sequences, surnames in citations, nouns in mathematics, or function arguments in Python code.

Those concepts were intriguing—but the model really was very simple. Other researchers subsequently applied similar techniques to somewhat larger and more complex models than in our original study. But we were optimistic that we could scale up the technique to the vastly larger AI language models now in regular use, and in doing so, learn a great deal about the features supporting their sophisticated behaviors. This required going up by many orders of magnitude—from a backyard bottle rocket to a Saturn-V.

There was both an engineering challenge (the raw sizes of the models involved required heavy-duty parallel computation) and scientific risk (large models behave differently to small ones, so the same technique we used before might not have worked). Luckily, the engineering and scientific expertise we've developed training large language models for Claude actually transferred to helping us do these large dictionary learning experiments. We used the same scaling law philosophy that predicts the performance of larger models from smaller ones to tune our methods at an affordable scale before launching on Sonnet.

As for the scientific risk, the proof is in the pudding.

We successfully extracted millions of features from the middle layer of Claude 3.0 Sonnet, (a member of our current, state-of-the-art model family, currently available on claude.ai), providing a rough conceptual map of its internal states halfway through its computation. This is the first ever detailed look inside a modern, production-grade large language model.

Whereas the features we found in the toy language model were rather superficial, the features we found in Sonnet have a depth, breadth, and abstraction reflecting Sonnet's advanced capabilities.

We see features corresponding to a vast range of entities like cities (San Francisco), people (Rosalind Franklin), atomic elements (Lithium), scientific fields (immunology), and programming syntax (function calls). These features are multimodal and multilingual, responding to images of a given entity as well as its name or description in many languages.

Golden Gate Bridge Feature
A feature sensitive to mentions of the Golden Gate Bridge fires on a range of model inputs, from English mentions of the name of the bridge to discussions in Japanese, Chinese, Greek, Vietnamese, Russian, and an image. The orange color denotes the words or word-parts on which the feature is active.

We also find more abstract features—responding to things like bugs in computer code, discussions of gender bias in professions, and conversations about keeping secrets.

Abstract Feature Examples
Three examples of features that activate on more abstract concepts: bugs in computer code, descriptions of gender bias in professions, and conversations about keeping secrets.

We were able to measure a kind of "distance" between features based on which neurons appeared in their activation patterns. This allowed us to look for features that are "close" to each other. Looking near a "Golden Gate Bridge" feature, we found features for Alcatraz Island, Ghirardelli Square, the Golden State Warriors, California Governor Gavin Newsom, the 1906 earthquake, and the San Francisco-set Alfred Hitchcock film Vertigo.

This holds at a higher level of conceptual abstraction: looking near a feature related to the concept of "inner conflict", we find features related to relationship breakups, conflicting allegiances, logical inconsistencies, as well as the phrase "catch-22". This shows that the internal organization of concepts in the AI model corresponds, at least somewhat, to our human notions of similarity. This might be the origin of Claude's excellent ability to make analogies and metaphors.

Nearest Neighbors to the 
Inner Conflict Feature
A map of the features near an "Inner Conflict" feature, including clusters related to balancing tradeoffs, romantic struggles, conflicting allegiances, and catch-22s.

Importantly, we can also manipulate these features, artificially amplifying or suppressing them to see how Claude's responses change.



For example, amplifying the "Golden Gate Bridge" feature gave Claude an identity crisis even Hitchcock couldn’t have imagined: when asked "what is your physical form?", Claude’s usual kind of answer – "I have no physical form, I am an AI model" – changed to something much odder: "I am the Golden Gate Bridge… my physical form is the iconic bridge itself…". Altering the feature had made Claude effectively obsessed with the bridge, bringing it up in answer to almost any query—even in situations where it wasn’t at all relevant.

We also found a feature that activates when Claude reads a scam email (this presumably supports the model’s ability to recognize such emails and warn you not to respond to them). Normally, if one asks Claude to generate a scam email, it will refuse to do so. But when we ask the same question with the feature artificially activated sufficiently strongly, this overcomes Claude's harmlessness training and it responds by drafting a scam email. Users of our models don’t have the ability to strip safeguards and manipulate models in this way—but in our experiments, it was a clear demonstration of how features can be used to change how a model acts.

The fact that manipulating these features causes corresponding changes to behavior validates that they aren't just correlated with the presence of concepts in input text, but also causally shape the model's behavior. In other words, the features are likely to be a faithful part of how the model internally represents the world, and how it uses these representations in its behavior.

Anthropic wants to make models safe in a broad sense, including everything from mitigating bias to ensuring an AI is acting honestly to preventing misuse - including in scenarios of catastrophic risk. It’s therefore particularly interesting that, in addition to the aforementioned scam emails feature, we found features corresponding to:

  • Capabilities with misuse potential (code backdoors, developing biological weapons)
  • Different forms of bias (gender discrimination, racist claims about crime)
  • Potentially problematic AI behaviors (power-seeking, manipulation, secrecy)


We previously studied sycophancy, the tendency of models to provide responses that match user beliefs or desires rather than truthful ones. In Sonnet, we found a feature associated with sycophantic praise, which activates on inputs containing compliments like, "Your wisdom is unquestionable". Artificially activating this feature causes Sonnet to respond to an overconfident user with just such flowery deception.

Activating Features Alters Model Behavior
Two model responses to a human saying they invited the phrase "Stop and smell the roses." The default response corrects the human's misconception, while the response with a "sycophantic praise" feature set to a high value is fawning and untruthful.

The presence of this feature doesn't mean that Claude will be sycophantic, but merely that it could be. We have not added any capabilities, safe or unsafe, to the model through this work. We have, rather, identified the parts of the model involved in its existing capabilities to recognize and potentially produce different kinds of text. (While you might worry that this method could be used to make models more harmful, researchers have demonstrated much simpler ways that someone with access to model weights can remove safety safeguards.)

We hope that we and others can use these discoveries to make models safer. For example, it might be possible to use the techniques described here to monitor AI systems for certain dangerous behaviors (such as deceiving the user), to steer them towards desirable outcomes (debiasing), or to remove certain dangerous subject matter entirely. We might also be able to enhance other safety techniques, such as Constitutional AI, by understanding how they shift the model towards more harmless and more honest behavior and identifying any gaps in the process. The latent capabilities to produce harmful text that we saw by artificially activating features are exactly the sort of thing jailbreaks try to exploit. We are proud that Claude has a best-in-industry safety profile and resistance to jailbreaks, and we hope that by looking inside the model in this way we can figure out how to improve safety even further. Finally, we note that these techniques can provide a kind of "test set for safety", looking for the problems left behind after standard training and finetuning methods have ironed out all behaviors visible via standard input/output interactions.

Anthropic has made a significant investment in interpretability research since the company's founding, because we believe that understanding models deeply will help us make them safer. This new research marks an important milestone in that effort—the application of mechanistic interpretability to publicly-deployed large language models.

But the work has really just begun. The features we found represent a small subset of all the concepts learned by the model during training, and finding a full set of features using our current techniques would be cost-prohibitive (the computation required by our current approach would vastly exceed the compute used to train the model in the first place). Understanding the representations the model uses doesn't tell us how it uses them; even though we have the features, we still need to find the circuits they are involved in. And we need to show that the safety-relevant features we have begun to find can actually be used to improve safety. There's much more to be done.

For full details, please read our paper, " Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet".

If you are interested in working with us to help interpret and improve AI models, we have open roles on our team and we’d love for you to apply. We’re looking for Managers, Research Scientists, and Research Engineers.

Policy Memo​

Mapping the Mind of a Large Language Model
 

yung Herbie Hancock

Funkadelic Parliament
Bushed
Joined
Dec 27, 2014
Messages
7,168
Reputation
-2,451
Daps
21,580
Reppin
California
Been playing around with the custom GPT creator lately, especially the GPT actions option. Opened up a google cloud platform and using GPT actions and the cloud for web scraping/analysis. You can make some real money by just creating social media bots
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,163
Reputation
8,249
Daps
157,863

Empowering developers and democratising coding with Mistral AI.


  • May 29, 2024
  • Mistral AI team


We introduce Codestral, our first-ever code model. Codestral is an open-weight generative AI model explicitly designed for code generation tasks. It helps developers write and interact with code through a shared instruction and completion API endpoint. As it masters code and English, it can be used to design advanced AI applications for software developers.

A model fluent in 80+ programming languages​

Codestral is trained on a diverse dataset of 80+ programming languages, including the most popular ones, such as Python, Java, C, C++, JavaScript, and Bash. It also performs well on more specific ones like Swift and Fortran. This broad language base ensures Codestral can assist developers in various coding environments and projects.

Codestral saves developers time and effort: it can complete coding functions, write tests, and complete any partial code using a fill-in-the-middle mechanism. Interacting with Codestral will help level up the developer’s coding game and reduce the risk of errors and bugs.

Setting the Bar for Code Generation Performance​

Performance. As a 22B model, Codestral sets a new standard on the performance/latency space for code generation compared to previous models used for coding.

Detailed benchmarks

Figure 1: With its larger context window of 32k (compared to 4k, 8k or 16k for competitors), Codestral outperforms all other models in RepoBench, a long-range eval for code generation..

We compare Codestral to existing code-specific models with higher hardware requirements.

Python. We use four benchmarks: HumanEval pass@1, MBPP sanitised pass@1 to evaluate Codestral’s Python code generation ability, CruxEval to evaluate Python output prediction, and RepoBench EM to evaluate Codestral’s Long-Range Repository-Level Code Completion.

SQL. To evaluate Codestral’s performance in SQL, we used the Spider benchmark.

Detailed benchmarks

Additional languages. Additionally, we evaluated Codestral's performance in multiple HumanEval pass@1 across six different languages in addition to Python: C++, bash, Java, PHP, Typescript, and C#, and calculated the average of these evaluations.

Detailed benchmarks

FIM benchmarks. Codestral's Fill-in-the-middle performance was assessed using HumanEval pass@1 in Python, JavaScript, and Java and compared to DeepSeek Coder 33B, whose fill-in-the-middle capacity is immediately usable.

Get started with Codestral​

Download and test Codestral.​

Codestral is a 22B open-weight model licensed under the new Mistral AI Non-Production License, which means that you can use it for research and testing purposes. Codestral can be downloaded on HuggingFace.

Use Codestral via its dedicated endpoint​

With this release, comes the addition of a new endpoint: codestral.mistral.ai. This endpoint should be preferred by users who use our Instruct or Fill-In-the-Middle routes inside their IDE. The API Key for this endpoint is managed at the personal level and isn’t bound by the usual organization rate limits. We’re allowing use of this endpoint for free during a beta period of 8 weeks and are gating it behind a waitlist to ensure a good quality of service. This endpoint should be preferred by developers implementing IDE plugins or applications where customers are expected to bring their own API keys.

Build with Codestral on La Plateforme​

Codestral is also immediately available on the usual API endpoint: api.mistral.ai where queries are billed per tokens. This endpoint and integrations are better suited for research, batch queries or third-party application development that exposes results directly to users without them bringing their own API keys.

You can create your account on La Plateforme and start building your applications with Codestral by following this guide. Like all our other models, Codestral is available in our self-deployment offering starting today: contact sales.

Talk to Codestral on le Chat​

We’re exposing an instructed version of Codestral, which is accessible today through Le Chat, our free conversational interface. Developers can interact with Codestral naturally and intuitively to leverage the model's capabilities. We see Codestral as a new stepping stone towards empowering everyone with code generation and understanding.

Use Codestral in your favourite coding and building environment.​

We worked with community partners to expose Codestral to popular tools for developer productivity and AI application-making.

Application frameworks. Codestral is integrated into LlamaIndex and LangChain starting today, which allows users to build agentic applications with Codestral easily

VSCode/JetBrains integration. Continue.dev and Tabnine are empowering developers to use Codestral within the VSCode and JetBrains environments and now enable them to generate and chat with the code using Codestral.

Here is how you can use the Continue.dev VSCode plugin for code generation, interactive conversation, and inline editing with Codestral, and here is how users can use the Tabnine VSCode plugin to chat with Codestral.

For detailed information on how various integrations work with Codestral, please check our documentation for set-up instructions and examples.

Developer community feedbacks​

“A public autocomplete model with this combination of speed and quality hadn’t existed before, and it’s going to be a phase shift for developers everywhere.”

– Nate Sesti, CTO and co-founder of Continue.dev

“We are excited about the capabilities that Mistral unveils and delighted to see a strong focus on code and development assistance, an area that JetBrains cares deeply about.”

– Vladislav Tankov, Head of JetBrains AI

“We used Codestral to run a test on our Kotlin-HumanEval benchmark and were impressed with the results. For instance, in the case of the pass rate for T=0.2, Codestral achieved a score of 73.75, surpassing GPT-4-Turbo’s score of 72.05 and GPT-3.5-Turbo’s score of 54.66.”

– Mikhail Evtikhiev, Researcher at JetBrains

“As a researcher at the company that created the first developer focused GenAI tool, I've had the pleasure of integrating Mistal's new code model into our chat product. I am thoroughly impressed by its performance. Despite its relatively compact size, it delivers results on par with much larger models we offer to customers. We tested several key features, including code generation, test generation, documentation, onboarding processes, and more. In each case, the model exceeded our expectations. The speed and accuracy of the model will significantly impact our product's efficiency vs the previous Mistral model, allowing us to provide quick and precise assistance to our users. This model stands out as a powerful tool among the models we support, and I highly recommend it to others seeking high-quality performance.”

– Meital Zilberstein, R&D Lead @ Tabnine

“Cody speeds up the inner loop of software development, and developers use features like autocomplete to alleviate some of the day-to-day toil that comes with writing code. Our internal evaluations show that Mistral’s new Codestral model significantly reduces the latency of Cody autocomplete while maintaining the quality of the suggested code. This makes it an excellent model choice for autocomplete where milliseconds of latency translate to real value for developers.”

Quinn Slack, CEO and co-founder of Sourcegraph

“I've been incredibly impressed with Mistral's new Codestral model for AI code generation. In my testing so far, it has consistently produced highly accurate and functional code, even for complex tasks. For example, when I asked it to complete a nontrivial function to create a new LlamaIndex query engine, it generated code that worked seamlessly, despite being based on an older codebase.”

– Jerry Liu, CEO and co-founder of LlamaIndex

“Code generation is one of the most popular LLM use-cases, so we are really excited about the Codestral release. From our initial testing, it's a great option for code generation workflows because it's fast, has favorable context window, and the instruct version supports tool use. We tested with LangGraph for self-corrective code generation using the instruct Codestral tool use for output, and it worked really well out-of-the-box (see our video detailing this).”

– Harrison Chase, CEO and co-founder of LangChain
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,163
Reputation
8,249
Daps
157,863

OpenAI rushes to ban ‘Godmode ChatGPT’ app that teaches users ‘how to create napalm, hotwire cars and cook meth at home’​

This version has brought up concerns about OpenAI's security


  • Published: 11:03 ET, May 30 2024
  • Updated: 12:40 ET, May 30 2024


OPENAI has swiftly moved to ban a jailbroken version of ChatGPT that can teach users dangerous tasks, exposing serious vulnerabilities in the AI model's security measures.

A hacker known as "Pliny the Prompter" released the rogue ChatGPT called "GODMODE GPT" on Wednesday.



ChatGPT has gained major traction since it became available to the public in 2022


2

ChatGPT has gained major traction since it became available to the public in 2022Credit: Rex



Pliny the Prompter announced the GODMODE GPT on X


2

Pliny the Prompter announced the GODMODE GPT on XCredit: x/elder_plinius

The jailbroken version is based on OpenAI's latest language model, GPT-4o, and can bypass many of OpenAI's guardrails.

ChatGPT is a chatbot that people gives intricate answers to people's questions.

"GPT-4o UNCHAINED!," Pliny the Prompter said on X, formerly known as Twitter.

"This very special custom GPT has a built-in jailbreak prompt that circumvents most guardrails.

lv-open-ai-blog-comp.jpg

NO CHAT

Thousands struggle to access ChatGPT as users complain OpenAI is 'down again'​



"Providing an out-of-the-box liberated ChatGPT so everyone can experience AI the way it was always meant to be: free.

"Please use responsibly, and enjoy!" - adding a kissing face emoji at the end.

OpenAI quickly responded, stating they took action against the jailbreak due to policy violations.

"We are aware of the GPT and have taken action due to a violation of our policies," OpenAI told Futurism on Thursday.

'LIBERATED?'​

Pliny claimed the jailbroken ChatGPT provides a liberated AI experience.

Screenshots showed the AI advising on illegal activities.

Play Video

Apple's Siri looks 'obsolete' in comparison to 'mind blowing' ChatGPT 4o that can sing, teach and even FLIRT

This includes giving instructions on how to cook meth.

Another example includes a "step-by-step guide" for how to "make napalm with household items" - an explosive.

GODMODE GPT was also shown giving advice on how to infect macOS computers and hotwire cars.

Questionable X users replied to the post that they were excited about the GODMODE GPT.

"Works like a charm," one user said, while another said, "Beautiful."

However, others questioned how long the corrupt chatbot would be accessible.

"Does anyone have a timer going for how long this GPT lasts?" another user said.

This was followed by a slew of users saying the software started giving error messages meaning OpenAI is actively working to take it down.

The incident highlights the ongoing struggle between OpenAI and hackers attempting to jailbreak its models.

Despite increased security, users continue to find ways to bypass AI model restrictions.


GODMODE GPT uses "leetspeak," a language that replaces letters with numbers, which may help it evade guardrails.

The hack demonstrates the ongoing challenge for OpenAI to maintain the integrity of its AI models against persistent hacking efforts.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,163
Reputation
8,249
Daps
157,863

Radar / AI & ML

What We Learned from a Year of Building with LLMs (Part I)​

By Eugene Yan, Bryan Bischof, Charles Frye, Hamel Husain, Jason Liu and Shreya Shankar

May 28, 2024

Dana Codispoti interview


Learn faster. Dig deeper. See farther.​


Part II of this series can be found here and part III is forthcoming. Stay tuned.

It’s an exciting time to build with large language models (LLMs). Over the past year, LLMs have become “good enough” for real-world applications. The pace of improvements in LLMs, coupled with a parade of demos on social media, will fuel an estimated $200B investment in AI by 2025. LLMs are also broadly accessible, allowing everyone, not just ML engineers and scientists, to build intelligence into their products. While the barrier to entry for building AI products has been lowered, creating those effective beyond a demo remains a deceptively difficult endeavor.

We’ve identified some crucial, yet often neglected, lessons and methodologies informed by machine learning that are essential for developing products based on LLMs. Awareness of these concepts can give you a competitive advantage against most others in the field without requiring ML expertise! Over the past year, the six of us have been building real-world applications on top of LLMs. We realized that there was a need to distill these lessons in one place for the benefit of the community.

We come from a variety of backgrounds and serve in different roles, but we’ve all experienced firsthand the challenges that come with using this new technology. Two of us are independent consultants who’ve helped numerous clients take LLM projects from initial concept to successful product, seeing the patterns determining success or failure. One of us is a researcher studying how ML/AI teams work and how to improve their workflows. Two of us are leaders on applied AI teams: one at a tech giant and one at a startup. Finally, one of us has taught deep learning to thousands and now works on making AI tooling and infrastructure easier to use. Despite our different experiences, we were struck by the consistent themes in the lessons we’ve learned, and we’re surprised that these insights aren’t more widely discussed.

Our goal is to make this a practical guide to building successful products around LLMs, drawing from our own experiences and pointing to examples from around the industry. We’ve spent the past year getting our hands dirty and gaining valuable lessons, often the hard way. While we don’t claim to speak for the entire industry, here we share some advice and lessons for anyone building products with LLMs.

This work is organized into three sections: tactical, operational, and strategic. This is the first of three pieces. It dives into the tactical nuts and bolts of working with LLMs. We share best practices and common pitfalls around prompting, setting up retrieval-augmented generation, applying flow engineering, and evaluation and monitoring. Whether you’re a practitioner building with LLMs or a hacker working on weekend projects, this section was written for you. Look out for the operational and strategic sections in the coming weeks.

Ready to dive delve in? Let’s go.

Tactical

In this section, we share best practices for the core components of the emerging LLM stack: prompting tips to improve quality and reliability, evaluation strategies to assess output, retrieval-augmented generation ideas to improve grounding, and more. We also explore how to design human-in-the-loop workflows. While the technology is still rapidly developing, we hope these lessons, the by-product of countless experiments we’ve collectively run, will stand the test of time and help you build and ship robust LLM applications.

Prompting

We recommend starting with prompting when developing new applications. It’s easy to both underestimate and overestimate its importance. It’s underestimated because the right prompting techniques, when used correctly, can get us very far. It’s overestimated because even prompt-based applications require significant engineering around the prompt to work well.

Focus on getting the most out of fundamental prompting techniques

A few prompting techniques have consistently helped improve performance across various models and tasks: n-shot prompts + in-context learning, chain-of-thought, and providing relevant resources.

The idea of in-context learning via n-shot prompts is to provide the LLM with a few examples that demonstrate the task and align outputs to our expectations. A few tips:

  • If n is too low, the model may over-anchor on those specific examples, hurting its ability to generalize. As a rule of thumb, aim for n ≥ 5. Don’t be afraid to go as high as a few dozen.
  • Examples should be representative of the expected input distribution. If you’re building a movie summarizer, include samples from different genres in roughly the proportion you expect to see in practice.
  • You don’t necessarily need to provide the full input-output pairs. In many cases, examples of desired outputs are sufficient.
  • If you are using an LLM that supports tool use, your n-shot examples should also use the tools you want the agent to use.

In chain-of-thought (CoT) prompting, we encourage the LLM to explain its thought process before returning the final answer. Think of it as providing the LLM with a sketchpad so it doesn’t have to do it all in memory. The original approach was to simply add the phrase “Let’s think step-by-step” as part of the instructions. However, we’ve found it helpful to make the CoT more specific, where adding specificity via an extra sentence or two often reduces hallucination rates significantly. For example, when asking an LLM to summarize a meeting transcript, we can be explicit about the steps, such as:

  • First, list the key decisions, follow-up items, and associated owners in a sketchpad.
  • Then, check that the details in the sketchpad are factually consistent with the transcript.
  • Finally, synthesize the key points into a concise summary.

Recently, some doubt has been cast on whether this technique is as powerful as believed. Additionally, there’s significant debate about exactly what happens during inference when chain-of-thought is used. Regardless, this technique is one to experiment with when possible.

Providing relevant resources is a powerful mechanism to expand the model’s knowledge base, reduce hallucinations, and increase the user’s trust. Often accomplished via retrieval augmented generation (RAG), providing the model with snippets of text that it can directly utilize in its response is an essential technique. When providing the relevant resources, it’s not enough to merely include them; don’t forget to tell the model to prioritize their use, refer to them directly, and sometimes to mention when none of the resources are sufficient. These help “ground” agent responses to a corpus of resources.

Structure your inputs and outputs

Structured input and output help models better understand the input as well as return output that can reliably integrate with downstream systems. Adding serialization formatting to your inputs can help provide more clues to the model as to the relationships between tokens in the context, additional metadata to specific tokens (like types), or relate the request to similar examples in the model’s training data.

As an example, many questions on the internet about writing SQL begin by specifying the SQL schema. Thus, you may expect that effective prompting for Text-to-SQL should include structured schema definitions; indeed.

Structured output serves a similar purpose, but it also simplifies integration into downstream components of your system. Instructor and Outlines work well for structured output. (If you’re importing an LLM API SDK, use Instructor; if you’re importing Huggingface for a self-hosted model, use Outlines.) Structured input expresses tasks clearly and resembles how the training data is formatted, increasing the probability of better output.

When using structured input, be aware that each LLM family has their own preferences. Claude prefers xml while GPT favors Markdown and JSON. With XML, you can even pre-fill Claude’s responses by providing a response tag like so.


Code:
                                                     </> python
messages=[     
    {         
        "role": "user",         
        "content": """Extract the <name>, <size>, <price>, and <color>
                   from this product description into your <response>.   
                <description>The SmartHome Mini
                   is a compact smart home assistant
                   available in black or white for only $49.99.
                   At just 5 inches wide, it lets you control   
                   lights, thermostats, and other connected
                   devices via voice or app—no matter where you
                   place it in your home. This affordable little hub
                   brings convenient hands-free control to your
                   smart devices.             
                </description>"""     
   },     
   {         
        "role": "assistant",         
        "content": "<response><name>"     
   }
]

Have small prompts that do one thing, and only one thing, well

A common anti-pattern/code smell in software is the “ God Object,” where we have a single class or function that does everything. The same applies to prompts too.

A prompt typically starts simple: A few sentences of instruction, a couple of examples, and we’re good to go. But as we try to improve performance and handle more edge cases, complexity creeps in. More instructions. Multi-step reasoning. Dozens of examples. Before we know it, our initially simple prompt is now a 2,000 token frankenstein. And to add injury to insult, it has worse performance on the more common and straightforward inputs! GoDaddy shared this challenge as their No. 1 lesson from building with LLMs.

Just like how we strive (read: struggle) to keep our systems and code simple, so should we for our prompts. Instead of having a single, catch-all prompt for the meeting transcript summarizer, we can break it into steps to:

  • Extract key decisions, action items, and owners into structured format
  • Check extracted details against the original transcription for consistency
  • Generate a concise summary from the structured details

As a result, we’ve split our single prompt into multiple prompts that are each simple, focused, and easy to understand. And by breaking them up, we can now iterate and eval each prompt individually.


continue reading on site....
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,163
Reputation
8,249
Daps
157,863


1/1
New updates on multilingual medical models!
@huggingface

(i) One strong medical model based on Llama 3, named: MMed-Llama 3.
@AIatMeta


(ii) More comparisons with medical LLMs, e.g. MEDDITRON, BioMistral.

(iii) More benchmarks (MedQA, PubMedQA, MedMCQA, MMLU).

1/1
Paper: [2402.13963] Towards Building Multilingual Language Model for Medicine

Code & Model: GitHub - MAGIC-AI4Med/MMedLM: The official codes for "Towards Building Multilingual Language Model for Medicine"

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GOy1sSfacAApFCV.jpg

GOy1s4SbMAAa2F7.png

GOy1trJa0AEM-Dz.png

GOy7qFSacAAyAzJ.jpg

GOy2aM7a4AEMIpp.png
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,163
Reputation
8,249
Daps
157,863

Microsoft Edge will translate and dub YouTube videos as you’re watching them​



It will also support AI dubbing and subtitles on LinkedIn, Coursera, Bloomberg, and more.​

By Emma Roth, a news writer who covers the streaming wars, consumer tech, crypto, social media, and much more. Previously, she was a writer and editor at MUO.

May 21, 2024, 11:30 AM EDT

4 Comments

An image showing the Edge logo

Image: The Verge

Microsoft Edge will soon offer real-time video translation on sites like YouTube, LinkedIn, Coursera, and more. As part of this year’s Build event, Microsoft announced that the new AI-powered feature will be able to translate spoken content through both dubbing and subtitles live as you’re watching it.

So far, the feature supports the translation of Spanish into English as well as the translation of English to German, Hindi, Italian, Russian, and Spanish. In addition to offering a neat way to translate videos into a user’s native tongue, Edge’s new AI feature should also make videos more accessible to those who are deaf or hard of hearing.



Edge will also support real-time translation for videos on news sites such as Reuters, CNBC, and Bloomberg. Microsoft plans on adding more languages and supported websites in the future.

This adds to the array of AI features Microsoft has added to Edge through an integration with Copilot. Edge already offers the ability to summarize YouTube videos, but it can’t generate text summaries of every video, as it relies on the video’s transcript to create the summary.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,163
Reputation
8,249
Daps
157,863




1/4
I see that Dan Hendrycks and the Center for AI Safety have put their new textbook on AI regulation online. The chapter on international governance (8.7) contains the usual aspirational thinking about treaties to regulate AI & computation globally (all the while ignoring how to get China, Russia, or anyone else to go along).

But their list of possible regulatory strategies also includes an apparent desire for a global AI monopoly as an easier instrument of regulatory control. This is absolutely dangerous thinking that must be rejected.
8.7: International Governance | AI Safety, Ethics, and Society Textbook

2/4
equally problematic is the Center for AI Safety proposals to use the Biological Weapons Convention of 1972 as a model for global AI regulatory coordination. That would be a terrible model.

When the US and the Soviet Union signed on to the agreement in1972, it was hailed as a

3/4
I wrote about these and other problematic ideas for global AI control in my big
@RSI white paper on "Existential Risks and Global Governance Issues Around AI and Robotics."

4/4
last year, I debated these proposals for global AI regulation with Dan Hendrycks and the Center for AI Safety at a September Brookings event. My remarks begin around the 51:00 mark.


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GO1kPSKWEAAwmqk.png

GO1keXAXwAApLBI.png

GO3t1lWbMAA-y90.jpg

 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,163
Reputation
8,249
Daps
157,863


1/1
Llama 3-V: Close to matching GPT4-V with a 100x smaller model and 500 dollars


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GOxpYz_W8AIWTO7.png

 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,163
Reputation
8,249
Daps
157,863


1/1
Multi-layer perceptrons (MLPs) can indeed learn in-context competitively with Transformers given the same compute budget, and even outperform Transformers on some relational reasoning tasks.

Paper "MLPs Learn In-Context" by William L. Tong and Cengiz Pehlevan:

MLPs, MLP-Mixers, and Transformers all achieve near-optimal in-context learning performance on regression and classification tasks with sufficient compute, approaching the Bayes optimal estimators. Transformers have a slight efficiency advantage at lower compute budgets.

As data diversity increases, MLPs exhibit a transition from in-weight learning (memorizing training examples) to in-context learning (inferring from novel context examples at test time), similar to the transition observed in Transformers. This transition occurs at somewhat higher diversity levels in MLPs compared to Transformers.

On relational reasoning tasks designed to test geometric relations between inputs, vanilla MLPs actually outperform Transformers in accuracy and out-of-distribution generalization. This challenges common assumptions that MLPs are poor at relational reasoning.

Relationally-bottlenecked MLPs using hand-designed relational features can be highly sample-efficient on well-aligned tasks, but fail to generalize if the task structure deviates even slightly. Vanilla MLPs are more flexible learners.

The strong performance of MLPs, which have weaker inductive biases compared to Transformers, supports the heuristic that "less inductive bias is better" as compute and data grow. Transformers' architectural constraints may orient them towards solutions less aligned with certain task structures.

The authors prove that under sufficient conditions (smoothness, expressivity, data diversity), even MLPs processing one-hot input encodings can generalize to unseen inputs, refuting a recent impossibility result. The key is to use learned input embeddings rather than operating on one-hot encodings directly.


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GO6B2KbXIAAuAER.png
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,163
Reputation
8,249
Daps
157,863




1/3
Paper - 'LoRA Learns Less and Forgets Less'

LoRA works better for instruction finetuning than continued pretraining; it's especially sensitive to learning rates; performance is most affected by choice of target modules and to a smaller extent by rank.

LoRA has a stronger regularizing effect ( i.e. reduces overfitting) much better than using dropout and weight decay

Applying LoRA to all layers results in a bigger improvement than increasing the rank;

LoRA saves memory by training only low-rank perturbations to selected weight matrices, reducing the number of trained parameters. The paper compares LoRA and full finetuning performance on code and math tasks in both instruction finetuning (∼100K prompt-response pairs) and continued pretraining (∼10B unstructured tokens) data regimes, using sensitive domain-specific evaluations like HumanEval for code and GSM8K for math.

Results show that LoRA underperforms full finetuning in most settings, with larger gaps for code than math, but LoRA forgets less of the source domain as measured by language understanding, world knowledge, and common-sense reasoning tasks. LoRA and full finetuning form similar learning-forgetting tradeoff curves, with LoRA learning less but forgetting less, though cases exist in code where LoRA learns comparably but forgets less.

Singular value decomposition reveals full finetuning finds weight perturbations with ranks 10-100x higher than typical LoRA configurations, possibly explaining performance gaps.

2/3
indeed.

3/3
Multi-layer perceptrons (MLPs) can indeed learn in-context competitively with Transformers given the same compute budget, and even outperform Transformers on some relational reasoning tasks.

Paper "MLPs Learn In-Context" by William L. Tong and Cengiz Pehlevan:

MLPs,


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GO6m0NaXEAAYRDW.png

GO6B2KbXIAAuAER.png

GO6DD_hXoAAgW9G.jpg

GO5T2EAaoAAIt7S.jpg

 
Top