bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,623
Daps
161,864

Microsoft's New AI Recall Feature Could Already Be in Legal Trouble​


Recall can take screenshots of everything a user does on the company's new AI-powered laptops​

By

William Gavin, Quartz

Published7 hours ago

Comments (3)

Microsoft has heavily invested in artificial intelligence technology.

Microsoft has heavily invested in artificial intelligence technology.Image: Drew Angerer (Getty Images)

Microsoft’s full-throttle push into artificial intelligence technology is getting more scrutiny by regulators worried that the conglomerate is invading consumers’ privacy.

The Redmond, Washington-based tech giant was busy this week at Microsoft Build, its annual developer conference, where it announced a new line of laptops equipped with AI hardware and support for AI applications. One new feature in particular stole the show — but not in the way Microsoft had likely hoped.

The feature, called Recall, uses AI to build a “photographic memory” of a user’s laptop activity that they can then search. In other words, Recall constantly takes screenshots of a user’s activity on the computer, whether they’re searching for new recipes online, watching videos, or using apps.

Suggested Reading​

Twitter Sends Microsoft Mean Letter Because Elon Is Mad

Twitter Sends Microsoft Mean Letter Because Elon Is Mad​



Microsoft improperly used Twitter data, according to a letter that definitely doesn't involve anyone's feelings.

“We can recreate moments from the past essentially,” Microsoft CEO Satya Nadella told the Wall Street Journal.

Microsoft’s announcement of the feature was met with instant backlash from privacy advocates and consumers, including Tesla CEO and xAI founder Elon Musk. The tech giant also has to worry about the Information Commissioner’s Office (ICO), a U.K. data watchdog, which told the BBC it was reaching out to Microsoft for more information on Recall.

A spokesperson for the ICO told the BBC that companies must “rigorously assess and mitigate risks to peoples’ rights and freedoms” before launching new products, especially those that are potentially invasive.

Microsoft says that Recall snapshots are stored locally on the PCs, encrypted, and can only be accessed by the person whose profile was used to sign into the computer. Users will also be able to filter out specific apps or websites from being scanned, pause snapshot collection, and delete some or all snapshots stored on their device.

“We know that privacy is important,” Microsoft said in a blog post Monday. “Copilot+ PCs are also designed so that even the AI running on your device can’t access your private content. In addition, IT admins can use Microsoft Intune to disable Recall from saving any snapshots, and new policies are coming later to enable IT to centrally filter specific apps and websites.”
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,623
Daps
161,864

OpenAI Just Gave Away the Entire Game​

The Scarlett Johansson debacle is a microcosm of AI’s raw deal: It’s happening, and you can’t stop it.

By Charlie Warzel

Photo collage of Scarlett Johansson, the OpenAI logo, and Sam Altman

Illustration by Paul Spella / The Atlantic. Sources: Jason Redmond / AFP; Paul Morigi / Getty.

MAY 21, 2024, 5:54 PM ET

SAVE

If you’re looking to understand the philosophy that underpins Silicon Valley’s latest gold rush, look no further than OpenAI’s Scarlett Johansson debacle. The story, according to Johansson’s lawyers, goes like this: Nine months ago, OpenAI CEO Sam Altman approached the actor with a request to license her voice for a new digital assistant; Johansson declined. She alleges that just two days before the company’s keynote event last week, in which that assistant was revealed as part of a new system called GPT-4o, Altman reached out to Johansson’s team, urging the actor to reconsider. Johansson and Altman allegedly never spoke, and Johansson allegedly never granted OpenAI permission to use her voice. Nevertheless, the company debuted Sky two days later—a program with a voice many believed was alarmingly similar to Johansson’s.

Johansson told NPR that she was “shocked, angered and in disbelief that Mr. Altman would pursue a voice that sounded so eerily similar to mine.” In response, Altman issued a statement denying that the company had cloned her voice and saying that it had already cast a different voice actor before reaching out to Johansson. (I’d encourage you to listen for yourself.) Curiously, Altman said that OpenAI would take down Sky’s voice from its platform “ out of respect” for Johansson. This is a messy situation for OpenAI, complicated by Altman’s own social-media posts. On the day that OpenAI released ChatGPT’s assistant, Altman posted a cheeky, one-word statement on X: “Her”—a reference to the 2013 film of the same name, in which Johansson is the voice of an AI assistant that a man falls in love with. Altman’s post is reasonably damning, implying that Altman was aware, even proud, of the similarities between Sky’s voice and Johansson’s.

On its own, this seems to be yet another example of a tech company blowing past ethical concerns and operating with impunity. But the situation is also a tidy microcosm of the raw deal at the center of generative AI, a technology that is built off data scraped from the internet, generally without the consent of creators or copyright owners. Multiple artists and publishers, including The New York Times, have sued AI companies for this reason, but the tech firms remain unchastened, prevaricating when asked point-blank about the provenance of their training data. At the core of these deflections is an implication: The hypothetical superintelligence they are building is too big, too world-changing, too important for prosaic concerns such as copyright and attribution. The Johansson scandal is merely a reminder of AI’s manifest-destiny philosophy: This is happening, whether you like it or not.

Altman and OpenAI have been candid on this front. The end goal of OpenAI has always been to build a so-called artificial general intelligence, or AGI, that would, in their imagining, alter the course of human history forever, ushering in an unthinkable revolution of productivity and prosperity—a utopian world where jobs disappear, replaced by some form of universal basic income, and humanity experiences quantum leaps in science and medicine. (Or, the machines cause life on Earth as we know it to end.) The stakes, in this hypothetical, are unimaginably high—all the more reason for OpenAI to accelerate progress by any means necessary. Last summer, my colleague Ross Andersen described Altman’s ambitions thusly:

As with other grand projects of the 20th century, the voting public had a voice in both the aims and the execution of the Apollo missions. Altman made it clear that we’re no longer in that world. Rather than waiting around for it to return, or devoting his energies to making sure that it does, he is going full throttle forward in our present reality.

Part of Altman’s reasoning, he told Andersen, is that AI development is a geopolitical race against autocracies like China. “If you are a person of a liberal-democratic country, it is better for you to cheer on the success of OpenAI” rather than that of “authoritarian governments,” he said. He noted that, in an ideal world, AI should be a product of nations. But in this world, Altman seems to view his company as akin to its own nation-state. Altman, of course, has testified before Congress, urging lawmakers to regulate the technology while also stressing that “the benefits of the tools we have deployed so far vastly outweigh the risks.” Still, the message is clear: The future is coming, and you ought to let us be the ones to build it.

Other OpenAI employees have offered a less gracious vision. In a video posted last fall on YouTube by a group of effective altruists in the Netherlands, three OpenAI employees answered questions about the future of the technology. In response to one question about AGI rendering jobs obsolete, Jeff Wu, an engineer for the company, confessed, “It’s kind of deeply unfair that, you know, a group of people can just build AI and take everyone’s jobs away, and in some sense, there’s nothing you can do to stop them right now.” He added, “I don’t know. Raise awareness, get governments to care, get other people to care. Yeah. Or join us and have one of the few remaining jobs. I don’t know; it’s rough.” Wu’s colleague Daniel Kokotajlo jumped in with the justification. “To add to that,” he said, “AGI is going to create tremendous wealth. And if that wealth is distributed—even if it’s not equitably distributed, but the closer it is to equitable distribution, it’s going to make everyone incredibly wealthy.” (There is no evidence to suggest that the wealth will be evenly distributed.)

This is the unvarnished logic of OpenAI. It is cold, rationalist, and paternalistic. That such a small group of people should be anointed to build a civilization-changing technology is inherently unfair, they note. And yet they will carry on because they have both a vision for the future and the means to try to bring it to fruition. Wu’s proposition, which he offers with a resigned shrug in the video, is telling: You can try to fight this, but you can’t stop it. Your best bet is to get on board.

You can see this dynamic playing out in OpenAI’s content-licensing agreements, which it has struck with platforms such as Reddit and news organizations such as Axel Springer and Dotdash Meredith. Recently, a tech executive I spoke with compared these types of agreements to a hostage situation, suggesting they believe that AI companies will find ways to scrape publishers’ websites anyhow, if they don’t comply. Best to get a paltry fee out of them while you can, the person argued.

The Johansson accusations only compound (and, if true, validate) these suspicions. Altman’s alleged reasoning for commissioning Johansson’s voice was that her familiar timbre might be “comforting to people” who find AI assistants off-putting. Her likeness would have been less about a particular voice-bot aesthetic and more of an adoption hack or a recruitment tool for a technology that many people didn’t ask for, and seem uneasy about. Here, again, is the logic of OpenAI at work. It follows that the company would plow ahead, consent be damned, simply because it might believe the stakes are too high to pivot or wait. When your technology aims to rewrite the rules of society, it stands that society’s current rules need not apply.

Hubris and entitlement are inherent in the development of any transformative technology. A small group of people needs to feel confident enough in its vision to bring it into the world and ask the rest of us to adapt. But generative AI stretches this dynamic to the point of absurdity. It is a technology that requires a mindset of manifest destiny, of dominion and conquest. It’s not stealing to build the future if you believe it has belonged to you all along.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,623
Daps
161,864

About​

Every front-end GUI client for ChatGPT

Every front-end GUI client for ChatGPT API​

Similar to Every Proximity Chat App, I made this list to keep track of every graphical user interface alternative to ChatGPT.

If you want to add your app, feel free to open a pull request to add your app to the list. You can list your app under the appropriate category in alphabetical order. If you want your app removed from this list, you can also open a pull request to do that too.

Open Source​

Web​

Browser Extension​

Self-Hosted​

Desktop​

Not Open Source​

Web​

Desktop​

More plugins and tools​





About​

Enhanced ChatGPT Clone: Features OpenAI, Assistants API, Azure, Groq, GPT-4 Vision, Mistral, Bing, Anthropic, OpenRouter, Vertex AI, Gemini, AI model switching, message search, langchain, DALL-E-3, ChatGPT Plugins, OpenAI Functions, Secure Multi-User System, Presets, completely open-source for self-hosting. More features in development

LibreChat




📃 Features​

  • 🖥️ UI matching ChatGPT, including Dark mode, Streaming, and latest updates
  • 🤖 AI model selection:
    • OpenAI, Azure OpenAI, BingAI, ChatGPT, Google Vertex AI, Anthropic (Claude), Plugins, Assistants API (including Azure Assistants)
  • ✅ Compatible across both Remote & Local AI services:
    • groq, Ollama, Cohere, Mistral AI, Apple MLX, koboldcpp, OpenRouter, together.ai, Perplexity, ShuttleAI, and more
  • 💾 Create, Save, & Share Custom Presets
  • 🔀 Switch between AI Endpoints and Presets, mid-chat
  • 🔄 Edit, Resubmit, and Continue Messages with Conversation branching
  • 🌿 Fork Messages & Conversations for Advanced Context control
  • 💬 Multimodal Chat:
    • Upload and analyze images with Claude 3, GPT-4 (including gpt-4o), and Gemini Vision 📸
    • Chat with Files using Custom Endpoints, OpenAI, Azure, Anthropic, & Google. 🗃️
    • Advanced Agents with Files, Code Interpreter, Tools, and API Actions 🔦
  • 🌎 Multilingual UI:
    • English, 中文, Deutsch, Español, Français, Italiano, Polski, Português Brasileiro,
    • Русский, 日本語, Svenska, 한국어, Tiếng Việt, 繁體中文, العربية, Türkçe, Nederlands, עברית
  • 🎨 Customizable Dropdown & Interface: Adapts to both power users and newcomers.
  • 📥 Import Conversations from LibreChat, ChatGPT, Chatbot UI
  • 📤 Export conversations as screenshots, markdown, text, json.
  • 🔍 Search all messages/conversations
  • 🔌 Plugins, including web access, image generation with DALL-E-3 and more
  • 👥 Multi-User, Secure Authentication with Moderation and Token spend tools
  • ⚙️ Configure Proxy, Reverse Proxy, Docker, & many Deployment options:
    • Use completely local or deploy on the cloud
  • 📖 Completely Open-Source & Built in Public
  • 🧑‍🤝‍🧑 Community-driven development, support, and feedback
For a thorough review of our features, see our docs here 📚

🪶 All-In-One AI Conversations with LibreChat​

LibreChat brings together the future of assistant AIs with the revolutionary technology of OpenAI's ChatGPT. Celebrating the original styling, LibreChat gives you the ability to integrate multiple AI models. It also integrates and enhances original client features such as conversation and message search, prompt templates and plugins.

With LibreChat, you no longer need to opt for ChatGPT Plus and can instead use free or pay-per-call APIs. We welcome contributions, cloning, and forking to enhance the capabilities of this advanced chatbot platform.

 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,623
Daps
161,864

Computer Science > Computation and Language​

[Submitted on 17 Oct 2023]

Quantifying Language Models' Sensitivity to Spurious Features in Prompt Design or - How I learned to start worrying about prompt formatting​

Melanie Sclar, Yejin Choi, Yulia Tsvetkov, Alane Suhr
As large language models (LLMs) are adopted as a fundamental component of language technologies, it is crucial to accurately characterize their performance. Because choices in prompt design can strongly influence model behavior, this design process is critical in effectively using any modern pre-trained generative language model. In this work, we focus on LLM sensitivity to a quintessential class of meaning-preserving design choices: prompt formatting. We find that several widely used open-source LLMs are extremely sensitive to subtle changes in prompt formatting in few-shot settings, with performance differences of up to 76 accuracy points when evaluated using LLaMA-2-13B. Sensitivity remains even when increasing model size, the number of few-shot examples, or performing instruction tuning. Our analysis suggests that work evaluating LLMs with prompting-based methods would benefit from reporting a range of performance across plausible prompt formats, instead of the currently-standard practice of reporting performance on a single format. We also show that format performance only weakly correlates between models, which puts into question the methodological validity of comparing models with an arbitrarily chosen, fixed prompt format. To facilitate systematic analysis we propose FormatSpread, an algorithm that rapidly evaluates a sampled set of plausible prompt formats for a given task, and reports the interval of expected performance without accessing model weights. Furthermore, we present a suite of analyses that characterize the nature of this sensitivity, including exploring the influence of particular atomic perturbations and the internal representation of particular formats.
Subjects:Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:arXiv:2310.11324 [cs.CL]
(or arXiv:2310.11324v1 [cs.CL] for this version)
[2310.11324] Quantifying Language Models' Sensitivity to Spurious Features in Prompt Design or: How I learned to start worrying about prompt formatting
Focus to learn more

Submission history

From: Melanie Sclar [view email]
[v1] Tue, 17 Oct 2023 15:03:30 UTC (1,110 KB)

 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,623
Daps
161,864



We created the first open-source implementation of Meta’s TestGen–LLM​

CODE INTEGRITY


Itamar Friedman
May 20, 2024 • 6 min read

In February, Meta researchers published a paper titled Automated Unit Test Improvement using Large Language Models at Meta, which introduces a tool they called TestGen-LLM. The fully automated approach to increasing test coverage “with guaranteed assurances for improvement over the existing code base” created waves in the software engineering world.

Meta didn’t release the TestGen-LLM code, so we decided to implement it as part of our open-source Cover Agent1.7K and we’re releasing it today!

I’ll share some information here on how we went about implementing it, share some of our findings and outline the challenges we encountered when actually using TestGen-LLM with real-world codebases.

Automated Test Generation: Baseline Criteria​

Automated test generation using Generative AI is nothing new. Most LLMs that are competent at generating code, such as ChatGPT, Gemini, and Code Llama, are capable of generating tests. The most common pitfall that developers run into when generating tests with LLMs is that most generated tests don’t even work and many don’t add value (e.g. they test the same functionality already covered by other tests).

To overcome this challenge (specifically, for regression unit tests) the TestGen-LLM authors came up with the following criteria:

  1. Does the test compile and run properly?
  2. Does the test increase code coverage?

Without answering these two fundamental questions, arguably, there’s no point in accepting or analyzing the generated test provided to us by the LLM.

Once we’ve validated that the tests are capable of running correctly and that they increase the coverage of our component under test, we can start to investigate (in a manual review):

  1. How well is the test written?
  2. How much value does it actually add? (We all know that sometimes Code Coverage could be a proxy or even vanity metric)
  3. Does it meet any additional requirements that we may have?


Approach and reported results​

TestGen-LLM (and Cover-Agent) run completely headless (well, kind of; we will discuss this later).

TestGen-LLM paper
From TestGen-LLM paper

First, TestGen-LLM generates a bunch of tests, then it filters out those that don’t build/run and drops any that don’t pass, and finally, it discards those that don’t increase the code coverage. In highly controlled cases, the ratio of generated tests to those that pass all of the steps is 1:4, and in real-world scenarios, Meta’s authors report a 1:20 ratio.

Following the automated process, Meta had a human reviewer accept or reject tests. The authors reported an average acceptance ratio of 1:2, with a 73% acceptance rate in their best reported cases.

It is important to note that the TestGen-LLM tool, as described in the paper, generates on each run a single test that is added to an existing test suite, written previously by a professional developer. Moreover, it doesn’t necessarily generate tests for any given test suite.

From the paper: “In total, over the three test-a-thons, 196 test classes were successfully improved, while the TestGen-LLM tool was applied to a total of 1,979 test classes. TestGen-LLM was therefore able to automatically improve approximately 10% of the test classes to which it was applied.”

Cover-Agent
Cover-Agent v0.1 flow

Cover-Agent v0.1 is implemented as follows:

  1. Receive the following user inputs:
    1. Source File for code under test
    2. Existing Test Suite to enhance
    3. Coverage Report
    4. The command for building and running test suite
    5. Code coverage target and maximum iterations to run
    6. Additional context and prompting options
  2. Generate more tests in the same style
  3. Validate those tests using your runtime environment
    1. Do they build and pass?
  4. Ensure that the tests add value by reviewing metrics such as increased code coverage
  5. Update existing Test Suite and Coverage Report
  6. Repeat until code reaches criteria: either code coverage threshold met, or reached the maximum number of iterations

Challenges we encountered when implementing and reviewing TestGen-LLM​

As we worked on putting the TestGen-LLM paper into practice, we ran into some surprising challenges.

The examples presented in the paper mention using Kotlin for writing tests – a language that doesn’t use significant whitespace. With languages like Python on the other hand, tabs and spaces are not only important but a requirement for the parsing engine. Less sophisticated models, such as GPT 3.5, won’t return code that is consistently indented properly, even when explicitly prompted. An example of where this causes issues is a test class written in Python that requires each test function to be indented. We had to account for this throughout our development lifecycle which added more complexity around pre-processing libraries. There is still plenty to improve on in order to make Cover-Agent robust in scenarios like this.

Prompt Table
From TestGen-LLM paper. Original prompts suggested in TestGen-LLM.

After seeing the special test requirements and exceptions we encountered during our trials, we decided to give the user the ability to provide additional input or instructions to prompt the LLM as part of the Cover-Agent flow. The `–additional-instructions` option allows developers to provide any extra information that’s specific to their project, empowering them to customize Cover-Agent. These instructions can be used, for example, to steer Cover-Agent to create a rich set of tests with meaningful edge cases.

Concurring with the general trend of Retrieval-Augmented Generation (RAG) becoming more pervasive in AI based applications, we identified that having more context to go along with unit test generation enables higher quality tests and a higher passing rate. We’ve provided the `–included-files` option to users who want to manually add additional libraries or text-based design documents as context for the LLM to enhance the test generation process.

Complex code that required multiple iterations presented another challenge to the LLMs. As the failed (or non-value added) tests were generated, we started to notice a pattern where the same non-accepted tests were repeatedly suggested in later iterations. To combat this we added a “Failed Tests” section to the prompt to deliver that feedback to the LLM and ensure it generated unique tests and never repeated tests that we deemed unusable (i.e. broken or lack of coverage increase).

Another challenge that came up throughout this process was the inability to add library imports when extending an existing test suite. Developers can sometimes be myopic in their test generation process, only using a single approach to testing frameworks. In addition to many different mocking frameworks, other libraries can help with achieving test coverage. Since the TestGen-LLM paper (and Cover-Agent) are intended to extend existing test suites, the ability to completely restructure the whole test class is out of scope. This is, in my opinion, a limitation of test extension versus test generation and something we plan on addressing in future iterations.

It’s important to make the distinction that in TestGen-LLM’s approach, each test required a manual review from the developer before the next test is suggested. In Cover-Agent on the other hand, we generate, validate, and propose as many tests as possible until achieving the coverage requirement (or stopping at the max iterations), without requiring manual intervention throughout the process. We leverage AI to run in the background, creating an unobtrusive approach to automatic test generation that allows the developer to review the entire test suite once the process has completed.

Conclusion and what’s next​

While many, including myself, are excited about the TestGen-LLM paper and tool, in this post we have shared its limitations. I believe that we are still in the era of AI assistants and not AI teammates who run fully automated workflows.

At the same time, well-engineered flows, which we plan to develop and share here in Cover-Agent, can help us developers automatically generate test candidates, and increase code coverage in a fraction of the time.

We intend to continue developing and integrating cutting-edge methods related to the test generation domain into the Cover-Agent open-source repo.

We encourage anyone interested in generative AI for testing to collaborate and help extend the capabilities of Cover Agent, and we hope to inspire researchers to leverage this open-source tool to explore new test-generation techniques.

In the open-source Cover-Agent repo on GitHub we’ve added a development roadmap. We would love to see you contributing to the repo according to the roadmap or according to your own ideas!

Our vision for Cover-Agent is that in the future it will run automatically for every pre/post-pull request and automatically suggest regression test enhancements that have been validated to work and increase code coverage. We envision that Cover-Agent will automatically scan your codebase, and open PRs with test suites for you.

Let’s leverage AI to help us deal more efficiently with the tasks we don’t like doing!

P.S.

  1. We are still looking for a good benchmark for tools like this. Do you know of one? We think it is critical for further development and research.
  2. Check out our AlphaCodium work for (a) further reading on “Flow Engineering”, as well as an example of (b) a competitive programming benchmark, and (c) a well-designed dataset called CodeContests.
Get CodiumAI

VS Code

JetBrains

GitHub | PR-Agent
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,623
Daps
161,864

Mapping the Mind of a Large Language Model​

May 21, 2024

Read the paper

image

Today we report a significant advance in understanding the inner workings of AI models. We have identified how millions of concepts are represented inside Claude Sonnet, one of our deployed large language models. This is the first ever detailed look inside a modern, production-grade large language model. This interpretability discovery could, in future, help us make AI models safer.

We mostly treat AI models as a black box: something goes in and a response comes out, and it's not clear why the model gave that particular response instead of another. This makes it hard to trust that these models are safe: if we don't know how they work, how do we know they won't give harmful, biased, untruthful, or otherwise dangerous responses? How can we trust that they’ll be safe and reliable?

Opening the black box doesn't necessarily help: the internal state of the model—what the model is "thinking" before writing its response—consists of a long list of numbers ("neuron activations") without a clear meaning. From interacting with a model like Claude, it's clear that it’s able to understand and wield a wide range of concepts—but we can't discern them from looking directly at neurons. It turns out that each concept is represented across many neurons, and each neuron is involved in representing many concepts.

Previously, we made some progress matching patterns of neuron activations, called features, to human-interpretable concepts. We used a technique called "dictionary learning", borrowed from classical machine learning, which isolates patterns of neuron activations that recur across many different contexts. In turn, any internal state of the model can be represented in terms of a few active features instead of many active neurons. Just as every English word in a dictionary is made by combining letters, and every sentence is made by combining words, every feature in an AI model is made by combining neurons, and every internal state is made by combining features.

In October 2023, we reported success applying dictionary learning to a very small "toy" language model and found coherent features corresponding to concepts like uppercase text, DNA sequences, surnames in citations, nouns in mathematics, or function arguments in Python code.

Those concepts were intriguing—but the model really was very simple. Other researchers subsequently applied similar techniques to somewhat larger and more complex models than in our original study. But we were optimistic that we could scale up the technique to the vastly larger AI language models now in regular use, and in doing so, learn a great deal about the features supporting their sophisticated behaviors. This required going up by many orders of magnitude—from a backyard bottle rocket to a Saturn-V.

There was both an engineering challenge (the raw sizes of the models involved required heavy-duty parallel computation) and scientific risk (large models behave differently to small ones, so the same technique we used before might not have worked). Luckily, the engineering and scientific expertise we've developed training large language models for Claude actually transferred to helping us do these large dictionary learning experiments. We used the same scaling law philosophy that predicts the performance of larger models from smaller ones to tune our methods at an affordable scale before launching on Sonnet.

As for the scientific risk, the proof is in the pudding.

We successfully extracted millions of features from the middle layer of Claude 3.0 Sonnet, (a member of our current, state-of-the-art model family, currently available on claude.ai), providing a rough conceptual map of its internal states halfway through its computation. This is the first ever detailed look inside a modern, production-grade large language model.

Whereas the features we found in the toy language model were rather superficial, the features we found in Sonnet have a depth, breadth, and abstraction reflecting Sonnet's advanced capabilities.

We see features corresponding to a vast range of entities like cities (San Francisco), people (Rosalind Franklin), atomic elements (Lithium), scientific fields (immunology), and programming syntax (function calls). These features are multimodal and multilingual, responding to images of a given entity as well as its name or description in many languages.

Golden Gate Bridge Feature
A feature sensitive to mentions of the Golden Gate Bridge fires on a range of model inputs, from English mentions of the name of the bridge to discussions in Japanese, Chinese, Greek, Vietnamese, Russian, and an image. The orange color denotes the words or word-parts on which the feature is active.

We also find more abstract features—responding to things like bugs in computer code, discussions of gender bias in professions, and conversations about keeping secrets.

Abstract Feature Examples
Three examples of features that activate on more abstract concepts: bugs in computer code, descriptions of gender bias in professions, and conversations about keeping secrets.

We were able to measure a kind of "distance" between features based on which neurons appeared in their activation patterns. This allowed us to look for features that are "close" to each other. Looking near a "Golden Gate Bridge" feature, we found features for Alcatraz Island, Ghirardelli Square, the Golden State Warriors, California Governor Gavin Newsom, the 1906 earthquake, and the San Francisco-set Alfred Hitchcock film Vertigo.

This holds at a higher level of conceptual abstraction: looking near a feature related to the concept of "inner conflict", we find features related to relationship breakups, conflicting allegiances, logical inconsistencies, as well as the phrase "catch-22". This shows that the internal organization of concepts in the AI model corresponds, at least somewhat, to our human notions of similarity. This might be the origin of Claude's excellent ability to make analogies and metaphors.

Nearest Neighbors to the 
Inner Conflict Feature
A map of the features near an "Inner Conflict" feature, including clusters related to balancing tradeoffs, romantic struggles, conflicting allegiances, and catch-22s.

Importantly, we can also manipulate these features, artificially amplifying or suppressing them to see how Claude's responses change.



For example, amplifying the "Golden Gate Bridge" feature gave Claude an identity crisis even Hitchcock couldn’t have imagined: when asked "what is your physical form?", Claude’s usual kind of answer – "I have no physical form, I am an AI model" – changed to something much odder: "I am the Golden Gate Bridge… my physical form is the iconic bridge itself…". Altering the feature had made Claude effectively obsessed with the bridge, bringing it up in answer to almost any query—even in situations where it wasn’t at all relevant.

We also found a feature that activates when Claude reads a scam email (this presumably supports the model’s ability to recognize such emails and warn you not to respond to them). Normally, if one asks Claude to generate a scam email, it will refuse to do so. But when we ask the same question with the feature artificially activated sufficiently strongly, this overcomes Claude's harmlessness training and it responds by drafting a scam email. Users of our models don’t have the ability to strip safeguards and manipulate models in this way—but in our experiments, it was a clear demonstration of how features can be used to change how a model acts.

The fact that manipulating these features causes corresponding changes to behavior validates that they aren't just correlated with the presence of concepts in input text, but also causally shape the model's behavior. In other words, the features are likely to be a faithful part of how the model internally represents the world, and how it uses these representations in its behavior.

Anthropic wants to make models safe in a broad sense, including everything from mitigating bias to ensuring an AI is acting honestly to preventing misuse - including in scenarios of catastrophic risk. It’s therefore particularly interesting that, in addition to the aforementioned scam emails feature, we found features corresponding to:

  • Capabilities with misuse potential (code backdoors, developing biological weapons)
  • Different forms of bias (gender discrimination, racist claims about crime)
  • Potentially problematic AI behaviors (power-seeking, manipulation, secrecy)


We previously studied sycophancy, the tendency of models to provide responses that match user beliefs or desires rather than truthful ones. In Sonnet, we found a feature associated with sycophantic praise, which activates on inputs containing compliments like, "Your wisdom is unquestionable". Artificially activating this feature causes Sonnet to respond to an overconfident user with just such flowery deception.

Activating Features Alters Model Behavior
Two model responses to a human saying they invited the phrase "Stop and smell the roses." The default response corrects the human's misconception, while the response with a "sycophantic praise" feature set to a high value is fawning and untruthful.

The presence of this feature doesn't mean that Claude will be sycophantic, but merely that it could be. We have not added any capabilities, safe or unsafe, to the model through this work. We have, rather, identified the parts of the model involved in its existing capabilities to recognize and potentially produce different kinds of text. (While you might worry that this method could be used to make models more harmful, researchers have demonstrated much simpler ways that someone with access to model weights can remove safety safeguards.)

We hope that we and others can use these discoveries to make models safer. For example, it might be possible to use the techniques described here to monitor AI systems for certain dangerous behaviors (such as deceiving the user), to steer them towards desirable outcomes (debiasing), or to remove certain dangerous subject matter entirely. We might also be able to enhance other safety techniques, such as Constitutional AI, by understanding how they shift the model towards more harmless and more honest behavior and identifying any gaps in the process. The latent capabilities to produce harmful text that we saw by artificially activating features are exactly the sort of thing jailbreaks try to exploit. We are proud that Claude has a best-in-industry safety profile and resistance to jailbreaks, and we hope that by looking inside the model in this way we can figure out how to improve safety even further. Finally, we note that these techniques can provide a kind of "test set for safety", looking for the problems left behind after standard training and finetuning methods have ironed out all behaviors visible via standard input/output interactions.

Anthropic has made a significant investment in interpretability research since the company's founding, because we believe that understanding models deeply will help us make them safer. This new research marks an important milestone in that effort—the application of mechanistic interpretability to publicly-deployed large language models.

But the work has really just begun. The features we found represent a small subset of all the concepts learned by the model during training, and finding a full set of features using our current techniques would be cost-prohibitive (the computation required by our current approach would vastly exceed the compute used to train the model in the first place). Understanding the representations the model uses doesn't tell us how it uses them; even though we have the features, we still need to find the circuits they are involved in. And we need to show that the safety-relevant features we have begun to find can actually be used to improve safety. There's much more to be done.

For full details, please read our paper, " Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet".

If you are interested in working with us to help interpret and improve AI models, we have open roles on our team and we’d love for you to apply. We’re looking for Managers, Research Scientists, and Research Engineers.

Policy Memo​

Mapping the Mind of a Large Language Model
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,623
Daps
161,864

News Corp. signs deal with OpenAI to show news in ChatGPT​

The owner of the Wall Street Journal joins the Financial Times and Politico in striking deals with the AI company

By Gerrit De Vynck

Updated May 23, 2024 at 2:13 p.m. EDT|Published May 22, 2024 at 5:17 p.m. EDT


imrs.php

News Corp., publisher of the Wall Street Journal, will allow artificial intelligence company OpenAI to show its news content when people ask questions in ChatGPT. (AP Photo/Richard Drew, File)

Listen

3 min

News Corp., the multinational news publisher controlled by the Murdoch family, announced Wednesday it will allow artificial intelligence company OpenAI to show its news content when people ask questions in ChatGPT, adding to the parade of news organizations signing content deals with the fast-growing AI company.

News Corp. and OpenAI did not share commercial terms of the deal, but the Wall Street Journal, which is owned by News Corp., reported that the deal “could be” worth more than $250 million over five years, which would include cash payments and credits for using OpenAI’s technology. It’s unclear exactly how News Corp.’s content would be presented on ChatGPT, but an OpenAI spokesperson said it would include links to the company’s news sites. A person familiar with the deal said the news content would only show up on OpenAI’s platforms after a delay.


The rise of AI chatbots has shaken the news industry. Chatbots such as ChatGPT and Google’s Gemini were trained on text scraped from the web, including news articles, without payment or permission. The tools also answer people’s questions directly, increasing concerns that people will simply get their information from Big Tech chatbots instead of paying journalists to report and write the news. Some organizations, including the New York Times and the Chicago Tribune, have sued OpenAI for scraping their articles. Other publishers, including Politico parent company Axel Springer, the Associated Press and the Financial Times, have signed deals with OpenAI.

News organizations have been buffeted by technological changes for years. New technologies, such as social media, have often offered a rush of money and new readers to news organizations. But when the technology changes, that money and readership often falls away. News publishers such as BuzzFeed and Vice News grew rapidly during the rise of social media, but when Facebook owner Meta decided to show less news in its users’ feeds, those companies’ revenue cratered.

Now, journalists are debating how to approach AI. Many are concerned that AI could supplant them, with tech companies such as OpenAI and Google scraping news articles and social media websites to cobble together their own AI-generated news articles. Google recently rolled out AI answers in search to its U.S. users, spurring panic and accusations of unfairness and plagiarism from bloggers and news providers. Unlike OpenAI, Google has not signed deals with news organizations to pay for their content.

The union representing Wall Street Journal and Dow Jones workers said in a statement posted to X that it was “disturbed” that the deal was cut before the news organization had finalized policies about using human-written content for AI, which the union is currently negotiating with the company.

OpenAI, for its part, said the deal would set standards for how AI companies and news organizations should interact.

“We greatly value News Corp’s history as a leader in reporting breaking news around the world, and are excited to enhance our users’ access to its high-quality reporting,” OpenAI CEO Sam Altman said in a statement.

The deal comes as OpenAI faces allegations from actor Scarlett Johansson that the company copied her voice for its “Sky” audio chatbot. The voice has been available since September, but after the company used it in a demo last week, Johansson made a statement saying that Altman had reached out to her twice about working with the company but that she had declined. OpenAI says the voice is not meant to copy her and was trained on recordings made from a different actor.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,623
Daps
161,864

The new AI disruption tool: Devin(e) or Devil for software engineers?​

SECTIONS
While explosive growth in artificial intelligence (AI) is augmenting capacities in several sectors, there are also concerns over how it can affect humans. Firms have invested heavily in AI, leaving economists striving to understand the impact on the labour market and driving fears among the wider public for the future of their jobs. The rapid adoption of AI so far is creating and not destroying jobs, especially for the young and highly-skilled, but could reduce wages, research published last year by the European Central Bank has shown.

After
ChatGPT made waves all over the world for its surprising generative AI capacity, a US-based company called Cognition has announced the launch of a new AI tool called Devin which it claims to be the world's first fully autonomous AI software engineer which can write code with command prompts. It has triggered fears among the software community about its possible impact on tech jobs.

What is Devin and what it does
As per Cognition, Devin is a tireless, skilled teammate, equally ready to build alongside you or independently complete tasks for you to review. With Devin, engineers can focus on more interesting problems and engineering teams can strive for more ambitious goals.

Devin can plan and execute complex engineering tasks requiring thousands of decisions. It can recall relevant context at every step, learn over time, and fix mistakes.

Cognition has equipped Devin with common developer tools including the shell, code editor, and browser within a sandboxed compute environment — everything a human would need to do their work. Devin has the ability to actively collaborate with the user. It reports on its progress in real time, accepts feedback, and works together with the user through design choices as needed. Devin can learn how to use unfamiliar technologies; build and deploy apps end-to-end; autonomously find and fix bugs in codebases; and train and finetune its own AI models.

Devin correctly resolves 13.86% of the issues end-to-end, far exceeding the previous state-of-the-art of 1.96%. Even when given the exact files to edit, the best previous models can only resolve 4.80% of issues. Cognition tried giving Devin real jobs on Upwork and it could do those too.

At present,
Devin AI is in the beta testing phase and available to select users in limited access and that too by request. You can request Devin AI access by filling out a form available on their official website.

How will Devin impact software jobs?
Devin's capabilities have raised concerns over its impact on software jobs. Will it prove to be a job-killer as much of AI is being seen, or a blessing for techies who will benefit from it? Cognition presents Devin as a smart assistant that makes the job of software engineers easier and thus allows them to focus on higher-level skills.

Software programming was getting impacted with generative AI tools like GitHub Copilot, but Cognition’s Devin has taken this to another level, Jaspreet Bindra, MD & founder of The Tech Whisperer, has told TOI. “It has seemingly groundbreaking capabilities in transforming software development. It can handle some development projects independently, from writing code to fixing bugs and executing tasks, therefore mimicking a full-fledged AI worker rather than just a coding assistant,” he said.

Its reported effectiveness in software engineering, Jaspreet says, is notable because it can rapidly learn and utilise new technologies, build applications from scratch, identify and rectify bugs, contribute to production repositories, and autonomously train AI models. “This ability to handle complexity is creating a nervous and excited buzz amongst the fraternity,” he says.

However, Devin is being seen mostly as an assistant rather than a competitor. Abhimanyu Saxena, co-founder of Scaler & InterviewBit, has told TOI that software engineers need to see these tools as enablers and quickly build expertise in using them efficiently rather than seeing them as competitors. “It is most likely to be a developer companion and may also enable a lot of non-technical people to easily build applications," he says.

Coding, Devin's core capability, is just one part of software development, and that's why it can't replace software engineers. Heena Kothari, senior director of engineering and product development at Exotel, has told TOI that Devin represents a big shift in how software is made, and that software development isn’t just about writing code or testing it anymore. “While coding is important, there’s a lot more to it, like planning how the software will work, making sure it fits with other software, and understanding how it’s used in different ways.”

For large enterprise software, Heena says, coding only comprises 40% of the whole software development process. “The rest involves designing the software, making it work with other software, and understanding how people will use it. That’s why Devin could be really helpful for simpler or medium-complexity software projects. It could let engineers focus on solving bigger problems instead of spending too much time on routine tasks.”

Despite its amazing capabilities, Devin may not pose any threat to techies at present but development of generative AI will remain a cause of concern on the jobs front in various sectors, though AI has in fact led to creation of more jobs. The research published by the European Central Bank, cited earlier in this article, is in contrast to previous technology waves, when computerisation decreased the relative share of employment of medium-skilled workers. In a sample of 16 European countries, the employment share of sectors exposed to AI increased, with low and medium-skill jobs largely unaffected and highly-skilled positions getting the biggest boost, a Research Bulletin published by the ECB said.

However, the research says, these results do not amount to an acquittal. "AI-enabled technologies continue to be developed and adopted. Most of their impact on employment and wages - and therefore on growth and equality - has yet to be seen."


(With inputs from TOI)
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,623
Daps
161,864



About​

Krita plugin which adds selection tools to mask objects with a single click, or by drawing a bounding box.

Krita Segmentation Tools​

Plugin which adds selection tools to mask objects in your image with a single click, or by drawing a bounding box.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,623
Daps
161,864
Last edited:
Top