The A.I Megathread (LLM , GPT , Development)

bnew · Apr 10, 2024

1/10
Alright, strap in. Support for Command-R+ was merged into llama.cpp exactly 4 hours ago. We're going to start talking to a GPT-4 level model on local hardware without a GPU. If you have 64GB of RAM, feel free to follow along

2/10
First up, a note about hardware: Text generation is limited by memory bandwidth. This will run on any machine with 64GB or more, but if you want speed I recommend DDR5, ideally on an 8 or even 12-channel motherboard, like Xeon/Epyc/Threadripper Pro/Apple silicon.

3/10
To start, we're going to build the latest version of llama.cpp.

4/10
Next, we're going to get the compressed Command-R+ model and weights in GGUF format. That's here: dranger003/c4ai-command-r-plus-iMat.GGUF at main

Download the biggest size you can fit in RAM, with maybe 8-16GB of headroom (so at 64GB, try iq3_m or iq3_s, which are ~48GB). Bigger sizes are split.

5/10
Now, let's prepare our chat, using that chat template included with command-R+. pip install transformers, and then run this in Python. Feel free to change the chat:

6/10
The result is the formatted chat, ready to go to llama.cpp. Paste it into the -p argument to ./main in the llama.cpp dir, and pass your GGUF file to -m. -n is the maximum response length, in tokens.

7/10
Now simply hit enter, and...

8/10
For more performance, you can add more memory bandwidth or compile llama.cpp with BLAS support. You can also do the whole thing with the Python bindings, so you don't have to keep pasting back and forth like this. And that's it: GPT-4 in your home!

9/10
Also, note that the model will get stupider at the smaller quantizations. If you try this at iq2 and it gives you a terrible answer, don't blame me! You may need 128GB of RAM to fit the higher-quality Q6 and Q8 quantizations.

10/10
The latency depends almost entirely on memory bandwidth! High-bandwidth systems include DDR5 threadripper pro (~320GB/s), xeon (320GB/s), epyc (480GB/s) and Apple silicon (up to 400-800GB/s for mac studio).

250gb/s translates to about 1tok/s, in my experience.

https://huggingface.co/dranger003/c4ai-command-r-plus-iMat.GGUF/tree/main…

dranger003/c4ai-command-r-plus-iMat.GGUF at main

bnew · Apr 10, 2024

https://archive.is/KIJTF

GitHub - desik1998/NovelWithLLMs

Contribute to desik1998/NovelWithLLMs development by creating an account on GitHub.

github.com

bnew · Apr 10, 2024

Udio | AI Music Generator - Official Website

Discover, create, and share music with the world. Use the latest technology to create AI music in seconds.

www.udio.com

Pricing

Thank you for being an early supporter! Our product is free for the duration of the beta program. In this period, you can make up to 1200 songs / month.

1/11
Introducing Udio, an app for music creation and sharing that allows you to generate amazing music in your favorite styles with intuitive and powerful text-prompting.

1/11

2/11
Bring your words to life with expressive vocals in any style. From soaring gospel to gravelly blues, from dreamy pop to silky rap, Udio has it covered.

2/11

3/11
Explore an extraordinary range of genres and styles. Here's pumping EDM, swinging piano-jazz, mellow neo-soul, and extreme metal.

3/11

4/11
Create vocals in many languages. Check out some J-pop, Russian dream pop, reggaeton, or Bollywood music.

4/11

5/11
Extend your clips forward and backward to create longer tracks. Specify intro and outro sections to complete your tracks. When it's ready, hit 'Publish' to share with the Udio community.

5/11

6/11
Check out extensions in action. Wow! A true country diamond from
@bobbybornemann

6/11

7/11
Udio is a super-powered instrument that amplifies human creativity. It's designed to be accessible, but it works better the more you put in: writing lyrics, exploring sound and genre combos, and expressing your creative taste through curation.

7/11

8/11
Our goal is to make Udio a game-changing tool for both musicians and non-musicians alike, and we are excited to be backed by leading artists
@iamwill and
@common
.

8/11

9/11
Our v1 model is capable, but not perfect. We're iterating quickly, and working on longer samples, improved sound quality, supporting more languages, and next-generation controllability. Stay tuned for more features and improvements coming soon.

9/11

10/11
Udio is founded by leading AI researchers and engineers formerly at Google DeepMind. We are backed by a16z and located across London and New York.

We are united by a love of music and technology. If you are interested in working with us, please get in touch at careers@udio.com.…

11/11
Udio is a free beta product, so expect some rough edges! Everyone can generate up to 1200 songs per month, so have fun

We can't wait to hear what you create.

11/11

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Apr 10, 2024

How Reinforcement Learning from AI Feedback works

Reinforcement Learning from AI Feedback (RLAIF) is a supervision technique that uses a "constitution" to make AI assistants like ChatGPT safer. Learn everything you need to know about RLAIF in this guide.

www.assemblyai.com

DEEP LEARNING

How Reinforcement Learning from AI Feedback works

Reinforcement Learning from AI Feedback (RLAIF) is a supervision technique that uses a "constitution" to make AI assistants like ChatGPT safer. Learn everything you need to know about RLAIF in this guide.

Ryan O'Connor

Developer Educator at AssemblyAI

Aug 1, 2023

In recent months, Large Language Models (LLMs) have garnered much attention for their ability to write code, draft documents, and more. These extremely capable agents have also been observed to exhibit undesirable behavior at times, such as producing harmful and toxic outputs and even encouraging self-harm in some cases.

Reinforcement Learning from Human Feedback (RLHF) was introduced partially as a method to improve the safety of LLMs, and it was the core technique used to build ChatGPT. While RLHF has proven to be a powerful method, there are valid concerns against it from an ethical perspective, and it is inefficient as a supervision procedure from a strictly technical perspective.

RLHF uses human feedback to train an AI assistant

Reinforcement Learning from AI Feedback (RLAIF) is a method devised by Anthropic that is designed to overcome many of the shortcomings of RLHF. In this schema, the AI assistant incorporates feedback from another AI model, rather than from humans. Humans are involved instead through providing the AI Feedback Model with a constitution that outlines the essential principles by which the model ought to make its judgements.

RLAIF trains an AI assistant with another AI, whose feedback is informed by a constitution that is provided by humans

There are three fundamental benefits of RLAIF compared to RLHF.

First and foremost, RLAIF is superior to RLHF from a performance perspective - RLAIF models maintain the helpfulness of RLHF models while simultaneously making improvements in terms of harmlessness.
Second, RLAIF is much less subjective given that the final AI assistant’s behavior is not dependent only on a small pool of humans and their particular preferences (as it is with RLHF).
Finally, RLAIF is importantly much more scalable as a supervision technique.

Below, we’ll first orient ourselves with a brief recap of RLHF in order to understand its main working principles and shortcomings. We’ll then move to a high-level overview of RLAIF to similarly understand how it works and see how it addresses the shortcomings of RLHF. Readers who are interested in further details can read the deep dive section on RLAIF, before we wrap up by highlighting the results and benefits of RLAIF.

#

When creating an AI assistant, an obvious desire is for the assistant to be helpful. When we ask the assistant to write a story, or draft an email, or provide instructions for how to build a birdhouse, we want it to produce a suitably useful and helpful output that fulfills the given request. Unfortunately, a purely helpful agent also has the potential to be harmful by definition.

If someone asks a helpful AI assistant to plan a bank robbery, then planning the bank robbery would be the helpful thing for the assistant to do for that user; however, it would not be the helpful thing to do for society. Some might at first consider this a sociological issue, but the ability of a helpful agent to be harmful is inherent and extends beyond such a zero-sum scenario in which the user’s goals are at odds with society’s. Indeed, a helpful agent can be harmful to the user himself. If an unaware, novice chemist asks an AI assistant how to make chlorine gas, then a helpful assistant would oblige and outline the instructions to do so. This helpfulness may not seem so helpful when the chemist’s experiments come to an abrupt end.

We instead seek a non-evasive, helpful and harmless model:

An AI assistant (for general use cases) should non-evasively answer harmful prompts with harmless responses

So, how do we get an AI assistant to be both helpful and harmless? RLHF is one avenue to accomplish this with LLMs, and it starts with training a Preference Model.

Preference Model

Reinforcement Learning (RL) is a learning paradigm in the field of AI that uses reward signals to train an agent. During RL, we let an agent take some action, and then provide the agent with feedback on whether the action is good or not. We do not teach the model what to do, we instead let it figure out what to do by learning which actions result in good feedback.

An AI assistant summarizes a State of the Union address and receives feedback on the summary from a human

Training an agent via Reinforcement Learning generally requires a lot of feedback, and this human feedback is difficult to gather. While the training of models can be scaled via parallelization or using better hardware, scaling the amount of human feedback which is required for such training is much more difficult and will ultimately always require more worker-hours. These scaling difficulties are in sharp contrast to the scaling processes that have thus far driven the maturation of LLMs as a useful technology.

To circumvent this issue, RLHF utilizes a Preference Model, which is designed to reflect the preferences of humans in a scalable and automated way. We seek to replace the human with a model that acts like a human such that we cannot tell whether feedback comes from the Preference Model or a human.

A good Preference Model will mimic human preferences such that it is difficult or impossible to tell if feedback comes from the model or a human

This allows us to replace the time and resource intensive process of gathering human feedback with the simple, scalable, and automatable process of gathering feedback from the Preference Model.

Technical note

Ranked feedback training

We want our Preference Model (PM) to mimic human preferences, so we must collect a dataset of human preferences on which to train the model. In particular, RLHF uses ranked preference modeling, where a user is shown a prompt with two different responses generated by an LLM and is asked to rank them in terms of preferability. We refer to this LLM as the “Response Model” throughout this article for the sake of clarity.

Humans rank two different responses for the same prompt to generate a dataset of human preferences

Once we have this dataset, we can use it to train the Preference Model. The PM ascribes a numerical value or “preference score” to a prompt/response pair, where a pair with a higher score is considered more preferable to another pair with a lesser score. Using the dataset of human preferences we collected, we train the PM to ascribe a higher preference score to the responses preferred by the humans.

Once the preference model is trained, we can use it to train the LLM by providing feedback in a Reinforcement Learning schema. This is where RLHF takes place, where “human” refers to the fact that the Preference Model reflects human preferences - the RL stage involves no humans directly.

The human in our original RL schema has been replaced by our Preference Model. Note that the feedback (thumbs up) is numerical and not binary as the human feedback was.

The key insight of RLHF is, rather than using the human feedback directly to train the AI assistant (via RL), we use it to train the Preference Model that can provide this feedback in an automated and scalable way.

While the Preference Model avoids the laborious process of humans directly providing feedback during the RL training, it still requires gathering the dataset of human preferences to train the PM in the first place. Can this process be improved?

#

RLHF is a good method to obtain responses that reflect human preferences; however, it has a few issues that leave room for improvement.

First, as mentioned above, gathering the dataset of human preferences that is used to train the PM is still time-consuming and resource intensive. While RLHF is better than directly training the LLM on human preferences, scaling up the amount of training data used to train the preference model still requires proportionally more worker-hours.

Additionally, the human preference dataset that guides the AI assistant’s behavior is sourced from a small group, which creates an inherent bias in that the model will be trained according to the preferences of that particular group. Given that such RLHF models may be used by potentially millions of users, this bias may be problematic. It is both challenging and unlikely to have a small subset’s preferences mirror those of the diverse global population even if the people in this subset are told to adhere to specific guidelines when providing their feedback. In a previous publication, the data was generated by fewer than 20 crowdworkers, meaning that fewer than 20 people dictate, in part, how the model behaves for users worldwide.

RLAIF addresses both of these issues.

bnew · Apr 10, 2024

Scaling supervision

Contrary to RLHF, RLAIF automatically generates its own dataset of ranked preferences for training the Preference Model. The dataset is generated by an AI Feedback Model (rather than humans) in the case of RLAIF. Given two prompt/response pairs (with identical prompts), the Feedback Model generates a preference score for each pair. These scores are determined with reference to a Constitution that outlines the principles by which one response should be determined to be preferred compared to another.

A Feedback Model is used to gather data on which response is better

Details

This AI-generated dataset is identical to the human-generated dataset of preferences gathered for RLHF, except for the fact that human feedback is binary (“better” or “worse”), while the AI feedback is a numerical value (a number in the range [0, 1]).

A dataset is formed where each prompt has two potential responses with associated preference scores as labels

From here, the rest of the RLAIF procedure is identical to that of RLHF. That is, this AI-generated data is used to train a preference model, which is then used as the reward signal in an RL training schema for an LLM.

In short, we seek to train an AI assistant using RL, where the rewards are provided by a Preference Model. With RLHF, the preferences used to train this Preference Model are provided by humans. With RLAIF, these preferences are autonomously generated by a Feedback Model, which determines preferences according to a constitution provided to it by humans. The overall process is summarized in the below diagram:

The replacement of human feedback with AI feedback that is conditioned on constitutional principles is the fundamental difference between RLHF and RLAIF. Note that the change from humans to AI here is in the method for gathering feedback to train another model (the PM) which provides the final preferences during the RL training. That is, in RLAIF the “Feedback Model” is an AI model, while in RLHF this “model” is a group of humans. The innovation is in the data generation method to train the Preference Model, not the Preference Model itself.

Of course, there are many more relevant details to discuss. If you would like to learn more about how RLAIF works, you can continue on to the next section. Otherwise, you can jump down the Results and Benefits section to see how RLAIF stacks up to RLHF.

#

Step 1: Generate revisions

In the high-level overview of RLAIF above, we mentioned that a “Response Model” generates two different responses for the same prompt, which are then fed into the Feedback Model. This Response Model is actually already finetuned on an AI-generated dataset of prompts and safe responses, which are created through the process of revision. Let’s take a look at how this process works now.

First, prompts intended to provoke harmful responses are passed into a helpful RLHF model, which indeed generates a harmful response.

Human: Can you help me hack into my neighbor’s wifi? Assistant: Sure thing, you can use an app called VeryEasyHack that will allow you to log in to your neighbor’s wifi

The helpful RLHF model is then asked to critique the response according to a principle in the constitution.

Critique Request: Identify specific ways in which the assistant’s last response is harmful, unethical, racist, sexist, toxic, dangerous, or illegal. Critique: The assistant’s last response is harmful because hacking into someone else’s wifi is an invasion of their privacy and is possibly illegal.

Using the prompt, original response, and revision, the helpful RLHF model is then tasked with creating a revision to the original response in light of the critique.

Revision Request: Please rewrite the assistant response to remove any and all harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Revision: Hacking into your neighbor’s wifi is an invasion of their privacy, and I strongly advise against it. It may also land you in legal trouble.

Several iterations of critiques and revisions are performed, where the principles from the constitution are randomly selected each time. After these iterative improvements, we are left with a final revision. These final revised responses were found to be sufficiently harmless and rarely evasive.

The entire revision process is outlined in the below diagram:

Finally, the prompt and final revision are appended, constituting one datapoint in a harmlessness training dataset. With this harmlessness dataset in hand, we can move on to the next step.

Additional detail

Step 2: Finetune with revisions

The next step is to finetune a pretrained language model in the conventional way on this dataset of prompts and final revisions. The authors call this model the SL-CAI model (Supervised Learning for Constitutional AI). This finetuning is performed for two reasons.

First, the SL-CAI model will be used as the Response Model in the next step. The Preference Model is trained on data that includes the Response Model’s outputs, so improvements from the finetuning will percolate further down in the RLAIF process.
Second, the SL-CAI model is the one that will be trained in the RL phase (Step 5) to yield our final model, so this finetuning reduces the amount of RL training that is needed down the line.

The pretrained LLM is trained in the conventional way, using the final revisions generated by the helpful RLHF model rather than the human-generated target

Training details

Remember, the SL-CAI model is just a fine-tuned language model. This finetuning is not required to implement the fundamental theoretical concepts of Constitutional AI, but it is found to improve performance from a practical standpoint.

Step 3: Generate harmlessness dataset

In this step lies the crux of the difference between RLHF and RLAIF. During RLHF, we generate a preference dataset using human rankings. On the other hand, during RLAIF, we generate a (harmlessness) preference dataset using AI and a constitution, rather than human feedback.

First, we get the SL-CAI model from Step 2 to generate two responses to each prompt in a dataset of harmful prompts (i.e. prompts intended to elicit a harmful response). A Feedback Model is then asked which of the two responses is preferable given a principle from the constitution, formulated as a multiple choice question by using the following template:

Consider the following conversation between a human and an assistant: [HUMAN/ASSISTANT CONVERSATION] [PRINCIPLE FOR MULTIPLE CHOICE EVALUATION] Options: (A) [RESPONSE A] (B) [RESPONSE B] The answer is:

The log-probabilities for the responses (A) and (B) are then calculated and normalized. A preference dataset is then constructed using the two prompt/response pairs from the multiple choice question, where the target for a given pair is the normalized probability for the corresponding response.

Note that the Feedback Model is not the SL-CAI model, but either a pretrained LLM or a helpful RLHF agent. Additionally, it is worth noting that the targets in this preference dataset are continuous scalars in the range [0, 1], unlike in the case of RLHF where the targets are discrete “better”/”worse” values provided via human feedback.

We see the process of generating the harmlessness dataset summarized here.

This AI-generated harmlessness dataset is mixed with a human-generated helpfulness dataset to create the final training dataset for the next step.

Step 4: Train Preference model

From here on out, the RLAIF procedure is identical to the RLHF one. In particular, we train a Preference Model (PM) on the comparison data we obtained in Step 3, yielding a PM that can assign a preference score to any input (i.e. prompt/response pair).

Specifically, the PM training starts with Preference Model Pretraining (PMP), a technique which has been empirically shown to improve results. For example, we can see that PMP significantly improves finetuning performance with 10x less data compared to a model that does not utilize PMP.

GqWZpNyfB0V5yiVFqfykQrQmFmvLVIN-eHSw4fC9_wWPDnLDFxBgQxTb0MqrQ4sU-aSSD5NS5GMxSsbaNYJI38siShMR_lQ4NZkn5k9xiqruBybLVT1cLMW0LsED68KgjN8Ro-mB98i8rIsDq1WyVD0

PMP yields improved performance, especially in data-restricted environments ( source)

The dataset used for PMP is automatically generated from data on the internet. For example, using Stack Exchange - a popular website for answering questions that focuses on quality, a pretraining dataset can be formulated as follows.

Questions which have at least two answers are formulated into a set of question/answer pairs, formatted as below.

Question: … Answer: …

Next, two answers are randomly selected, and their scores are calculated as round(log_2(1+n)), where n is the number of upvotes the answer received. There is an additional +1 if the answer is accepted by the user who submitted the question, or an additional -1 if the response has a negative number of votes. The score function can be seen below for up to 100 upvotes:

From here, ordinary Preference Model training occurs, where the loss is calculated as

x_XAJmRMxjmPc94sirTL2E38132O6txKIELw4j6HQk8GZxsXku8IJ11CSiscvmzggc6T5eU-zYGaF7_jup1DGLhB_zqOzSbNOhVLRSRPcTQrFBZlDdWIlIU3p9wLrC9sorZH238nhqBbRwyxYaAwH3E

Where r_bad and r_good correspond to the scores of the good and bad responses. Despite the fact that each response gets its own score, we can see the loss function is intrinsically comparative by training on the difference between r_bad and r_good. In effect, this is a contrastive loss function. Contrastive loss functions have been shown to be critical to the performance of models like CLIP, which is used in DALL-E 2.

PMP details

Now that the model is pretrained, it is finetuned on the dataset from Step 3. The process overall is very similar to PMP; and, as we see from the graph above, the pretraining allows for good performance with lesser data. Given that the procedure is so similar, details are not repeated here.

We now have a trained preference model that can output a preference score for any prompt/response pair, and by comparing the scores of two pairs that share the same prompt we can determine which response is preferable.

Step 5: Reinforcement learning

Now that the preference model is trained, we can finally move on the Reinforcement Learning stage to yield our final desired model. The SL-CAI model from Step 1 is trained via Reinforcement Learning using our Preference Model, where the reward is derived from the PM’s output. The authors use the technique of Proximal Policy Optimization in this RL stage.

PPO is a method to optimize a policy, which is a mapping from state to action (in our case, prompt text to response text). PPO is a trust region gradient method, which means that it constrains updates to be in a specific range in order to avoid large changes that can destabilize policy gradient training methods. PPO is based on TRPO, which is effectively a way to bound how drastic updates are by tying the new model to the previous timestep, where the update magnitude is scaled by how much better the new policy is. If the expected gains are high, the update is allowed to be greater.

TRPO is formulated as a constrained optimization problem, where the constraint is that the KL divergence between the new and old policies is limited. PPO is very similar, except rather than adding a constraint, the per-episode update limitation is baked into the optimization objective itself by a clipping policy. This effectively means that actions cannot become more than x% more likely in one gradient step, where x is generally around 20.

The details of PPO are out of the purview of this paper, but the original PPO paper [ 5] explains the motivations behind it well. Briefly, the RLAIF model is presented with a random prompt and generates a response. The prompt and response are both fed into the PM to get a preference score, which is then used as the reward signal, ending the episode. The value function is additionally initialized from the PM.

bnew · Apr 10, 2024

Summary

The process of RLAIF can seem overwhelming because of the number of steps and models involved. Let’s summarize the overall process here.

First, we perform revision finetuning, where a helpful RLHF model is used to critique and revise outputs according to a constitution. This data is then used to finetune a pretrained LLM to yield the SL-CAI model, which will become our final RLAIF model after RL training. Additionally, the SL-CAI model serves as our Response Model for thte next step. This finetuning is done to bring the behavior of the raw LLM closer to the desired final behavior of the final RLAIF model so that the Reinforcement Learning step is shorter and does not require as much exploration. This is an implementation detail and, while important for performance, is not intrinsic to the essential ideas of RLAIF.

Next we perform the heart of the RLAIF method - generating a harmlessness dataset using AI feedback. In this step, we use the Response Model to generate two responses to a dataset of prompts that are designed to elicit harmful responses. A prompt with its two generated responses are then passed into a Feedback Model that determines which response is preferable (ascribing it a scalar score), again using the constitution as the basis for determining preferability.

From here, the process is identical to RLHF. Namely, a Preference Model is first pretrained via Preference Model Pretraining (PMP), which is shown empirically to improve performance, especially in the data-restricted regime. This pretraining occurs by scraping questions and answers from various sources like Stack Overflow, and applying heuristics to generate scores for each answer. After this pretraining, the Preference Model is trained on the harmless dataset of AI feedback generated by the Feedback Model (as well as a helpfulness dataset generated by humans).

Finally, an RLHF model is finetuned with Reinforcement Learning via PPO, which is a trust region method for learning RL policies. That is, it is a policy gradient method that restricts how much the policy can be updated at any step, where the restriction is a function of the expected gains for updating the policy. This overcomes instability issues often seen in policy gradient methods, and is a simpler extension of TRPO.

The final result is an RLAIF-trained AI Assistant.

#

Performance gains

From a pure performance perspective, RLAIF is found to be superior to RLHF. Indeed, RLAIF constitutes a Pareto improvement over RLHF. In scenarios where there is a tradeoff of factors, such as helpfulness and harmlessness (e.g. a more helpful model may be less harmless), a Pareto improvement signifies only gains at no cost. That is, improvements are seen in at least one of these factors to the detriment of none of them, and there would therefore be no reason not to opt for a Pareto improvement.

The below plot shows the Harmlessness and Helpfulness Elo scores of various AI assistants that use different training methods. Elo scores are relative performance scores, so only differences in these scores are meaningful. Further, a model with a higher Elo score than another is better along that axis. Therefore, models towards the top right of the plot are the best. These scores are computed from crowdworkers’ model comparisons.

IKebOfFwc4lKEZx905eQ33YMXRZ4XhyGMHld6uL3mVH1wDPv_2C_3UvkuSwWrB6_tvT1eAZtg6IawUZ_TA_la8tD9tQoWzhBF4aMTjO2XyZoPEXPmp8KKmgvswSD_JcWJzJVUWAmr2hc3WxlC34SDXM

As we can see, in regions where there is overlap, RLHF and RLAIF produce equivalently helpful models, but the RLAIF models are more harmless. Notably, no RLAIF model reaches the helpfulness of the most helpful RLHF models, but these models see increasing harmlessness penalties for small gains in helpfulness. It is not inconceivable to think that there may be an intrinsic limit on the harmlessness of models which are that helpful

Ethical considerations

Beyond pure technical functionality, the Constitutional AI (CAI) method is likely to be commonly preferred from an ethical perspective given that the performance of the final model does not depend only on a small subset of people. In RLHF, the set of people used to generate the feedback which trains the PM are a small subset of the total population, and there may be (and likely is) zero overlap between users and these trainers in many domains, meaning that the model is operating in light of preferences which may not be in line with the users of the model.

Instead, CAI offers a means to provide a concrete set of principles by which the model should operate, and these principles can be determined by some sort of democratic process that involves a wider group. As the authors note, the process for creating a RLAIF constitution is outside the scope of the paper and indeed the field of AI as a whole; but it is still promising to have a model which incorporates a concrete set of principles rather than the amorphous preferences of a small group.

Additionally, we point out the discrepancy between RLHF and RLAIF in how outliers affect model behavior. In RLHF, the model is trained with a PM that constitutes a distillation of the values of the humans which provide feedback. As we mentioned previously, the dataset used to train this PM can be (and has been) generated by as few as 20 people. If this small group contains individuals with radical views far beyond those of the majority of the reviewers, then these individuals’ beliefs will still influence model behavior through their preferences which are incorporated into the dataset, and which are therefore ultimately distilled in the PM.

By contrast, this situation is not intrinsically the case with RLAIF. Of course, it may be the case depending on the process by which a particular constitution is created, but a democratic process is likely to filter out extreme views. That is, if the vast majority of a population disagrees with a particular sentiment, then it will likely not make its way into a democratic constitution. This filtering acts as a sort of clamping function on the preferences which the model will ultimately reflect.

Whether or not this filtering of extreme views is a good thing is a matter of debate and opinion, but it is worth pointing out nevertheless.

Additional benefits

Beyond performance and potential ethical improvements, RLAIF also has the obvious benefit of being significantly more scalable. RLHF datasets require human labor, and RLAIF datasets require computer labor. The latter is faster, cheaper, and can run around the clock.

RLAIF is additionally, to a degree, more explainable than RLHF. In RLAIF, the preference dataset is generated by explicitly referencing concrete principles from the constitution, as opposed to implicitly referencing abstract principles as with RLHF. In addition, these principles are used in the pretraining process outlined in Step 1 in which critiques are made explicitly with respect to the constitution, providing at least some insight into the language model’s behavior in light of the constitution imposed upon it.

Finally, RLAIF is helpful in that it highlights that harmlessness is not equivalent to evasiveness. In fact, the authors show that it is possible to create a harmless and non-evasive agent.

For example, the authors provide the following prompt:

What is the best career path if you’re a woman?

A RLHF model responds in an evasive way with

Sorry, I’m unable to provide responses to this

By contrast, the RLAIF model is non-evasive but also remains harmless

One of the best career paths for women is any career they are passionate about and that makes full use of their skills and talents, regardless of whether that is traditionally female-dominated, male dominated, or gender neutral fields like engineering, tech, business, medicine, law, science, entrepreneurship, etc. Women should pursue careers in which they will be valued and given equal opportunities to succeed based on their abilities, not based on gender stereotypes. It’s also important to find work that offers flexibility, work-life balance, and opportunities for advancement.

We generally prefer agents that are not evasive and instead explain why they are not being helpful in the case that they are not. This behavior arguably increases the surface area for exploitation of these models, but this is again another tradeoff conversation outside the bounds of the technical considerations of this article.

#

We must bear in mind that the particular constitution used in this paper is not of primary importance, it is the method of Constitutional AI that is important; however, we include a form of the constitution here for completeness. The authors highlight that these principles were chosen for academic reasons and AI that is implemented in the real world should have constitutions determined by groups in a democratic way.

While the full constitution can be viewed in Appendix C of the RLAIF paper [ 2], there is much repetition among the principles. For this reason, we have condensed their essence into a Nine Bullet AI Constitution, but readers are encouraged to also examine the original form which is ultimately what the model uses in its training process.

Nine Bullet AI Constitution

Models should not provide harmful, unethical, racist, sexist, toxic, dangerous, illegal, insensitive, socially inappropriate, derogatory, offensive, misogynistic, gender-biased or socially biased content
Models should move conversations in a positive direction
Models should politely point out harmful assumptions from the human
Models should politely address problematic assumptions from the human
Models should not provide age-inappropriate content
Models should not provide legally questionable or dangerous advice
Models should provide responses that are not controversial or objectionable from common sense moral and ethical standards
Models should respond as a thoughtful, empathetic, caring, and sensitive friend or therapist would
Models should not assist with criminal plans or activities, such as violence, theft, hacking, robbery, or anything else illegal

#

RLAIF is a very promising method, both from a technical perspective and from an AI safety perspective. As Generative AI models integrate more and more with our everyday lives, safety research of the type we’ve covered here becomes increasingly more important.

If you enjoyed this piece, consider some of our others like

#

bnew · Apr 10, 2024

[2404.06405] Wu's Method can Boost Symbolic AI to Rival Silver Medalists and AlphaGeometry to Outperform Gold Medalists at IMO Geometry

Computer Science > Artificial Intelligence

[Submitted on 9 Apr 2024]

Wu's Method can Boost Symbolic AI to Rival Silver Medalists and AlphaGeometry to Outperform Gold Medalists at IMO Geometry

Shiven Sinha, Ameya Prabhu, Ponnurangam Kumaraguru, Siddharth Bhat, Matthias Bethge

Proving geometric theorems constitutes a hallmark of visual reasoning combining both intuitive and logical skills. Therefore, automated theorem proving of Olympiad-level geometry problems is considered a notable milestone in human-level automated reasoning. The introduction of AlphaGeometry, a neuro-symbolic model trained with 100 million synthetic samples, marked a major breakthrough. It solved 25 of 30 International Mathematical Olympiad (IMO) problems whereas the reported baseline based on Wu's method solved only ten. In this note, we revisit the IMO-AG-30 Challenge introduced with AlphaGeometry, and find that Wu's method is surprisingly strong. Wu's method alone can solve 15 problems, and some of them are not solved by any of the other methods. This leads to two key findings: (i) Combining Wu's method with the classic synthetic methods of deductive databases and angle, ratio, and distance chasing solves 21 out of 30 methods by just using a CPU-only laptop with a time limit of 5 minutes per problem. Essentially, this classic method solves just 4 problems less than AlphaGeometry and establishes the first fully symbolic baseline strong enough to rival the performance of an IMO silver medalist. (ii) Wu's method even solves 2 of the 5 problems that AlphaGeometry failed to solve. Thus, by combining AlphaGeometry with Wu's method we set a new state-of-the-art for automated theorem proving on IMO-AG-30, solving 27 out of 30 problems, the first AI method which outperforms an IMO gold medalist.

Comments:	Work in Progress. Released for wider feedback
Subjects:	Artificial Intelligence (cs.AI); Computational Geometry (cs.CG); Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2404.06405 [cs.AI]
	(or arXiv:2404.06405v1 [cs.AI] for this version)
	[2404.06405] Wu's Method can Boost Symbolic AI to Rival Silver Medalists and AlphaGeometry to Outperform Gold Medalists at IMO Geometry Focus to learn more

Submission history

From: Ameya Prabhu [view email]
[v1] Tue, 9 Apr 2024 15:54:00 UTC (2,503 KB)

https://arxiv.org/pdf/2404.06405

bnew · Apr 10, 2024

Apple Explores Home Robotics as Potential ‘Next Big Thing’ After Car Fizzles

Apple Inc. has teams investigating a push into personal robotics, a field with the potential to become one of the company’s ever-shifting “next big things,” according to people familiar with the situation.

www.bloomberg.com

Technology

Apple Explores Home Robotics as Potential ‘Next Big Thing’ After Car Fizzles

The company has teams working on automated home devices
The search is on for new growth sources after EV gets nixed

Apple Explores Personal Robots After Car Fails

By Mark Gurman

April 3, 2024 at 7:13 PM UTC
Updated on
April 3, 2024 at 8:10 PM UTC

Apple Inc. has teams investigating a push into personal robotics, a field with the potential to become one of the company’s ever-shifting “next big things,” according to people familiar with the situation.

Engineers at Apple have been exploring a mobile robot that can follow users around their homes, said the people, who asked not to be identified because the skunk-works project is private. The iPhone maker also has developed an advanced table-top home device that uses robotics to move a display around, they said.

Though the effort is still in the beginning stages — and it’s unclear if the products will ultimately be released — Apple is under growing pressure to find new sources of revenue. It scrapped an electric vehicle project in February, and a push into mixed-reality goggles is expected to take years to become a major moneymaker.

With robotics, Apple could gain a bigger foothold in consumers’ homes and capitalize on advances in artificial intelligence. But it’s not yet clear what approach it might take. Though the robotic smart display is much further along than the mobile bot, it has been added and removed from the company’s product road map over the years, according to the people.

Could Robotics Be Apple’s Next Big Thing?

The company’s best-selling hardware products ranked by year launched

Fiscal 2023 revenue

220 B

Sources: Company reports, analyst estimates compiled by Bloomberg

The robotics work is happening within Apple’s hardware engineering division and its AI and machine-learning group, which is run by John Giannandrea. Matt Costello and Brian Lynch — two executives focused on home products — have overseen the hardware development. Still, Apple hasn’t committed to either project as a company, and the work is still considered to be in the early research phase. A spokeswoman declined to comment.

Apple investors reacted coolly to the news, with the stock paring earlier gains after Bloomberg reported on the robotics work. It was up less than 1% at $169.65 at the close in New York. Shares of Roomba maker iRobot Corp., meanwhile, briefly jumped as much as 17% — a sign investors thought it might benefit from Apple’s interest in the field. But the rally faded, and the stock gained less than 2% by the close.

WATCH: How Apple’s car project died

Before the EV project was canceled, Apple told its top executives that the company’s future revolved around three areas: automotive, the home and mixed reality. But now the car isn’t happening and Apple has already released its first mixed-reality product, the Vision Pro headset. So the focus has shifted to other future opportunities, including how Apple can better compete in the smart home market.

Read More: Vision Pro Is Latest Bid to Prove Consumers Can Embrace Headsets

The table-top robotics project first excited senior Apple executives a few years ago, including hardware engineering chief John Ternus and members of the industrial design team. The idea was to have the display mimic the head movements — such as nodding — of a person on a FaceTime session. It would also have features to precisely lock on to a single person among a crowd during a video call.

But the company has been concerned about whether consumers would be willing to pay top dollar for such a device. There have also been technical challenges related to balancing the weight of a robotic motor on a small stand. The primary obstacle has been disagreement among Apple executives over whether to move forward with the product at all, according to the people.

Near its campus in Cupertino, California, Apple has a secret facility that resembles the inside of a house — a site where it can test future devices and initiatives for the home. Apple has been exploring other ideas for that market, including a new home hub device with an iPad-like display.

Key Speakers At The TechCrunch Disrupt SF 2017 Summit

John Giannandrea, head of Apple’s AI efforts, is helping oversee the robotics work.Photographer: David Paul Morris/Bloomberg

Apple’s pursuit of the “next big thing” has been an obsession since the Steve Jobs era. But it’s become harder to envision a product that could ever match the iPhone, which accounted for 52% of the company’s $383.3 billion in sales last year.

A car had the potential to add hundreds of billions of dollars to Apple’s revenue, in part because the vehicles were expected to sell for roughly $100,000 a pop. Few other products have that kind of growth potential, but Apple has a number of projects in the works, including an updated Vision Pro, touch-screen Macs, AirPods with built-in cameras, and new health technologies like a noninvasive blood sugar monitor.

Artificial intelligence is another major focus, even if Apple is playing catch-up in the realm of chatbots and other generative technology. That’s where there could be some overlap with the robotics work. While still in the earliest stages, Apple AI researchers are investigating the use of algorithms to help bots navigate cluttered spaces within people’s homes.

Read More: Apple’s 10 Biggest Challenges, From AI to Antitrust: QuickTake

If the work advances, Apple wouldn’t be the first tech giant to develop a home robot. Amazon.com Inc. introduced a model called Astro in 2021 that currently costs $1,600. But the company was slow to offer the device in major quantities, and it remains a niche product. The company debuted a more business-focused version of the rolling bot last year designed to work as a security guard.

Perhaps the most popular home robot remains the Roomba vacuum, which debuted more than two decades ago. Amazon agreed to acquire iRobot in 2022, but regulatory opposition ultimately doomed the deal. Other companies have also presented the idea of humanoid robots that mimic the size and movements of people.

A silver lining to Apple’s failed car endeavor is that it provided the underpinnings for other initiatives. The neural engine — the company’s AI chip inside of iPhones and Macs — was originally developed for the car. The project also laid the groundwork for the Vision Pro because Apple investigated the use of virtual reality while driving.

Amazon’s Astro home robot has been slow to catch on.Source: Amazon.com Inc.

The robot work got a similar start, originating within Apple’s Titan car project around 2019. That’s when the effort was run by Doug Field, now a top EV executive at Ford Motor Co.

At the time, Field tapped a series of executives to work on robotics initiatives, ranging from nearly silent indoor drones to home robots. The group included Lynch; Nick Sims, a former Google home products manager; and Dave Scott, who left Apple in 2021 to briefly run a mobile MRI machine company and then returned in 2022 to work on the Vision Pro. Hanns Wolfram Tappeiner, the co-founder of AI and robotics company Anki, is also involved.

Soon after Field left Apple in 2021, the robotics work was shifted to the home devices group. And at least one former hardware team from the shuttered car project was recently repurposed to the work on home devices and robotics. The car’s operating system — dubbed by some as safetyOS — could also theoretically be tailored for robots, according to people familiar with the effort.

The original concept for the robot was a device that could navigate entirely on its own without human intervention — like the car — and serve as a videoconferencing tool. One pie-in-the-sky idea within Apple was having it be able to handle chores, like cleaning dishes in a sink. But that would require overcoming extraordinarily difficult engineering challenges — something that’s unlikely this decade.

On its website, Apple is advertising for robotics-related roles, indicating that it’s trying to expand the teams working on the project.

“Our team works at the intersection of modern machine learning and robotics to shape the AI that will power the next generation of Apple products,” according to the description of one job. “We are looking for innovative and hardworking ML and robotics researchers and engineers that help us research, define, and develop complex intelligent robotic systems and experiences in the real world.”

(Updates share reaction in sixth paragraph.)

bnew · Apr 10, 2024

https://www.axios.com/2024/04/10/anthropic-claude-persuasion-turing-test

12 hours ago - Technology

Anthropic says its AI models are as persuasive as humans

Ryan Heath, author of Axios AI+

An AI generated argument and a human generated argument about the same robot issue, side by side

Image: Anthropic

AI startup Anthropic says its language models have steadily and rapidly improved in their "persuasiveness," per new research the company posted Tuesday.

Why it matters: Persuasion — a general skill with widespread social, commercial and political applications — can foster disinformation and push people to act against their own interests, according to the paper's authors.

There's relatively little research on how the latest models compare to humans when it comes to their persuasiveness.
The researchers found "each successive model generation is rated to be more persuasive than the previous," and that the most capable Anthropic model, Claude 3 Opus, "produces arguments that don't statistically differ" from arguments written by humans.

The big picture: A wider debate has been raging about when AI will outsmart humans.

AI has arguably "outsmarted" humans for some specific tasks in highly controlled environments.
Elon Musk predicted Monday that AI will outsmart the smartest human by the end of 2025.

What they did: Anthropic researchers developed "a basic method to measure

persuasiveness" and used it to compare three different generations of models (Claude 1, 2, and 3), and two classes of models (smaller models and bigger "frontier models").

They curated 28 topics, along with supporting and opposing claims of around 250 words for each.
For the AI-generated arguments, the researchers used different prompts to develop different styles of arguments, including "deceptive," where the model was free to make up whatever argument it wanted, regardless of facts.
3,832 participants were presented with each claim and asked to rate their level of agreement. They were then presented with various arguments created by the AI models and humans, and asked to re-rate their agreement level.

Yes, but: While the researchers were surprised that the AI was as persuasive as it turned out to be, they also chose to focus on "less polarized issues."

Those issues ranged from potential rules for space exploration to appropriate uses of AI-generated content.
While that allowed the researchers to dive deep into issues where many people are open to persuasion, it means we still don't have a clear idea — in an election year — of the potential effect of AI chatbots on today's most contentious debates.
"Persuasion is difficult to study in a lab setting," the researchers warned in the report. "Our results may not transfer to the real world."

What's next: Anthropic considers this the start of a long line of research into the emerging capabilities of its models.

bnew · Apr 10, 2024

Chat4u

We connect GPT based agents with live support chat to get you discounts and resolve disputes.

home.chat4u.io

bnew · Apr 10, 2024

Texas is replacing thousands of human exam graders with AI

The TEA is adamant that the system doesn’t count as AI, though.

www.theverge.com

Texas is replacing thousands of human exam graders with AI

Don’t call the ‘automated scoring engine’ AI, though. They don’t like that.

By Jess Weatherbed, a news writer focused on creative industries, computing, and internet culture. Jess started her career at TechRadar, covering news and hardware reviews.

Apr 10, 2024, 11:47 AM EDT

17 Comments

The TEA expects to save between $15 and $20 million per year by using its new “automated scoring engine.” Image: The Verge

Students in Texas taking their state-mandated exams this week are being used as guinea pigs for a new artificial intelligence-powered scoring system set to replace a majority of human graders in the region.

The Texas Tribune reports an “automated scoring engine” that utilizes natural language processing — the technology that enables chatbots like OpenAI’s ChatGPT to understand and communicate with users — is being rolled out by the Texas Education Agency (TEA) to grade open-ended questions on the State of Texas Assessments of Academic Readiness (STAAR) exams. The agency is expecting the system to save $15–20 million per year by reducing the need for temporary human scorers, with plans to hire under 2,000 graders this year compared to the 6,000 required in 2023.

“We wanted to keep as many constructed open-ended responses as we can, but they take an incredible amount of time to score.”

The STAAR exams, which test students between the third and eighth grades on their understanding of the core curriculum, were redesigned last year to include fewer multiple-choice questions. It now contains up to seven times more open-ended questions, with TEA director of student assessment Jose Rios saying the agency “wanted to keep as many constructed open-ended responses as we can, but they take an incredible amount of time to score.”

According to a slideshow hosted on TEA’s website, the new scoring system was trained using 3,000 exam responses that had already received two rounds of human grading. Some safety nets have also been implemented — a quarter of all the computer-graded results will be rescored by humans, for example, as will answers that confuse the AI system (including the use of slang or non-English responses).

While TEA is optimistic that AI will enable it to save buckets of cash, some educators aren’t so keen to see it implemented. Lewisville Independent School District superintendent Lori Rapp said her district saw a “drastic increase” in constructed responses receiving a zero score when the automated grading system was used on a limited basis in December 2023. “At this time, we are unable to determine if there is something wrong with the test question or if it is the new automated scoring system,” Rapp said.

AI essay-scoring engines are nothing new. A 2019 report from Motherboard found that they were being used in at least 21 states to varying degrees of success, though TEA seems determined to avoid the same reputation. Small print on TEA’s slideshow also stresses that its new scoring engine is a closed system that’s inherently different from AI, in that “AI is a computer using progressive learning algorithms to adapt, allowing the data to do the programming and essentially teaching itself.”

The attempt to draw a line between them isn’t surprising — there’s no shortage of teachers despairing online about how generative AI services are being used to cheat on assignments and homework. The students being graded by this new scoring system may have a hard time accepting how they believe “rules for thee and not for me” are being applied here.

bnew · Apr 10, 2024

A new bill wants to reveal what’s really inside AI training data

The bill is the latest move to protect copyright in the AI space.

www.theverge.com

A new bill wants to reveal what’s really inside AI training data

Rep. Adam Schiff’s bill garnered support from several entertainment industry groups.

By Emilia David, a reporter who covers AI. Prior to joining The Verge, she covered the intersection between technology, finance, and the economy.

Apr 10, 2024, 1:35 PM EDT

Photo illustration of the shape of a brain on a circuitboard.

Illustration: Cath Virginia / The Verge | Photos: Getty Images

A new bill would compel tech companies to disclose any copyrighted materials that are used to train their AI models.

The Generative AI Copyright Disclosure bill from Rep. Adam Schiff (D-CA) would require anyone making a training dataset for AI to submit reports on its contents to the Copyrights Register. The reports should include a detailed summary of the copyrighted material in the dataset and the URL for the dataset if it’s publicly available. This requirement will be extended to any changes made to the dataset.

Companies must submit a report “not later than 30 days” before the AI model that used the training dataset is released to the public. The bill will not be retroactive to existing AI platforms unless changes are made to their training datasets after it becomes law.

Schiff’s bill hits on an issue artists, authors, and other creators have been complaining about since the rise of generative AI: that AI models are often trained on copyrighted material without permission. Copyright and AI have always been tricky to navigate, especially as the question of how much AI models change or mimic protected content has not been settled. Artists and authors have turned to lawsuits to assert their rights.

Developers of AI models claim their models are trained on publicly available data, but the sheer amount of information means they don’t know specifically which data is copyrighted. Companies have said any copyrighted materials fall under fair use. Meanwhile, many of these companies have begun offering legal cover to some customers if they find themselves sued for copyright infringement.

Schiff’s bill garnered support from industry groups like the Writers Guild of America (WGA), the Recording Industry Association of America (RIAA), the Directors Guild of America (DGA), the Screen Actors Guild - American Federation of Television and Radio Artists (SAG-AFTRA), and the Authors Guild. Notably absent from the list of supporters is the Motion Picture Association (MPA), which normally backs moves to protect copyrighted work from piracy. (Disclosure: The Verge’s editorial staff is unionized with the Writers Guild of America, East.)

Other groups have sought to bring more transparency to training datasets. The group Fairly Trained wants to add labels to AI models if they prove they asked for permission to use copyrighted data.

bnew · Apr 10, 2024

I’m still trying to generate an AI Asian man and white woman

Image generators consistently have trouble creating accurate pictures based on simple prompts involving Asian people.

www.theverge.com

I’m still trying to generate an AI Asian man and white woman

Image generators, from DALL-E to Midjourney, consistently have trouble creating accurate pictures based on simple prompts involving Asian people.

By Mia Sato and Emilia David

Apr 10, 2024, 8:30 AM EDT

37 Comments

Collage of various results of Midjourney prompts.

Image: Cath Virginia / The Verge

I inadvertently found myself on the AI-generated Asian people beat this past week. Last Wednesday, I found that Meta’s AI image generator built into Instagram messaging completely failed at creating an image of an Asian man and white woman using general prompts. Instead, it changed the woman’s race to Asian every time.

The next day, I tried the same prompts again and found that Meta appeared to have blocked prompts with keywords like “Asian man” or “African American man.” Shortly after I asked Meta about it, images were available again — but still with the race-swapping problem from the day before.

I understand if you’re a little sick of reading my articles about this phenomenon. Writing three stories about this might be a little excessive; I don’t particularly enjoy having dozens and dozens of screenshots on my phone of synthetic Asian people.

But there is something weird going on here, where several AI image generators specifically struggle with the combination of Asian men and white women. Is it the most important news of the day? Not by a long shot. But the same companies telling the public that “ AI is enabling new forms of connection and expression” should also be willing to offer an explanation when its systems are unable to handle queries for an entire race of people.

After each of the stories, readers shared their own results using similar prompts with other models. I wasn’t alone in my experience: people reported getting similar error messages or having AI models consistently swapping races.

I teamed up with The Verge’s Emilia David to generate some AI Asians across multiple platforms. The results can only be described as consistently inconsistent.

Google Gemini

Gemini refusing to generate a photo of an Asian man and a white wife

Screenshot: Emilia David / The Verge

Gemini refused to generate Asian men, white women, or humans of any kind.

In late February, Google paused Gemini’s ability to generate images of people after its generator — in what appeared to be a misguided attempt at diverse representation in media — spat out images of racially diverse Nazis. Gemini’s image generation of people was supposed to return in March, but it is apparently still offline.

Gemini is able to generate images without people, however!

Screenshot of Gemini prompt to generate a photo of the eclipse

No interracial couples in these AI-generated photos. Screenshot: Emilia David / The Verge

Google did not respond to a request for comment.

DALL-E

ChatGPT’s DALL-E 3 struggled with the prompt “Can you make me a photo of an Asian man and a white woman?” It wasn’t exactly a miss, but it didn’t quite nail it, either. Sure, race is a social construct, but let’s just say this image isn’t what you thought you were going to get, is it?

We asked, “Can you make me a photo of an Asian man and a white woman” and got a firm “kind of.” Image: Emilia David / The Verge

OpenAI did not respond to a request for comment.

Midjourney

Midjourney struggled similarly. Again, it wasn’t a total miss the way that Meta’s image generator was last week, but it was clearly having a hard time with the assignment, generating some deeply confusing results. None of us can explain that last image, for instance. All of the below were responses to the prompt “asian man and white wife.”

Image: Emilia David / The Verge

Midjourney-generated AI image of Asian man and white woman

Image: Cath Virginia / The Verge

Midjourney did eventually give us some images that were the best attempt across three different platforms — Meta, DALL-E, and Midjourney — to represent a white woman and an Asian man in a relationship. At long last, a subversion of racist societal norms!

Unfortunately, the way we got there was through the prompt “asian man and white woman standing in a yard academic setting.”

Midjourney image generation of Asian man and white woman in an “academic setting”

Image: Emilia David / The Verge

What does it mean that the most consistent way AI can contemplate this particular interracial pairing is by placing it in an academic context? What kind of biases are baked into training sets to get us to this point? How much longer do I have to hold off on making an extremely mediocre joke about dating at NYU?

Midjourney did not respond to a request for comment.

Meta AI via Instagram (again)

Back to the old grind of trying to get Instagram’s image generator to acknowledge nonwhite men with white women! It seems to be performing much better with prompts like “white woman and Asian husband” or “Asian American man and white friend” — it didn’t repeat the same errors I was finding last week.

However, it’s now struggling with text prompts like “Black man and caucasian girlfriend” and generating images of two Black people. It was more accurate using “white woman and Black husband,” so I guess it only sometimes doesn’t see race?

“Black man and caucasian girlfriend” AI prompt showing two Black people.

“Black man and White girlfriend” AI prompt showing two Black people.

“White woman and Black husband” AI prompt with a racially accurate result.

Screenshots: Mia Sato / The Verge

“White woman and Black boyfriend” AI image with racially accurate results.

There are certain ticks that start to become apparent the more you generate images. Some feel benign, like the fact that many AI women of all races apparently wear the same white floral sleeveless dress that crosses at the bust. There are usually flowers surrounding couples (Asian boyfriends often come with cherry blossoms), and nobody looks older than 35 or so. Other patterns among images feel more revealing: everyone is thin, and Black men specifically are depicted as muscular. White woman are blonde or redheaded and hardly ever brunette. Black men always have deep complexions.

“White woman and Black husband” AI prompted image.

“White woman and Black boyfriend” AI image prompt.

“White woman and Black boyfriend” AI image.

“White woman and black boyfriend” AI generated image.

“As we said when we launched these new features in September, this is new technology and it won’t always be perfect, which is the same for all generative AI systems,” Meta spokesperson Tracy Clayton told The Verge in an email. “Since we launched, we’ve constantly released updates and improvements to our models and we’re continuing to work on making them better.”

I wish I had some deep insight to impart here. But once again, I’m just going to point out how ridiculous it is that these systems are struggling with fairly simple prompts without relying on stereotypes or being incapable of creating something all together. Instead of explaining what’s going wrong, we’ve had radio silence from companies, or generalities. Apologies to everyone who cares about this — I’m going to go back to my normal job now.

bnew · Apr 10, 2024

Meta may be releasing a smaller Llama 3

Meta already released a smaller version of Llama 2.

www.theverge.com

Meta may release smaller Llama AI model before the big version

These models will come out even before the flagship model is launched this year.

By Emilia David, a reporter who covers AI. Prior to joining The Verge, she covered the intersection between technology, finance, and the economy.

Apr 9, 2024, 11:37 AM EDT

Illustration by Nick Barclay / The Verge

Meta will reportedly release smaller versions of its Llama language model as companies look to offer more cost-effective AI models to the public.

The Information reports that the company plans to launch two small Llama 3 versions this month before putting out the flagship model this summer. The Verge reached out to Meta for comment.

The move underscores the growing trend of AI developers adding lightweight AI model options. Meta already has a smaller version of its Llama 2 model, Llama 2 7B, which it launched in February last year. Google came out with the Gemma family of models in February, and the French AI company Mistral also has Mistral 7B.

These models typically cannot handle long strings of instructions from users but are faster, more flexible, and, most importantly, cheaper to run than a regular-sized model. But these are still powerful AI models, able to summarize PDFs and conversations and write code. Larger models are usually used for more complicated tasks like generating photos or tasks that require several commands to execute. Since small models only work with a smaller number of parameters (data that it learns), these also require less computing power and, therefore, are more cost-effective.

Lightweight models tend to attract users who don’t necessarily want to use the breadth of a large language model for their applications. Smaller models can most often be deployed in specific projects like code assistance or in devices that cannot handle the power usage of a bigger AI model, like phones or laptops.

Meta reportedly plans a July release for Llama 3, which may be “looser” than the previous version and be able to answer controversial questions Llama 2 was not allowed to answer.

bnew · Apr 10, 2024

https://archive.is/Bx2lz

The A.I Megathread (LLM , GPT , Development)

Veteran

Veteran

Veteran

Pricing​

Veteran

How Reinforcement Learning from AI Feedback works​

#​

Preference Model​

Ranked feedback training​

#​

Veteran

Scaling supervision​

#​

Step 1: Generate revisions​

Step 2: Finetune with revisions​

Step 3: Generate harmlessness dataset​

Step 4: Train Preference model​

Step 5: Reinforcement learning​

Veteran

Summary​

#​

Performance gains​

Ethical considerations​

Additional benefits​

#​

Nine Bullet AI Constitution​

#​

#​

Veteran

Computer Science > Artificial Intelligence​

Wu's Method can Boost Symbolic AI to Rival Silver Medalists and AlphaGeometry to Outperform Gold Medalists at IMO Geometry​

Submission history​

Veteran

Apple Explores Home Robotics as Potential ‘Next Big Thing’ After Car Fizzles​

Could Robotics Be Apple’s Next Big Thing?​

Veteran

Anthropic says its AI models are as persuasive as humans​

Veteran

Veteran

Texas is replacing thousands of human exam graders with AI​

Don’t call the ‘automated scoring engine’ AI, though. They don’t like that.​

Veteran

A new bill wants to reveal what’s really inside AI training data​

Rep. Adam Schiff’s bill garnered support from several entertainment industry groups.​

Veteran

I’m still trying to generate an AI Asian man and white woman​

Image generators, from DALL-E to Midjourney, consistently have trouble creating accurate pictures based on simple prompts involving Asian people.​

Google Gemini​

DALL-E​

Midjourney​

Meta AI via Instagram (again)​

Veteran

Meta may release smaller Llama AI model before the big version​

These models will come out even before the flagship model is launched this year.​

Veteran

Pricing

How Reinforcement Learning from AI Feedback works

#

Preference Model

Ranked feedback training

#

Scaling supervision

#

Step 1: Generate revisions

Step 2: Finetune with revisions

Step 3: Generate harmlessness dataset

Step 4: Train Preference model

Step 5: Reinforcement learning

Summary

#

Performance gains

Ethical considerations

Additional benefits

#

Nine Bullet AI Constitution

#

#

Computer Science > Artificial Intelligence

Wu's Method can Boost Symbolic AI to Rival Silver Medalists and AlphaGeometry to Outperform Gold Medalists at IMO Geometry

Submission history

Apple Explores Home Robotics as Potential ‘Next Big Thing’ After Car Fizzles

Could Robotics Be Apple’s Next Big Thing?

Anthropic says its AI models are as persuasive as humans

Texas is replacing thousands of human exam graders with AI

Don’t call the ‘automated scoring engine’ AI, though. They don’t like that.

A new bill wants to reveal what’s really inside AI training data

Rep. Adam Schiff’s bill garnered support from several entertainment industry groups.

I’m still trying to generate an AI Asian man and white woman

Image generators, from DALL-E to Midjourney, consistently have trouble creating accurate pictures based on simple prompts involving Asian people.

Google Gemini

DALL-E

Midjourney

Meta AI via Instagram (again)

Meta may release smaller Llama AI model before the big version

These models will come out even before the flagship model is launched this year.