The A.I Megathread (LLM , GPT , Development)

bnew · Oct 27, 2023

We aim to produce a smaller language model that is aligned to user intent. Previous research has shown that applying distilled supervised fine-tuning (dSFT) on larger models significantly improves task accuracy; however, these models are unaligned, i.e. they do not respond well to natural...

arxiv.org

[Submitted on 25 Oct 2023]

Zephyr: Direct Distillation of LM Alignment

Lewis Tunstall, Edward Beeching, Nathan Lambert, Nazneen Rajani, Kashif Rasul, Younes Belkada, Shengyi Huang, Leandro von Werra, Clémentine Fourrier, Nathan Habib, Nathan Sarrazin, Omar Sanseviero, Alexander M. Rush, Thomas Wolf

We aim to produce a smaller language model that is aligned to user intent. Previous research has shown that applying distilled supervised fine-tuning (dSFT) on larger models significantly improves task accuracy; however, these models are unaligned, i.e. they do not respond well to natural prompts. To distill this property, we experiment with the use of preference data from AI Feedback (AIF). Starting from a dataset of outputs ranked by a teacher model, we apply distilled direct preference optimization (dDPO) to learn a chat model with significantly improved intent alignment. The approach requires only a few hours of training without any additional sampling during fine-tuning. The final result, Zephyr-7B, sets the state-of-the-art on chat benchmarks for 7B parameter models, and requires no human annotation. In particular, results on MT-Bench show that Zephyr-7B surpasses Llama2-Chat-70B, the best open-access RLHF-based model. Code, models, data, and tutorials for the system are available at this https URL.

Subjects:	Machine Learning (cs.LG); Computation and Language (cs.CL)
Cite as:	arXiv:2310.16944 [cs.LG]
	(or arXiv:2310.16944v1 [cs.LG] for this version)
	[2310.16944] Zephyr: Direct Distillation of LM Alignment Focus to learn more

Submission history

From: Alexander M. Rush [view email]
[v1] Wed, 25 Oct 2023 19:25:16 UTC (3,722 KB)

https://arxiv.org/pdf/2310.16944.pdf

HuggingFaceH4/zephyr-7b-beta · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

Model Card for Zephyr 7B β

Zephyr is a series of language models that are trained to act as helpful assistants. Zephyr-7B-β is the second model in the series, and is a fine-tuned version of mistralai/Mistral-7B-v0.1 that was trained on on a mix of publicly available, synthetic datasets using Direct Preference Optimization (DPO). We found that removing the in-built alignment of these datasets boosted performance on MT Bench and made the model more helpful. However, this means that model is likely to generate problematic text when prompted to do so and should only be used for educational and research purposes. You can find more details in the technical report.

Model description

Model type: A 7B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets.
Language(s) (NLP): Primarily English
License: MIT
Finetuned from model: mistralai/Mistral-7B-v0.1

Model Sources

Repository: GitHub - huggingface/alignment-handbook: Robust recipes for to align language models with human and AI preferences
Demo: Zephyr Chat - a Hugging Face Space by HuggingFaceH4
Chatbot Arena: Evaluate Zephyr 7B against 10+ LLMs in the LMSYS arena: http://arena.lmsys.org

Performance

At the time of release, Zephyr-7B-β is the highest ranked 7B chat model on the MT-Bench and AlpacaEval benchmarks:

Model	Size	Alignment	MT-Bench (score)	AlpacaEval (win rate %)
StableLM-Tuned-α	7B	dSFT	2.75	-
MPT-Chat	7B	dSFT	5.42	-
Xwin-LMv0.1	7B	dPPO	6.19	87.83
Mistral-Instructv0.1	7B	-	6.84	-
Zephyr-7b-α	7B	dDPO	6.88	-
Zephyr-7b-β	7B	dDPO	7.34	90.60
Falcon-Instruct	40B	dSFT	5.17	45.71
Guanaco	65B	SFT	6.41	71.80
Llama2-Chat	70B	RLHF	6.86	92.66
Vicuna v1.3	33B	dSFT	7.12	88.99
WizardLM v1.0	70B	dSFT	7.71	-
Xwin-LM v0.1	70B	dPPO	-	95.57
GPT-3.5-turbo	-	RLHF	7.94	89.37
Claude 2	-	RLHF	8.06	91.36
GPT-4	-	RLHF	8.99	95.28

In particular, on several categories of MT-Bench, Zephyr-7B-β has strong performance compared to larger open models like Llama2-Chat-70B:

However, on more complex tasks like coding and mathematics, Zephyr-7B-β lags behind proprietary models and more research is needed to close the gap.

Intended uses & limitations

The model was initially fine-tuned on a filtered and preprocessed of the UltraChat dataset, which contains a diverse range of synthetic dialogues generated by ChatGPT. We then further aligned the model with

TRL's DPOTrainer on the openbmb/UltraFeedback dataset, which contains 64k prompts and model completions that are ranked by GPT-4. As a result, the model can be used for chat and you can check out our demo to test its capabilities.

You can find the datasets used for training Zephyr-7B-β here

TheBloke/zephyr-7B-beta-GGUF · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

demo:

Zephyr 7B Beta

The first open source alternative to ChatGPT. 💪

huggingfaceh4-zephyr-chat.hf.space

bnew · Oct 28, 2023

https://archive.ph/8GVMh

Synthetic data will provide the next trillion tokens to fuel our hungry models.

I'm excited to announce MimicGen: massively scaling up data pipeline for robot learning! We multiply high-quality human data in simulation with digital twins.

Using < 200 human demonstrations, MimicGen can autonomously generate > 50,000 training episodes across 18 tasks, multiple simulators, and even in the real-world!

The idea is simple:
1. Humans tele-operate the robot to complete a task. It is extremely high-quality but also very slow and expensive.
2. We create a digital twin of the robot and the scene in high-fidelity, GPU-accelerated simulation.
3. We can now move objects around, replace with new assets, and even change the robot hand - basically augment the training data with procedural generation.
4. Export the successful episodes, and feed that to a neural network! You now have an near-infinite stream of data.

One of the key reasons that robotics lags far behind other AI fields is the lack of data: you cannot scrape control signals from the internet. They simply don't exist in-the-wild.

MimicGen shows the power of synthetic data and simulation to keep our scaling laws alive. I believe this principle apply beyond robotics. We are quickly exhausting the high-quality, real tokens from the web. Artificial intelligence from artificial data will be the way forward.

We are big fans of the OSS community. As usual, we open-source everything, including the generated dataset!

Website: mimicgen.github.io/
- Paper: arxiv.org/abs/2310.17596
- Dataset is hosted on HuggingFace (thanks @_akhaliq!!): huggingface.co/datasets/aman…
- Code: github.com/NVlabs/mimicgen_e… MimicGen is led by @AjayMandlekar, deep dive in the thread:

bnew · Oct 28, 2023

https://archive.ph/gcDYl

https://www.daviddeutsch.org.uk/wp-content/uploads/2019/07/PossibleMinds_Deutsch.pdf

bnew · Oct 28, 2023

https://archive.ph/9inUZ

Aligned conversational AI remains out of reach for most. Leading proprietary chatbots demand immense compute resources and human oversight, placing cutting-edge performance beyond the capabilities of many organizations.

Open-source models struggle to match proprietary assistants in effectively understanding and responding to diverse user needs. This results in confusion, misalignment, and unsatisfied customers.

ZEPHYR-7B breaks down these barriers. Through novel distillation techniques, this 7B parameter chatbot achieves alignment rivaling select 70B industry models designed with extensive human feedback.

ZEPHYR surpasses other open models on benchmarks measuring conversational ability. It requires no costly human labeling, enabling rapid training on modest compute.

While work remains to address biases, safety procedures, and scaling, ZEPHYR-7B represents a major step toward accessible aligned AI. It brings customizable, transparent chatbots with proprietary-grade alignment within reach.

Don't settle for misaligned assistants.

Carlos E. Perez
@IntuitMachine
Oct 27
Oct 27
3 training and alignment methods:

1. Distilled Supervised Fine-Tuning (dSFT): The student model is trained on dialogues generated by a teacher model on a diverse prompt set. This provides a basic conversational ability.

2. AI Feedback (AIF): An ensemble of models generates responses to prompts which are ranked by the teacher. The top response and a random lower-ranked one are saved as a training example. This converts rankings to preferences.

3. Distilled Direct Preference Optimization (dDPO): The student model is directly optimized to rank the teacher-preferred response higher, using the AIF data. This aligns the student model to the teacher's preferences without any sampling.

Carlos E. Perez
@IntuitMachine
Oct 27
Oct 27
Here is an overview of the full training process:

1. Start with a pretrained language model (LLM) like Mistral-7B as the base student model.

2. Apply distilled supervised fine-tuning (dSFT) using the UltraChat dataset, which contains dialogues generated by the teacher model GPT-3.5 Turbo. This teaches the student basic conversational abilities.

3. Collect AI feedback (AIF) preferences using the UltraFeedback dataset. Prompts are given to an ensemble of models, then ranked by the teacher GPT-4. The top response is saved as the "winner" and a random lower-ranked one as the "loser".

4. Optimize the dSFT student model using distilled direct preference optimization (dDPO) on the AIF data. This directly maximizes the probability of the "winner" response over the "loser" response for each prompt, aligning the student with the teacher's preferences.

5. The final model is ZEPHYR-7B, which combines the conversational skills of dSFT and the intent alignment of dDPO without any human labeling.
Carlos E. Perez
@IntuitMachine
Oct 27
Oct 27
key results

MT-Bench:
- ZEPHYR-7B obtains a score of 7.34, significantly higher than other 7B models like Mistral-Instruct (6.84), Xwin-LM-7B (6.19), and StableLM-α (2.75).
- This surpasses the 70B RLHF model LLaMa2-Chat at 6.86 and is competitive with GPT-3.5 Turbo at 7.94.
- ZEPHYR-7B is still below GPT-4 at 8.99 and Claude-2 at 8.06.

AlpacaEval:
- ZEPHYR-7B achieves a 90.6% win rate against the baseline.
- This edges out Vicuna-33B at 88.99% and is on par with GPT-3.5 Turbo at 89.37%.
- Slightly lower than Claude-2 at 91.36% and GPT-4 at 95.28%.

OpenLLM Leaderboard:
- On ARC, ZEPHYR-7B reaches 62.03% accuracy compared to 54.52% for Mistral-Instruct.
- On Hellaswag, accuracy is 84.52% vs 75.63% for Mistral-Instruct.
- Similarly 5-10% gains on MMLU and TruthfulQA over other 7B models.

Ablations:
- dDPO gives gains of 0.5-1.0 on MT-Bench and 5-10% on AlpacaEval over dSFT models.
- DPO without dSFT fails completely, showing SFT provides critical starting skills.

In summary, ZEPHYR-7B significantly improves over other 7B chat models and is competitive with some much larger models like LLaMa2-70B. The dDPO approach successfully aligns smaller models.
Oct 27, 2023 · 11:01 AM UTC

Carlos E. Perez
@IntuitMachine
Oct 27
Oct 27
key limitations

- Reliance on GPT-4 judgments - The benchmarks used like MT-Bench and AlpacaEval rely on GPT-4 ratings, which are known to be biased towards models similar to it. So the true evaluation may be inflated.

- Safety and ethics not addressed - The focus is on intent alignment for helpfulness, but safety considerations around potential harmful outputs are not directly addressed. Methods to distill that property are needed.

- Scaling up unclear - It's not certain if the same gains would occur when applying dDPO to larger base models like LLaMa2-70B. The techniques may have more impact on smaller models.

- No human evaluations - There are no human ratings or preferences used, so the real human alignment is unvalidated.

- Limited prompt distribution - The benchmarks may not cover the full diversity of real user prompts.

- No parameter efficient methods - Approaches like LoRA could produce aligned models with fewer parameters but are not explored.

- Overfitting dDPO - Rapid overfitting is observed during dDPO but longer-term impacts are uncertain

Zephyr 7B Beta

The first open source alternative to ChatGPT. 💪

huggingfaceh4-zephyr-chat.hf.space

bnew · Oct 28, 2023

ChatGPT and Bing AI might already be obsolete, according to a new study

Meta-learning for Compositionality (MLC) might give AI-powered chatbots a run for their money.

www.windowscentral.com

What you need to know

A new study highlights how scientists are potentially on the verge of a breakthrough.
The new technique dubbed Meta-learning for Compositionality (MLC), has the capability to make generalizations about language.
Per benchmarks shared, neural networks could potentially outperform AI-powered chatbots like Bing Chat and ChatGPT, which also leverage neural network capabilities.
When presented with certain tasks, the neural network was able to replicate similar results, whereas the GPT-4 model struggled to accomplish these tasks.
The study claims that the new design is able to understand and use new words in different settings better than ChatGPT.

As companies continue putting more effort into AI to improve the technology, scientists have seemingly created a a discovery that might supersede generative AI's capabilities.

Per the report in Nature, scientists refer to the technique as Meta-learning for Compositionality (MLC). They further indicated that it has the capability to make generalizations about language. Moreover, scientists claim that it might be just as good as humans, especially when folding new words and applying them in different settings and contexts, ultimately presenting a life-like experience.

When put to the test and compared to ChatGPT (which leverages neural network technology to understand and generate text based on the user's prompts) the scientists concluded that the technique and humans performed better. This is despite the fact that chatbots like ChatGPT and Bing Chat are able to interact in a human-like manner and serve as AI-powered assistants.

According to Nature's report, there's a huge possibility that the new design could outwit AI-powered chatbots in the long run as it can interact with people more naturally compared to existing systems. Looking back, Microsoft's Bing Chat was spotted hallucinating during the initial days of its launch, though the issue was fixed.

Paul Smolensky, a scientist specializing in language at Johns Hopkins University in Baltimore, Maryland, stated that the technique is a "breakthrough in the ability to train networks to be systematic."

How does neural network work?

As highlighted above, a neural network is a type of artificial intelligence with the ability to fold new words and use them in different settings like humans. The only difference is that the technology must first undergo vigorous training to master the word and how to use it in different settings.

To determine the capability of the technology, the scientists ran several tests on humans by exposing them to new words and gauging their understanding of how well they were able to use the words in different contexts. They also tested their capability to link the newly learned words with specific colors. As per the benchmark shared, 80% of the people who participated in the exercise excelled and could relate the words with the colors.

The scientist used the same premise to train a neural network. However, they configured it to learn from its own mistakes. The goal was to allow the system to learn from every task it completed rather than using static data. To ensure that the neural network portrayed human-like characteristics, the scientists trained the model to reproduce similar errors to the ones made by those who took a similar test. Ultimately, this allowed the neural network to respond to a fresh batch of questions almost (if not perfectly) like humans.

GPT-4, on the other hand, took quite some time to make sense of the tasks presented to it. Even then, the results were dismal compared to humans and the neural network, where it averaged between 42 and 86 percent, depending on the tasks presented. Put incredibly simply, the issue with GPT and other similar systems is that they simply mimic intensely complex syntax, rather than demonstrate a true understanding of context. This is what leads GPT and similar models down hallucinogenic rabbit holes — humans are more capable of self-correcting anomalies like this, and neural networks may be more capable of doing so as well.

While this potentially proves that a neural network could be the next best thing after generative AI, a lot of testing and studies need to be done to assert this completely. It will be interesting to see how this plays out and how it reshapes systematic generalization.

What does the future hold for ChatGPT and Bing Chat?

Bing Chat is not available on Google Chrome

(Image credit: Future)

There's no doubt about generative AI's power and potential, especially if its vast capabilities are fully explored and put to good use. This is not to say that the technology is not achieving amazing feats already. Recently, a group of researchers proved that it's possible to successfully run a software company using ChatGPT and even generate code in under seven minutes for less than a dollar.

While impressive, the generative AI faces its fair share of setbacks. For instance, the exorbitant cost implication required to keep it going, not forgetting the amount of cooling water and energy it consumes. There have also been reports of OpenAI's AI-powered chatbot, ChatGPT, losing accuracy and its user base declining for three months consecutively. Bing Chat's market share has also stagnated despite Microsoft's heavy investment in the technology.

bnew · Oct 29, 2023

https://archive.ph/1QNRn

https://archive.ph/9r90A

bnew · Oct 29, 2023

Shutterstock will now let you transform real photos using AI

Shutterstock’s AI-powered tools are rolling out.

www.theverge.com

Shutterstock will now let you transform real photos using AI

/

Shutterstock says it will compensate artists if their images are licensed after they’re edited with AI.

By Emma Roth, a news writer who covers the streaming wars, consumer tech, crypto, social media, and much more. Previously, she was a writer and editor at MUO.

Oct 26, 2023, 9:00 AM EDT|0 Comments / 0 New

If you buy something from a Verge link, Vox Media may earn a commission. See our ethics statement.

An image showing Shutterstock photos in a grid

Image: Shutterstock

Shutterstock will now let you edit its library of images using AI. In an update on Thursday, Shutterstock revealed a set of new AI-powered tools, like Magic Brush, which lets you tweak an image by brushing over an area and “describing what you want to add, replace or erase.”

The AI image editor is still in beta and will also let you generate alternate versions of a stock or AI-generated image as well as expand the background of an image. Additionally, Shutterstock is rolling out a “smart” resizing feature that will automatically change an image’s shape to match your required dimensions, along with an AI-powered background removal tool.

Shutterstock notes that it will compensate artists “if their images are licensed after editing.” However, it adds that “AI-generated or edited content” will not be eligible for licensing on the site “to further ensure the protection of contributor IP and proper compensation of artists.”
“This is an unprecedented offering in the stock photography industry,” Shutterstock CEO Paul Hennessy says in a statement. “Now, creatives have everything they need to craft the perfect content for any project with AI-powered design capabilities that you can use to edit stock images within Shutterstock’s library, presenting infinite possibilities to make stock your own.”

The company also announced that it’s going to update its AI image generator, which it launched in beta in January, with the latest version of OpenAI’s DALL-E text-to-image generator. Shutterstock expanded its partnership with OpenAI in July, allowing DALL-E to train on Shutterstock’s library for six more years. Last year, Shutterstock announced a contributor’s fund to compensate artists whose work is used for training.

In addition to Shutterstock, other companies, like Adobe and Canva, are getting into AI-powered image editing. Last month, Adobe launched its Firefly generative AI tools for those who subscribe to Adobe Creative Cloud, Adobe Express, and Adobe Experience Cloud. As part of its launch, Adobe announced a new annual bonus scheme that it will pay to artists who allow their stock submissions to be used to train the company’s models. Canva similarly rolled out a trove of AI-powered design tools in March that users can use for free.https://www.theverge.com/2023/10/26/23933120/shutterstock-transform-real-photos-ai

bnew · Oct 29, 2023

https://www.theverge.com/2023/10/26/23932315/google-maps-ai-immersive-view-ev-charging-search

Google Maps is becoming more like Search — thanks to AI

/

Google Maps is getting an AI makeover, adding new features like Immersive View and making existing features like driving directions easier to follow. It’s also becoming more like Search.

By Andrew J. Hawkins, transportation editor with 10+ years of experience who covers EVs, public transportation, and aviation. His work has appeared in The New York Daily News and City & State.

Oct 26, 2023, 9:00 AM EDT|17 Comments / 17 New

Illustration by Alex Castro / The Verge

Google is adding a range of new AI-powered features to Maps, including more immersive navigation, easier-to-follow driving directions, and better organized search results. The end result is an experience that will likely remind many users of Google Search.

Some of this news has already been announced, like Immersive View, which is now coming to more cities. Other aspects are intended to enhance previously available features, like EV charging station availability.

But the biggest takeaway is that Google wants Maps to be more like Search: a place where people can obviously come get directions or find coffee shops and EV chargers but also enter vague queries like “fall foliage,” “latte art,” or “things to do in Tokyo” and get a whole bunch of actually useful hits. Google said it wants people to use Maps to discover new places or experiences, all while under the auspices of its all-powerful algorithm.

Google wants people to use Maps to discover new places or experiences
“AI has really supercharged the way we map,” said Chris Phillips, vice president and general manager of Geo, the team at Google that includes all of its geospatial location mapping products. “It plays a key role in everything from helping you navigate, [helping you] commute, discover new restaurants, where to go, when to go. These are all really important decisions that people are making all the time.”

Google

With Google locked in a tight competition with Apple, Microsoft, and others around the use of AI, the company is banking on its more familiar and popular products, like Google Maps, to help it maintain a leg up over its rivals. And as Google Search becomes more AI-driven, it’s only natural that these other products follow a similar lead.

The future of Google Maps, according to Phillips, is a product that’s more “visual and immersive” but also one that helps you make “more sustainable choices,” like riding transit or a bike. Google is also expanding its API offerings to developers, cities, and especially automotive companies, so they can tweak and improve Maps for the in-car navigation experience.

One of the ways Google is using AI to make Maps more like Search is to analyze “billions” of user-uploaded photos to help people find random items, like coffee shops that offer lattes with panda faces, said Miriam Daniel, Google Maps team leader. People can type specific questions into Maps, much in the way they do with Search, and get a list of results for nearby businesses or locations that match the query based on a real-time analysis of user photos.
“To give you the inspiration when you need it, we’re going to better organize search results for these broad queries,” Daniel said.

“To give you the inspiration when you need it, we’re going to better organize search results for these broad queries.”

Google is using neural radiance fields, which is a form of generative AI, to sort through billions of images, including aerial photos, street imagery, and indoor snaps to create a 360-degree experience, Daniel said. Traffic information is culled from historical data and filtered through Google’s predictive algorithm to help users figure out the best time to get somewhere with the least amount of traffic.

Google

Google also wants to help answer one of today’s most burning questions: is the EV charging station I’m driving toward actually going to work? Studies have shown that roughly 25 percent of chargers are down or inoperable at any given time. Google Maps will now tell you when a charger was last used to help EV owners figure out whether they’re wasting their time with a nonoperational charger. If the station was used a few hours ago, chances are it’s working. If it’s been a few days or weeks since it was used, you might want to find another charger instead. Google Maps is also adding more EV charging details, such as whether a charger is compatible with your EV and whether it’s fast, medium, or slow.

For more EV charging information while you’re driving, Google is offering updated Places APIs to developers to build out these features for cars with navigation systems based on Google Maps. Now, car companies can use the Places API to build out more EV charging information so their customers can see real-time location information, plug type, and charging speeds directly on their vehicle’s infotainment screens.

Google

To improve Google Maps’ stickiness, the company is also rolling out more flashy features, like Immersive View, which was first announced earlier this year. This feature offers users a 3D view of a place to help them see where they’re supposed to go, while also offering other tidbits of information, like local business locations, weather, and traffic. The feature is now available for Android and iOS users in 15 cities, including Amsterdam, Barcelona, Dublin, Florence, Las Vegas, London, Los Angeles, Miami, New York, Paris, San Francisco, San Jose, Seattle, Tokyo, and Venice.

Google is rebranding its augmented reality feature “Search with Live View” to “Lens in Maps” in which someone taps the Lens in the search bar and holds up their camera to find information about the nearest train stations, coffee shops, ATMs, or whatever is nearby. In this way, Google is trying to cut out the middleman by having to search for businesses that are close by and allowing you to use your phone’s camera as an AR tool instead. Lens in Maps is now live in 50 more cities, including Austin, Las Vegas, Rome, São Paulo, and Taipei.

Navigation in Google Maps is getting a makeover: updated colors, more realistic buildings, and improved lane details for tricky highway exits. These improvements are coming to 12 countries, including the US, Canada, France, and Germany. US drivers will also start to see, for the first time, HOV lanes, which is coming to Android, iOS, and cars with Google built-in. And speed limit information is coming to around 20 new countries.

Google

The question is, do Google Maps users want all of this stuff? Trying to pack too much tech into one product is the hallmark of feature bloat, which can drive some users away. But Google is betting that pretty visuals, more AI-driven search results, and other bells and whistles can help elevate it over other map products, like Apple Maps, which itself is finally starting to eat into Google’s market share.
“The foundation of all the work we do is to build the most comprehensive, fresh, accurate information to represent the real world,” Phillips said. “This is key for us, and we like to talk about the map as being alive.”

bnew · Oct 30, 2023

https://archive.ph/uNWVI

bnew · Oct 30, 2023

CodeFusion: A Pre-trained Diffusion Model for Code Generation

Imagine a developer who can only change their last line of code, how often would they have to start writing a function from scratch before it is correct? Auto-regressive models for code generation from natural language have a similar limitation: they do not easily allow reconsidering earlier...

arxiv.org

Computer Science > Software Engineering

[Submitted on 26 Oct 2023]

CodeFusion: A Pre-trained Diffusion Model for Code Generation

Mukul Singh, José Cambronero, Sumit Gulwani, Vu Le, Carina Negreanu, Gust Verbruggen

Imagine a developer who can only change their last line of code, how often would they have to start writing a function from scratch before it is correct? Auto-regressive models for code generation from natural language have a similar limitation: they do not easily allow reconsidering earlier tokens generated. We introduce CodeFusion, a pre-trained diffusion code generation model that addresses this limitation by iteratively denoising a complete program conditioned on the encoded natural language. We evaluate CodeFusion on the task of natural language to code generation for Bash, Python, and Microsoft Excel conditional formatting (CF) rules. Experiments show that CodeFusion (75M parameters) performs on par with state-of-the-art auto-regressive systems (350M-175B parameters) in top-1 accuracy and outperforms them in top-3 and top-5 accuracy due to its better balance in diversity versus quality.

Comments:	EMNLP 2023, 12 pages
Subjects:	Software Engineering (cs.SE); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Programming Languages (cs.PL)
Cite as:	arXiv:2310.17680 [cs.SE]
	(or arXiv:2310.17680v1 [cs.SE] for this version)
	[2310.17680] CodeFusion: A Pre-trained Diffusion Model for Code Generation Focus to learn more

Submission history

From: Mukul Singh [view email]
[v1] Thu, 26 Oct 2023 11:06:15 UTC (463 KB)

https://arxiv.org/pdf/2310.17680.pdf

AI summary of summary

In simple terms, CodeFusion is a tool that helps developers generate whole programs or functions based on given instructions, without having to write everything from scratch every time. It's like an assistant that listens to your requests and suggests possible solutions, rather than just suggesting one specific thing at a time as some other tools might do. The tool has already been trained on lots of examples, so it knows what makes sense in different programming languages like Bash, Python, and Microsoft Excel. When you give it a natural language instruction, CodeFusion generates a partially completed program, but then keeps improving it until it reaches a high-quality solution. This process involves "denoising" or removing random noise from the program, which allows CodeFusion to consider all previous steps when deciding on each new step. Compared to some other popular tools, CodeFusion tends to suggest more diverse options while still maintaining good overall quality. So, instead of always generating exactly the same solution, CodeFusion offers multiple possibilities that are all likely to be helpful. Overall, CodeFusion aims to make coding easier and faster, especially for tasks where small changes need to be made repeatedly.

IIVI · Oct 30, 2023

bnew said:

This is insane news right here. Crazy if it's true:

bnew · Oct 30, 2023

IIVI said:
This is insane news right here. Crazy if it's true:

if median/medium human AGI is actually achieved then a lot more than 10% of jobs will be replaced.

bnew · Oct 30, 2023

Biden issues U.S.' first AI executive order, requiring safety assessments, civil rights guidance, research on labor market impact

The executive order builds on voluntary commitments the White House previously secured from leading AI companies.

www.cnbc.com

TECH

Biden issues U.S.′ first AI executive order, requiring safety assessments, civil rights guidance, research on labor market impact

PUBLISHED MON, OCT 30 20235:17 AM EDTUPDATED 2 HOURS AGO

Hayden Field@HAYDENFIELD

Lauren Feiner@LAUREN_FEINER

KEY POINTS

U.S. President Joe Biden unveiled a new executive order on artificial intelligence.
It’s the U.S. government’s first action of its kind, requiring new safety assessments, equity and civil rights guidance and research on AI’s impact on the labor market.
The order builds on voluntary commitments the White House previously secured from leading AI companies and represents the first major binding government action on the technology.

SAN FRANCISCO, CALIFORNIA - JUNE 20: President Joe Biden speaks as he meets with AI experts and researchers at the Fairmont Hotel in San Francisco, Calif., on Tuesday, June 20, 2023. (Jane Tyska/Digital First Media/East Bay Times via Getty Images)

President Joe Biden speaks as he meets with AI experts and researchers at the Fairmont Hotel in San Francisco, California, June 20, 2023.
Jane Tyska | Medianews Group | Getty Images

President Joe Biden issued a new executive order on artificial intelligence — the U.S. government’s first action of its kind — requiring new safety assessments, equity and civil rights guidance and research on AI’s impact on the labor market.

While law enforcement agencies have warned that they’re ready to apply existing law to abuses of AI and Congress has endeavored to learn more about the technology to craft new laws, the executive order could have a more immediate impact. Like all executive orders, it “has the force of law,” according to a senior administration official who spoke with reporters on a call Sunday.

The White House breaks the key components of the executive order into eight parts:

Creating new safety and security standards for AI, including by requiring some AI companies to share safety test results with the federal government, directing the Commerce Department to create guidance for AI watermarking, and creating a cybersecurity program that can make AI tools that help identify flaws in critical software.
Protecting consumer privacy, including by creating guidelines that agencies can use to evaluate privacy techniques used in AI.
Advancing equity and civil rights by providing guidance to landlords and federal contractors to help avoid AI algorithms furthering discrimination, and creating best practices on the appropriate role of AI in the justice system, including when it’s used in sentencing, risk assessments and crime forecasting.
Protecting consumers overall by directing the Department of Health and Human Services to create a program to evaluate potentially harmful AI-related health-care practices and creating resources on how educators can responsibly use AI tools.
Supporting workers by producing a report on the potential labor market implications of AI and studying the ways the federal government could support workers affected by a disruption to the labor market.
Promoting innovation and competition by expanding grants for AI research in areas such as climate change and modernizing the criteria for highly skilled immigrant workers with key expertise to stay in the U.S.
Working with international partners to implement AI standards around the world.
Developing guidance for federal agencies’ use and procurement of AI and speeding up the government’s hiring of workers skilled in the field.

The order represents “the strongest set of actions any government in the world has ever taken on AI safety, security, and trust,” White House Deputy Chief of Staff Bruce Reed said in a statement.

It builds on voluntary commitments the White House previously secured from leading AI companies and represents the first major binding government action on the technology. It also comes ahead of an AI safety summit hosted by the U.K.

The senior administration official referenced the fact that 15 major American technology companies have agreed to implement voluntary AI safety commitments but said that it “is not enough” and that Monday’s executive order is a step toward concrete regulation for the technology’s development.
“The President, several months ago, directed his team to pull every lever, and that’s what this order does: bringing the power of the federal government to bear in a wide range of areas to manage AI’s risk and harness its benefits,” the official said.

Biden’s executive order requires that large companies share safety test results with the U.S. government before the official release of AI systems. It also prioritizes the National Institute of Standards and Technology’s development of standards for AI “red-teaming,” or stress-testing the defenses and potential problems within systems. The Department of Commerce will develop standards for watermarking AI-generated content.

The order also addresses training data for large AI systems, and it lays out the need to evaluate how agencies collect and use commercially available data, including data purchased from data brokers, especially when that data involves personal identifiers.

The Biden administration is also taking steps to beef up the AI workforce. Beginning Monday, the senior administration official said, workers with AI expertise can find relevant openings in the federal government on AI.gov.

The administration official said Sunday that the “most aggressive” timing for some safety and security aspects of the order involves a 90-day turnaround, and for some other aspects, that time frame could be closer to a year.

Building on earlier AI actions

Monday’s executive order follows a number of steps the White House has taken in recent months to create spaces to discuss the pace of AI development, as well as proposed guidelines.

Since the viral rollout of ChatGPT in November 2022 — which within two months became the fastest-growing consumer application in history, according to a UBS study — the widespread adoption of generative AI has already led to public concerns, legal battles and lawmaker questions. For instance, days after Microsoft folded ChatGPT into its Bing search engine, it was criticized for toxic speech, and popular AI image generators have come under fire for racial bias and propagating stereotypes.

Biden’s executive order directs the Department of Justice, as well as other federal offices, to develop standards for “investigating and prosecuting civil rights violations related to AI,” the administration official said Sunday on the call with reporters.

“The President’s executive order requires a clear guidance must be provided to landlords, federal benefits programs and federal contractors to keep AI algorithms from being used to exacerbate discrimination,” the official added.

In August, the White House challenged thousands of hackers and security researchers to outsmart top generative AI models from the field’s leaders, including OpenAI, Google, Microsoft, Meta and Nvidia. The competition ran as part of Def Con, the world’s largest hacking conference.

“It is accurate to call this the first-ever public assessment of multiple LLMs,” a representative for the White House Office of Science and Technology Policy told CNBC at the time.

The competition followed a July meeting between the White House and seven top AI companies, including Alphabet, Microsoft, OpenAI, Amazon, Anthropic, Inflection and Meta. Each of the companies left the meeting having agreed to a set of voluntary commitments in developing AI, including allowing independent experts to assess tools before public debut, researching societal risks related to AI and allowing third parties to test for system vulnerabilities, such as in the competition at Def Con.

bnew · Oct 30, 2023

bnew · Oct 30, 2023

Google AI Chief Says There’s a 50% Chance We’ll Hit AGI in Just 5 Years

The co-founder of Google DeepMind has for decades thought that AGI will be achieved by 2028 — and he's holding firm on that bet.

futurism.com

FUTURE FORECAST

10:44 AM by NOOR AL-SIBAI

Google AI Chief Says There’s a 50% Chance We’ll Hit AGI in Just 5 Years

"I think it's entirely plausible."

/ Artificial Intelligence/ Agi/ Deep Mind/ Google

Image by Getty Images

More than a decade ago, the co-founder of Google's DeepMind artificial intelligence lab predicted that by 2028, AI will have a half-and-half shot of being about as smart as humans — and now, he's holding firm on that forecast.

In an interview with tech podcaster Dwarkesh Patel, DeepMind co-founder Shane Legg said that he still thinks that researchers have a 50-50 chance of achieving artificial general intelligence (AGI), a stance he publicly announced at the very end of 2011 on his blog.

It's a notable prediction considering the exponentially growing interest in the space. OpenAI CEO Sam Altman has long advocated for an AGI, a hypothetical agent that is capable of accomplishing intellectual tasks as well as a human, that can be of benefit to all. But whether we'll ever be able to get to that point — let alone agree on one definition of AGI — remains to be seen.

Legg apparently began looking towards his 2028 goalpost all the way back in 2001 after reading "The Age of Spiritual Machines," the groundbreaking 1999 book by fellow Google AI luminary Ray Kurzweil that predicts a future of superhuman AIs.

"There were two really important points in his book that I came to believe as true," he explained. "One is that computational power would grow exponentially for at least a few decades. And that the quantity of data in the world would grow exponentially for a few decades."

Paired with an understanding of the trends of the era, such as the deep learning method of teaching algorithms to "think" and process data the way human brains do, Legg wrote back at the start of the last decade that in the coming ones, AGI could well be achieved — so long as "nothing crazy happens like a nuclear war."

Today, the DeepMind co-founder said that there are caveats to his prediction that the AGI era will be upon us by the end of this decade.

The first, broadly, is that definitions of AGI are reliant on definitions of human intelligence — and that kind of thing is difficult to test precisely because the way we think is complicated.

"You'll never have a complete set of everything that people can do," Legg said — things like developing episodic memory, or the ability to recall complete "episodes" that happened in the past, or even understanding streaming video. But if researchers could assemble a battery of tests for human intelligence and an AI model were to perform well enough against them, he continued, then "you have an AGI."

When Patel asked if there could be a single simple test to see whether an AI system had reached general intelligence, such as beating Minecraft, Legg pushed back.

"There is no one thing that would do it, because I think that's the nature of it," the AGI expert said. "It's about general intelligence. So I'd have to make sure [an AI system] could do lots and lots of different things and it didn't have a gap."

The second biggest caveat, Legg added, was the ability to scale AI training models way, way up — a worthy point given how much energy AI companies are already using to churn out large language models like OpenAI's GPT-4.

"There's a lot of incentive to make a more scalable algorithm to harness all this computing data," Legg explained. "So I thought it would be very likely that we'll start to discover scalable algorithms to do this."

Asked where he thought we stand today on the path to AGI, Legg said that he thinks computational power is where it needs to be to make it happen, and the "first unlocking step" would be to "start training models now with the scale of the data that is beyond what a human can experience in a lifetime" — a feat he believes the AI industry is ready to achieve.

All that said, Legg reiterated his personal stance that he only believes there's a 50 percent chance researchers will achieve AGI before the end of this decade, and Futurism has reached out to DeepMind to see if the Google subsidiary has anything to add to that prognosis.

"I think it's entirely plausible," he said, "but I'm not going to be surprised if it doesn't happen by then."

The A.I Megathread (LLM , GPT , Development)

Veteran

Zephyr: Direct Distillation of LM Alignment​

Submission history​

Model Card for Zephyr 7B β​

Model description​

Model Sources​

Performance​

Intended uses & limitations​

Veteran

Veteran

Veteran

Veteran

What you need to know​

How does neural network work?​

What does the future hold for ChatGPT and Bing Chat?​

Veteran

Veteran

Shutterstock will now let you transform real photos using AI​

Shutterstock says it will compensate artists if their images are licensed after they’re edited with AI.​

Veteran

Google Maps is becoming more like Search — thanks to AI​

Google Maps is getting an AI makeover, adding new features like Immersive View and making existing features like driving directions easier to follow. It’s also becoming more like Search.​

Veteran

Veteran

Computer Science > Software Engineering​

CodeFusion: A Pre-trained Diffusion Model for Code Generation​

Submission history​

Superstar

Veteran

Veteran

Biden issues U.S.′ first AI executive order, requiring safety assessments, civil rights guidance, research on labor market impact​

Building on earlier AI actions​

Veteran

Veteran

Google AI Chief Says There’s a 50% Chance We’ll Hit AGI in Just 5 Years​

"I think it's entirely plausible."​

Zephyr: Direct Distillation of LM Alignment

Submission history

Model Card for Zephyr 7B β

Model description

Model Sources

Performance

Intended uses & limitations

What you need to know

How does neural network work?

What does the future hold for ChatGPT and Bing Chat?

Shutterstock will now let you transform real photos using AI

Shutterstock says it will compensate artists if their images are licensed after they’re edited with AI.

Google Maps is becoming more like Search — thanks to AI

Google Maps is getting an AI makeover, adding new features like Immersive View and making existing features like driving directions easier to follow. It’s also becoming more like Search.

Computer Science > Software Engineering

CodeFusion: A Pre-trained Diffusion Model for Code Generation

Submission history

Biden issues U.S.′ first AI executive order, requiring safety assessments, civil rights guidance, research on labor market impact

Building on earlier AI actions

Google AI Chief Says There’s a 50% Chance We’ll Hit AGI in Just 5 Years

"I think it's entirely plausible."