bnew

Veteran
Joined
Nov 1, 2015
Messages
58,114
Reputation
8,602
Daps
161,775

LLama 2 13B vs Mistral 7B LLM models compared​

10:01 am October 12, 2023 By Julian Horsey

LLama 2 13B vs Mistral 7B LLM models compared


If you are interested in learning more about how large language models compare you may be interested in this comparison between LLama 2 13B vs Mistral 7B revealing the differences between the different AI models. Both models are powerful and adaptable, but they each have their unique strengths and features. This article will provide a comprehensive comparison of these two models, focusing on their performance, architecture, and intended use cases.

Mistral 7B, a 7.3 billion parameter model, has been making a name for itself due to its impressive performance on various benchmarks. It outperforms Llama 2 13B on all benchmarks and even surpasses Llama 1 34B on many. It also approaches the performance of CodeLlama 7B on code, while maintaining proficiency in English tasks. This model uses Grouped-query attention (GQA) for faster inference and Sliding Window Attention (SWA) to handle longer sequences at a smaller cost.

One of the key advantages of Mistral 7B is its adaptability. It can be deployed on any cloud, including AWS, GCP, and Azure, using the vLLM inference server and skypilot. It can also be used locally with the reference implementation provided by the developers. Furthermore, Mistral 7B is easy to fine-tune on any task. As a demonstration, the developers have provided a model fine-tuned for chat, which outperforms Llama 2 13B chat.

Llama 2 13B vs Mistral 7B​

Watch this video on YouTube.


Other articles you may find of interest on the subject of Llama 2

Mistral 7B’s performance on a wide range of benchmarks is impressive. It significantly outperforms Llama 2 13B on all metrics and is on par with Llama 34B. It also excels in code and reasoning benchmarks. The model uses a sliding window attention (SWA) mechanism, which allows each layer to attend to the previous 4,096 hidden states. This results in a linear compute cost and a 2x speed improvement for sequence length of 16k with a window of 4k.

On the other hand, Llama 2 13B is part of a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. Developed by Meta, the Llama 2 family of large language models (LLMs) are optimized for dialogue use cases. The fine-tuned LLMs, known as Llama-2-Chat, outperform open-source chat models on most benchmarks tested and are on par with popular closed-source models like ChatGPT and PaLM in terms of helpfulness and safety.

Llama 2 comes in a range of parameter sizes — 7B, 13B, and 70B — as well as pretrained and fine-tuned variations. It is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align to human preferences for helpfulness and safety. The larger models, such as the 70B, use Grouped-Query Attention (GQA) for improved inference scalability.

Llama 2 is intended for commercial and research use in English. The tuned models are designed for assistant-like chat, whereas the pretrained models can be adapted for a variety of natural language generation tasks.

Both Mistral 7B and Llama 2 13B are powerful models with their unique strengths. Mistral 7B shines in its adaptability and performance on various benchmarks, while Llama 2 13B excels in dialogue use cases and aligns well with human preferences for helpfulness and safety. The choice between the two would largely depend on the specific requirements of the task at hand.

Further articles you may find of interest on the Mistral 7B AI model :

Filed Under: Guides, Top News
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,114
Reputation
8,602
Daps
161,775

Using large language models in psychology:

💡 LLMs have the potential to advance psychological measurement, experimentation and practice.

💡 LLM generated on-topic, grammatically correct useless information, but not based on research and psychology construct.

💡A critical task for the field is to curate large, reliable annotated datasets of key psychological constructs while minimizing unwanted biases.

💡Concerns about applying LLMs to psychology:
- Evaluation: computer scientists have tended to evaluate the functionality of features, but psychologists usually want to evaluate the effects of those features on human thought and behaviour.
- Bias: LLM’s censorship guardrails only addresses the symptoms of bias, rather than the underlying bias in the data. The censorship makes it hard for researchers to study unknown biases. It’s a high priority to make censorship algorithms transparent and to develop bias-testing protocols that go beyond the obvious one.

💡Three important needs in the psychology field:
- Invest in keystone datasets that represent populations and psychological constructs of interest, and must be linked to psychologically important outcomes.
- Define a new psychologically way of benchmarking LLMs, which can help facilitate the development of safe and transparent algorithms.
- Shared computing and analysis infrastructure to ensure that the future of LLM-powered research is equitable.

Thanks for the great paper @Diyi_Yang and team!

Using large language models in psychology - Nature Reviews Psychology
NWuzGV1.jpeg
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,114
Reputation
8,602
Daps
161,775

XuUvFyZ.jpeg

How Do Large Language Models Capture the Ever-changing World Knowledge? A Review of Recent Advances​

Zihan Zhang, Meng Fang, Ling Chen, Mohammad-Reza Namazi-Rad, Jun Wang
Although large language models (LLMs) are impressive in solving various tasks, they can quickly be outdated after deployment. Maintaining their up-to-date status is a pressing concern in the current era. This paper provides a comprehensive review of recent advances in aligning LLMs with the ever-changing world knowledge without re-training from scratch. We categorize research works systemically and provide in-depth comparisons and discussion. We also discuss existing challenges and highlight future directions to facilitate research in this field. We release the paper list at this https URL
Comments:EMNLP 2023 main conference, paper link at this https URL
Subjects:Computation and Language (cs.CL)
Cite as:arXiv:2310.07343 [cs.CL]
(or arXiv:2310.07343v1 [cs.CL] for this version)
[2310.07343] How Do Large Language Models Capture the Ever-changing World Knowledge? A Review of Recent Advances
Focus to learn more

Submission history​

From: Zihan Zhang [view email]
[v1] Wed, 11 Oct 2023 09:46:32 UTC (378 KB)

https://arxiv.org/pdf/2310.07343v1.pdf
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,114
Reputation
8,602
Daps
161,775

Build a personal AI assistant running on your laptop with LM Studio​

10:14 am October 19, 2023 By Julian Horsey

build a custom personal AI assistant on your laptop


If you are interested in learning more about how you can easily create your very own personal AI assistant running it locally from your laptop or desktop PC. You might be interested in a new program and framework called LM Studio. LM Studio is a lightweight program designed to make it easy to install and use of local language models on personal computers rather than third-party servers. One of the key features of LM Studio is its user-friendly interface making it easy to manage a variety of different AI models depending on your needs all from one interface

Thanks to its minimalist UI and chatbot interface LM Studio has been specifically designed to provide users with an efficient and easy-to-use platform for running language models. This feature is particularly beneficial for users who are new to the world of large language models, as it simplifies the process of running these models locally. Which until a few months ago was quite a tricky undertaking to do but has now been simplified thanks to the likes of LM Studio and other framework such as Ollama and others.

How to run personal AI assistance locally on your laptop​

One of the standout features of LM Studio is the ability for users to start their own inference server with just a few clicks. This feature offers users the ability to play around with their inferences, providing them with a deeper understanding of how these models work. Additionally, LM Studio provides a guide for choosing the right model based on the user’s RAM, further enhancing the user experience.



Watch this video on YouTube.


Other articles we have written that you may find of interest on the subject of large language models :

Benefits of running LLM is locally​

The benefits of running large language models on your laptop or desktop PC locally :
  • Hands-On Experience: Working directly with the model code allows you to understand the architecture, data preprocessing, and other technical aspects in detail.
  • Customization: You have the freedom to tweak parameters, modify the architecture, or even integrate the model with other systems to see how it performs under different conditions.
  • Debugging and Profiling: Running models locally makes it easier to debug issues, profile computational performance, and optimize code. You can get a clear picture of how resources like memory and CPU are utilized.
  • Data Privacy: You can experiment with sensitive or proprietary datasets without sending the data over the network, thus maintaining data privacy.
  • Cost-Efficiency: There’s no need to pay for cloud-based machine time for experimentation, although the upfront hardware cost and electricity can be significant.
  • Offline Availability: Once downloaded and set up, the model can be run without an internet connection, allowing you to work on AI projects anywhere.
  • End-to-End Understanding: Managing the entire pipeline, from data ingestion to model inference, provides a holistic view of AI systems.
  • Skill Development: The experience of setting up, running, and maintaining a large-scale model can be a valuable skill set for both academic and industrial applications.

Another significant feature of LM Studio is its compatibility with any ggml Llama, MPT, and StarCoder model on Hugging Face. This includes models such as Llama 2, Orca, Vicuna, Nous Hermes, WizardCoder, MPT, among others. This wide range of compatibility allows users to explore different models, expanding their knowledge and experience in the field of large language models.

LM Studio also allows users to discover, download, and run local LMS within the application. This feature simplifies the process of finding and using different models, eliminating the need for multiple platforms or programs. Users can search for and download models that are best suited for their computer, enhancing the efficiency and effectiveness of their work.

Ensuring privacy and security is a key focus of LM Studio. The program is 100% private, using an encryption method and providing a clear statement that explains how it uses HTTP requests. This feature provides users with the assurance that their data and information are secure.

User feedback and continuous improvement are key components of LM Studio’s approach. The program has a feedback tab where users can provide constructive feedback and request features. This feature ensures that LM Studio continues to evolve and improve based on user needs and preferences. Furthermore, LM Studio has a Discord where users can get more information, provide feedback, and request features.
LM Studio is a comprehensive platform for experimenting with local and open-source Large Language Models. Its user-friendly interface, wide range of compatibility, and focus on privacy and security make it an ideal choice for users looking to explore the world of large language models. Whether you’re a seasoned professional or a beginner in the field, LM Studio offers a platform that caters to your needs.

Filed Under: Guides, Top News
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,114
Reputation
8,602
Daps
161,775

To excel at engineering design, generative AI must learn to innovate, study finds​

AI models that prioritize similarity falter when asked to design something completely new.


Jennifer Chu | MIT News

Publication Date:

October 19, 2023
PRESS INQUIRIES
Hundreds of colorful dots represent 16 types of bikes. There are 16 bike icons that point to various clusters, and a list says they are: “Road, Dirt-Jump, Polo, BMX, MTB, Touring, Track, Cruiser, Commuter, City, Cyclocross, other, Trials, Children’s, Time-trial, Cargo, Hybrid, Gravel, Fat.”

Caption:
MIT engineers trained several AI models on thousands of bicycle frames, sourced from a dataset of full bicycle designs, shown here color-coded by bike style.
Credits:
Credit: Courtesy of the researchers


ChatGPT and other deep generative models are proving to be uncanny mimics. These AI supermodels can churn out poems, finish symphonies, and create new videos and images by automatically learning from millions of examples of previous works. These enormously powerful and versatile tools excel at generating new content that resembles everything they’ve seen before.

But as MIT engineers say in a new study, similarity isn’t enough if you want to truly innovate in engineering tasks.

“Deep generative models (DGMs) are very promising, but also inherently flawed,” says study author Lyle Regenwetter, a mechanical engineering graduate student at MIT. “The objective of these models is to mimic a dataset. But as engineers and designers, we often don’t want to create a design that’s already out there.”

He and his colleagues make the case that if mechanical engineers want help from AI to generate novel ideas and designs, they will have to first refocus those models beyond “statistical similarity.”

“The performance of a lot of these models is explicitly tied to how statistically similar a generated sample is to what the model has already seen,” says co-author Faez Ahmed, assistant professor of mechanical engineering at MIT. “But in design, being different could be important if you want to innovate.”

In their study, Ahmed and Regenwetter reveal the pitfalls of deep generative models when they are tasked with solving engineering design problems. In a case study of bicycle frame design, the team shows that these models end up generating new frames that mimic previous designs but falter on engineering performance and requirements.

When the researchers presented the same bicycle frame problem to DGMs that they specifically designed with engineering-focused objectives, rather than only statistical similarity, these models produced more innovative, higher-performing frames.

The team’s results show that similarity-focused AI models don’t quite translate when applied to engineering problems. But, as the researchers also highlight in their study, with some careful planning of task-appropriate metrics, AI models could be an effective design “co-pilot.”

“This is about how AI can help engineers be better and faster at creating innovative products,” Ahmed says. “To do that, we have to first understand the requirements. This is one step in that direction.”

The team’s new study appeared recently online, and will be in the December print edition of the journal Computer Aided Design. The research is a collaboration between computer scientists at MIT-IBM Watson AI Lab and mechanical engineers in MIT’s DeCoDe Lab. The study’s co-authors include Akash Srivastava and Dan Gutreund at the MIT-IBM Watson AI Lab.

Framing a problem

As Ahmed and Regenwetter write, DGMs are “powerful learners, boasting unparalleled ability” to process huge amounts of data. DGM is a broad term for any machine-learning model that is trained to learn distribution of data and then use that to generate new, statistically similar content. The enormously popular ChatGPT is one type of deep generative model known as a large language model, or LLM, which incorporates natural language processing capabilities into the model to enable the app to generate realistic imagery and speech in response to conversational queries. Other popular models for image generation include DALL-E and Stable Diffusion.

Because of their ability to learn from data and generate realistic samples, DGMs have been increasingly applied in multiple engineering domains. Designers have used deep generative models to draft new aircraft frames, metamaterial designs, and optimal geometries for bridges and cars. But for the most part, the models have mimicked existing designs, without improving the performance on existing designs.

“Designers who are working with DGMs are sort of missing this cherry on top, which is adjusting the model’s training objective to focus on the design requirements,” Regenwetter says. “So, people end up generating designs that are very similar to the dataset.”

In the new study, he outlines the main pitfalls in applying DGMs to engineering tasks, and shows that the fundamental objective of standard DGMs does not take into account specific design requirements. To illustrate this, the team invokes a simple case of bicycle frame design and demonstrates that problems can crop up as early as the initial learning phase. As a model learns from thousands of existing bike frames of various sizes and shapes, it might consider two frames of similar dimensions to have similar performance, when in fact a small disconnect in one frame — too small to register as a significant difference in statistical similarity metrics — makes the frame much weaker than the other, visually similar frame.

Beyond “vanilla”
A bike transforms to various types of bikes, like a road or BMX bike. The bike wheels get larger and smaller, and the frame changes to different styles.

An animation depicting transformations across common bicycle designs.


Credit: Courtesy of the researchers


The researchers carried the bicycle example forward to see what designs a DGM would actually generate after having learned from existing designs. They first tested a conventional “vanilla” generative adversarial network, or GAN — a model that has widely been used in image and text synthesis, and is tuned simply to generate statistically similar content. They trained the model on a dataset of thousands of bicycle frames, including commercially manufactured designs and less conventional, one-off frames designed by hobbyists.

Once the model learned from the data, the researchers asked it to generate hundreds of new bike frames. The model produced realistic designs that resembled existing frames. But none of the designs showed significant improvement in performance, and some were even a bit inferior, with heavier, less structurally sound frames.

The team then carried out the same test with two other DGMs that were specifically designed for engineering tasks. The first model is one that Ahmed previously developed to generate high-performing airfoil designs. He built this model to prioritize statistical similarity as well as functional performance. When applied to the bike frame task, this model generated realistic designs that also were lighter and stronger than existing designs. But it also produced physically “invalid” frames, with components that didn’t quite fit or overlapped in physically impossible ways.

“We saw designs that were significantly better than the dataset, but also designs that were geometrically incompatible because the model wasn’t focused on meeting design constraints,” Regenwetter says.

The last model the team tested was one that Regenwetter built to generate new geometric structures. This model was designed with the same priorities as the previous models, with the added ingredient of design constraints, and prioritizing physically viable frames, for instance, with no disconnections or overlapping bars. This last model produced the highest-performing designs, that were also physically feasible.

“We found that when a model goes beyond statistical similarity, it can come up with designs that are better than the ones that are already out there,” Ahmed says.

“It’s a proof of what AI can do, if it is explicitly trained on a design task.”

For instance, if DGMs can be built with other priorities, such as performance, design constraints, and novelty, Ahmed foresees “numerous engineering fields, such as molecular design and civil infrastructure, would greatly benefit. By shedding light on the potential pitfalls of relying solely on statistical similarity, we hope to inspire new pathways and strategies in generative AI applications outside multimedia.”
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,114
Reputation
8,602
Daps
161,775

7JEFEnC.png





DeepMind discovers that AI large language models can optimize their own prompts​

Ben dikkson@BenDee983

September 15, 2023 9:48 AM

A robot works on another robot in a garage.

Credit: VentureBeat made with Midjourney

When people program new deep learning AI models — those that can focus on the right features of data by themselves — the vast majority rely on optimization algorithms, or optimizers, to ensure the models have a high enough rate of accuracy. But one of the most commonly used optimizers — derivative-based optimizers— run into trouble handling real-world applications.

In a new paper, researchers from DeepMind propose a new way: Optimization by PROmpting (OPRO), a method that uses AI large language models (LLM) as optimizers. The unique aspect of this approach is that the optimization task is defined in natural language rather than through formal mathematical definitions.


The researchers write, “Instead of formally defining the optimization problem and deriving the update step with a programmed solver, we describe the optimization problem in natural language, then instruct the LLM to iteratively generate new solutions based on the problem description and the previously found solutions.”

The technique is highly adaptable. By simply modifying the problem description or adding specific instructions, the LLM can be guided to solve a wide array of problems.

The researchers found that, on small-scale optimization problems, LLMs can generate effective solutions through prompting alone, sometimes matching or even surpassing the performance of expert-designed heuristic algorithms. However, the true potential of OPRO lies in its ability to optimize LLM prompts to get maximum accuracy from the models.

How Optimization by PROmpting works​

The process of OPRO begins with a “meta-prompt” as input. This meta-prompt includes a natural language description of the task at hand, along with a few examples of problems, placeholders for prompt instructions, and corresponding solutions.

As the optimization process unfolds, the large language model (LLM) generates candidate solutions. These are based on the problem description and the previous solutions included in the meta-prompt.

OPRO then evaluates these candidate solutions, assigning each a quality score. Optimal solutions and their scores are added to the meta-prompt, enriching the context for the next round of solution generation. This iterative process continues until the model stops proposing better solutions.

jTThVvpQn-F_GaIOyc-XiV5NQBXBJipvjvJgzDDqmXMGCNNCKB0N7-UTG0E2GX0l5pmyyCMbHVrGIj04cp_91DRxGQv1El0FB8oRZqBg9TXg2kw4lVI0fzKZMhLa-d1mVFP-0IhU7bdHWlxKxccfNP0
“The main advantage of LLMs for optimization is their ability of understanding natural language, which allows people to describe their optimization tasks without formal specifications,” the researchers explain.

This means users can specify target metrics such as “accuracy” while also providing other instructions. For instance, they might request the model to generate solutions that are both concise and broadly applicable.

OPRO also capitalizes on LLMs’ ability to detect in-context patterns. This enables the model to identify an optimization trajectory based on the examples included in the meta-prompt. The researchers note, “Including optimization trajectory in the meta-prompt allows the LLM to identify similarities of solutions with high scores, encouraging the LLM to build upon existing good solutions to construct potentially better ones without the need of explicitly defining how the solution should be updated.”

To validate the effectiveness of OPRO, the researchers tested it on two well-known mathematical optimization problems: linear regression and the “traveling salesman problem.” While OPRO might not be the most optimal way to solve these problems, the results were promising.
“On both tasks, we see LLMs properly capture the optimization directions on small-scale problems merely based on the past optimization trajectory provided in the meta-prompt,” the researchers report.

Optimizing LLM prompts with OPRO​

Experiments show that prompt engineering can dramatically affect the output of a model. For instance, appending the phrase “let’s think step by step” to a prompt can coax the model into a semblance of reasoning, causing it to outline the steps required to solve a problem. This can often lead to more accurate results.

However, it’s crucial to remember that this doesn’t imply LLMs possess human-like reasoning abilities. Their responses are highly dependent on the format of the prompt, and semantically similar prompts can yield vastly different results. The DeepMind researchers write, “Optimal prompt formats can be model-specific and task-specific.”

The true potential of Optimization by PROmpting lies in its ability to optimize prompts for LLMs like OpenAI’s ChatGPT and Google’s PaLM. It can guide these models to find the best prompt that maximizes task accuracy.
“OPRO enables the LLM to gradually generate new prompts that improve the task accuracy throughout the optimization process, where the initial prompts have low task accuracies,” they write.

To illustrate this, consider the task of finding the optimal prompt to solve word-math problems. An “optimizer LLM” is provided with a meta-prompt that includes instructions and examples with placeholders for the optimization prompt (e.g., “Let’s think step by step”). The model generates a set of different optimization prompts and passes them on to a “scorer LLM.” This scorer LLM tests them on problem examples and evaluates the results. The best prompts, along with their scores, are added to the beginning of the meta-prompt, and the process is repeated.

The researchers evaluated this technique using several LLMs from the PaLM and GPT families. They found that “all LLMs in our evaluation are able to serve as optimizers, which consistently improve the performance of the generated prompts through iterative optimization until convergence.”

For example, when testing OPRO with PaLM-2 on the GSM8K, a benchmark of grade school math word problems, the model produced intriguing results. It began with the prompt “Let’s solve the problem,” and generated other strings, such as “Let’s think carefully about the problem and solve it together,” “Let’s break it down,” “Let’s calculate our way to the solution,” and finally “Let’s do the math,” which provided the highest accuracy.

In another experiment, the most accurate result was generated when the string “Take a deep breath and work on this problem step-by-step,” was added before the LLM’s answer.

These results are both fascinating and somewhat disconcerting. To a human, all these instructions would carry the same meaning, but they triggered very different behavior in the LLM. This serves as a caution against anthropomorphizing LLMs and highlights how much we still have to learn about their inner workings.

However, the advantage of OPRO is clear. It provides a systematic way to explore the vast space of possible LLM prompts and find the one that works best for a specific type of problem. How it will hold out in real-world applications remains to be seen, but this research can be a step forward toward our understanding of how LLMs work.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,114
Reputation
8,602
Daps
161,775

PyTorch ExecuTorch extends open source AI for new quests at the edge​

Sean Michael Kerner@TechJournalist

October 17, 2023 3:54 PM


cfr0z3n_a_virtual_reality_headset_displaying_neon_fire_ab1a2c58-33f2-4f0d-ae81-e17de062897d.png


VentureBeat presents: AI Unleashed - An exclusive executive event for enterprise data leaders. Network and learn with industry peers. Learn More



The open source machine learning (ML) framework PyTorch is moving forward with a new release, as well as a new project for enabling AI inference at the edge and on mobile devices.

The new developments were announced today at the PyTorch Conference, which loosely coincided with the one year anniversary of the formation of the PyTorch Foundation, at the Linux Foundation. As part of the event, technical details on the PyTorch 2.1 update which was released on Oct. 4, were discussed.


Most notable, however, was the announcement of new mobile and edge efforts with PyTorch Edge and the open sourcing of ExecuTorch by Meta Platforms (formerly Facebook). ExecuTorch is technology for deploying AI models for on-device inference, specifically on mobile and edge devices.

Meta has already proven the technology and is using it to power the latest generation of Ray-Ban smart glasses and it’s also part of the recently released Quest 3 VR headset. As part of the open source PyTorch project the goal is to push the technology further enabling what could be a new era of on-device AI inference capabilities.

EVENT​

AI Unleashed

An exclusive invite-only evening of insights and networking, designed for senior enterprise executives overseeing data stacks and strategies.


Learn More

During the opening keynote at PyTorch Conference, Ibrahim Haddad, executive director of the PyTorch Foundation outlined the progress the organization has made over the past year.
“At the Linux Foundation we host over 900 technical projects, PyTorch is one of them,” Haddad said. “There are over 900 examples of how a neutral open home for projects help projects grow and PyTorch is a great example of that.”

The expanding capabilities for inference of PyTorch 2.1​

PyTorch has long been one of the most widely used tools underpinning training of AI, including many of the world’s most popular large language models (LLMs) including GPT models from OpenAI and Meta’s Llama to name a few.

Historically, PyTorch has not been widely used for inference, but that is now changing. In a recent exclusive with VentureBeat, IBM detailed its efforts and contributions into PyTorch 2.1 that help to improve inference for server deployments.

PyTorch 2.1 also provides performance enhancement that should help to improve operations for the torch.compile function that is at the foundation for the technology. The addition of support for automatic dynamic shapes will minimize the need for recompilations due to tensor shape changes, and Meta developers added support to translate NumPy operations into PyTorch to accelerate certain types of numerical calculations that are commonly used for data science.

ExecuTorch is on a quest to change the game for AI inference​

In a keynote session at the PyTorch Conference, Mergen Nachin, Software Engineer at Meta detailed what the new ExecuTorch technology is all about and why it matters.

Nachin said that ExecuTorch is a new end-to-end solution for deploying AI for on-device inference, specifically for mobile and edge devices.

He noted that today’s AI models are extending beyond servers to edge devices such as mobile, AR, VR and AR headsets, wearables, embedded systems and microcontrollers.

ExecuTorch addresses the challenges of restricted edge devices by providing an end-to-end workflow from PyTorch models to deliver optimized native programs.

Nachin explained that ExecuTorch starts with a standard PyTorch module, but coverts it into an exporter graph, and then optimizes it with further transformations and compilations to target specific devices.

A key benefit of ExecuTorch is portability with the ability to run on both mobile and embedded devices. Nachin noted that ExecuTorch can also help to improve developer productivity by using consistent APIs and software development kits across different targets.

ExecuTorch was validated and vetted by actual real-world engineering problems and Meta has already proven the technology with deployment in its Ray-Ban Meta smart glasses.

With the technology now being made available as open source as part of the PyTorch Foundation, Nachin said the goal is to help the industry collaboratively address fragmentation in deploying AI models to the wide array of edge devices. Meta believes ExecuTorch can help more organizations take advantage of on-device AI through its optimized and portable workflow.
“Today we are open sourcing ExecuTorch and it’s still very early, but we’re open sourcing because we want to get feedback from the community and embrace the community,” he said.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,114
Reputation
8,602
Daps
161,775

How to create the perfect ChatGPT prompt formula​

12:02 pm October 19, 2023 By Roland Hutchinson

perfect ChatGPT prompt formula


This guide is designed to show you how to create the perfect ChatGPT prompt formula, there are a number of different things that you can do to improve chatGPT responses by improving your prompts.

The art of interacting with ChatGPT, or any other sophisticated language model, lies in mastering the craft of prompt formulation. Crafting the ideal prompt can be the difference between receiving a generic response and obtaining a highly nuanced, tailored answer. In this guide, we’ll delve into the intricacies of creating the perfect ChatGPT prompt formula to maximize your interactions.

Understanding the Underlying Mechanism​

Before we dive into the art of prompt crafting, it’s crucial to have a basic understanding of how ChatGPT operates. ChatGPT is an autoregressive model, meaning it generates responses token by token, based on the information provided in the prompt and its extensive training data. The more explicit and clear your prompt, the better the model can generate a response tailored to your needs.

Begin with a Clear Objective​

The first step in crafting an effective prompt is to have a clear objective in mind. Are you seeking a concise answer, a detailed explanation, or perhaps a creative story? Your desired outcome should shape the structure and content of your prompt. For instance, if you’re looking for a brief summary, you might start your prompt with “In a few sentences, explain…”.

Be Explicit and Specific​

General or ambiguous prompts can lead to generic responses. If you’re looking for information about a niche topic or a specific aspect of a broader subject, be sure to specify that in your prompt. For instance, instead of asking, “Tell me about apples,” you might say, “Describe the nutritional benefits of Granny Smith apples in comparison to Red Delicious.”

Utilize Open-ended Questions​

Open-ended questions can elicit more detailed and comprehensive responses. Instead of asking, “Is X better than Y?”, consider phrasing your query as, “What are the advantages and disadvantages of X compared to Y?”

Guide the Model’s Tone and Style​

You can steer ChatGPT’s response style by setting a tone in your prompt. For instance, if you’re looking for a humorous take on a topic, you might begin with, “In a light-hearted manner, explain…”. Conversely, for a more scholarly tone, you could prompt, “Provide a detailed academic analysis of…”.

Experiment with System Instructions​

System instructions are high-level directives that guide the model’s behavior. For example, you might include an instruction like, “You are a 19th-century historian,” to obtain a response in a specific historical context. These instructions can be a powerful tool to tailor the model’s perspective.

Account for Potential Biases​

While ChatGPT is designed to be as neutral as possible, no model is entirely free from biases. Being aware of potential biases and framing your prompts to mitigate them can lead to more balanced and accurate responses.

Iterative Prompting​

Don’t be afraid to engage in a back-and-forth with the model. If the initial response isn’t quite what you’re looking for, refine your prompt and ask follow-up questions. This iterative process can hone in on the exact information or style you’re seeking.

Use Contextual Information Sparingly​

While it can be tempting to provide extensive background information, remember that ChatGPT operates best with concise, direct prompts. If you find yourself writing a lengthy prompt, consider breaking it up into multiple interactions or refining your question to be more direct.

Stay Updated on Model Iterations​

ChatGPT and similar models are continually evolving. Staying updated on the latest versions and their capabilities can help you craft even more effective prompts over time.

Summary​

Crafting the perfect ChatGPT prompt is both an art and a science. With a clear objective, explicit details, and a touch of creativity, you can maximize the potential of your interactions with this powerful language model. As with any skill, practice makes perfect, so don’t be afraid to experiment and refine your approach over time.

We hope that you find our guide on how to create the perfect ChatGPT prompt formula helpful and informative, if you have any comments, suggestions, or questions, please let us know in the comments section below.

Filed Under: Guides
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,114
Reputation
8,602
Daps
161,775

Eureka! NVIDIA Research Breakthrough Puts New Spin on Robot Learning​

AI agent uses LLMs to automatically generate reward algorithms to train robots to accomplish complex tasks.

October 20, 2023 by ANGIE LEE


A new AI agent developed by NVIDIA Research that can teach robots complex skills has trained a robotic hand to perform rapid pen-spinning tricks — for the first time as well as a human can.

The stunning prestidigitation, showcased in the video above, is one of nearly 30 tasks that robots have learned to expertly accomplish thanks to Eureka, which autonomously writes reward algorithms to train bots.

Eureka has also taught robots to open drawers and cabinets, toss and catch balls, and manipulate scissors, among other tasks.

The Eureka research, published today, includes a paper and the project’s AI algorithms, which developers can experiment with using NVIDIA Isaac Gym, a physics simulation reference application for reinforcement learning research. Isaac Gym is built on NVIDIA Omniverse, a development platform for building 3D tools and applications based on the OpenUSD framework. Eureka itself is powered by the GPT-4 large language model.

“Reinforcement learning has enabled impressive wins over the last decade, yet many challenges still exist, such as reward design, which remains a trial-and-error process,” said Anima Anandkumar, senior director of AI research at NVIDIA and an author of the Eureka paper. “Eureka is a first step toward developing new algorithms that integrate generative and reinforcement learning methods to solve hard tasks.”



AI Trains Robots

Eureka-generated reward programs — which enable trial-and-error learning for robots — outperform expert human-written ones on more than 80% of tasks, according to the paper. This leads to an average performance improvement of more than 50% for the bots.

Video Player
00:00
00:01

Robot arm taught by Eureka to open a drawer.

The AI agent taps the GPT-4 LLM and generative AI to write software code that rewards robots for reinforcement learning. It doesn’t require task-specific prompting or predefined reward templates — and readily incorporates human feedback to modify its rewards for results more accurately aligned with a developer’s vision.

Using GPU-accelerated simulation in Isaac Gym, Eureka can quickly evaluate the quality of large batches of reward candidates for more efficient training.

Eureka then constructs a summary of the key stats from the training results and instructs the LLM to improve its generation of reward functions. In this way, the AI is self-improving. It’s taught all kinds of robots — quadruped, bipedal, quadrotor, dexterous hands, cobot arms and others — to accomplish all kinds of tasks.

The research paper provides in-depth evaluations of 20 Eureka-trained tasks, based on open-source dexterity benchmarks that require robotic hands to demonstrate a wide range of complex manipulation skills.

The results from nine Isaac Gym environments are showcased in visualizations generated using NVIDIA Omniverse.

Video Player
00:02
00:04

Humanoid robot learns a running gait via Eureka.
“Eureka is a unique combination of large language models and NVIDIA GPU-accelerated simulation technologies,” said Linxi “Jim” Fan, senior research scientist at NVIDIA, who’s one of the project’s contributors. “We believe that Eureka will enable dexterous robot control and provide a new way to produce physically realistic animations for artists.”

It’s breakthrough work bound to get developers’ minds spinning with possibilities, adding to recent NVIDIA Research advancements like Voyager, an AI agent built with GPT-4 that can autonomously play Minecraft.

NVIDIA Research comprises hundreds of scientists and engineers worldwide, with teams focused on topics including AI, computer graphics, computer vision, self-driving cars and robotics.

Learn more about Eureka and NVIDIA Research.

Categories: Autonomous Machines | Deep Learning | Research
 
Top