bnew

Veteran
Joined
Nov 1, 2015
Messages
56,123
Reputation
8,239
Daps
157,818

FqzsiMFWwAA3n54

Fqzs1jvX0AEDfr4

Fqzs_AXWwAIE___

Fqz0CbfWcAIdlFw

Fqz0orOXsAMhQJ9

Fq0Fy2FXsAEn-Id

Fq0YJ_0X0AAfcOq

Fq0YLYUWYAIoEy0


 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,123
Reputation
8,239
Daps
157,818

TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs​

Yaobo Liang, Chenfei Wu, Ting Song, Wenshan Wu, Yan Xia, Yu Liu, Yang Ou, Shuai Lu, Lei Ji, Shaoguang Mao, Yun Wang, Linjun Shou, Ming Gong, Nan Duan
Artificial Intelligence (AI) has made incredible progress recently. On the one hand, advanced foundation models like ChatGPT can offer powerful conversation, in-context learning and code generation abilities on a broad range of open-domain tasks. They can also generate high-level solution outlines for domain-specific tasks based on the common sense knowledge they have acquired. However, they still face difficulties with some specialized tasks because they lack enough domain-specific data during pre-training or they often have errors in their neural network computations on those tasks that need accurate executions. On the other hand, there are also many existing models and systems (symbolic-based or neural-based) that can do some domain-specific tasks very well. However, due to the different implementation or working mechanisms, they are not easily accessible or compatible with foundation models. Therefore, there is a clear and pressing need for a mechanism that can leverage foundation models to propose task solution outlines and then automatically match some of the sub-tasks in the outlines to the off-the-shelf models and systems with special functionalities to complete them. Inspired by this, we introduce this http URL as a new AI ecosystem that connects foundation models with millions of APIs for task completion. Unlike most previous work that aimed to improve a single AI model, this http URL focuses more on using existing foundation models (as a brain-like central system) and APIs of other AI models and systems (as sub-task solvers) to achieve diversified tasks in both digital and physical domains. As a position paper, we will present our vision of how to build such an ecosystem, explain each key component, and use study cases to illustrate both the feasibility of this vision and the main challenges we need to address next.

9JNe23I.png

3 Application Scenarios

In this section, we present some examples of how TaskMatrix.AI can be applied in different application scenarios. We show how TaskMatrix.AI can assist in creating AI-powered content in Section 3.1 and 3.2. We demonstrate how TaskMatrix.AI can facilitate office automation and cloud service usage in Section 3.3 and 3.4. We illustrate how TaskMatrix.AI can perform tasks in the physical world by interacting with robots and IoT devices in Section 3.5. All these cases have been implemented in practice and will be supported by the online system of TaskMatrix.AI, which will be released soon. We also explore more potential applications in Section 3.6.
3.1 Visual Task Completion
TaskMatrix.AI enables the user to interact with AI by 1) sending and receiving not only languages
but also images 2) providing complex visual questions or visual editing instructions that require
the collaboration of multiple AI models with multi-steps. 3) providing feedback and asking for
corrected results. We design a series of prompts to inject the visual model information into ChatGPT,
considering models of multiple inputs/outputs and models that require visual feedback. More details
are described at Wu et al. (2023). We demonstrate this with an example in Figrue 2. The APIs related
to this include:

• Image Editing Image Editing includes removing or replacing objects of an image, or
changing the style of an image. Removing objects from an image involves using image
editing tools or algorithms to get rid of unwanted elements. On the other hand, replacing
objects with new ones involves swapping out an element in an image with another one that
is more suitable. Finally, changing an image using text involves using machine learning
algorithms to generate an image based on a textual description.

• Image Question Answering This refers to the process of using machine learning algorithms
to answer questions about an image, often by analyzing the contents of the image and
providing relevant information. This can be useful in situations where the image contains
important information that needs to be extracted.

• Image Captioning This refers to the process of using machine learning algorithms to
generate textual descriptions of an image, often by analyzing the contents of the image and
providing relevant information.

• Text-to-Image This refers to the process of generating an image from a textual description,
often using machine learning algorithms that can generate realistic images based on textual
input.

• Image-to-Sketch/Depth/Hed/Line This refers to the process of converting an image to a
sketch, depth, Hed (Holistically-nested edge detection), or line, often using image processing
techniques or computer algorithms.

• Sketch/Depth/Hed/Line-to-Image This refers to the process of generating an image from
a sketch, depth, Hed (Holistically-nested edge detection), or line.
 
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,123
Reputation
8,239
Daps
157,818

TurboGPT.ai - An improved UI for ChatGPT​

TurboGPT is an open-source ChatGPT UI project that enables users to chat with AI-powered open GPT-3 technology. TurboGPT can be used as a standalone chatbot or integrated into a larger project.

code:

live version:

pKBDbIM.jpeg


pgaxdRc.png
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,123
Reputation
8,239
Daps
157,818

Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality​

by the Team with members from UC Berkeley, CMU, Stanford, and UC San Diego​

* According to a fun and non-scientific evaluation with GPT-4. Further rigorous evaluation is needed.

We introduce Vicuna-13B, an open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT. Preliminary evaluation using GPT-4 as a judge shows Vicuna-13B achieves more than 90%* quality of OpenAI ChatGPT and Google Bard while outperforming other models like LLaMA and Stanford Alpaca in more than 90%* of cases. The cost of training Vicuna-13B is around $300. The training and serving code, along with an online demo, are publicly available for non-commercial use.

How Good is Vicuna?

We present examples of Alpaca and Vicuna responses to our benchmark questions. After fine-tuning Vicuna with 70K user-shared ChatGPT conversations, we discover that Vicuna becomes capable of generating more detailed and well-structured answers compared to Alpaca (see examples below), with the quality on par with ChatGPT.


live version:

:wow:
 
Top