bnew

Veteran
Joined
Nov 1, 2015
Messages
56,129
Reputation
8,239
Daps
157,831

About​

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

arxiv.org/pdf/2311.10122.pdf

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

If you like our project, please give us a star ⭐ on GitHub for latest update.​

hf_space Replicate demo and cloud API zhihu zhihu arXiv License Hits GitHub issues GitHub closed issues

PWC
PWC
PWC

💡 I also have other video-language projects that may interest you ✨.


LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
Bin Zhu, Bin Lin, Munan Ning, Yang Yan, Jiaxi Cui, HongFa Wang, Yatian Pang, Wenhao Jiang, Junwu Zhang, Zongwei Li, Wancai Zhang, Zhifeng Li, Wei Liu, Li Yuan

📰 News​

  • [2023.11.20] 🤗Demo and code are available now! Welcome to watch 👀 this repository for the latest updates.

😮 Highlights​

Video-LLaVA exhibits remarkable interactive capabilities between images and videos, despite the absence of image-video pairs in the dataset.

💡 Simple baseline, learning united visual representation by alignment before projection​

  • With the binding of unified visual representations to the language feature space, we enable an LLM to perform visual reasoning capabilities on both images and videos simultaneously.

🔥 High performance, complementary learning with video and image​

  • Extensive experiments demonstrate the complementarity of modalities, showcasing significant superiority when compared to models specifically designed for either images or videos.
 

Macallik86

Superstar
Supporter
Joined
Dec 4, 2016
Messages
6,511
Reputation
1,372
Daps
21,237

Man. Seeing who is on the new board of OpenAI does not give me faith that we will be long for this world as a society.

A former CEO of Salesforce ? The board should have an ethical background, not a proven track record of being capitalistic. The foxes are officially in the hen house. I wonder if this is how people felt when Glass-Steagall was repealed.

Edit: fukking aye, just realized that Larry Summer who played a big part in repealing Glass-Steagall, is one of the other new board members
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,129
Reputation
8,239
Daps
157,831



From the company that brought you AWS certification comes a new ‘AI Ready’ education track to help train aspiring professionals on Amazon’s AI tech
 

Macallik86

Superstar
Supporter
Joined
Dec 4, 2016
Messages
6,511
Reputation
1,372
Daps
21,237

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,129
Reputation
8,239
Daps
157,831

Computer Science > Computation and Language​

[Submitted on 15 Nov 2023 (v1), last revised 21 Nov 2023 (this version, v2)]

Exponentially Faster Language Modelling​

Peter Belcak, Roger Wattenhofer
Language models only really need to use an exponential fraction of their neurons for individual inferences. As proof, we present UltraFastBERT, a BERT variant that uses 0.3% of its neurons during inference while performing on par with similar BERT models. UltraFastBERT selectively engages just 12 out of 4095 neurons for each layer inference. This is achieved by replacing feedforward networks with fast feedforward networks (FFFs). While no truly efficient implementation currently exists to unlock the full acceleration potential of conditional neural execution, we provide high-level CPU code achieving 78x speedup over the optimized baseline feedforward implementation, and a PyTorch implementation delivering 40x speedup over the equivalent batched feedforward inference. We publish our training code, benchmarking setup, and model weights.
Subjects:Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Cite as:arXiv:2311.10770 [cs.CL]
(or arXiv:2311.10770v2 [cs.CL] for this version)
https://doi.org/10.48550/arXiv.2311.10770
Focus to learn more

Submission history​

From: Peter Belcak [view email]
[v1] Wed, 15 Nov 2023 18:42:50 UTC (35 KB)



 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,129
Reputation
8,239
Daps
157,831

I’m excited to introduce RAGs - create a “ChatGPT over your data” purely with natural language 💬

Setup and use a full RAG pipeline over files/web urls in three simple steps:
1️⃣ RAG Builder: Describe your RAG pipeline in natural language (data, parameters like top-k, system prompt), and let the builder agent build this for you.
2️⃣ View Config: View the generated RAG configuration and make edits to it if you want to update the agent.
3️⃣ RAG Agent: Ask the agent anything and it’ll answer the question!

Of course, this is heavily inspired by GPTs from @OpenAI. The goal here was to build a similar natural language interface but tailored around RAG - retrieval parameters, prompts, data sources, all through conversation. Have an agent help you setup QA over your data instead of you writing code.

We built the app with @llama_index + @streamlit. It currently supports the following:
- LLMs: Supports our 20+ LLMs for builder agent (uses OpenAI by default) and currently supports OpenAI for generated agent (not too hard to extend).
- Embeddings: Supports OpenAI ada embeddings and @huggingface models.

There’s so many things we want to do. In the meantime if you have feedback/suggestions let us know!

Check out the attached video, blog post, and repo.

Blog: https://blog.llamaindex.ai/introduc...hatgpt-experience-over-your-data-2b9d140769b1

Repo: github.com/run-llama/rags
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,129
Reputation
8,239
Daps
157,831

Google’s Bard chatbot can now watch YouTube videos and provide information about them​

Google's Bard chatbot can now watch and provide information on YouTube video material

by FİKRİKADİM

November 22, 2023
2 mins read


Google's Bard chatbot can now watch YouTube videos and provide information about them


Microsoft recently retired the Bing Chat brand name for its generative AI chatbot, opting for the name Pilot. However, Google has maintained the Bard name for its competing chatbot, at least for the time being. In the past week, Google introduced several new features to Bard, including the ability to engage in discussions about specific content from YouTube videos.

This new feature is an update to Google’s Bard Extensions launched in September, enabling Bard to integrate with various Google apps and services, including YouTube, to extract information for conversations. The latest update brings additional functionalities tailored specifically to YouTube.

In the official Bard update page, It states:

We’re taking the first steps in Bard’s ability to understand YouTube videos. For example, if you’re looking for videos on how to make olive oil cake, you can now also ask how many eggs the recipe in the first video requires.

We’ve heard you want deeper engagement with YouTube videos. So we’re expanding the YouTube Extension to understand some video content so you can have a richer conversation with Bard about it.

Alongside these YouTube-related enhancements, Google added other features to Bard last week. These include incorporating new “step-by-step explanations” for solving mathematical equations. Another noteworthy addition is Bard’s capacity to generate charts, either from user-inputted data or tables created during a conversation.

Moreover, Google reduced the minimum age requirement for Bard usage to 13 years old in certain countries. Notably, the company announced its intention to take action against two cybercriminal groups exploiting the Bard name for fraudulent activities. These actions include misleading individuals into downloading Bard under the false pretense of malware distribution.

İlgili

https://www.fikrikadim.com/2023/04/...tedposts_origin=81348&relatedposts_position=0
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,129
Reputation
8,239
Daps
157,831

Computer Science > Artificial Intelligence​

[Submitted on 14 Nov 2023]

LLMs cannot find reasoning errors, but can correct them!​

Gladys Tyen, Hassan Mansoor, Peter Chen, Tony Mak, Victor Cărbune
While self-correction has shown promise in improving LLM outputs in terms of style and quality (e.g. Chen et al., 2023; Madaan et al., 2023), recent attempts to self-correct logical or reasoning errors often cause correct answers to become incorrect, resulting in worse performances overall (Huang et al., 2023). In this paper, we break down the self-correction process into two core components: mistake finding and output correction. For mistake finding, we release BIG-Bench Mistake, a dataset of logical mistakes in Chain-of-Thought reasoning traces. We provide benchmark numbers for several state-of-the-art LLMs, and demonstrate that LLMs generally struggle with finding logical mistakes. For output correction, we propose a backtracking method which provides large improvements when given information on mistake location. We construe backtracking as a lightweight alternative to reinforcement learning methods, and show that it remains effective with a reward model at 60-70% accuracy.
Subjects:Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:arXiv:2311.08516 [cs.AI]
(or arXiv:2311.08516v1 [cs.AI] for this version)
https://doi.org/10.48550/arXiv.2311.08516
Focus to learn more

Submission history​

From: Gladys Tyen [view email]
[v1] Tue, 14 Nov 2023 20:12:38 UTC (7,191 KB)





partial AI summary:

ANSWER | PHIND V8 MODEL​

The paper you're referring to is about Large Language Models (LLMs) and their ability to self-correct their outputs. The authors, Gladys Tyen, Hassan Mansoor, Peter Chen, Tony Mak, and Victor Carbune, have broken down the self-correction process into two components: mistake finding and output correction.
Here are the key highlights:
  1. Mistake Finding: The authors have released a dataset called BIG-Bench Mistake, which contains logical mistakes in Chain-of-Thought reasoning traces. They found that LLMs generally struggle with finding these logical mistakes.
  1. Output Correction: The authors propose a backtracking method for output correction. This method provides large improvements when given information on the mistake location. They view backtracking as a lightweight alternative to reinforcement learning methods and show that it remains effective with a reward model at 60-70% accuracy.
  1. Benchmark Results: The authors tested the reasoning capabilities of state-of-the-art LLMs using their dataset. They found that LLMs struggle with mistake finding, even for objective, unambiguous cases. This suggests that LLMs' inability to find mistakes is a main contributing factor to their inability to self-correct reasoning errors.
  1. Backtracking as Verbal Reinforcement Learning: The authors view backtracking as a form of "verbal reinforcement learning". They propose that backtracking can be used with a trained classifier as a reward model, and demonstrate the effectiveness of backtracking at various reward model accuracies.
  1. BIG-Bench Mistake Dataset: The authors created the BIG-Bench Mistake dataset, which is the first of its kind that goes beyond problems in mathematics. The dataset consists of 2186 sets of CoT-style traces, each annotated with the location of the first logical error. The authors used this dataset to test the reasoning capabilities of state-of-the-art LLMs.
  1. Annotation: The authors used both human and automatic annotation for the BIG-Bench Mistake dataset. They found that the inter-rater reliability for the human-annotated tasks was high, suggesting the reliability of the dataset.
  1. Prompting Methods: The authors compared three different prompting methods: Direct trace-level prompting, Direct step-level prompting, and CoT step-level prompting. They found that the accuracy on traces with no mistakes goes down considerably from direct, trace-level prompting to CoT step-level prompting.
  1. Few-shot Prompting for Mistake Location: The authors investigated whether their prompting methods can reliably determine the correctness of a trace rather than the mistake location. They found that using mistake finding as a strategy for determining whether an answer is correct or not can be effective.
In summary, the authors have found that LLMs can correct their outputs using a backtracking method, but they struggle with finding mistakes. They propose backtracking as a lightweight alternative to reinforcement learning methods and show that it remains effective with a reward model at 60-70% accuracy. They also provide a new dataset, BIG-Bench Mistake, for testing the reasoning capabilities of LLMs.
 
Top