bnew

Veteran
Joined
Nov 1, 2015
Messages
51,805
Reputation
7,926
Daps
148,715

1/6
Unveiling Encoder-Free Vision-Language Models

Achieves smaller performance-compute gap between encoder-based VLM and decoder-only VLM

[2406.11832] Unveiling Encoder-Free Vision-Language Models

2/6
This is super cool!

But I can’t find the model weights and inference code

3/6
actually fuyu already did it, but just don't know this paradigms scaling laws and how many more data needed as the encoder-free remedy.

4/6
Very reminiscent of the adept fuyu models from last year, which are also encoder-free and use image patches, directly.

adept/fuyu-8b · Hugging Face

5/6
Not sure whether "Encoder-Free" is the most accurate description. It seems that the patch embedding layer (PEL) & the patch aligning layer (PAL) together are essentially distilling the representation from an vision encoder...

6/6
soly 35M data.... they really have plenty ofntime..


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GQUuahHWUAA5koZ.jpg
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,805
Reputation
7,926
Daps
148,715

1/6
Autoregressive Image Generation without Vector Quantization

Achieves competitive performance without vector quantization by using diffusion loss function

[2406.11838] Autoregressive Image Generation without Vector Quantization

2/6
To understand Quantization with a simplified explanation, do check this post; 👇

3/6
I imagined something like this would work, feel vindicated

4/6
FINALLY !

5/6
Diffusion loss??
Man, the papers are coming in too quick, I’m unable to read them all

6/6



To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

GQUtNi0XAAAVFDq.png

GQSHbTzaUAA_99T.jpg

GQUv-hGacAAUD9g.jpg
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,805
Reputation
7,926
Daps
148,715

1/1
We're sharing an update on the advanced Voice Mode we demoed during our Spring Update, which we remain very excited about:

We had planned to start rolling this out in alpha to a small group of ChatGPT Plus users in late June, but need one more month to reach our bar to launch. For example, we’re improving the model’s ability to detect and refuse certain content. We’re also working on improving the user experience and preparing our infrastructure to scale to millions while maintaining real-time responses.

As part of our iterative deployment strategy, we'll start the alpha with a small group of users to gather feedback and expand based on what we learn. We are planning for all Plus users to have access in the fall. Exact timelines depend on meeting our high safety and reliability bar. We are also working on rolling out the new video and screen sharing capabilities we demoed separately, and will keep you posted on that timeline.

ChatGPT’s advanced Voice Mode can understand and respond with emotions and non-verbal cues, moving us closer to real-time, natural conversations with AI. Our mission is to bring these new experiences to you thoughtfully.


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196



 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,805
Reputation
7,926
Daps
148,715

Apple Spurned Idea of iPhone AI Partnership With Meta Months Ago​


  • Apple has been looking to forge agreements to use AI chatbots
  • Report indicated that Apple and Meta are in discussions


An Apple iPhone

An Apple iPhone

Photographer: Samsul Said/Bloomberg

Gift this article

Have a confidential tip for our reporters? Get in Touch

Before it’s here, it’s on the Bloomberg Terminal

LEARN MORE


By Mark Gurman

June 24, 2024 at 5:45 PM EDT

Apple Inc. rejected overtures by Meta Platforms Inc. to integrate the social networking company’s AI chatbot into the iPhone months ago, according to people with knowledge of the matter.

The two companies aren’t in discussions about using Meta’s Llama chatbot in an AI partnership and only held brief talks in March, said the people, who asked not to be identified because the situation is private. The dialogue about a partnership didn’t reach any formal stage, and Apple has no active plans to integrate Llama.

The preliminary talks occurred around the time Apple started hashing out deals to use OpenAI’s ChatGPT and Alphabet Inc.’s Gemini in its products. The iPhone maker announced the ChatGPT agreement earlier this month and said it was expecting to offer Gemini in the future.

Read More: Apple Hits Record After Introducing ‘AI for the Rest of Us’

Apple decided not to move forward with formal Meta discussions in part because it doesn’t see that company’s privacy practices as stringent enough, according to the people. Apple has spent years criticizing Meta’s technology, and integrating Llama into the iPhone would have been a stark about-face.

Apple also sees ChatGPT as a superior offering. Google, meanwhile, is already a partner for search in Apple’s Safari web browser, so a future Gemini deal would build on that relationship.

Spokespeople for Apple and Meta declined to comment. The Wall Street Journal reported on Sunday that the two companies were in talks about an AI partnership.

Apple unveiled a suite of artificial intelligence features at its Worldwide Developers Conference on June 10. The new technology — called Apple Intelligence — includes homegrown tools for summarizing notifications, transcribing voice memos and generating custom emoji.

But Apple’s chatbot technology isn’t as advanced as that of rivals, prompting it to seek out partners. The company also believes that customers will want the ability to switch between different chatbots depending on their needs, similar to how they might hop between Google and Microsoft Corp.’s Bing for searches.

Apple continues to talk to AI startup Anthropic about eventually adding that company’s chatbot as an option, the people said. Apple Intelligence will begin rolling out later this year as part of operating systems for the iPhone, iPad and Mac.

The current deal with OpenAI doesn’t involve money swapping hands, but Apple will allow paying ChatGPT customers to access their subscriptions within the iOS operating system. That could generate revenue for OpenAI, a percentage of which could be headed to Apple in the form of App Store commissions.

Read More: Apple to ‘Pay’ OpenAI for ChatGPT Through Distribution, Not Cash

Meta and Apple were on friendlier terms a decade ago, when the iPhone maker was integrating Facebook into iOS. But the companies have become fierce rivals in recent years, competing over AI, home devices and mixed-reality headsets.

— With assistance from Kurt Wagner
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,805
Reputation
7,926
Daps
148,715

New Google AI experiment could let you chat with celebrities, YouTube influencers​

Don't like Gemini's responses? Google may let you create your own chatbot soon.

By Calvin Wankhede

June 25, 2024

Google Gemini logo on smartphone stock photo (2)

Edgar Cervantes / Android Authority

TL;DR
  • According to a new report, Google is working on an AI experiment that could let you chat with famous personalities.
  • The project will also allow anyone to build their own chatbots, similar to services like Character.ai.
  • The search giant may partner with YouTube influencers to create brand-specific AI personas.
Google is reportedly working on a new AI project that will let you converse with chatbots modeled after celebrities, YouTube influencers, or even fictional characters. According to The Information, Google plans to let anyone create their own chatbot by “describing its personality and appearance” and then converse with it — purely for entertainment.

This latest AI effort is notably distinct from Gems, which are essentially “customized versions of Gemini”. Put simply, Gems are similar to custom GPTs that can be taught to handle singular tasks like acting as a running coach or coding partner. On the other hand, Google’s upcoming chatbot project will fine-tune the Gemini family of language models to mimic or emulate the response style of specific people.

The search giant’s interest in personalized chatbots might suggest that it’s looking to take on Meta’s Celebrity AI chatbots. The latter already lets you talk to AI recreations of famous personalities like Snoop Dogg. Google’s upcoming product has also drawn comparisons to Character.ai, a chatbot service that offers a diverse range of personas ranging from TV characters to real-life politicians. Character.ai allows you to create your own personas with unique response styles that can be trained via text datasets.

Google’s customizable chatbot endeavor comes courtesy of the Labs team, which pivoted to working on various AI experiments last year. It’s being developed by a team of ten employees and led by long-time Google Doodle designer Ryan Germick.

As for monetization, the report suggests that Google may eventually integrate the project into YouTube rather than launching it as a standalone product. This would allow creators to create their own AI personas and potentially improve engagement with their audiences. YouTube’s most famous personality, MrBeast, already embodies an AI-powered chatbot on Meta’s platforms. While this approach may still not translate to a direct revenue stream, it could convince users to return to YouTube more often and offer creators better reach.

While a release date has yet to be finalized, the chatbot platform will likely make its way to the Google Labs page for testing first. The company is currently showing off over a dozen experimental tools and projects, with some like the controversial AI Overviews already integrated within mainline Google products.





1/1
Character AI revealed earlier this week in a blog post that they now serve more than 20k inference qps - that's 20% of Google Search request volume. According to The Information, Google is now developing its own celebrity and user made chatbot platform. Planned launch this year.


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,805
Reputation
7,926
Daps
148,715

DE

AI in practice

Jun 25, 2024


ChatGPT's writing style has infiltrated over 10% of scientific abstracts since its launch, study finds​

Midjourney prompted by THE DECODER

ChatGPT's writing style has infiltrated over 10% of scientific abstracts since its launch, study finds

Matthias Bastian

Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.

Profile
E-Mail
Content


An analysis of 14 million PubMed abstracts shows that AI text generators have influenced at least 10 percent of scientific abstracts since ChatGPT's introduction. In some fields and countries, the percentage is even higher.

Researchers from the Universities of Tübingen and Northwestern examined linguistic changes in 14 million scientific abstracts between 2010 and 2024. They found that ChatGPT and similar AI text generators led to a sharp increase in certain style words.

The researchers first identified words that appeared significantly more frequently in 2024 compared to previous years. These included many verbs and adjectives typical of ChatGPT's writing style, such as "delve," "intricate," "showcasing," and "underscores."

Based on these markers, the researchers estimate that in 2024, AI text generators influenced at least 10 percent of all PubMed abstracts. In some cases, the impact was even greater than that of words like "Covid," "pandemic," or "Ebola," which were dominant in their time.


The measured effects of typical ChatGPT phrases even outweighed scientifically relevant terms such as "Covid". | Image: Kobak et al.

The researchers found that for PubMed subgroups in countries such as China and South Korea, around 15 percent of abstracts were created using ChatGPT, compared to just 3 percent in the UK. However, this doesn't necessarily mean that UK authors use ChatGPT less.

In fact, according to the researchers, the actual use of AI text generators is likely to be much higher. Many researchers edit AI-generated text to remove typical marker words. Native speakers may have an advantage here because they're more likely to notice such phrases. This makes it difficult to determine the true proportion of AI-influenced abstracts.

Where it was measurable, the use of AI was particularly high in journals such as Frontiers and MDPI, at about 17 percent, and in IT journals it reached 20 percent. The highest proportion of Chinese authors in IT journals was 35 percent.


There were plenty of typical ChatGPT phrases in Asian IT journals. | Image: Kobak et al.


Meta was too early​

AI could assist scientific authors and make articles more readable. According to study author Dmitry Kobak, using generative AI specifically for abstracts isn't necessarily problematic.

However, AI text generators can also invent facts, reinforce biases, and even plagiarize. They could also reduce the diversity and originality of scientific texts.

The researchers call for a reassessment of guidelines for using AI text generators in science.

In this context, it seems almost ironic that Meta's scientific open-source language model "Galactica," published shortly before ChatGPT, faced harsh criticism from parts of the scientific community, forcing Meta to take it offline.

This clearly didn't prevent the introduction of generative AI into scientific writing, but it may have prevented a system optimized for this task.

Summary
  • An analysis of 14 million PubMed abstracts shows that at least 10 percent of scientific abstracts have been influenced by AI text generators since the introduction of ChatGPT. In some disciplines and countries, the percentage is significantly higher.
  • The researchers identified certain marker words that are typical of ChatGPT's writing style, such as "delve" and "intricate". In China and South Korea, about 15 percent of the abstracts examined were written using ChatGPT, compared to only 3 percent in the United Kingdom. However, the number of unreported cases could be high, especially among native speakers.
  • The use of AI can assist scientific authors and make articles more readable. However, there are also concerns about invented facts, the reinforcement of biases, and potential plagiarism risks. The researchers call for a reassessment of guidelines for the use of AI text generators in academia.

Sources

Paper Kobak via X
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,805
Reputation
7,926
Daps
148,715

AI in practice

Jun 25, 2024


Anthropic finally brings some ChatGPT features to Claude​

Anthropic

Anthropic finally brings some ChatGPT features to Claude

Matthias Bastian

Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.

Profile

E-Mail

Content


Anthropic has introduced new collaboration features for Claude, including projects, shared conversations, and artifacts. These additions bring Claude closer to the functionality offered by ChatGPT, which has had similar features for some time.

Claude Pro and Team users can now organize their chats into "projects." Like GPTs, projects store custom data and specific prompts, making them readily available each time the project is launched.

Projects allow users to leverage a context scope of 200,000 tokens, the equivalent of about 500 pages of text. This helps avoid a "cold start" by providing relevant background information. Anthropic believes this will improve Claude's performance on specific tasks.

Users can also give Claude project-specific instructions, such as adopting a more formal tone or answering questions from a particular industry perspective. A redesigned sidebar allows users to pin frequently used chats for easy access.

External media content ( www.youtube.com) has been blocked here. When loading or playing, connections are established to the servers of the respective providers. Personal data may be communicated to the providers in the process. You can find more information in our privacy policy.

Allow external media content

Projects also support the newly introduced "artifacts". It displays generated content such as code snippets, text documents, or diagrams in a separate window next to the conversation. For developers, artifacts provide an expanded code window and live previews for frontends. This feature is currently in beta and can be enabled in your account settings.

Claude.ai Team users can now share snapshots of their best conversations with colleagues in a project's activity feed, which is designed to facilitate learning and inspiration among team members.

The company recently launched Claude Sonnet 3.5, one of the most capable AI models on the market. Opus 3.5, the largest model in the range, is scheduled for release later this year and might take the crown. Anthropic also plans to make Claude more versatile in the coming months by natively integrating popular applications and tools.


Summary
  • Anthropic has introduced the "Projects" feature for Claude.ai Pro and Team users. This allows chats to be organized into projects. Each project has a context window of 200,000 tokens, the equivalent of about 500 book pages.
  • Projects allow users to add internal documents, codebases, and knowledge to improve Claude's performance. Custom instructions can be used to further customize Claude's responses, such as a more formal tone or responses from the perspective of a specific role or industry.
  • Claude Team users can now also share snapshots of their best conversations with team members.

Sources

Anthropic
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,805
Reputation
7,926
Daps
148,715

AI in practice

Jun 23, 2024

Magnific AI's Relight lets you change image lighting and backgrounds on the fly​

Gizem Akdag via X

Magnific AI's Relight lets you change image lighting and backgrounds on the fly

Matthias Bastian

Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.

Profile

E-Mail

Content


Spanish AI startup Magnific AI has launched a new feature called "Relight," which allows users to change the lighting and background of images using AI. The technology could make it easier to create realistic and varied scenes with a main subject.

Magnific AI, which joined Freepik in May, has developed Relight to allow users to modify image lighting and optionally change backgrounds using AI prompts.

Users can control lighting adjustments through text prompts like "change the lighting to sci-fi neon green," by providing a reference image, or by creating a custom light map. A demo of all three prompts is available here.


Image: LysonOber via X

Share

Recommend our article

Share

According to Magnific AI co-founder Javi Lopez, Relight works on characters, landscapes, backgrounds, and "any type of image."


Image: Javi Lopez via X

Beta users have shared numerous examples on X, showcasing the technology's potential.


Image: Julie W. Design via X

Lopez acknowledges some current limitations. When images contain multiple people or small faces, unwanted facial changes can occur. He notes this issue is "difficult to fix," but Relight performs well for standard portraits. There are also some inaccuracies in precisely matching new lighting to a scene compared to the original lighting.

The new feature could be particularly useful in commercial photography, allowing products to be easily placed in different environments. While such image manipulation was possible before AI, Relight can significantly speed up the process and make it accessible to non-experts.

Video Player



Video: Dogan Ural via X

Relight is currently in a short beta test and should be available to all Magnific AI accounts next week. The company, which initially focused on AI-based image upscaling, continues to expand its toolkit with additional AI image features.

Summary


  • The Magnific AI image tool has introduced a new feature called Relight, which can change the lighting in images and realistically place characters in new environments by changing the background at the same time.
  • The new lighting is implemented using a text prompt, a reference image, or a custom lighting map. The tool has particular potential in advertising photography, where products can be moved to different locations with little effort.
  • After a short beta test, Relight will be activated for all users next week. The feature isn't perfect yet, especially when there are multiple people or small faces in the picture.

Sources

Javi Lopez via X
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,805
Reputation
7,926
Daps
148,715







1/11
This is fast. Chrome running Gemini locally on my laptop. 2 lines of code.

2/11
No library or anything, it's a native part of some future version of Chrome

3/11
Does it work offline?

4/11
This is Chrome 128 Canary. You need to sign up for "Built-In AI proposal preview" to enable it

5/11
Seems very light on memory and CPU

6/11
wait why are they putting this into Chrome lol

are they trying to push this as a web standard of sorts or are they just going to keep this for themselves?

7/11
It's a proposal for all browsers

8/11
Query expansion like this could be promising

9/11
This is a great point!

10/11
No API key required? That would be great, I can run tons of instances of Chrome on the server as the back end of my wrapper apps.

11/11
Free, fast and private for everyone


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196







Jun 25, 2024


Get Access to Gemini Nano Locally Using Chrome Canary​

You can access Gemini Nano locally using Chrome Canary. It lets you use cutting-edge AI in your browser.

Gemini Nano

Stay up to date​

Subscribe to AI Newsletter

Explore the power of Gemini Nano that is now available in Chrome Canary. While the official release is coming soon, you can already use Gemini Nano on your computer using Chrome Canary.

What is Gemini Nano?

Gemini Nano is a streamlined version of the larger Gemini model, designed to run locally. It uses the same datasets as as its predecessors. Gemini Nano keeps the original models' multimodal capabilities, but in a smaller form. Google had promised this in Chrome 126. But, it's now in Chrome Canary. This hints that an official release is near.

Benefits of Using Nano Locally

Using Nano locally offers numerous advantages. It enhances product quality drastically. Nano materials display unique properties. Locally sourced materials reduce transport costs. This approach minimizes environmental impacts. Fewer emissions result from local production. It streamlines supply chains efficiently. Locally produced goods boost economies. Customers appreciate nearby product origin. It increases trust with local sourcing. This practice supports community growth.

Running Gemini Nano locally offers several benefits.

  • Privacy: Local processing means data doesn't have to leave your device. This provides an extra layer of security and privacy.
  • Speed and Responsiveness: You don't need to send data to a server. So, interactions can be quicker, improving user experience.
  • Accessibility: Developers can add large language model capabilities to applications. Users don't need constant internet access.



What is Chrome Canary?

It's the most experimental version of the Google Chrome web browser, designed primarily for developers and tech enthusiasts who want to test the latest features and APIs before they are widely available. While it offers cutting-edge functionality, it is also more prone to crashes and instability due to its experimental nature.

  • Canary is updated daily with the latest changes, often with minimal or no testing from Google.
  • It is always three versions ahead of the Stable channel.
  • Canary includes all features of normal Chrome, plus experimental functionality.
  • It can run alongside other Chrome versions and is available for Windows, macOS, and Android.



Launching Gemini Nano Locally with Chrome Canary

To get started with Gemini Nano locally using Chrome Canary, follow these steps:

  1. Download and set up Chrome Canary, ensuring the language is set to English (United States).
  2. In the address bar, enter chrome://flags
  3. Set:
    • the 'Enables optimization guide on device' to Enabled BypassPerfRequirement
    • the 'Prompt API for Gemini Nano' to Enabled



Chrome Flags Enabled

  1. Restart Chrome.
  2. Wait for the Gemini Nano to download. To check the status, navigate to chrome://components and ensure that the Optimization Guide On Device Model shows version 2024.6.5.2205 or higher. If not, click 'Check for updates'.
  3. Congratulations! You're all set to explore Gemini Nano for chat applications. Although the model is significantly simpler, it's a major stride for website developers now having access to a local LLM for inference.
  4. You can chat with chrome AI model here: https://www.localhostai.xyz


Chat Gemini Nano Locally


Conclusion

Gemini Nano is now available on Chrome Canary, a big step forward for local AI. It processes data on your device, which increases privacy and speeds things up. This also makes advanced technology easier for more people to use. Gemini Nano gives developers and tech fans a new way to try out AI. This helps create a stronger and more efficient local tech community and shows what the future of independent digital projects might look like.
 
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,805
Reputation
7,926
Daps
148,715



There’s also a paper: [2406.04692] Mixture-of-Agents Enhances Large Language Model Capabilities

Mixture-of-Agents Enhances Large Language Model Capabilities
Recent advances in large language models (LLMs) demonstrate substantial capabilities in natural language understanding and generation tasks. With the growing number of LLMs, how to harness the collective expertise of multiple LLMs is an exciting open direction. Toward this goal, we propose a new approach that leverages the collective strengths of multiple LLMs through a Mixture-of-Agents (MoA) methodology. In our approach, we construct a layered MoA architecture wherein each layer comprises multiple LLM agents. Each agent takes all the outputs from agents in the previous layer as auxiliary information in generating its response. MoA models achieves state-of-art performance on AlpacaEval 2.0, MT-Bench and FLASK, surpassing GPT-4 Omni. For example, our MoA using only open-source LLMs is the leader of AlpacaEval 2.0 by a substantial gap, achieving a score of 65.1% compared to 57.5% by GPT-4 Omni.



1/1
Mixture of Agents—a framework that leverages the collective strengths of multiple LLMs. Each layer contains multiple agents that refine responses using outputs from the preceding layer.
Together MoA achieves a score of 65.1% on AlpacaEval 2.0.
Together MoA — collective intelligence of open-source models pushing the frontier of LLM capabilities


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GPzIHgqaIAAr00B.png

 
Top