levitate

I love you, you know.
Joined
Sep 3, 2015
Messages
39,790
Reputation
6,145
Daps
151,525
Reppin
The Multiverse
As a graphic artist who has spent countless hours over the course of over two decades becoming expert level good at doing things like clipping out things/people from backgrounds I feel a kinda way about this. They way they train the AI is by letting it watch humans do it the old fashioned way. Wild...


@3Rivers
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,623
Daps
161,869

FupJOo3WcAcry-Q
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,623
Daps
161,869



ControlNet is a highly regarded tool for guiding StableDiffusion models, and it has been widely acknowledged for its effectiveness. In this repository, A simple hack that allows for the restoration or removal of objects without requiring user prompts. By leveraging this approach, the workflow can be significantly streamlined, leading to enhanced process efficiency.

No-prompt​

restore Result restore Result
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,623
Daps
161,869

Making the community's best AI chat models available to everyone.
Current Model
OpenAssistant/oasst-sft-6-llama-30b
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,623
Daps
161,869

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,623
Daps
161,869

Natural Language Movie Scene Search Engine​


The goal of this project was to develop a natural language video search engine that could effectively search through large quantities of video data without relying on metadata like titles, descriptions, or audio transcriptions. The aim was to enable users to search for specific actions or scenes, such as closing and opening a door, and facilitate the comparison of these scenes across different videos.

The dataset for this particular tool is roughly ~30,000 scenes from imdbs top 250 movies.

Features​

  • Natural Language Search
  • Scene Similarity Search
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,623
Daps
161,869

Speed Is All You Need: On-Device Acceleration of Large Diffusion Models via GPU-Aware Optimizations​

Yu-Hui Chen, Raman Sarokin, Juhyun Lee, Jiuqiang Tang, Chuo-Ling Chang, Andrei Kulik, Matthias Grundmann
The rapid development and application of foundation models have revolutionized the field of artificial intelligence. Large diffusion models have gained significant attention for their ability to generate photorealistic images and support various tasks. On-device deployment of these models provides benefits such as lower server costs, offline functionality, and improved user privacy. However, common large diffusion models have over 1 billion parameters and pose challenges due to restricted computational and memory resources on devices. We present a series of implementation optimizations for large diffusion models that achieve the fastest reported inference latency to-date (under 12 seconds for Stable Diffusion 1.4 without int8 quantization on Samsung S23 Ultra for a 512x512 image with 20 iterations) on GPU-equipped mobile devices. These enhancements broaden the applicability of generative AI and improve the overall user experience across a wide range of devices.

 
Last edited:

TRUEST

Superstar
Joined
May 17, 2012
Messages
14,538
Reputation
2,746
Daps
55,233
Reppin
NULL
The unfocused invention and proliferation of “AI” technologies by every Tom d1ck and harrry will inevitably lead to a weakness of analysis paralysis. You’ll end up in a position where the general public is expected to do the outright ridiculous…like using a tractor trailer as a commuter vehicle to an office job.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,623
Daps
161,869
The unfocused invention and proliferation of “AI” technologies by every Tom d1ck and harrry will inevitably lead to a weakness of analysis paralysis. You’ll end up in a position where the general public is expected to do the outright ridiculous…like using a tractor trailer as a commuter vehicle to an office job.

:what:
not sure why you believe it's unfocused when dozens of focused projects sprout up everyday and thousands of models are being trained in specific knowledge areas.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,206
Reputation
8,623
Daps
161,869

About​

mPLUG-Owl🦉: Modularization Empowers Large Language Models with Multimodality



Examples​

Training paradigm and model overview Training paradigm and model overview

News​

  • We provide an online demo on modelscope for the public to experience.
  • We released code of mPLUG-Owl🦉 with its pre-trained and instruction tuning checkpoints.

Spotlights​

  • A new training paradigm with a modularized design for large multi-modal language models.
  • Learns visual knowledge while support multi-turn conversation consisting of different modalities.
  • Observed abilities such as multi-image correlation and scene text understanding, vision-based document comprehension.
  • Release a visually-related instruction evaluation set OwlEval.
Training paradigm and model overview

Online Demo​

Demo of mPLUG-Owl on Modelscope
 
Top