The A.I Megathread (LLM , GPT , Development)

bnew · Jun 25, 2023

PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360°

sizhean.github.io

PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360°

Sizhe An, Hongyi Xu, Yichun Shi, Guoxian Song, Umit Y. Ogras, Linjie Luo
University of Wisconsin-Madison, ByteDance Inc.
Paper arXiv Video Code

Abstract

Synthesis and reconstruction of 3D human head has gained increasing interests in computer vision and computer graphics recently. Existing state-of-the-art 3D generative adversarial networks (GANs) for 3D human head synthesis are either limited to near-frontal views or hard to preserve 3D consistency in large view angles. We propose PanoHead, the first 3D-aware generative model that enables high-quality view-consistent image synthesis of full heads in 360° with diverse appearance and detailed geometry using only in-the-wild unstructured images for training. At its core, we lift up the representation power of recent 3D GANs and bridge the data alignment gap when training from in-the-wild images with widely distributed views. Specifically, we propose a novel two-stage self-adaptive image alignment for robust 3D GAN training. We further introduce a tri-grid neural volume representation that effectively addresses front-face and back-head feature entanglement rooted in the widely-adopted tri-plane formulation. Our method instills prior knowledge of 2D image segmentation in adversarial learning of 3D neural scene structures, enabling compositable head synthesis in diverse backgrounds. Benefiting from these designs, our method significantly outperforms previous 3D GANs, generating high-quality 3D heads with accurate geometry and diverse appearances, even with long wavy and afro hairstyles, renderable from arbitrary poses. Furthermore, we show that our system can reconstruct full 3D heads from single input images for personalized realistic 3D avatars.

bnew · Jun 25, 2023

bnew · Jun 25, 2023

GitHub - LykosAI/StabilityMatrix: An easy to use, powerful package manager for Stable Diffusion Web UIs, and managing model checkpoints.

An easy to use, powerful package manager for Stable Diffusion Web UIs, and managing model checkpoints. - GitHub - LykosAI/StabilityMatrix: An easy to use, powerful package manager for Stable Diffus...

github.com

bnew · Jun 25, 2023

Micky Mikey · Jun 26, 2023

https://archive.is/2023.06.26-121228/https://www.wired.com/story/google-deepmind-demis-hassabis-chatgpt/

Google's DeepMind are currently creating a system called Gemini which could eclipse ChatGPT4. This could legitimately be true AGI. Something to watch out for in the next year or so.

bnew · Jun 26, 2023

lumine AI – AI-Powered Creativity

ilumine.ai

https://archive.is/F2OHV

bnew · Jun 26, 2023

https://archive.is/CzXQI

bnew · Jun 26, 2023

GitHub - artyfacialintelagent/CloneCleaner: An extension for Automatic1111 to work around Stable Diffusion's "clone problem". It automatically modifies your prompts with random names, nationalities, hair style and hair color to create more variations

An extension for Automatic1111 to work around Stable Diffusion's "clone problem". It automatically modifies your prompts with random names, nationalities, hair style and hair color to...

github.com

CloneCleaner

An extension for Automatic1111 to work around Stable Diffusion's "clone problem". It automatically modifies your prompts with random names, nationalities, hair style and hair color to create more variations in generated people.

bnew · Jun 26, 2023

MosaicML Agrees to Join Databricks to Power Generative AI for All

Together with Databricks, we can bring our customers and community to the forefront of AI faster than ever before.

www.mosaicml.com

64991359c628c150d21c19f6_Mosaic_Databricks_Teamup4-p-2000.jpg

by
Naveen Rao, Hanlin Tang
on
June 26, 2023

MosaicML Agrees to Join Databricks to Power Generative AI for All

Together with Databricks, we can bring our customers and community to the forefront of AI faster than ever before.
We are excited to announce that MosaicML has agreed to join Databricks to further our vision of making custom AI model development available to any organization.
‍
We started MosaicML to solve the hard engineering and research problems necessary to make large scale neural network training and inference more accessible to everyone. With the recent generative AI wave, this mission has taken center stage. We fundamentally believe in a better world where everyone is empowered to train their own models, imbued with their own data, wisdom, and creativity, rather than have this capability centralized in a few generic models.
‍
When Ali, Patrick, and the other Databricks co-founders reached out about a partnership, we immediately recognized them as kindred spirits: researchers-turned-entrepreneurs sharing a similar mission. Their strong company culture and focus on engineering mirrored what we thought a grown-up MosaicML would be
‍
Today, we’re excited to announce that MosaicML has signed an agreement to join Databricks to create a leading generative AI platform. The transaction is subject to certain customary closing conditions and regulatory clearances, and the companies will remain independent until those reviews are complete, but we are excited about what we can do together with Databricks when the transaction closes.
‍
Our flagship products will continue to grow. To our current customers and those on our long waitlist: this partnership will only help us serve you faster! MosaicML training, inference, and our MPT family of foundation models are already powering generative AI for enterprises and developers around the world, and together with Databricks, we look forward to going bigger with all of you.
‍
Generative AI is at an inflection point. Will the future rely mostly on large generic models owned by a few? Or will we witness a true Cambrian explosion of custom AI models that are built by many developers and companies from every corner of the world? MosaicML’s expertise in generative AI software infrastructure, model training, and model deployment, combined with Databricks’ customer reach and engineering capacity, will allow us to tip the scales in the favor of the many. We look forward to continuing this journey together with the AI community. As always, please join us in conversation on Twitter, LinkedIn, or in our community forum.
‍
We’d like to thank our board members Matt Ocko at DCVC, Shahin Farshchi at Lux Capital, and Peter Barrett at Playground Global, as well as all our investors who have supported us through our journey.
‍
Let’s keep going!

bnew · Jun 26, 2023

Grapevine: Web & YouTube Summary with ChatGPT

Display ChatGPT summaries of YouTube videos, Google search results, and all web content.

chrome.google.com

Standing on business · Jun 26, 2023

bnew said:

Lol at this, I gotta stay out this thread. Like someone said in the comments it’s seems that most people have no idea what’s coming
:wow:

Micky Mikey · Jun 26, 2023

ChatGPT said:
Lol at this, I gotta stay out this thread. Like someone said in the comments it’s seems that most people have no idea what’s coming

No one is prepared. Not even those of us who keep up with it.

bnew · Jun 26, 2023

NUWA-Infinity

msra-nuwa.azurewebsites.net

NUWA-XL

NUWA-XL is a cutting-edge multimodal generative model with the remarkable ability to produce extremely long video based on provided scripts in a “coarse-to-fine” process.

Long Video

Given the prompts of a script, NUWA-XL can generate an extremely long video that conforms to it in a “coarse-to-fine” process.

GitHub - microsoft/NUWA: A unified 3D Transformer Pipeline for visual synthesis

A unified 3D Transformer Pipeline for visual synthesis - GitHub - microsoft/NUWA: A unified 3D Transformer Pipeline for visual synthesis

github.com

NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation

Shengming Yin, Chenfei Wu, Huan Yang, Jianfeng Wang, Xiaodong Wang, Minheng Ni, Zhengyuan Yang, Linjie Li, Shuguang Liu, Fan Yang, Jianlong Fu, Gong Ming, Lijuan Wang, Zicheng Liu, Houqiang Li, Nan Duan

In this paper, we propose NUWA-XL, a novel Diffusion over Diffusion architecture for eXtremely Long video generation. Most current work generates long videos segment by segment sequentially, which normally leads to the gap between training on short videos and inferring long videos, and the sequential generation is inefficient. Instead, our approach adopts a ``coarse-to-fine'' process, in which the video can be generated in parallel at the same granularity. A global diffusion model is applied to generate the keyframes across the entire time range, and then local diffusion models recursively fill in the content between nearby frames. This simple yet effective strategy allows us to directly train on long videos (3376 frames) to reduce the training-inference gap, and makes it possible to generate all segments in parallel. To evaluate our model, we build FlintstonesHD dataset, a new benchmark for long video generation. Experiments show that our model not only generates high-quality long videos with both global and local coherence, but also decreases the average inference time from 7.55min to 26s (by 94.26\%) at the same hardware setting when generating 1024 frames. The homepage link is \url{this https URL}

https://arxiv.org/pdf/2303.12346.pdf

bnew · Jun 26, 2023

bnew said:
my goodneess! it opened it's mouth.

edit:

whoa

https://s3.amazonaws.com/moonup/production/uploads/60a551a34ecc5d054c8ad93e/asSF_QiOtZ-Iqv2fV2-QE.mp4

project page:

Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold

Synthesizing visual content that meets users' needs often requires flexible and precise controllability of the pose, shape, expression, and layout of the generated objects. Existing approaches gain controllability of generative adversarial networks (GANs) via manually annotated training data or...

vcai.mpi-inf.mpg.de

https://archive.is/RA2GZ

GitHub - XingangPan/DragGAN: Official Code for DragGAN (SIGGRAPH 2023)

Official Code for DragGAN (SIGGRAPH 2023). Contribute to XingangPan/DragGAN development by creating an account on GitHub.

github.com

Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold

About

Official Code for DragGAN (SIGGRAPH 2023)

DEMO:

DragGan - Drag Your GAN - a Hugging Face Space by radames

Discover amazing ML apps made by the community

huggingface.co

GooPunch · Jun 26, 2023

@bnew What do you think of George Hotz' claim that GPT-4 is a mixture of experts with 8x 220B models? That would explain the all the recent Microsoft research papers on optimizing for smaller models.

The A.I Megathread (LLM , GPT , Development)

Veteran

PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360°​

Abstract​

Veteran

Veteran

Veteran

Banned

Veteran

Veteran

Veteran

CloneCleaner​

Veteran

MosaicML Agrees to Join Databricks to Power Generative AI for All​

Veteran

Veteran

Banned

Veteran

NUWA-XL​

Long Video​

NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation​

Veteran

Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold​

About​

DEMO:​

Pro

PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360°

Abstract

CloneCleaner

MosaicML Agrees to Join Databricks to Power Generative AI for All

NUWA-XL

Long Video

NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation

Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold

About

DEMO: