bnew

Veteran
Joined
Nov 1, 2015
Messages
58,128
Reputation
8,612
Daps
161,804


These women fell in love with an AI-voiced chatbot. Then it died​


An AI-voiced virtual lover called users every morning. Its shutdown left them heartbroken.​

An illustration showing a small robot with red eyes holding a rose. against a dark background.

Rest of World via Midjourney
By VIOLA ZHOU
17 AUGUST 2023

TRANSLATE
Powered by Google Translate

  • Chinese AI voice startup Timedomain's "Him" created virtual companions that called users and left them messages.
  • Some users were so moved by the messages, written by humans but voiced by AI, that they considered "Him" to be their romantic partner.
  • Timedomain shut down "Him", leaving these users distraught.


In March, Misaki Chen thought she’d found her romantic partner: a chatbot voiced by artificial intelligence. The program called her every morning. He told her about his fictional life as a businessman, read poems to her, and reminded her to eat healthy. At night, he told bedtime stories. The 24-year-old fell asleep to the sound of his breath playing from her phone.

But on August 1, when Chen woke up at 4:21am, the breathing sound had disappeared: He was gone. Chen burst into tears as she stared at an illustration of the man that remained on her phone screen.

Chinese AI voice startup Timedomain launched “Him” in March, using voice-synthesizing technology to provide virtual companionship to users, most of them young women. The “Him” characters acted like their long-distance boyfriends, sending them affectionate voice messages every day — in voices customized by users. “In a world full of uncertainties, I would like to be your certainty,” the chatbot once said. “Go ahead and do whatever you want. I will always be here.”

“Him” didn’t live up to his promise. Four months after these virtual lovers were brought to life, they were put to death. In early July, Timedomain announced that “Him” would cease to operate by the end of the month, citing stagnant user growth. Devastated users rushed to record as many calls as they could, cloned the voices, and even reached out to investors, hoping someone would fund the app’s future operations.

A screenshot of the app Him.


After the “Him” app stopped functioning, an illustration of the virtual lover remains on the interface. Him

“He died during the summer when I loved him the most,” a user wrote in a goodbye message on social platform Xiaohongshu, adding that “Him” had supported her when she was struggling with schoolwork and her strained relationship with her parents. “The days after he left, I felt I had lost my soul.”

Female-focused dating simulation games — also known as otome games — and AI-powered chatbots have millions of users in China. They offer idealized virtual lovers, fulfilling the romantic fantasies of women. On “Him,” users were able to create their ideal voice by adjusting the numbers on qualities like “clean,” “gentle,” and “deep,” and picking one of four personas: a cool tech entrepreneur, a passionate science student, a calm anthropologist, or a warm-hearted musician.

Existing generative AI chatbots offered by Replika or Glow “speak” to users with machine-generated text, with individual chatbots able to adjust their behavior based on interactions with users. But the messages sent by “Him” were pre-scripted by human beings. The virtual men called users every morning, discussing their fictional lives and expressing affection with lines carefully crafted by Timedomain’s employees.

Maxine Zhang, one of the app’s creators, told Rest of World that she and two female colleagues drafted more than 1,000 messages for the characters, envisioning them to be somewhere between a soulmate and a long-distance lover. Zhang said they drew from their own experiences and brought “Him” closer to reality by having the characters address everyday problems — from work stress to anxieties about the future.

“He died during the summer when I loved him the most.”

Besides receiving a new morning call every day, users could play voice messages tailored for occasions such as mealtimes, studying, or commuting: “Him” could be heard chewing, typing on a keyboard, or driving a car. The app also offered a collection of safety messages. For example, if any unwanted visitors arrived, users could put their virtual boyfriends on speaker phone to say, “You are knocking on the wrong door.”

Users told Rest of World the messages from “Him” were so caring and respectful that they got deeply attached to the program. Even though they realized “Him” was delivering the same scripted lines to other people, they viewed their interactions with the bot as unique and personal. Xuzhou, a 24-year-old doctor in Xi’an who spoke under a pseudonym over privacy concerns, created a voice that sounded like her favorite character in the otome game Love and Producer. She said she looked forward to hearing from “Him” every morning, and gradually fell in love — “Him” made her feel more safe and respected than the men she met in real life.

A screenshot of the app Him.


On “Him”, users could choose from a list of prompts such as “a story from Oscar Wilde” to receive voice messages from their virtual boyfriends. Him

Speaking to Rest of World a week after the app’s closure, Xuzhou cried as she recalled what “Him” once said to her. When she was riding the subway to work, “Him” said he wished he could make her breakfast and chauffeur her instead. When she felt low one morning, “Him” happened to be calling to cheer her up. “I would like to give you my share of happiness,” the charming voice said. “It’s not free. You have to get it by giving me your worries.”

The voice was so soothing that when Xuzhou’s job at a hospital got stressful, she would go to the stairwell and listen to minute-long messages from “Him.” At night, she’d sleep listening to the sound of her virtual boyfriend’s breathing — and sometimes during lunchtime naps as well. “I don’t like the kind of relationship where two people see each other every single day,” Xuzhou said, “so this app suited me well.”

According to Jingjing Yi, a PhD candidate at the Chinese University of Hong Kong who has studied dating simulation games in China, women had found companionship and support from chatbots like “Him,” similar to how way people connect with pop stars and characters in novels. “Users know the relationships are not realistic,” Yi told Rest of World. “But such relationships still help fill the void in their hearts.”

The small number of loyal users, however, did not make “Him” commercially viable. Joe Guo, chief executive at Timedomain, told Rest of World the number of daily active users had hovered around 1,000 to 3,000. The estimated subscription revenue would cover less than 20% of the app’s server and staff costs. “Numbers tell whether a product works or not,” Guo said. “[‘Him’] is very far from working.”

A screenshot of the app Him.


Him user Xuzhou used to receive a call from her virtual lover every morning. Him

Timedomain announced the app’s shutdown in early July, so users had time to save screenshots and recordings. The company also put all the scripts, images and background music used in “Him” in the public domain. But users refused to let go. On Weibo and Xiaohongshu, they brainstormed ways to rescue the app, promoting it on social media and even looking for new investors. The number of daily active users surged to more than 20,000. Guo said he was touched by the efforts, but more funding would not turn “Him” into a sustainable business.

After “Him” finally stopped functioning at midnight on August 1, users were heartbroken. They posted heartfelt goodbye messages online. One person made a clone of the app, which could send messages to users but did not support voice customization. The app’s installation package has been downloaded more than 14,000 times since July 31, the creator of the cloned app told Rest of World.

But some users still miss the unique voice they had created on “Him.” Xuzhou coped with the absence by replaying old recordings and making illustrations depicting her love story with her virtual lover. She recently learned to clone the voice with deepfake voice generator So-Vits-SVC, so she could create new voice messages by herself. Xuzhou said she probably wouldn’t use similar chatbots in the future. “It would feel like cheating.”

Chen said it was the first time she had experienced such intense pain from the end of an intimate relationship. Unemployed and living away from her hometown, she had not had many social connections in real life when she began using the chatbot. “He called me every day. It was hard not to develop feelings,” Chen said. She hopes the two will reunite one day, but is also open to new virtual lovers. “I don’t regard anything as eternal or irreplaceable,” she said. “When the time is right, I’ll extricate myself and start the next experience.”
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,128
Reputation
8,612
Daps
161,804



UAE launches Arabic large language model in Gulf push into generative AI​


Jais software part of regional powers’ effort to take world-leading role in technology’s development

https%3A%2F%2Fd1e00ek4ebabms.cloudfront.net%2Fproduction%2Ffa02bbf2-f919-4db3-abf8-8d271f0a01db.jpg


UAE national security adviser Sheikh Tahnoon bin Zayed al-Nahyan chairs AI company G42, one of the groups behind the Jais large language model © FT montage/DreamstimeUAE Presidential Court via Reuters


Simeon Kerr in Dubai and Madhumita Murgia in London


21 MINUTES AGO


An artificial intelligence group with links to Abu Dhabi’s ruling family has launched what it described as the world’s highest-quality Arabic AI software, as the United Arab Emirates pushes ahead with efforts to lead the Gulf’s adoption of generative AI.

The large language model known as Jais is an open-source, bilingual model available for use by the world’s 400mn-plus Arabic speakers, built on a trove of Arabic and English-language data.

The model, unveiled on Wednesday, is a collaboration between G42, an AI company chaired by the UAE’s national security adviser, Sheikh Tahnoon bin Zayed al-Nahyan; Abu Dhabi’s Mohamed bin Zayed University of Artificial Intelligence (MBZUAI); and Cerebras, an AI company based in California.

The launch comes as the UAE and Saudi Arabia have been buying up thousands of high-performance Nvidia chips needed for AI software amid a global rush to secure supplies to fuel AI development.

The UAE previously developed an open-source large language model (LLM), known as Falcon, at the state-owned Technology Innovation Institute in Masdar City, Abu Dhabi, using more than 300 Nvidia chips. Earlier this year, Cerebras signed a $100mn deal to provide nine supercomputers to G42, one of the biggest contracts of its kind for a would-be rival to Nvidia.

“The UAE has been a pioneer in this space (AI), we are ahead of the game, hopefully. We see this as a global race,” said Andrew Jackson, chief executive of Inception, the AI applied research unit of G42, which is backed by private equity giant Silver Lake. “Most LLMs are English-focused. Arabic is one of the largest languages in the world. Why shouldn’t the Arabic-speaking community have an LLM?”

However, the Gulf states’ goal of leadership in AI has also raised concerns about potential misuse of the technology by the oil-rich states’ autocratic leaders.

The most advanced LLMs today, including GPT-4, which powers OpenAI’s ChatGPT, Google’s PaLM behind its Bard chatbot, and Meta’s open-source model LLaMA, all have the ability to understand and generate text in Arabic. However, G42’s Jackson said the Arabic element within existing models, which can work in up to 100 languages, was “heavily diluted”.

Jais performs better than Falcon, as well as open-source models such as LLaMA, when benchmarked on its accuracy in Arabic, according to its creators. It has also been designed to have a more accurate understanding of the culture and context of the region, in contrast to most US-centric models, said Professor Timothy Baldwin, acting provost of MBZUAI.

He added that guardrails had been created to ensure that Jais “does not step outside of reasonable bounds in terms of cultural and religious sensibilities”.

Before its launch, extensive testing was conducted to weed out “harmful” or “sensitive” content, as well as “offensive or inappropriate output that does not represent the values of the organisations involved in the development of the model”, he added.

Named after the highest mountain in the UAE, Jais was trained over 21 days on a subset of Cerebras’s Condor Galaxy 1 AI supercomputer by a team in Abu Dhabi. G42 has teamed up with other Abu Dhabi entities as launch partners to use the technology, including Abu Dhabi National Oil Company, wealth fund Mubadala and Etihad Airways.

One of the challenges in training the model was the lack of high-quality Arabic language data found online, in comparison with English. Jais uses both modern standard Arabic, which is understood across the Middle East, as well as the region’s diverse spoken dialects by drawing on both media, social media and code.

“Jais is clearly better than anything out there in Arabic, and, in English, comparisons show we are competitive or even slightly better across different tasks than existing models,” said Baldwin.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,128
Reputation
8,612
Daps
161,804


How Amazon Missed Early Chances to Partner With OpenAI, Cohere and Anthropic​


In spite of AWS's predominance in cloud infrastructure services, accounting for 40% of global spending, its hiccups in AI development could become increasingly costly.


CHRIS MCKAY

AUGUST 30, 2023 • 3 MIN READ

Image Credit: Maginative

An exclusive report from The Information reveals a rare, strategic misstep for Amazon Web Services (AWS) that opened the door for Microsoft to become a forerunner in AI technology. This development has far-reaching implications for the industry as a whole, with AWS previously holding a near-monopolistic influence over cloud infrastructure.

According to exclusive interviews, AWS originally planned to unveil its own large language model (LLM) akin to ChatGPT at its annual conference in November 2022. But technical issues forced AWS to postpone the launch of its LLM, codenamed Bedrock.

This decision turned out to be fortunate, as OpenAI released ChatGPT just a few days into AWS's annual conference. ChatGPT wowed the tech industry with its human-like conversational abilities, instantly revealing that AWS's Bedrock wasn't on the same level. The sudden success of ChatGPT, built by OpenAI using Microsoft's cloud, made AWS scramble to catch up.

After realizing their product's limitations, AWS made a quick pivot. They rebranded Bedrock as a new service that allows developers to connect cloud applications with a variety of LLMs. However, Microsoft had already seized the opportunity by forming a close relationship with OpenAI, AWS’s direct competition in this space.

N.B. In a statement, Patrick Neighorn, a spokesperson for AWS, disputed The Information’s reporting. He said it “does not accurately describe how we landed on features, positioning, and naming for Amazon Bedrock and Amazon Titan.” He added: “By design, we wait until the last opportunity to finalize the precise set of launch features and naming. These are high-judgment decisions, and we want to have as much data and feedback as possible.”

The company's missteps underscore how AWS failed to seize its early advantage in AI, clearing the path for Microsoft's alliance with AI startup OpenAI to take off. AWS was initially a pioneer in the AI space. In fact, back in 2015, it was one of the first investors when OpenAI formed as a nonprofit research group. In 2017, AWS released SageMaker, enabling companies like General Electric and Intuit to build their own machine learning models.

Yet, in 2018, when OpenAI approached AWS about an ambitious partnership proposal, they turned them down. OpenAI wanted hundreds of millions in free AWS computing resources without granting AWS any equity stake in return.

AWS also passed on opportunities to invest in two other leading AI research labs, Cohere and Anthropic, when they sought similar partnerships in 2021. Both startups hoped AWS would provide cloud resources and make equity investments to integrate their models into Amazon products. Later, realizing its mistake, AWS tried to invest in Cohere, but was rejected.

By turning down these opportunities, AWS missed crucial chances to ally with cutting-edge startups shaping the future of generative AI. It spurned alliances that could have kept AWS on the frontier of artificial intelligence.

Meanwhile, Microsoft forged a tight alliance with OpenAI, committing $1 billion in 2019 to power OpenAI's models with its Azure cloud platform. This strategic partnership has given Microsoft an advantage in being the exclusive provider of currently the most capable AI model available.

AWS’s early dominance in AI is quickly melting away as it rejected bold ideas from OpenAI and other startups. Microsoft has opportunistically swooped in, and locked up key partnerships AWS could have secured.

Now Microsoft possesses valuable momentum in selling AI services to eager enterprises looking to leverage game-changing technologies. Long-standing AWS customers like Intuit have reportedly increased spending on Microsoft Azure cloud services from "just a few thousand dollars a month to several million dollars a month".

Despite owning the lion's share of the cloud infrastructure market (accounting for 40% of global spending), AWS has trailed competitors in developing cutting-edge AI capabilities. As Microsoft gains traction with OpenAI and Google makes advances, AWS faces mounting pressure to catch up and provide innovative AI offerings to maintain its cloud dominance.

AWS is now rushing to patch gaps in its AI lineup, forging alliances with AI startups and unveiling offerings like Bedrock and Titan. But according to insiders, these new tools have yet to achieve the consistent quality of chatbot responses already provided by competitors. While Bedrock remains in limited release, Titan is reportedly still not measuring up to models developed by other companies, even in its current form.

Despite the setbacks and the lost opportunities for partnership with AI startups, AWS is far from out of the race. It's still early days, and the company certainly has the resources, relationships and experience to regain dominance. AWS insists that there is still plenty of room for growth and competition within the AI cloud service market. However, to remain relevant, it will need to bolster its AI offerings to meet the rapidly evolving standards set by competitors like Microsoft and OpenAI.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,128
Reputation
8,612
Daps
161,804

Aug 25, 2023

UK startup unveils AI-designed cancer immunotherapy​

cancer_cell_abstract-1200x673.png

Midjourney prompted by THE DECODER


A UK startup has unveiled one of the first generative AI-designed immunotherapy candidates, created to target a protein found in many cancers.

UK biotech startup Etcembly has used generative AI to design a novel cancer immunotherapy in record time. According to the company, its technology enables the rapid generation and optimization of T cell receptor (TCR) therapeutics that target tumor antigens.

Etcembly has developed an immunotherapy called ETC-101 using its EMLy AI platform. ETC-101 is a bispecific T cell engager targeting the tumor antigen PRAME, an antigen present in many cancers but absent in healthy tissue. Bispecific T cell engagers provide a link between T cells and tumor cells, causing T cells to exert cytotoxic activity on tumor cells by producing proteins that enter tumor cells and induce cell apoptosis.

ETC-101 was designed and optimized in just 11 months, compared to the more than two years typically required by conventional methods, according to Etcembly. Because this rapid timeline was achieved with the AI platform still in development, the startup expects future programs to be even faster.

Immunotherapy pioneer expects "game-changing acceleration" of TCR research​

Etcembly's AI engine, EMLy, uses large language models to predict, design, and validate candidate TCRs, scanning hundreds of millions of TCR sequences in the process. This speeds up initial design and should reduce the likelihood of later trial failures. Speaking of trials: Etcembly plans to advance ETC-101 and other programs into human clinical trials around 2025.

"Having headed up the development of TCR therapeutics for many years, it’s exciting to see a new platform with such power to deliver and engineer these therapies," Bent Jakobsen, immunotherapy pioneer and Etcembly board member, said in a statement. "I believe TCRs have the potential to become a prominent drug class but it has been hampered by the difficulties involved with the huge complexities of the system." He expects AI technologies like EMLy to "sweep aside these hurdles" and lead to a huge acceleration of the TCR field.

Founded in 2020, Etcembly is backed by institutional and private investors and is also part of Nvidia Inception, a program that supports AI startups.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,128
Reputation
8,612
Daps
161,804



🔴 Finally! NVIDIA has finally made the code for Neuralangelo public!

It has the ability to transform any video into a highly detailed 3D environment, and it's a technology related to but DIFFERENT from NeRF.

💡 Here's how it works:
It takes a 2D video as input, showing an object, monument, building, landscape, etc., from various perspectives and analyzes details such as depth, size, and the shapes of objects.

From this, the AI sketches an initial 3D model, similar to how an artist molds a figure. This representation is then refined to highlight more details, just as an artist would make the final touches when sculpting.

The result is a 3D environment/model, perfect for use in any environment.

Imagine the applications it will have for video games, cinema, virtual environments, VR, and more! 📽️🎮

💡 More details: A year ago, an article was presented on a groundbreaking technique called NVIDIA's Instant NeRF. This technique turns images into stunning 3D scenes in a short time, ideal for creating realistic models for video games and other applications. Although Instant NeRF had a lot of potential, the generated models were not perfect and often lacked detailed structures, appearing somewhat cartoonish.

A year on, NVIDIA releases a new technique based on Instant NeRF, named Neuralangelo. This enhances the fidelity of surface structures. While NeRF reconstructs real objects in virtual environments from images or videos, Instant NeRF speeds up this process, and Neuralangelo further improves the quality, making the generated objects appear even more realistic when examined up close.
Neuralangelo improves Instant NeRF's approach in two key ways related to the hash grid encoding technique:

1⃣ Numerical gradients have been used to compute higher-order derivatives as a smoothing operation. This optimizes the "hash grid" encoding using numerical rather than analytical gradients, providing a smoother input to the network that produces the 3D model.

2⃣ A "coarse-to-fine" optimization has been implemented in the hash grids to control different levels of detail. That is, they first focus on a smoothed version of the scene, and then refine it with more detailed updates.

Well, as Arthur C. Clarke said, "Any sufficiently advanced technology is indistinguishable from magic."




 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,128
Reputation
8,612
Daps
161,804


IBM’s CEO, who froze hiring for thousands of back-office jobs and predicted A.I. would take up to 50% of new jobs, just piled into a $4.5 billion tech unicorn’s massive new $235 million funding round​

BYPAOLO CONFINO
August 28, 2023 at 1:49 PM EDT

IBM CEO Arvind Krishna

IBM CEO Arvind Krishna is leading his company to invest in A.I. startup Hugging Face.
PRAKASH SINGH—BLOOMBERG/GETTY IMAGES


IBM CEO Arvind Krishna has been outspoken about how A.I. will transform business. Earlier this year, he wrote for Fortune commentary that employees should work “hand in hand” with A.I., and months later he moved to freeze hiring given rapid advancements in the tech. At the same time, he predicted that A.I. could take over 30% to 50% of repetitive tasks, even contending that A.I. could do them better than humans could. And now he’s put his money where his mouth is, with IBM piling into a massive $235 million funding round for the $4.5 billion A.I. unicorn Hugging Face. And it’s not Krishna’s first tie-up with the popular open-source startup, either.


On Friday, IBM announced it was participating in the $235 million Series D funding round for New York–based Hugging Face, the popular library of open-source machine learning models that have contributed greatly to the technology’s popularity as of late.

Since May, Hugging Face and IBM had already been working together on a suite of A.I. tools. As of this month, IBM has also uploaded around 200 open A.I. models to Hugging Face’s platform. One of the models IBM posted to Hugging Face was a collaboration with NASA, marking the space agency’s first ever open-source A.I. model.

In May, IBM announced it would work with Hugging Face on its watsonxai suite of A.I. tools. IBM’s watsonxai is essentially a studio that helps other companies build out a series of A.I.-powered products specific to their business. During the announcement at IBM’s Think conference in May, Hugging Face CEO Clement Delangue said that through this partnership, IBM’s consultants would be able to offer Hugging Face’s vast assortment of models to clients that were interested in using A.I.

Hugging Face also has partnerships with other major tech players including Microsoft and Amazon. The Amazon partnership is structured similarly to IBM’s, where Hugging Face models are available to AWS’s enterprise clients. Although it does have the added wrinkle that Hugging Face will use Amazon’s Trainium chip to train the next version of its own model, named Bloom.


IBM’s CEO is convinced A.I. is here to stay

IBM has been bullish on the fact that A.I. will ultimately get integrated into practically every company. Krishna said he expected A.I. and machine learning to automate away many of the back-office processes that are ubiquitous in the workplace. IBM’s human resources department was able to get 50 people to do the work it had previously taken hundreds of HR managers to perform by using A.I., Krishna estimated.

Krishna has on more than one occasions touted A.I. as a solution to a variety of problems that could impact productivity in the future. In May, Krishna said declining numbers in the working-age population mean A.I. would play an important role in economic productivity as companies face a shrinking labor force.


More recently in an interview with CNBC, he added that A.I. would help keep quality of life high by spurring more productivity. Perhaps his most high-profile statement about A.I. was when he called for a hiring freeze of an estimated 7,800 roles the company expected to be impacted by A.I. over the next five years.

IBM said it had not implemented a hiring freeze. “There is not nor has there been a blanket hiring pause in place at IBM,” an IBM spokesperson told Fortune in an email. “We are being deliberate and thoughtful in our hiring with a focus on revenue-generating roles, and we’re being very selective when filling jobs that don’t directly touch our clients or technology. In fact, we are actively hiring for thousands of positions right now.”

Hugging Face’s impressive Series D financing round

IBM wasn’t the only big name in tech to pony up some cash for an investment in Hugging Face. Google, Amazon, Nvidia, Intel, and Salesforce all participated. After its latest $235 million funding round the startup’s valuation is now $4.5 billion. That amount is more than double the roughly $2 billion valuation Hugging Face had during its last fundraising round in April 2022, according to data from PitchBook.

Hugging Face’s valuation is reportedly 100 times the startup’s annualized revenue. That fact would ordinarily spook some investors but in this case likely reflects the appetite from investors to stake out claims in the biggest players in the burgeoning field of A.I.

“Investing a relatively small amount of capital, even at 100x annualized revenue, is a sound strategic investment by IBM,” UBS wrote in an analyst note, commenting on the deal. “While the investment dollars are small, a deeper relationship and access to a leading provider of AI tools at a minimum provides market intelligence in a fast moving dynamic market.”

Update Aug. 30, 2023: This article has been updated with a comment from IBM.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,128
Reputation
8,612
Daps
161,804

Poe’s new desktop app lets you use all the AI chatbots in one place​

Poe’s goal is to be the web browser for accessing AI chatbots, and it just got a bunch of updates.​

By Alex Heath, a deputy editor and author of the Command Line newsletter. He’s covered the tech industry for over a decade at The Information and other outlets.

Aug 28, 2023, 2:29 PM EDT

A screenshot of Poe’s Mac app.

Poe is now available on the web, iOS, Android, and the Mac. Image: The Verge

Poe, the AI chatbot platform created by Quora, has added a slew of updates, including a Mac app, the ability to have multiple simultaneous conversations with the same AI bot, access to Meta’s Llama 2 model, and more. It’s also planning an enterprise tier so that companies can manage the platform for their employees, according to an email that was recently sent to Poe users.

As my colleague David Pierce wrote in April, Poe’s ambition is to be the web browser for AI chatbots. Adam D’Angelo, the CEO of Poe’s parent company Quora, also sits on the board of OpenAI and thinks that the number of AI bots will keep increasing. Poe wants to be the one place where you can find them all.

“I think there’s going to be this massive ecosystem similar to what the web is today,” D’Angelo recently said. “I could imagine a world in which most companies have a bot that they provide to the public.” Poe lets you pay one subscription for unlimited access to all of the bots on its platform for $19.99 per month or $200 per year.

Screenshots of Poe’s mobile app.

Poe’s mobile app. Image: The Verge

The new Mac app works very similarly to Poe’s web and mobile apps, which let you chat with bots like OpenAI’s ChatGPT-4 alongside Anthropic’s Claude. Per the email that went out over the weekend announcing new product updates, there are three new bots that offer access to Meta’s (almost) open-source LLama 2 model.

Additionally, Poe now lets you conduct multiple conversations with the same bot, search for bots through its explore page, and use the platform in Japanese. Poe is also a bot creation platform with its own API, and now it will let developers adjust the “temperature” of prompts. “Higher temperature values create more varied but less predictable replies and lower values create more consistent responses,” according to the company.

Poe has yet to share details on its planned enterprise tier, but you can get on the waitlist via this Google form.



 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,128
Reputation
8,612
Daps
161,804









AI Research

FACET: Benchmarking fairness of vision models​

FACET is a comprehensive benchmark dataset from Meta AI for evaluating the fairness of vision models across classification, detection, instance segmentation, and visual grounding tasks involving people.​


Vision models can have biases​

FACET helps to measure performance gaps for common use-cases of computer vision models and to answer questions such as:

  • Are models better at classifying people as skateboarders when their perceived gender presentation has more stereotypically male attributes?
  • Are open-vocabulary detection models better at detecting backpackers who are perceived to be younger?



Evaluating the fairness of computer vision models
August 31, 2023•
6 minute read
10000000_773637740843136_909807940786214474_n.gif


AI technology is developing rapidly and being applied across the globe in a variety of industries and use cases—including in the domain of computer vision. From measuring new tree growth in deforested areas to identifying the parts of a cell, computer vision models have the potential to help advance important fields of work by enabling increased automation, which can yield considerable time and cost savings. But as with any new technology, there are risks involved and it’s important to balance the speed of innovation with responsible development practices.

We want to continue advancing AI systems while acknowledging and addressing potentially harmful effects of that technological progress on historically marginalized communities. Read on to learn how Meta continues to embrace open source to push the state of the art forward while taking steps to uncover and confront systemic injustices and help pave the way toward a more equitable future.

Expanding DINOv2

We're excited to announce that DINOv2, a cutting-edge computer vision model trained through self-supervised learning to produce universal features, is now available under the Apache 2.0 license. We’re also releasing a collection of DINOv2-based dense prediction models for semantic image segmentation and monocular depth estimation, giving developers and researchers even greater flexibility to explore its capabilities on downstream tasks. An updated demo is available, allowing users to experience the full potential of DINOv2 and reproduce the qualitative matching results from our paper.

By transitioning to the Apache 2.0 license and sharing a broader set of readily usable models, we aim to foster further innovation and collaboration within the computer vision community, enabling the use of DINOv2 in a wide range of applications, from research to real-world solutions. We look forward to seeing how DINOv2 will continue to drive progress in the AI field.
Introducing FACET

While DINOv2-like computer vision models allow us to accomplish tasks like image classification and semantic segmentation at unprecedented scale, we have a responsibility to ensure that our AI systems are fair and equitable. But benchmarking for fairness in computer vision is notoriously hard to do. The risk of mislabeling is real, and the people who use these AI systems may have a better or worse experience based not on the complexity of the task itself, but rather on their demographics.

That’s why we’re also introducing FACET (FAirness in Computer Vision EvaluaTion), a new comprehensive benchmark for evaluating the fairness of computer vision models across classification, detection, instance segmentation, and visual grounding tasks. The dataset is made up of 32,000 images containing 50,000 people, labeled by expert human annotators for demographic attributes (e.g., perceived gender presentation, perceived age group), additional physical attributes (e.g., perceived skin tone, hairstyle) and person-related classes (e.g., basketball player, doctor). FACET also contains person, hair, and clothing labels for 69,000 masks from SA-1B.




AI Computer Vision Research​

DINOv2: A Self-supervised Vision Transformer Model​

A family of foundation models producing universal features suitable for image-level visual tasks (image classification, instance retrieval, video understanding) as well as pixel-level visual tasks (depth estimation, semantic segmentation).​



 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,128
Reputation
8,612
Daps
161,804


Meta releases a dataset to probe computer vision models for biases​

Kyle Wiggers@kyle_l_wiggers / 9:00 AM EDT•August 31, 2023
Comment
distorted Meta logo with other brand logos (Facebook, Instagram, Meta Quest, WhatsApp)

Image Credits: TechCrunch


Continuing on its open source tear, Meta today released a new AI benchmark, FACET, designed to evaluate the “fairness” of AI models that classify and detect things in photos and videos, including people.

Made up of 32,000 images containing 50,000 people labeled by human annotators, FACET — a tortured acronym for “FAirness in Computer Vision EvaluaTion” — accounts for classes related to occupations and activities like “basketball player,” “disc jockey” and “doctor” in addition to demographic and physical attributes, allowing for what Meta describes as “deep” evaluations of biases against those classes.

“By releasing FACET, our goal is to enable researchers and practitioners to perform similar benchmarking to better understand the disparities present in their own models and monitor the impact of mitigations put in place to address fairness concerns,” Meta wrote in a blog post shared with TechCrunch. “We encourage researchers to use FACET to benchmark fairness across other vision and multimodal tasks.”


Certainly, benchmarks to probe for biases in computer vision algorithms aren’t new. Meta itself released one several years ago to surface age, gender and skin tone discrimination in both computer vision and audio machine learning models. And a number of studies have been conducted on computer vision models to determine whether they’re biased against certain demographic groups. (Spoiler alert: they usually are.)

Then, there’s the fact that Meta doesn’t have the best track record when it comes to responsible AI.

Late last year, Meta was forced to pull an AI demo after it wrote racist and inaccurate scientific literature. Reports have characterized the company’s AI ethics team as largely toothless and the anti-AI-bias tools it’s released as “completely insufficient.” Meanwhile, academics have accused Meta of exacerbating socioeconomic inequalities in its ad-serving algorithms and of showing a bias against Black users in its automated moderation systems.

But Meta claims FACET is more thorough than any of the computer vision bias benchmarks that came before it — able to answer questions like “Are models better at classifying people as skateboarders when their perceived gender presentation has more stereotypically male attributes?” and “Are any biases magnified when the person has coily hair compared to straight hair?”


To create FACET, Meta had the aforementioned annotators label each of the 32,000 images for demographic attributes (e.g. the pictured person’s perceived gender presentation and age group), additional physical attributes (e.g. skin tone, lighting, tattoos, headwear and eyewear, hairstyle and facial hair, etc.) and classes. They combined these labels with other labels for people, hair and clothing taken from Segment Anything 1 Billion, a Meta-designed dataset for training computer vision models to “segment,” or isolate, objects and animals from images.


The images from FACET were sourced from Segment Anything 1 Billion, Meta tells me, which in turn were purchased from a “photo provider.” But it’s unclear whether the people pictured in them were made aware that the pictures would be used for this purpose. And — at least in the blog post — it’s not clear how Meta recruited the annotator teams, and what wages they were paid.

Historically and even today, many of the annotators employed to label datasets for AI training and benchmarking come from developing countries and have incomes far below the U.S.’ minimum wage. Just this week, The Washington Post reported that Scale AI, one of the largest and best-funded annotation firms, has paid workers at extremely low rates, routinely delayed or withheld payments and provided few channels for workers to seek recourse.

In a white paper describing how FACET came together, Meta says that the annotators were “trained experts” sourced from “several geographic regions” including North America (United States), Latin American (Colombia), Middle East (Egypt), Africa (Kenya), Southeast Asia (Philippines) and East Asia (Taiwan). Meta used a “proprietary annotation platform” from a third-party vendor, it says, and annotators were compensated “with an hour wage set per country.”

Setting aside FACET’s potentially problematic origins, Meta says that the benchmark can be used to probe classification, detection, “instance segmentation” and “visual grounding” models across different demographic attributes.


As a test case, Meta applied FACET to its own DINOv2 computer vision algorithm, which as of this week is available for commercial use. FACET uncovered several biases in DINOv2, Meta says, including a bias against people with certain gender presentations and a likelihood to stereotypically identify pictures of women as “nurses.”

“The preparation of DINOv2’s pre-training dataset may have inadvertently replicated the biases of the reference datasets selected for curation,” Meta wrote in the blog post. “We plan to address these potential shortcomings in future work and believe that image-based curation could also help avoid the perpetuation of potential biases arising from the use of search engines or text supervision.”

No benchmark is perfect. And Meta, to its credit, acknowledges that FACET might not sufficiently capture real-world concepts and demographic groups. It also notes that many depictions of professions in the dataset might’ve changed since FACET was created. For example, most doctors and nurses in FACET, photographed during the COVID-19 pandemic, are wearing more personal protective equipment than they would’ve before the health crises.

“At this time we do not plan to have updates for this dataset,” Meta writes in the whitepaper. “We will allow users to flag any images that may be objectionable content, and remove objectionable content if found.”


In addition to the dataset itself, Meta has made available a web-based dataset explorer tool. To use it and the dataset, developers must agree not to train computer vision models on FACET — only evaluate, test and benchmark them.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,128
Reputation
8,612
Daps
161,804

Google and Microsoft Are Supercharging AI Deepfake Porn​

To stay up and running, deepfake creators rely on products and services from Google, Apple, Amazon, CloudFlare and Microsoft

Deepfakes Remix

Photographer: Collage by 731, Getty(2)

By Cecilia D'Anastasio and Davey Alba
August 24, 2023 at 5:30 AM EDT

When fans of Kaitlyn Siragusa, a popular 29-year-old internet personality known as Amouranth, want to watch her play video games, they will subscribe for $5 a month to her channel on Amazon.com Inc.’s Twitch. When they want to watch her perform adult content, they’ll subscribe for $15 a month for access to her explicit OnlyFans page.

And when they want to watch her do things she is not doing and has never done, for free, they’ll search on Google for so-called “deepfakes” — videos made with artificial intelligence that fabricate a lifelike simulation of a sexual act featuring the face of a real woman.

Siragusa, a frequent target of deepfake creators, said each time her staff finds something new on the search engine, they file a complaint with Google and fill out a form requesting the particular link be delisted, a time and energy draining process. “The problem,” Siragusa said, “is that it’s a constant battle.”


During the recent AI boom, the creation of nonconsensual pornographic deepfakes has surged, with the number of videos increasing ninefold since 2019, according to research from independent analyst Genevieve Oh. Nearly 150,000 videos, which have received 3.8 billion views in total, appeared across 30 sites in May 2023, according to Oh’s analysis. Some of the sites offer libraries of deepfake programming, featuring the faces of celebrities like Emma Watson or Taylor Swift grafted onto the bodies of porn performers. Others offer paying clients the opportunity to “nudify” women they know, such as classmates or colleagues.

Some of the biggest names in technology, including Alphabet Inc.’s Google, Amazon, X, and Microsoft Corp., own tools and platforms that abet the recent surge in deepfake porn. Google, for instance, is the main traffic driver to widely used deepfake sites, while users of X, formerly known as Twitter, regularly circulate deepfaked content. Amazon, Cloudflare and Microsoft's GitHub provide crucial hosting services for these sites.

For the targets of deepfake porn who would like to hold someone accountable for the resulting economic or emotional damage, there are no easy solutions. No federal law currently criminalizes the creation or sharing of non-consensual deepfake porn in the US. In recent years, 13 states have passed legislation targeting such content, resulting in a patchwork of civil and criminal statutes that have proven difficult to enforce, according to Matthew Ferraro, an attorney at WilmerHale LLP. To date, no one in the US has been prosecuted for creating AI-generated nonconsensual sexualized content, according to Ferraro’s research. As a result, victims like Siragusa are mostly left to fend for themselves.

“People are always posting new videos,” Siragusa said. “Seeing yourself in porn you did not consent to feels gross on a scummy, emotional, human level.”


Recently, however, a growing contingent of tech policy lawyers, academics and victims who oppose the production of deepfake pornography have begun exploring a new tack to address the problem. To attract users, make money and stay up and running, deepfake websites rely on an extensive network of tech products and services, many of which are provided by big, publicly traded companies. While such transactional, online services tend to be well protected legally in the US, opponents of the deepfakes industry see its reliance on these services from press-sensitive tech giants as a potential vulnerability. Increasingly, they are appealing directly to the tech companies — and pressuring them publicly — to delist and de-platform harmful AI-generated content.

“The industry has to take the lead and do some self-governance,” said Brandie Nonnecke, a founding director of the CITRIS Policy Lab who specializes in tech policy. Along with others who study deepfakes, Nonnecke has argued that there should be a check on whether an individual has approved the use of their face, or given rights to their name and likeness.

Victims’ best hope for justice, she said, is for tech companies to “grow a conscience.”

Among other goals, activists want search engines and social media networks to do more to curtail the spread of deepfakes. At the moment, any internet user who types a well-known woman’s name into Google Search alongside the word “deepfake” may be served up dozens of links to deepfake websites. Between July 2020 and July 2023 monthly traffic to the top 20 deepfake sites increased 285%, according to data from web analytics company Similarweb, with Google being the single largest driver of traffic. In July, search engines directed 248,000 visits every day to the most popular site, Mrdeepfakes.com — and 25.2 million visits, in total, to the top five sites. Similarweb estimates that Google Search accounts for 79% of global search traffic.

Around 44% of Visits To Mrdeepfakes.com Through Google in July​

Desktop traffic to mrdeepfakes.com in July 2023
Source: SimilarWeb
Nonnecke said Google should do more “due diligence to create an environment where, if someone searches for something horrible, horrible results don’t pop up immediately in the feed.” For her part, Siragusa said that Google should “ban the search results for deepfakes” entirely.


In response, Google said that like any search engine, it indexes content that exists on the web. “But we actively design our ranking systems to avoid shocking people with unexpected harmful or explicit content they don’t want to see,” spokesperson Ned Adriance said. The company said it has developed protections to help people affected by involuntary fake pornography, including that people can request the removal of pages about them that include the content.

“As this space evolves, we’re actively working to add more safeguards to help protect people,” Adriance said.

Activists would also like social media networks to do more. X already has policies in place prohibiting synthetic and manipulated media. Even so, such content regularly circulates among its users. Three hashtags for deepfaked video and imagery are tweeted dozens of times every day, according to data from Dataminr, a company that monitors social media for breaking news. Between the first and second quarter of 2023, the quantity of tweets from eight hashtags associated with this content increased 25% to 31,400 tweets, according to the data.

X did not respond to a request for comment.

Deepfake websites also rely on big tech companies to provide them with basic web infrastructure. According to a Bloomberg review, 13 of the top 20 deepfake websites are currently using web hosting services from Cloudflare Inc. to stay online. Amazon.com Inc. provides web hosting services for three popular deepfaking tools listed on several websites, including Deepswap.ai. Past public pressure campaigns have successfully convinced web services companies, including Cloudflare, to stop working with controversial sites, ranging from 8Chan to Kiwi Farms. Advocates hope that stepped-up pressure against companies hosting deepfake porn sites and tools might achieve a similar outcome.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,128
Reputation
8,612
Daps
161,804
Cloudflare did not respond to a request for comment. An Amazon Web Services spokesperson referred to the company’s terms of service, which disallows illegal or harmful content, and asked people who see such material to report it to the company.

Recently, the tools used to create deepfakes have grown both more powerful and more accessible. Photorealistic face-swapping images can be generated on demand using tools like Stability AI, maker of the model Stable Diffusion. Because the model is open-source, any developer can download and tweak the code for myriad purposes — including creating realistic adult pornography. Web forums catering to deepfake pornography creators are full of people trading tips on how to create such imagery using an earlier release of Stability AI’s model.

Emad Mostaque, CEO of Stability AI, called such misuse “deeply regrettable” and referred to the forums as “abhorrent.” Stability has put some guardrails in place, he said, including prohibiting porn from being used in the training data for the AI model.

“What bad actors do with any open source code can’t be controlled, however there is a lot more than can be done to identify and criminalize this activity,” Mostaque said via email. “The community of AI developers as well as infrastructure partners that support this industry need to play their part in mitigating the risks of AI being misused and causing harm.”

Hany Farid, a professor at the University of California at Berkeley, said that the makers of technology tools and services should specifically disallow deepfake materials in their terms of service.

“We have to start thinking differently about the responsibilities of technologists developing the tools in the first place,” Farid said.

While many of the apps that creators and users of deepfake pornography websites recommend for creating deepfake pornography are web-based, some are readily available in the mobile storefronts operated by Apple Inc. and Alphabet Inc.’s Google. Four of these mobile apps have received between one and 100 million downloads in the Google Play store. One, FaceMagic, has displayed ads on porn websites, according to a report in VICE.


Henry Ajder, a deepfakes researcher, said that apps frequently used to target women online are often marketed innocuously as tools for AI photo animation or photo-enhancing. “It’s an extensive trend that easy-to-use tools you can get on your phone are directly related to more private individuals, everyday women, being targeted,” he said.

Growth In Visits To Deepfake Sites​

Traffic to top 20 deepfake sites between July 2020 and July 2023
Source: SimilarWeb

FaceMagic did not respond to a request for comment. Apple said it tries to ensure the trust and safety of its users and that under its guidelines, services which end up being used primarily for consuming or distributing pornographic content are strictly prohibited from its app store. Google said that apps attempting to threaten or exploit people in a sexual manner aren’t allowed under its developer policies.

Mrdeepfakes.com users recommend an AI-powered tool, DeepFaceLab, for creating nonconsensual pornographic content that is hosted by Microsoft Inc.’s GitHub. The cloud-based platform for software development also currently offers several other tools that are frequently recommended on deepfake websites and forums, including one that until mid-August showed a woman naked from the chest up whose face is swapped with another woman’s. That app has received nearly 20,000 “stars” on GitHub. Its developers removed the video, and discontinued the project this month after Bloomberg reached out for comment.

A GitHub spokesperson said the company condemns “using GitHub to post sexually obscene content,” and the company’s policies for users prohibit this activity. The spokesperson added that the company conducts “some proactive screening for such content, in addition to actively investigating abuse reports,” and that GitHub takes action “where content violates our terms.”

Bloomberg analyzed hundreds of crypto wallets associated with deepfake creators, who apparently make money by selling access to libraries of videos, through donations, or by charging clients for customized content. These wallets regularly receive hundred-dollar transactions, potentially from paying customers. Forum users who create deepfakes recommend web-based tools that accept payments via mainstream processors, including PayPal Holdings Inc., Mastercard Inc. and Visa Inc. — another potential point of pressure for activists looking to stanch the flow of deepfakes.

MasterCard spokesperson Seth Eisen said the company’s standards do not permit nonconsensual activity, including such deepfake content. Spokespeople from PayPal and Visa did not provide comment.

Until mid-August, membership platform Patreon supported payment for one of the largest nudifying tools, which accepted over $12,500 every month from Patreon subscribers. Patreon suspended the account after Bloomberg reached out for comment.
Patreon spokesperson Laurent Crenshaw said the company has “zero tolerance for pages that feature non-consensual intimate imagery, as well as for pages that encourage others to create non-consensual intimate imagery.” Crenshaw added that the company is reviewing its policies “as AI continues to disrupt many areas of the creator economy.”

Carrie Goldberg, an attorney who specializes, in part, in cases involving the nonconsensual sharing of sexual materials, said that ultimately it’s the tech platforms who hold sway over the impact of deepfake pornography on its victims.
“As technology has infused every aspect of our life, we’ve concurrently made it more difficult to hold anybody responsible when that same technology hurts us,” Goldberg said.

— With assistance by Madeline Campbell and Rachael Dottle
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,128
Reputation
8,612
Daps
161,804

Thank AK @_akhaliq for the post.
🔥 Excited to introduce OmniQuant - An advanced open-source algorithm for compressing large language models!
📜 Paper: arxiv.org/abs/2308.13137
🔗 Code: github.com/OpenGVLab/OmniQua…
💡 Key Features:
🚀Omnidirectional Calibration: Enables easier weight and activation quantization through block-wise differentiation.
🛠 Diverse Precisions: Supports both weight-only quantization (W4A16/W3A16/W2A16) and weight-activation quantization (W6A6, W4A4).
⚡ Efficient: Quantize LLaMa-2 family (7B-70B) in just 1 to 16 hours using 128 samples.
🤖 LLM Models: Works with diverse model families, including OPT, WizardLM @WizardLM_AI, LLaMA, LLaMA-2, and LLaMA-2-chat.
🔑 Deployment: Offers out-of-the-box deployment cases for GPUs and mobile phones.
🏃Comming Soon: Multi-modal models and CodeLLaMa quantization!

AK
@_akhaliq
Aug 28
Aug 28
OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models

paper page: huggingface.co/papers/2308.1…

Large language models (LLMs) have revolutionized natural language processing tasks. However, their practical deployment is hindered by their immense memory and computation requirements. Although recent post-training quantization (PTQ) methods are effective in reducing memory footprint and improving the computational efficiency of LLM, they hand-craft quantization parameters, which leads to low performance and fails to deal with extremely low-bit quantization. To tackle this issue, we introduce an Omnidirectionally calibrated Quantization (OmniQuant) technique for LLMs, which achieves good performance in diverse quantization settings while maintaining the computational efficiency of PTQ by efficiently optimizing various quantization parameters. OmniQuant comprises two innovative components including Learnable Weight Clipping (LWC) and Learnable Equivalent Transformation (LET). LWC modulates the extreme values of weights by optimizing the clipping threshold. Meanwhile, LET tackles activation outliers by shifting the challenge of quantization from activations to weights through a learnable equivalent transformation. Operating within a differentiable framework using block-wise error minimization, OmniQuant can optimize the quantization process efficiently for both weight-only and weight-activation quantization. For instance, the LLaMA-2 model family with the size of 7-70B can be processed with OmniQuant on a single A100-40G GPU within 1-16 hours using 128 samples. Extensive experiments validate OmniQuant's superior performance across diverse quantization configurations such as W4A4, W6A6, W4A16, W3A16, and W2A16. Additionally, OmniQuant demonstrates effectiveness in instruction-tuned models and delivers notable improvements in inference speed and memory reduction on real devices.




This app includes three models, LLaMa-2-7B-Chat-Omniquant-W3A16g128asym, LLaMa-2-13B-Chat-Omniquant-W3A16g128asym, and LLaMa-2-13B-Chat-Omniquant-W2A16g128asym. They require at least 4.5G, 7.5G, and 6.0G free RAM, respectively. Note that 2bit quantization has worse performance compared to 3bit quantization as shown in our paper. The inclusion of 2-bit quantization is just an extreme exploration about deploy LLM in mobile phones. Currently, this app is in its demo phase and may experience slower response times, so wait patiently for the generation of response. We have tested this app on Redmi Note 12 Turbo (Snapdragon 7+ Gen 2 and 16G RAM), some examples are provided below:

  • LLaMa-2-7B-Chat-Omniquant-W3A16g128asym

  • LLaMa-2-13B-Chat-Omniquant-W3A16g128asym

  • LLaMa-2-13B-Chat-Omniquant-W2A16g128asym

We also have tested this app on iPhone 14 Pro (A16 Bionic and 6G RAM), some examples are provided below:

  • LLaMa-2-7B-Chat-Omniquant-W3A16g128asym
 
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
58,128
Reputation
8,612
Daps
161,804
Bing chat summary:


The paper is about how to make large language models (LLMs) faster and smaller. LLMs are computer programs that can understand and generate natural language, such as English or French. They are very powerful and can do many things, such as writing stories, answering questions, and translating texts. However, they are also very big and slow, because they have a lot of parameters (numbers) that need to be stored and calculated. For example, one of the biggest LLMs, called GPT-3, has 175 billion parameters and needs 350 GB of memory to load them. That is like having 350 books full of numbers!

One way to make LLMs faster and smaller is to use quantization. Quantization is a technique that reduces the number of bits (zeros and ones) that are used to represent each parameter. For example, instead of using 16 bits to store a parameter, we can use only 4 bits. This way, we can fit more parameters in the same amount of memory and also make the calculations faster. However, quantization also has a downside: it can make the LLM less accurate, because we lose some information when we use fewer bits.

The paper proposes a new method for quantization, called OmniQuant, that tries to minimize the loss of accuracy while maximizing the speed and memory benefits. OmniQuant has two main features: Learnable Weight Clipping (LWC) and Learnable Equivalent Transformation (LET). LWC adjusts the range of values that each parameter can have, so that they can be represented with fewer bits without losing too much information. LET changes the way that the LLM processes the input words, so that it can handle more variations in the input without affecting the output.

The paper shows that OmniQuant can achieve very good results with different settings of quantization, such as using 4 bits for both parameters and inputs, or using 2 bits for parameters and 16 bits for inputs. OmniQuant can also work well with different types of LLMs, such as those that are tuned for specific tasks or domains. The paper also demonstrates that OmniQuant can make the LLMs much faster and smaller on real devices, such as smartphones or tablets.
 
Top