bnew

Veteran
Joined
Nov 1, 2015
Messages
56,135
Reputation
8,249
Daps
157,853

Google admits its AI Overviews need work, but we’re all helping it beta test​

Sarah Perez

12:54 PM PDT • May 31, 2024

Comment

Sundar-AI-backdrop-Google-IO.png
Image Credits: Google

Google is embarrassed about its AI Overviews, too. After a deluge of dunks and memes over the past week, which cracked on the poor quality and outright misinformation that arose from the tech giant’s underbaked new AI-powered search feature, the company on Thursday issued a mea culpa of sorts. Google — a company whose name is synonymous with searching the web — whose brand focuses on “organizing the world’s information” and putting it at user’s fingertips — actually wrote in a blog post that “some odd, inaccurate or unhelpful AI Overviews certainly did show up.”

That’s putting it mildly.

The admission of failure, penned by Google VP and Head of Search Liz Reid, seems a testimony as to how the drive to mash AI technology into everything has now somehow made Google Search worse.

In the post titled “About last week,” (this got past PR?), Reid spells out the many ways its AI Overviews make mistakes. While they don’t “hallucinate” or make things up the way that other large language models (LLMs) may, she says, they can get things wrong for “other reasons,” like “misinterpreting queries, misinterpreting a nuance of language on the web, or not having a lot of great information available.”

Reid also noted that some of the screenshots shared on social media over the past week were faked, while others were for nonsensical queries, like “How many rocks should I eat?” — something no one ever really searched for before. Since there’s little factual information on this topic, Google’s AI guided a user to satirical content. (In the case of the rocks, the satirical content had been published on a geological software provider’s website.)

It’s worth pointing out that if you had Googled “How many rocks should I eat?” and were presented with a set of unhelpful links, or even a jokey article, you wouldn’t be surprised. What people are reacting to is the confidence with which the AI spouted back that “ geologists recommend eating at least one small rock per day” as if it’s a factual answer. It may not be a “hallucination,” in technical terms, but the end user doesn’t care. It’s insane.

What’s unsettling, too, is that Reid claims Google “tested the feature extensively before launch,” including with “robust red-teaming efforts.”

Does no one at Google have a sense of humor then? No one thought of prompts that would generate poor results?

In addition, Google downplayed the AI feature’s reliance on Reddit user data as a source of knowledge and truth. Although people have regularly appended “Reddit” to their searches for so long that Google finally made it a built-in search filter, Reddit is not a body of factual knowledge. And yet the AI would point to Reddit forum posts to answer questions, without an understanding of when first-hand Reddit knowledge is helpful and when it is not — or worse, when it is a troll.

Reddit today is making bank by offering its data to companies like Google, OpenAI and others to train their models, but that doesn’t mean users want Google’s AI deciding when to search Reddit for an answer, or suggesting that someone’s opinion is a fact. There’s nuance to learning when to search Reddit and Google’s AI doesn’t understand that yet.

As Reid admits, “forums are often a great source of authentic, first-hand information, but in some cases can lead to less-than-helpful advice, like using glue to get cheese to stick to pizza,” she said, referencing one of the AI feature’s more spectacular failures over the past week.

Google AI overview suggests adding glue to get cheese to stick to pizza, and it turns out the source is an 11 year old Reddit comment from user F*cksmith 😂 pic.twitter.com/uDPAbsAKeO

— Peter Yang (@petergyang) May 23, 2024

If last week was a disaster, though, at least Google is iterating quickly as a result — or so it says.

The company says it’s looked at examples from AI Overviews and identified patterns where it could do better, including building better detection mechanisms for nonsensical queries, limiting the user of user-generated content for responses that could offer misleading advice, adding triggering restrictions for queries where AI Overviews were not helpful, not showing AI Overviews for hard news topics, “where freshness and factuality are important,” and adding additional triggering refinements to its protections for health searches.

With AI companies building ever-improving chatbots every day, the question is not on whether they will ever outperform Google Search for helping us understand the world’s information, but whether Google Search will ever be able to get up to speed on AI to challenge them in return.

As ridiculous as Google’s mistakes may be, it’s too soon to count it out of the race yet — especially given the massive scale of Google’s beta-testing crew, which is essentially anybody who uses search.

“There’s nothing quite like having millions of people using the feature with many novel searches,” says Reid.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,135
Reputation
8,249
Daps
157,853

Discord has become an unlikely center for the generative AI boom​

Amanda Silberling

6:41 AM PDT • May 29, 2024

Comment

Joaquin Phoenix as The Joker
Image Credits: Warner Bros

In the video, a crowd is roaring at a packed summer music festival. As a beat starts playing over the speakers, the performer finally walks onstage: It’s the Joker. Clad in his red suit, green hair and signature face paint, the Joker pumps his fist and dances across the stage, hopping down a runway to get even closer to his sea of fans. When it’s time to start rapping, the Joker flexes his knees and propels himself off the ground, bouncing up and down before doing a 360 turn on one foot. It looks effortless, and yet if you attempted the maneuver, you’d fall flat on your face. The Joker has never been this cool.

Then there’s another video, where NBA All-Star Joel Embiid struts out from backstage to greet the crowd before nailing those same dance moves. Then, it’s “Curb Your Enthusiasm” star Larry David. But in each of these scenes, something is a bit off — whether it’s the Joker, Joel Embiid or Larry David, the performer’s body is shaky, while their facial expressions never change.

Of course, this is all AI-generated, thanks to a company called Viggle.

The original video shows the rapper Lil Yachty taking the stage at the Summer Smash Festival in 2021 — according to the title of a YouTube video with more than 6.5 million views, this entrance is “ the HARDEST walk out EVER.” This turned into a trending meme format in April, as people inserted their favorite celebrities — or their favorite villains, like Sam Bankman-Fried — into the video of Lil Yachty taking the stage.



Text-to-video AI offerings are getting scarily good, but you can’t type “sam bankman-fried as lil yachty at the 2021 summer smash” and expect Sora to know precisely what you mean. Viggle works differently.

On Viggle’s Discord server, users upload a video of someone doing some sort of movement — often a TikTok dance — and a photo of a person. Then, Viggle creates a video of that person replicating the movements from the video. It’s obvious that these videos aren’t real, though they’re still entertaining. But after the Lil Yachty meme went viral, Viggle got hot, and the hype hasn’t subsided.

“We’re focusing on building what we call the controllable video generation model,” Viggle founder Hang Chu told TechCrunch. “When we generate content, we want to control precisely how the character moves, or how the scene looks. But the current tools only focus on the text-to-video side, where the text itself is not sufficient to specify all the visual subtlety.”

According to Chu, Viggle has two main types of users — while some people are making memes, others are using the product as a tool in the production process for game design and VFX.

“For example, a team of animation engineers could take some concept designs and quickly turn them into rough, but quick animation assets,” Chu said. “The whole purpose is to see how they look and feel in the rough sketch of the final plan. This usually takes days, or even weeks for them to manually set up, but with Viggle, this can basically be done instantly and automatically. This saves tons of tedious, repetitive modeling work.”

In March, Viggle’s Discord had a few thousand members. By mid-May, there were 1.8 million members, and with June just days away, Viggle’s server has climbed to over 3 million members. That makes it larger than the servers for games like Valorant and Genshin Impact combined.

Viggle’s growth shows no sign of slowing down, except that the high demand for video generation has made wait times a bit too long for impatient users. But since Viggle is so Discord-centric, Discord’s developer team has worked directly with Viggle to guide the two-year-old startup through its speedy growth.

Fortunately for Viggle, Discord has been through this before. Midjourney, which also operates on Discord, has 20.3 million members on its server, making it the largest single community on the platform. Overall, Discord has about 200 million monthly users.

Viggle-Discord2.jpg
Image Credits:Viggle/Discord

“No one’s ready for that type of growth, so in that virality stage, we start to work with them, because they’re not ready,” Discord’s VP of Product Ben Shanken told TechCrunch. “We have to be ready, because a huge part of the messages being sent right now are Viggle and Midjourney, and a lot of consumption and usage on Discord is actually generative AI.”

For startups like Viggle and Midjourney, building their apps on Discord means they don’t have to build out a whole platform for their users — instead, they’re hosted on a platform that already has a tech-savvy audience, as well as built-in content moderation tools. For Viggle, which has just 15 employees, the support of Discord is crucial.

“We can focus on building the model as the back-end service, while Discord can utilize their infrastructure on the front end, and basically we can iterate faster,” Chu said.

Before Viggle, Chu was an AI researcher at Autodesk, a 3D tools giant. He also did research for companies like Facebook, Nvidia and Google.

For Discord, acting as an accidental SaaS company for AI startups could come at a cost. On one hand, these apps bring a new audience to Discord, and they’re probably good for user metrics. But hosting so much video can be difficult and costly on the tech side, especially when other users across the platform are streaming live video games, video chatting and voice calling. Without a platform like Discord, though, these startups might not be able to grow at the same rate.

“It’s not easy for any type of company to scale, but Discord is built for that type of scale, and we’re able to help them absorb that pretty well,” Shanken said.

While these companies can just adopt Discord’s own content guidelines and use its content moderation apps, it will always be a challenge to make sure that 3 million people are behaving. Even those Lil Yachty walk-out memes technically violate Viggle’s rules, which encourage users to avoid generating images of real people — including celebrities — without their consent.

For now, Viggle’s saving grace could be that its output isn’t 100% realistic yet. The tech is truly impressive, but we know better. That janky Joker animation definitely isn’t real, but it sure is funny.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,135
Reputation
8,249
Daps
157,853

Anthropic hires former OpenAI safety lead to head up new team​

Kyle Wiggers

10:24 AM PDT • May 28, 2024

Comment

Anthropic Claude logo
Image Credits: Anthropic

Jan Leike, a leading AI researcher who earlier this month resigned from OpenAI before publicly criticizing the company’s approach to AI safety, has joined OpenAI rival Anthropic to lead a new “superalignment” team.

In a post on X, Leike said that his team at Anthropic will focus on various aspects of AI safety and security, specifically “scalable oversight,” “weak-to-strong generalization” and automated alignment research.



A source familiar with the matter tells TechCrunch that Leike will report directly to Jared Kaplan, Anthropic’s chief science officer, and that Anthropic researchers currently working on scalable oversight — techniques to control large-scale AI’s behavior in predictable and desirable ways — will move to report to Leike as Leike’s team spins up.



In many ways, Leike’s team sounds similar in mission to OpenAI’s recently dissolved Superalignment team. The Superalignment team, which Leike co-led, had the ambitious goal of solving the core technical challenges of controlling superintelligent AI in the next four years, but often found itself hamstrung by OpenAI’s leadership.

Anthropic has often attempted to position itself as more safety-focused than OpenAI.

Anthropic’s CEO, Dario Amodei, was once the VP of research at OpenAI and reportedly split with OpenAI after a disagreement over the company’s direction — namely OpenAI’s growing commercial focus. Amodei brought with him a number of ex-OpenAI employees to launch Anthropic, including OpenAI’s former policy lead Jack Clark.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,135
Reputation
8,249
Daps
157,853

The rise of ChatCCP​


Inside China's terrifying plan for the future of AI


Xi Jinping standing on a pile of microchips with the Cinese stars on top

China's push to develop its AI industry could usher in a dystopian era of division unlike any we have ever seen before.
Kiran Ridley/Stringer/Getty, jonnysek/Getty, Tyler Le/BI

Linette Lopez
Jun 2, 2024, 5:57 AM EDT

For technology to change the global balance of power, it needn't be new. It must simply be known.

Since 2017, the Chinese Communist Party has laid out careful plans to eventually dominate the creation, application, and dissemination of generative artificial intelligence — programs that use massive datasets to train themselves to recognize patterns so quickly that they appear to produce knowledge from nowhere. According to the CCP's plan, by 2020, China was supposed to have "achieved iconic advances in AI models and methods, core devices, high-end equipment, and foundational software." But the release of OpenAI's ChatGPT in fall 2022 caught Beijing flat-footed. The virality of ChatGPT's launch asserted that US companies — at least for the moment — were l eading the AI race and threw a great-power competition that had been conducted in private into the open for all the world to see.

There is no guarantee that America's AI lead will last forever. China's national tech champions have joined the fray and managed to twist a technology that feeds on freewheeling information to fit neatly into China's constrained information bubble. Censorship requirements may slow China's AI development and limit the commercialization of domestic models, but they will not stop Beijing from benefiting from AI where it sees fit. China's leader, Xi Jinping, sees technology as the key to shaking his country out of its economic malaise. And even if China doesn't beat the US in the AI race, there's still great power, and likely danger, in it taking second place.

"There's so much we can do with this technology. Beijing's just not encouraging consumer-facing interactions," Reva Goujon, a director for client engagement on the consulting firm Rhodium Group's China advisory team, said. "Real innovation is happening in China. We're not seeing a huge gap between the models Chinese companies have been able to roll out. It's not like all these tech innovators have disappeared. They're just channeling applications to hard science."

In its internal documents, the CCP says that it will use AI to shape reality and tighten its grip on power within its borders — for political repression, surveillance, and monitoring dissent. We know that the party will also use AI to drive breakthroughs in industrial engineering, biotechnology, and other fields the CCP considers productive. In some of these use cases, it has already seen success. So even if it lags behind US tech by a few years, it can still have a powerful geopolitical impact. There are many like-minded leaders who also want to use the tools of the future to cement their authority in the present and distort the past. Beijing will be more than happy to facilitate that for them. China's vision for the future of AI is closed-sourced, tightly controlled, and available for export all around the world.



In the world of modern AI, the technology is only as good as what it eats. ChatGPT and other large language models gorge on scores of web pages, news articles, and books. Sometimes this information gives the LLMs food poisoning — anyone who has played with a chatbot knows they sometimes hallucinate or tell lies. Given the size of the tech's appetite, figuring out what went wrong is much more complex than narrowing down the exact ingredient in your dinner that had you hugging your toilet at 2 a.m. AI datasets are so vast, and the calculations so fast, that the companies controlling the models do not know why they spit out bad results, and they may never know. In a society like China — where information is tightly controlled — this inability to understand the guts of the models poses an existential problem for the CCP's grip on power: A chatbot could tell an uncomfortable truth, and no one will know why. The likelihood of that happening depends on the model it's trained on. To prevent this, Beijing is feeding AI with information that encourages positive "social construction."

China's State Council wrote in its 2017 Next Generation Artificial Intelligence Development Plan that AI would be able to "grasp group cognition and psychological changes in a timely manner," which, in turn, means the tech could "significantly elevate the capability and level of social governance, playing an irreplaceable role in effectively maintaining social stability." That is to say, if built to the correct specifications, the CCP believes AI can be a tool to fortify its power. That is why this month, the Cyberspace Administration of China, the country's AI regulator, launched a chatbot trained entirely on Xi's political and economic philosophy, "Xi Jinping Thought on Socialism with Chinese Characteristics for a New Era" (snappy name, I know). Perhaps it goes without saying that ChatGPT is not available for use in China or Hong Kong.


Xi Jinping

The government of China has launched a chatbot trained entirely on Xi Jinping's political and economic philosophy. Xie Huanchi/Xinhua via Getty Images

For the CCP, finding a new means of mass surveillance and information domination couldn't come at a better time. Consider the Chinese economy. Wall Street, Washington, Brussels, and Berlin have accepted that the model that helped China grow into the world's second-largest economy has been worn out and that Beijing has yet to find anything to replace it. Building out infrastructure and industrial capacity no longer provides the same bang for the CCP's buck. The world is pushing back against China's exports, and the CCP's attempts to drive growth through domestic consumption have gone pretty much nowhere. The property market is distortedbeyond recognition, growth has plateaued, and deflation is lingering like a troubled ghost. According to Freedom House, a human-rights monitor, Chinese people demonstrated against government policies in record numbers during the fourth quarter of 2023. The organization logged 952 dissent events, a 50% increase from the previous quarter. Seventy-eight percent of the demonstrations involved economic issues, such as housing or labor. If there's a better way to control people, Xi needs it now.

Ask the Cyberspace Administration of China's chatbot about these economic stumbles, and you'll just get a lecture on the difference between "traditional productive forces" and "new productive forces" — buzzwords the CCP uses to blunt the trauma of China's diminished economic prospects. In fact, if you ask any chatbot operating in the country, it will tell you that Taiwan is a part of China (a controversial topic outside the country, to say the least). All chatbots collect information on the people who use them and the questions they ask. The CCP's elites will be able to use that information gathering and spreading to their advantage politically and economically — but the government doesn't plan to share that power with regular Chinese people. What the party sees will not be what the people see.

"The Chinese have great access to information around the world," Kenneth DeWoskin, a professor emeritus at the University of Michigan and senior China advisor to Deloitte, told me. "But it's always been a two-tiered information system. It has been for 2,000 years."

To ensure this, the CCP has constructed a system to regulate AI that is both flexible enough to evaluate large language models as they are created and draconian enough to control their outputs. Any AI disseminated for public consumption must be registered and approved by the CAC. Registration involves telling the administration things like which datasets the AI was trained on and what tests were run on it. The point is to set up controls that embrace some aspects of AI, while — at least ideally — giving the CCP final approval on what it can and cannot create.

"The real challenge of LLMs is that they are really the synthesis of two things," Matt Sheehan, a researcher and fellow at the Carnegie Endowment for International Peace, told me. "They might be at the forefront of productivity growth, but they're also fundamentally a content-based system, taking content and spitting out content. And that's something the CCP considers frivolous."
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,135
Reputation
8,249
Daps
157,853
In the past few years, the party has shown that it can be ruthless in cutting out technology it considers "frivolous" or harmful to social cohesion. In 2021, it barred anyone under 18 from playing video games on the weekdays, paused the approval of new games for eight months, and then in 2023 announced rules to reduce the public's spending on video games.

But AI is not simply entertainment — it's part of the future of computation. The CCP cannot deny the virality of what OpenAI's chatbot was able to achieve, its power in the US-China tech competition, or the potential for LLMs to boost economic growth and political power through lightning-speed information synthesis.

Ultimately, as Sheehan put it, the question is: "Can they sort of lobotomize AI and LLMs to make the information part a nonfactor?"

Unclear, but they're sure as hell going to try.



For the CCP to actually have a powerful AI to control, the country needs to develop models that suit its purpose — and it's clear that China's tech giants are playing catch-up.

The e-commerce giant Baidu claims that its chatbot, Ernie Bot — which was released to the public in August — has 200 million users and 85,000 enterprise clients. To put that in perspective, OpenAI generated 1.86 billion visits in March alone. There's also the Kimi chatbot from Moonshot AI, a startup backed by Alibaba that launched in October. But both Ernie Bot and Kimi were only recently overshadowed by ByteDance's Doubao bot, which also launched in August. According to Bloomberg, it's now the most downloaded bot in the country, and it's obvious why — Doubao is cheaper than its competitors.

"The generative-AI industry is still in its early stages in China," Paul Triolo, a partner for China and technology policy at the consultancy Albright Stonebridge Group, said. "So you have this cycle where you invest in infrastructure, train, and tweak models, get feedback, then you make an app that makes money. Chinese companies are now in the training and tweaking models phase."

The question is which of these companies will actually make it to the moneymaking phase. The current price war is a race to the bottom, similar to what we've seen in the Chinese technology space before. Take the race to make electric vehicles: The Chinese government started by handing out cash to any company that could produce a design — and I mean any. It was a money orgy. Some of these cars never made it out of the blueprint stage. But slowly, the government stopped subsidizing design, then production. Then instead, it started to support the end consumer. Companies that couldn't actually make a car at a price point that consumers were willing to pay started dropping like flies. Eventually, a few companies started dominating the space, and now the Chinese EV industry is a manufacturing juggernaut.

The generative-AI industry is still in its early stages in China.

Similar top-down strategies, like China's plan to advance semiconductor production, haven't been nearly as successful. Historically, DeWoskin told me, party-issued production mandates have "good and bad effects." They have the ability to get universities and the private sector in on what the state wants to do, but sometimes these actors move slower than the market. Up until 2022, everyone in the AI competition was most concerned about the size of models, but the sector is now moving toward innovation in the effectiveness of data training and generative capacity. In other words, sometimes the CCP isn't skating to where the puck's going to be but to where it is.

There are also signs that the definition of success is changing to include models with very specific purposes. OpenAI CEO Sam Altman said in a recent interview with the Brookings Institution that, for now, the models in most need of regulatory overhead are the largest ones. "But," he added, "I think progress may surprise us, and you can imagine smaller models that can do impactful things." A targeted model can have a specific business use case. After spending decades analyzing how the CCP molds the Chinese economy, DeWoskin told me that he could envision a world where some of those targeted models were available to domestic companies operating in China but not to their foreign rivals. After all, Beijing has never been shy about using a home-field advantage. Just ask Elon Musk.



To win the competition to build the most powerful AI in the world, China must combat not only the US but also its own instincts when it comes to technological innovation. A race to the bottom may simply beggar China's AI ecosystem. A rush to catch up to where the US already is — amid investor and government pressure to make money as soon as possible — may keep China's companies off the frontier of this tech.

"My base case for the way this goes forward is that maybe two Chinese entities push the frontier, and they get all the government support," Sheehan said. "But they're also burdened with dealing with the CCP and a little slower-moving."

This isn't to say we have nothing to learn from the way China is handling AI. Beijing has already set regulations for things like deepfakes and labeling around authenticity. Most importantly, China's system holds people accountable for what AI does — people make the technology, and people should have to answer for what it does. The speed of AI's development demands a dynamic, consistent regulatory system, and while China's checks go too far, the current US regulatory framework lacks systemization. The Commerce Department announced an initiative last month around testing models for safety, and that's a good start, but it's not nearly enough.

The digital curtain AI can build in our imaginations will be much more impenetrable than iron, making it impossible for societies to cooperate in a shared future.

If China has taught us anything about technology, it's that it doesn't have to make society freer — it's all about the will of the people who wield it. The Xi Jinping Thought chatbot is a warning. If China can make one for itself, it can use that base model to craft similar systems for authoritarians who want to limit the information scape in their societies. Already, some Chinese AI companies — like the state-owned iFlytek, a voice-recognition AI — have been hit with US sanctions, in part, for using their technology to spy on the Uyghur population in Xinjiang. For some governments, it won't matter if tech this useful is two or three generations behind a US counterpart. As for the chatbots, the models won't contain the sum total of human knowledge, but they will serve their purpose: The content will be censored, and the checks back to the CCP will clear.

That is the danger of the AI race. Maybe China won't draw from the massive, multifaceted AI datasets that the West will — its strict limits on what can go into and come out of these models will prevent that. Maybe China won't be pushing the cutting edge of what AI can achieve. But that doesn't mean Beijing can't foster the creation of specific models that could lead to advancements in fields like hard sciences and engineering. It can then control who gets access to those advancements within its borders, not just people but also multinational corporations. It can sell tools of control, surveillance, and content generation to regimes that wish to dominate their societies and are antagonistic to the US and its allies.

This is an inflection point in the global information war. If social media harmfully siloed people into alternate universes, the Xi bot has demonstrated that AI can do that on steroids. It is a warning. The digital curtain AI can build in our imaginations will be much more impenetrable than iron, making it impossible for societies to cooperate in a shared future. Beijing is well aware of this, and it's already harnessing that power domestically, why not geopolitically? We need to think about all the ways Beijing can profit from AI now before its machines are turned on the world. Stability and reality depend on it.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,135
Reputation
8,249
Daps
157,853



DeTi​

arXiv
Hugging Face
Colab

Creating high-quality scientific figures can be time-consuming and challenging, even though sketching ideas on paper is relatively easy. Furthermore, recreating existing figures that are not stored in formats preserving semantic information is equally complex. To tackle this problem, we introduce DeTikZify, a novel multimodal language model that automatically synthesizes scientific figures as semantics-preserving TikZ graphics programs based on sketches and existing figures. We also introduce an MCTS-based inference algorithm that enables DeTikZify to iteratively refine its outputs without the need for additional training.
...

Model Weights & Datasets​

We upload all our models and datasets to the Hugging Face Hub. However, please note that for the public release of the DaTikZv2 dataset, we had to remove a considerable portion of TikZ drawings originating from arXiv, as the arXiv non-exclusive license does not permit redistribution. We do, however, release our dataset creation scripts and encourage anyone to recreate the full version of DaTikZv2 themselves.


Acknowledgments​

The implementation of the model architecture is largely based on LLaVA. Our MCTS implementation takes heavy inspiration from VMCTS.


About​

Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,135
Reputation
8,249
Daps
157,853

Apple brings ChatGPT to its apps, including Siri​

Kyle Wiggers

11:38 AM PDT • June 10, 2024

Comment

OpenAI and ChatGPT logos
Image Credits: Didem Mente/Anadolu Agency / Getty Images

Apple is bringing ChatGPT, OpenAI’s AI-powered chatbot experience, to Siri and other first-party apps and capabilities across its operating systems. The tech giant announced the news during a keynote at its annual WorldWide Developer Conference (WWDC) in Cupertino this morning.

“We’re excited to partner with Apple to bring ChatGPT to their users in a new way,” OpenAI CEO Sam Altman said in a statement. “Apple shares our commitment to safety and innovation, and this partnership aligns with OpenAI’s mission to make advanced AI accessible to everyone.”

Soon, Siri will be able to tap ChatGPT for “expertise” where it might be helpful, Apple says. For example, if you need menu ideas for a meal to make for friends using some ingredients from your garden, you can ask Siri, and Siri will automatically feed that info to ChatGPT for an answer after you give it permission to do so.

wwdc24-siri-chatgpt-02.jpg

You can include photos with the questions you ask ChatGPT via Siri, or ask questions related to your docs or PDFs. Apple’s also integrated ChatGPT into system-wide writing tools like Writing Tools, which lets you create content with ChatGPT — including images — or ask an initial idea and send it to ChatGPT to get a revision or variation back.

ChatGPT integrations will arrive on iOS 18, iPadOS 18, and MacOS later this year, Apple says, and will be free without the need to create a ChatGPT or OpenAI account. Initially, they’ll be powered by GPT-4o, OpenAI’s recently introduced flagship generative AI model.

Subscribers to one of OpenAI’s ChatGPT premium plans will be able to access paid features within Siri and Apple’s other apps with ChatGPT integrations. Apple says that privacy protections are “built in” to the integrations — specifically, requests aren’t stored by OpenAI and users’ IP addresses are obscured.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,135
Reputation
8,249
Daps
157,853

'Apple Intelligence' Generative Personal AI Unveiled for iPhone, iPad, and Mac​

Monday June 10, 2024 11:13 am PDT by Tim Hardwick

Apple at WWDC today announced Apple Intelligence, a deeply integrated, personalized AI experience for Apple devices that uses cutting-edge generative artificial intelligence to enhance user experiences across iPhone, iPad, and Mac.

Apple WWDC24 Apple Intelligence hero 240610

Powered by large-language models (LLMs), Apple Intelligence allows your devices to understand languages, images, actions, and personal context.

The feature boasts deep natural language understanding, allowing it to power new writing tools that can be accessed system-wide across Apple platforms. Apple Intelligence allows you to rewrite or proofread text automatically across Mail, Notes, Safari, Pages, Keynote, and third-party apps.

Apple Intelligence is grounded in a user's personal information, allowing it to access data from your apps and what's on your screen. At the same time, thanks to on-device processing, Apple Intelligence is aware of your personal data without collecting your data. And when it needs more compute capacity, it sends only the data needed to fulfill the request to Apple servers based on Apple Silicon, which share the same privacy and security capabilities of your ‌iPhone‌.

At the heart of Apple Intelligence is Siri, which is now more natural, more contextually relevant, and more personal. The virtual assistant allows you to make corrections in real-time and maintains conversational context. ‌Siri‌ will also eventually have on-screen awareness, so if you ask it to do something related to what's going on in an app, it will know what you are talking about, without needing more information. Apple says ‌Siri‌ will be able to understand and take actions in more apps over time, and it includes a semantic index of photos, calendar events, and files.

‌Siri‌ will also help you with your writing. A new feature called Rewrite allows you to change the tone or proofread inline text in Mail and other apps.

Apple Intelligence will automatically summarize notifications, too. A new Focus mode called "Reduce Interruptions" can automatically look at notifications and messages to see if they're important enough to interrupt you.

Apple Intelligence in iOS 18 requires iPhone 15 Pro, while on macOS 15 and iPadOS 17, Macs and iPads powered by M1 Apple silicon or later are required.

Related Roundup: iOS 18

Tags: WWDC 2024, Apple Intelligence
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,135
Reputation
8,249
Daps
157,853










1/1
Apple just announced a ton of incredible AI developments at WWDC.

The 11 most impressive reveals:

1. Using the iPad calculator as a notepad and getting real-time answers

1/10
3. Apple and OpenAI partner up to directly integrate ChatGPT into iOS 18, iPasOS 18, and macOS

2/10
4. Apple Intelligence will allow Siri to have on-screen awareness and take actions on a user's behalf

3/10
5. Tap to cash: A new way to pay someone with another iPhone without a phone number or email

4/10
6. Apple Intelligence feature in Mail understands the content of emails and surfaces the most urgent messages to the top of inboxes

5/10
7. macOS 15 will let you mirror your iPhone from your Mac

6/10
8. Apple Genmoji: A new way to create emojis on-device using Apple Intelligence

7/10
9. Apple's new Image playground allows users to generate images on-device and share them across apps

8/10
10. Since Apple Intelligence understands personal context, AI-generated images can be personalized to users

9/10
11. A new Siri animation

10/10
That's it for my favorite announcements from WWDC!

I share the latest news in AI and tech every day, follow me
@rowancheung to stay up to speed.

If you want to support my content, like/retweet the first tweet of this thread


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

GPvAUStXIAIa9zR.jpg
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,135
Reputation
8,249
Daps
157,853

Apple leaps into AI with an array of upcoming iPhone features and a ChatGPT deal to smarten up Siri​

Monday's showcase seemed aimed at allaying concerns Apple might be losing its edge during the advent of AI technology.

ASSOCIATED PRESS / June 10, 2024

w=1880
Apple CEO Tim Cook speaks during an announcement of new products on the Apple campus in Cupertino, Calif., Monday, June 10, 2024. (AP Photo/Jeff Chiu)

CUPERTINO, Calif. (AP) — Apple jumped into the race to bring generative artificial intelligence to the masses during its World Wide Developers Conference Monday that spotlighted an onslaught of features designed to soup up the iPhone, iPad and Mac.

And in a move befitting a company known for its marketing prowess, the AI technology coming as part of free software updates later this year, is being billed as “Apple Intelligence.”

Even as it tried to put its own stamp on the hottest area of technology, Apple tacitly acknowledged it needed help to catch up with companies like Microsoft and Google, who have emerged as the early leaders in the AI field. Apple is leaning on ChatGPT, made by the San Francisco startup OpenAI, to help make its often-bumbling virtual assistant Siri smarter and more helpful.

“All of this goes beyond artificial intelligence, it's personal intelligence, and it is the next big step for Apple,” Apple CEO Tim Cook said.

Click to accept the cookies for this service

Courthouse News’ podcast Sidebar tackles the stories you need to know from the legal world. Join our hosts as they take you in and out of courtrooms in the U.S. and beyond.

Siri's gateway to ChatGPT will be free to all iPhone users and made available on other Apple products once the option is baked into the next generation of Apple's operating systems. ChatGPT subscribers are supposed to be able to easily sync their existing accounts when using the iPhone, and should get more advanced features than free users would.

To herald the alliance with Apple, OpenAI CEO Sam Altman sat in the front row of the packed conference, which included developers attending from more than 60 countries worldwide.

“Think you will really like it,” Altman predicted in a post about their partnership with Apple.

Beyond giving Siri the ability to tap into ChatGPT's knowledge base. Apple is giving its 13-year-old virtual assistant an extensive makeover designed to make it more personable and versatile, even as it currently fields about 1.5 billion queries a day.

When Apple releases free updates to the software powering the iPhone and its other products this autumn, Siri will signal its presence with flashing lights along the edges of the display screen, and be able to handle hundreds of more tasks — including chores that may require tapping into third-party devices — than it can now, based on Monday's presentations.

The AI-packed updates coming to the next versions of Apple software are meant to enable the billions of people who use its devices to get more done in less time, while also giving them access to creative tools that could liven things up. For instance, Apple will deploy AI to allow people to create emojis, dubbed “Genmojis” on the fly to fit the vibe they are trying to convey.

Monday's showcase seemed aimed at allaying concerns Apple might be losing its edge during the advent of AI technology that is expected to be as revolutionary as the 2007 invention of the Phone. Both Google and Samsung have already released smartphone models touting AI features as their main attractions while Apple has been stuck in an uncharacteristically extended slump in the company’s sales.

AI mania is the main reason that Nvidia, the dominant maker of the chips underlying the technology, has seen its market value rocket from about $300 billion at the end of 2022 to about $3 trillion. The meteoric ride allowed Nvidia to surpass Apple as the second most valuable company in the U.S. Earlier this year, Microsoft also eclipsed the iPhone maker on the strength of its so-far successful push into AI.

Investors didn't seem as impressed with Apple's AI presentation as the crowd that came to the company's Cupertino, California, headquarters to see it. Apple's stock price declined nearly 2% in Monday's trading after Cook walked off the stage.

Despite that negative reaction, Wedbush Securities analyst Dan Ives asserted that Apple is “taking the right path” in a research note that hailed the presentation as a “historical” day for a company that already has reshaped the tech industry and society.

Besides pulling AI tricks out of its toolbox, Apple also used the conference to confirm that it will be rolling out a technology called Rich Communications Service, or RCS, to its iMessage app that should improve the quality and security of texting between iPhones and devices powered by Android software, such as the Samsung Galaxy and Google Pixel.

The change, due out with the next version of iPhone's operating software won't eliminate the blue bubbles denoting texts originating from iPhones and the green bubbles marking text sent from Android devices — a distinction that has become a source of social stigma.

This marked the second straight year that Apple has created a stir at its developers conference by using it to usher in a trendy form of technology that other companies already are on the market with.

Last year, Apple provided an early look at its mixed-reality headset, the Vision Pro, which wasn't released until early 2024. Nevertheless, Apple's push into mixed reality — with a twist that it bills as “spatial computing” — has raised hopes that there will be more consumer interest in this niche technology.

Part of that optimism stems from Apple's history of releasing technology later than others, then using sleek designs and slick marketing campaigns to overcome its tardy start.

Bringing more AI to the iPhone will likely raise privacy concerns — a topic that Apple has gone to great lengths to assure its loyal customers it can be trusted not to peer too deeply into their personal lives. Apple did talk extensively Monday about its efforts to build strong privacy protections and controls around its AI technology.

One way Apple is trying to convince consumers that the iPhone won't be used to spy on them is harnessing its chip technology so most of its AI-powered features are handled on the device itself instead of at remote data centers, often called “the cloud.” Going down this route would also help protect Apple's profit margins because AI processing through the cloud is far more expensive than when it is run solely on a device.

Apple's AI “will be aware of your personal l data without collecting your personal data,” said Craig Federighi, Apple's senior vice president of software engineering.

__

By MICHAEL LIEDTKE AP Technology Writer
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,135
Reputation
8,249
Daps
157,853

1/1
Chinese scientists have developed an #AI hospital called "Agent Hospital" where all doctors, nurses, and patients are driven by large language model (LLM)-powered intelligent agents, aiming to train AI doctors to autonomously evolve medical capabilities through a simulated environment.

In this virtual world, AI doctors can treat 10,000 patients in just a few days and they achieved an impressive 93.06% accuracy rate on the MedQA dataset (US Medical Licensing Exam questions) covering major respiratory diseases.


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GO4ekmUaEAA_odp.jpg

GO4ekmVaIAALEwb.jpg
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,135
Reputation
8,249
Daps
157,853

Computer Science > Computer Vision and Pattern Recognition​

[Submitted on 12 Jun 2024]

What If We Recaption Billions of Web Images with LLaMA-3?​

Xianhang Li, Haoqin Tu, Mude Hui, Zeyu Wang, Bingchen Zhao, Junfei Xiao, Sucheng Ren, Jieru Mei, Qing Liu, Huangjie Zheng, Yuyin Zhou, Cihang Xie
Web-crawled image-text pairs are inherently noisy. Prior studies demonstrate that semantically aligning and enriching textual descriptions of these pairs can significantly enhance model training across various vision-language tasks, particularly text-to-image generation. However, large-scale investigations in this area remain predominantly closed-source. Our paper aims to bridge this community effort, leveraging the powerful and \textit{open-sourced} LLaMA-3, a GPT-4 level LLM. Our recaptioning pipeline is simple: first, we fine-tune a LLaMA-3-8B powered LLaVA-1.5 and then employ it to recaption 1.3 billion images from the DataComp-1B dataset. Our empirical results confirm that this enhanced dataset, Recap-DataComp-1B, offers substantial benefits in training advanced vision-language models. For discriminative models like CLIP, we observe enhanced zero-shot performance in cross-modal retrieval tasks. For generative models like text-to-image Diffusion Transformers, the generated images exhibit a significant improvement in alignment with users' text instructions, especially in following complex queries. Our project page is this https URL
Comments:* denotes equal contributions
Subjects:Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
Cite as:arXiv:2406.08478 [cs.CV]
(or arXiv:2406.08478v1 [cs.CV] for this version)
[2406.08478] What If We Recaption Billions of Web Images with LLaMA-3?
Focus to learn more

Submission history

From: Bingchen Zhao [view email]
[v1] Wed, 12 Jun 2024 17:59:07 UTC (1,642 KB)






Code: GitHub - UCSC-VLAA/Recap-DataComp-1B: This is the official repository of our paper "What If We Recaption Billions of Web Images with LLaMA-3 ?"

Data: UCSC-VLAA/Recap-DataComp-1B · Datasets at Hugging Face

Caption Model: tennant/llava-llama-3-8b-hqedit · Hugging Face
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,135
Reputation
8,249
Daps
157,853








1/3
Artificial Gerbil Intelligence has been achieved internally!

A team at Google DeepMind has built a ‘virtual rodent’, in which an artificial neural network actuates a biomechanically realistic model of the rat. This helps provide a causal, generative model that can reproduce complex animal behaviors, not just correlate with them. The model's internal structure can be analyzed to gain insights that are hard to get from real neural data alone. A virtual rodent predicts the structure of neural activity across behaviors - Nature

2/3
on it equally curious haha

3/3
one way to find out... simulation-facilitated street fights


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GP3TZs9XcAAU8nO.png


Photo of Bence Ölveczky with mouse models.

Bence Ölveczky.

Niles Singer/Harvard Staff Photographer



SCIENCE & TECH

Want to make robots more agile? Take a lesson from a rat.​

Scientists create realistic virtual rodent with digital neural network to study how brain controls complex, coordinated movement

Anne J. Manning

Harvard Staff Writer

June 11, 2024 4 min read

The effortless agility with which humans and animals move is an evolutionary marvel that no robot has yet been able to closely emulate. To help probe the mystery of how brains control and coordinate it all, Harvard neuroscientists have created a virtual rat with an artificial brain that can move around just like a real rodent.

Bence Ölveczky, professor in the Department of Organismic and Evolutionary Biology, led a group of researchers who collaborated with scientists at Google’s DeepMind AI lab to build a biomechanically realistic digital model of a rat. Using high-resolution data recorded from real rats, they trained an artificial neural network — the virtual rat’s “brain” — to control the virtual body in a physics simulator called MuJoco, where gravity and other forces are present. And the results are promising.

Illustration panels showing a virtual rat using movement data recorded from real rats.

Harvard and Google researchers created a virtual rat using movement data recorded from real rats.

Credit: Google DeepMind

Published in Nature, the researchers found that activations in the virtual control network accurately predicted neural activity measured from the brains of real rats producing the same behaviors, said Ölveczky, who is an expert at training (real) rats to learn complex behaviors in order to study their neural circuitry. The feat represents a new approach to studying how the brain controls movement, Ölveczky said, by leveraging advances in deep reinforcement learning and AI, as well as 3D movement-tracking in freely behaving animals.

The collaboration was “fantastic,” Ölveczky said. “DeepMind had developed a pipeline to train biomechanical agents to move around complex environments. We simply didn’t have the resources to run simulations like those, to train these networks.”

Working with the Harvard researchers was, likewise, “a really exciting opportunity for us,” said co-author and Google DeepMind Senior Director of Research Matthew Botvinick. “We’ve learned a huge amount from the challenge of building embodied agents: AI systems that not only have to think intelligently, but also have to translate that thinking into physical action in a complex environment. It seemed plausible that taking this same approach in a neuroscience context might be useful for providing insights in both behavior and brain function.”

Graduate student Diego Aldarondo worked closely with DeepMind researchers to train the artificial neural network to implement what are called inverse dynamics models, which scientists believe our brains use to guide movement. When we reach for a cup of coffee, for example, our brain quickly calculates the trajectory our arm should follow and translates this into motor commands. Similarly, based on data from actual rats, the network was fed a reference trajectory of the desired movement and learned to produce the forces to generate it. This allowed the virtual rat to imitate a diverse range of behaviors, even ones it hadn’t been explicitly trained on.

These simulations may launch an untapped area of virtual neuroscience in which AI-simulated animals, trained to behave like real ones, provide convenient and fully transparent models for studying neural circuits, and even how such circuits are compromised in disease. While Ölveczky’s lab is interested in fundamental questions about how the brain works, the platform could be used, as one example, to engineer better robotic control systems.

A next step might be to give the virtual animal autonomy to solve tasks akin to those encountered by real rats. “From our experiments, we have a lot of ideas about how such tasks are solved, and how the learning algorithms that underlie the acquisition of skilled behaviors are implemented,” Ölveczky continued. “We want to start using the virtual rats to test these ideas and help advance our understanding of how real brains generate complex behavior.”

This research received financial support from the National Institutes of Health.
 
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,135
Reputation
8,249
Daps
157,853
Top