The A.I Megathread (LLM , GPT , Development)

bnew · Jul 25, 2024

X launches underwhelming Grok-powered 'More About This Account' feature | TechCrunch

Since launching xAI last year, Elon Musk has been using X as a sandbox to test some of the Grok model's AI capabilities. Beyond the basic chatbot, X uses

techcrunch.com

X launches underwhelming Grok-powered ‘More About This Account’ feature

Ivan Mehta

6:10 AM PDT • July 24, 2024
Comment

Image Credits: TechCrunch

Since launching xAI last year, Elon Musk has been using X as a sandbox to test some of the Grok model’s AI capabilities.

Beyond the basic chatbot, X uses the model to summarize news and trending events. Now the platform has launched a new feature called “More About This Account” to provide a few AI-powered sentences about a user. It sounds useful enough, but the results are underwhelming.

Paid users will see the “More About This Account” button when they hover over a handle or display name on the web. They can then click on the button to generate an AI-powered synopsis. At times, Grok offers generic info about the account. Other times, it’s dead wrong.

For instance, when I asked Grok about my colleagues Sean O’Kane and Amanda Silberling, Grok gave me a few generic sentences about the type of posts they put out on the platform.

Image Credits:Screenshot by TechCrunch

And alarmingly, Grok made my colleague Jagmeet Singh an expert on Canada, though he hasn’t posted much on the topic.

Image Credits:Screenshot by TechCrunch

Ram Iyer, one of the editors at TechCrunch, is apparently a brunch account.

Image Credits:Screenshot by TechCrunch

When I asked about edtech startup Unacademy’s SVP of design and product, Hardik Pandya, Grok told me about Hardik Pandya, the Indian cricketer.

Screenshot-2024-07-24-at-11.18.47-AM-1.jpg

Image Credits:Screenshot by TechCrunch

In X’s defense, the platform does warn about Grok possibly getting things wrong.
“Grok version 1.5 is an early feature and can make mistakes. Verify its outputs,” the message reads.

Even when Grok is not getting the information wrong, it’s not adding very useful beyond what you can easily learn from looking at the account and bio.

Image Credits: Screenshot by TechCrunch

Image Credits:Screenshot by TechCrunch

Given that Grok has access to X’s data, it should have given better information about users with its new feature. The feature needs a rethink.

bnew · Jul 25, 2024

Mark Zuckerberg imagines content creators making AI clones of themselves | TechCrunch

In an interview with internet personality Rowan Cheung, Meta CEO Mark Zuckerberg said that he thinks content creators will want to use AI to scale their reach.

techcrunch.com

Mark Zuckerberg imagines content creators making AI clones of themselves

Kyle Wiggers

7:22 PM PDT • July 23, 2024

Comment

Image Credits: Jeff Bottari / Getty Images

Content creators are busy people. Most spend more than 20 hours a week creating new content for their respective corners of the web. That doesn’t leave much time for audience engagement. But Mark Zuckerberg, Meta’s CEO, thinks that AI could solve this problem.

In an interview with internet personality Rowan Cheung, Zuckerberg laid out his vision for a future in which creators have their own bots, of sorts, that capture their personalities and “business objectives.” Creators will offload some community outreach to these bots to free up time for other, presumably more important tasks, Zuckerberg says.

“I think there’s going to be a huge unlock where basically every creator can pull in all their information from social media and train these systems to reflect their values and their objectives and what they’re trying to do, and then people can can interact with that,” Zuckerberg said. “It’ll be almost like this artistic artifact that creators create that people can kind of interact with in different ways.”

Zuckerberg’s thinking is common in many techno-optimist circles: that AI is an inherent good because it promises to vastly scale up the impact a single person — or organization — can have. (Google, too, has pitched AI-powered tools for creators.) But when productivity comes at the expense of the personal touch, would creators, whose audiences value authenticity, really be the ones to embrace generative AI?

Not helping Zuckerberg’s case, Meta hasn’t exactly delivered a strong sales pitch.

When Meta began to roll out AI-powered bots as a part of its broader Meta AI push earlier this year, it didn’t take long for the bots to fall prey to the many pitfalls of today’s generative AI tech, in particular hallucinations. The Associated Press observed one bot inserting itself into a conversation in a Facebook group for Manhattan moms and claiming it had a child in the NYC school district. Another bot offered to give away a nonexistent camera and A/C in a forum for swapping free items near Boston.

To be fair, Meta’s AI is improving — or so the company claims, at least. The latest release, the Llama 3.1 model family, which will power a number of features across the tech giant’s platforms, is Meta’s most sophisticated yet judging by the benchmarks. But hallucinations — and general mistakes in planning and reasoning — remain an unsolved problem in generative AI, and Meta offers no research breakthroughs there.

It’s tough to imagine creators putting trust in the hands of flawed AI bots to interact with their fans. In the interview, Zuckerberg acknowledges that Meta has to “mitigate some of the concerns” around its use of generative AI and win users’ trust over the long term. This is especially true as some of Meta’s AI training practices are actively driving creators away from its platforms.

bnew · Jul 25, 2024

Bing previews its answer to Google's AI Overviews | TechCrunch

Microsoft's Bing is preparing to launch an answer to Google's AI Overviews -- an AI-powered, info-summarizing feature called Bing generative search.

techcrunch.com

Bing previews its answer to Google’s AI Overviews

Kyle Wiggers

12:06 PM PDT • July 24, 2024

Comment

Image Credits: Microsoft/OpenAI

Microsoft this afternoon previewed its answer to Google’s AI-powered search experiences: Bing generative search.

Available for only a “small percentage” of users at the moment, Bing generative search, underpinned by a combo of large and small generative AI models (mum’s the word on which models exactly), aggregates info from around the web and generates a summary in response to search queries.

For example, if a user searches “What is a spaghetti western?” Bing generative search will show information about the film subgenre’s history and origin and top examples, along with links and sources that show where those details came from. As with Google’s similar AI Overviews feature, there’s an option to dismiss AI-generated summaries for traditional search from the same results page.

“This is another important step in evolving the search experience on Bing and we’re eager to get feedback throughout this journey,” Microsoft writes in a post on its official blog. “We are slowly rolling this out and will take our time, garner feedback, test and learn and work to create a great experience before making this more broadly available … We look forward to sharing more updates in the coming months.”

Image Credits:Bing

Microsoft insists that Bing generative search, which evolves the AI-generated chat answers it launched on Bing in February 2023, “fulfill the intent of the user’s query more effectively.” But much has been written about AI-generated search results gone wrong.

Google’s AI Overviews infamously suggested putting glue on a pizza. Arc Search told one reporter that cut-off toes will eventually grow back. Genspark recommends a few weapons that might be used to kill someone. And Perplexity ripped off news articles written by other outlets, including CNBC, Bloomberg, and Forbes, without giving credit or attribution.

Image Credits:Bing

AI-generated overviews threaten to cannibalize traffic to the sites from which they source their info. Indeed, they already are, with one study finding that AI Overviews could negatively affect about 25% of publisher traffic due to the de-emphasis on article links.

For its part, Microsoft insists that it’s “maintaining the number of clicks to websites” and “look[ing] closely at how generative search impacts traffic to publishers.” The company volunteers no stats to back this up, however, alluding only to “early data” that it’s choosing to keep private for the time being.

That doesn’t instill a ton of confidence.

bnew · Jul 25, 2024

Researchers are training home robots in simulations based on iPhone scans | TechCrunch

Researchers at MIT CSAIL this week are showcasing a new method for training home robots in simulation.

techcrunch.com

Researchers are training home robots in simulations based on iPhone scans

Brian Heater

1:00 PM PDT • July 24, 2024

Comment

Image Credits: Screenshot / YouTube

There’s a long list of reasons why you don’t see a lot of non-vacuum robots in the home. At the top of the list is the problem of unstructured and semi-structured environments. No two homes are the same, from layout to lighting to surfaces to humans and pets. Even if a robot can effectively map each home, the spaces are always in flux.

Researchers at MIT CSAIL this week are showcasing a new method for training home robots in simulation. Using an iPhone, someone can scan a part of their home, which can then be uploaded into a simulation.

Simulation has become a bedrock element of robot training in recent decades. It allows robots to try and fail at tasks thousands — or even millions — of times in the same amount of time it would take to do it once in the real world.

The consequences of failing in simulation are also significantly lower than in real life. Imagine for a moment that teaching a robot to put a mug in a dishwasher required it to break 100 real-life mugs in the process.

“Training in the virtual world in simulation is very powerful, because the robot can practice millions and millions of times,” researcher Pulkit Agrawal says in a video tied to the research. “It might have broken a thousand dishes, but it doesn’t matter, because everything was in the virtual world.”

Much like the robots themselves, however, simulation can only go so far when it comes to dynamic environments like the home. Making simulations as accessible as an iPhone scan can dramatically enhance the robot’s adaptability to different environments.

In fact, creating a robust enough database of environments such as these ultimately makes the system more adaptable when something is inevitably out of place, be it moving a piece of furniture or leaving a dish on the kitchen counter.

Fresh · Jul 25, 2024

I have one question

there's been a million movies about creating AI and then the AI turns against humans...so why do people want to make AI smarter and smarter ?

bnew · Jul 25, 2024

Fresh said:
I have one question

there's been a million movies about creating AI and then the AI turns against humans...so why do people want to make AI smarter and smarter ?

the ability to harness intelligence to advance our society and species will never not appeal to a vast amount of people.

we use stories as warnings and philosophical lessons to learn from and thats why theres a lot of research being done on implementing various AI systems safely. when electricity was first being adopted, there were naysayers who warned it would kill us and they'd cause house fires and they were right but the benefits far outweigh any negatives.

bnew · Jul 25, 2024

TTT models might be the next frontier in generative AI | TechCrunch

TTT models, a new architecture, could effectively replace transformers if they scale up as their creators suggest they will.

techcrunch.com

TTT models might be the next frontier in generative AI

Kyle Wiggers

1:47 PM PDT • July 17, 2024

Comment

Image Credits: Natee127 / Getty Images

After years of dominance by the form of AI known as the transformer, the hunt is on for new architectures.

Transformers underpin OpenAI’s video-generating model Sora, and they’re at the heart of text-generating models like Anthropic’s Claude, Google’s Gemini and GPT-4o. But they’re beginning to run up against technical roadblocks — in particular, computation-related roadblocks.

Transformers aren’t especially efficient at processing and analyzing vast amounts of data, at least running on off-the-shelf hardware. And that’s leading to steep and perhaps unsustainable increases in power demand as companies build and expand infrastructure to accommodate transformers’ requirements.

A promising architecture proposed this month is test-time training (TTT), which was developed over the course of a year and a half by researchers at Stanford, UC San Diego, UC Berkeley and Meta. The research team claims that TTT models can not only process far more data than transformers, but that they can do so without consuming nearly as much compute power.

The hidden state in transformers

A fundamental component of transformers is the “hidden state,” which is essentially a long list of data. As a transformer processes something, it adds entries to the hidden state to “remember” what it just processed. For instance, if the model is working its way through a book, the hidden state values will be things like representations of words (or parts of words).

“If you think of a transformer as an intelligent entity, then the lookup table — its hidden state — is the transformer’s brain,” Yu Sun, a post-doc at Stanford and a co-contributor on the TTT research, told TechCrunch. “This specialized brain enables the well-known capabilities of transformers such as in-context learning.”

The hidden state is part of what makes transformers so powerful. But it also hobbles them. To “say” even a single word about a book a transformer just read, the model would have to scan through its entire lookup table — a task as computationally demanding as rereading the whole book.

So Sun and team had the idea of replacing the hidden state with a machine learning model — like nested dolls of AI, if you will, a model within a model.

It’s a bit technical, but the gist is that the TTT model’s internal machine learning model, unlike a transformer’s lookup table, doesn’t grow and grow as it processes additional data. Instead, it encodes the data it processes into representative variables called weights, which is what makes TTT models highly performant. No matter how much data a TTT model processes, the size of its internal model won’t change.

Sun believes that future TTT models could efficiently process billions of pieces of data, from words to images to audio recordings to videos. That’s far beyond the capabilities of today’s models.

“Our system can say X words about a book without the computational complexity of rereading the book X times,” Sun said. “Large video models based on transformers, such as Sora, can only process 10 seconds of video, because they only have a lookup table ‘brain.’ Our eventual goal is to develop a system that can process a long video resembling the visual experience of a human life.”

Skepticism around the TTT models

So will TTT models eventually supersede transformers? They could. But it’s too early to say for certain.

TTT models aren’t a drop-in replacement for transformers. And the researchers only developed two small models for study, making TTT as a method difficult to compare right now to some of the larger transformer implementations out there.

“I think it’s a perfectly interesting innovation, and if the data backs up the claims that it provides efficiency gains then that’s great news, but I couldn’t tell you if it’s better than existing architectures or not,” said Mike Cook, a senior lecturer in King’s College London’s department of informatics who wasn’t involved with the TTT research. “An old professor of mine used to tell a joke when I was an undergrad: How do you solve any problem in computer science? Add another layer of abstraction. Adding a neural network inside a neural network definitely reminds me of that.”

Regardless, the accelerating pace of research into transformer alternatives points to growing recognition of the need for a breakthrough.

This week, AI startup Mistral released a model, Codestral Mamba, that’s based on another alternative to the transformer called state space models (SSMs). SSMs, like TTT models, appear to be more computationally efficient than transformers and can scale up to larger amounts of data.

AI21 Labs is also exploring SSMs. So is Cartesia, which pioneered some of the first SSMs and Codestral Mamba’s namesakes, Mamba and Mamba-2.

Should these efforts succeed, it could make generative AI even more accessible and widespread than it is now — for better or worse.

]/hr]

[2407.04620v1] Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Computer Science > Machine Learning

[Submitted on 5 Jul 2024]

Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Yu Sun, Xinhao Li, Karan Dalal, Jiarui Xu, Arjun Vikram, Genghan Zhang, Yann Dubois, Xinlei Chen, Xiaolong Wang, Sanmi Koyejo, Tatsunori Hashimoto, Carlos Guestrin

Self-attention performs well in long context but has quadratic complexity. Existing RNN layers have linear complexity, but their performance in long context is limited by the expressive power of their hidden state. We propose a new class of sequence modeling layers with linear complexity and an expressive hidden state. The key idea is to make the hidden state a machine learning model itself, and the update rule a step of self-supervised learning. Since the hidden state is updated by training even on test sequences, our layers are called Test-Time Training (TTT) layers. We consider two instantiations: TTT-Linear and TTT-MLP, whose hidden state is a linear model and a two-layer MLP respectively. We evaluate our instantiations at the scale of 125M to 1.3B parameters, comparing with a strong Transformer and Mamba, a modern RNN. Both TTT-Linear and TTT-MLP match or exceed the baselines. Similar to Transformer, they can keep reducing perplexity by conditioning on more tokens, while Mamba cannot after 16k context. With preliminary systems optimization, TTT-Linear is already faster than Transformer at 8k context and matches Mamba in wall-clock time. TTT-MLP still faces challenges in memory I/O, but shows larger potential in long context, pointing to a promising direction for future research.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as:	arXiv:2407.04620 [cs.LG]
	(or arXiv:2407.04620v1 [cs.LG] for this version)
	[2407.04620] Learning to (Learn at Test Time): RNNs with Expressive Hidden States

[TABLE width="100%"]
[TR]
[td][/td]
[/TR]
[/TABLE]

Submission history

From: Yu Sun [view email]

[v1] Fri, 5 Jul 2024 16:23:20 UTC (897 KB)

https://arxiv.org/pdf/2407.04620v1

Fresh · Jul 25, 2024

bnew said:
the ability to harness intelligence to advance our society and species will never not appeal to a vast amount of people.

we use stories as warnings and philosophical lessons to learn from and thats why theres a lot of research being done on implementing various AI systems safely. when electricity was first being adopted, there were naysayers who warned it would kill us and they'd cause house fires and they were right but the benefits far outweigh any negatives.

I don't have any problem with advancing our society with technology

but AI can be a danger in the future, especially if they become self-aware, and thinking for themselves

I don't think the AI movies were purely philosophical, I think they were a warning not to play God and create an artificial consciousness

bnew · Jul 25, 2024

Fresh said:
I don't have any problem with advancing our society with technology

but AI can be a danger in the future, especially if they become self-aware, and thinking for themselves

I don't think the AI movies were purely philosophical, I think they were a warning not to play God and create an artificial consciousness

A.I in the hands of bad actors will be more of a realistic danger than some artificial consciousness.

Fresh · Jul 25, 2024

bnew said:
A.I in the hands of bad actors will be more of a realistic danger than some artificial consciousness.

I agree, bad people can manipulate AI for evil use

but I still think if we create an AI that's self aware with a consciousness it could be very dangerous to humans too

I don't know if it was in this thread, but two AI's were talking to each other in a computer language nobody understood

but I'm not a doomsday person, I hope you're right

bnew · Jul 25, 2024

bnew · Jul 25, 2024

Spotify adds a Spanish-speaking AI DJ, 'Livi' | TechCrunch

Spotify's AI DJ, which first launched last year, is meant to serve as a smart audio guide that introduces music using a convincingly realistic voice.

techcrunch.com

Spotify adds a Spanish-speaking AI DJ, ‘Livi’

Sarah Perez

9:00 AM PDT • July 17, 2024

Comment

Image Credits: Spotify

Spotify’s AI DJ feature, an AI guide that introduces personalized song selections, is now available in its first language outside of English. On Tuesday, Spotify announced the launch of an AI DJ that speaks Spanish, confirming TechCrunch’s earlier reports of the feature under development. Like the original AI DJ “X,” whose voice was based on an existing Spotify employee (Xavier “X” Jernigan), the Spanish-language voice is also based on a real person — in this case, Olivia “Livi” Quiroz Roa, a senior music editor at Spotify who resides in Mexico City.

Spotify’s AI DJ, which was first launched last year, is meant to serve as a smart audio guide that introduces music using a convincingly realistic voice. For Spotify, the feature isn’t just a way for it to dabble around with new AI technology, it’s also a way to increase the consumption of music on its app. According to its internal data, DJ users listen to a lot of music, and the use of the feature has continued to grow with an increase of more than 200% over the last year.

The AI DJ itself is built using technologies from both OpenAI and Sonantic, an AI voice platform Spotify acquired in 2022. DJ X was initially picked to model his voice for the AI, which uses the same slang terms and expressions as Jernigan does, in addition to duping the sound of his voice.

Meanwhile, Roa was chosen to be the newest voice for the AI DJ feature after an extensive casting call, Spotify said, during which the company found that her voice resonated the most with listeners. Listeners said the voice sounded relatable and as if they were listening to music recommendations from a friend, the company noted. X will not be going away with “Livi’s” launch, but rather customers will be able to pick which AI DJ they prefer, according to Spotify.

Image Credits:Spotify DJ "Livi"
“We have millions of Spanish-speaking listeners on Spotify, many of whom have been taking to social media to ask about DJ,” according to Spotify’s announcement. “In fact, over the last few months, we’ve seen an over 215% increase in social conversation around DJ in Spanish,” the company said.

In May, TechCrunch reported on the Spanish-speaking AI DJ feature, when new text referencing DJ “Livi” was added to the app’s code, alongside indications that the feature would become available in Mexico. However, Spotify’s announcement indicates “Livi” will become available to Premium listeners in markets where DJ X is already offered and other markets across Latin America: Argentina, Bolivia, Chile, Colombia, Costa Rica, Dominican Republic, Ecuador, El Salvador, Guatemala, Honduras, Mexico, Nicaragua, Panama, Paraguay, Peru, Uruguay and Venezuela.

“Livi” is considered a “beta” test, for the time being.

To use “Livi,” Spotify Premium subscribers can access the feature from the Search tab on the app. From there, they must search for the term “DJ” and press play to begin using the DJ. To switch languages from an existing DJ, users must tap the three-dot menu within the DJ card, then pick between English and Spanish.

bnew · Jul 25, 2024

Meta releases its biggest 'open' AI model yet | TechCrunch

Meta's newest 'open' AI model release is its biggest yet. The company claims the model, Llama 3.1 405B, is competitive with the best commercial releases.

techcrunch.com

Meta releases its biggest ‘open’ AI model yet

Kyle Wiggers

8:00 AM PDT • July 23, 2024

Comment

Image Credits: TOBIAS SCHWARZ/AFP / Getty Images

Meta’s latest open source AI model is its biggest yet.

Today, Meta said it is releasing Llama 3.1 405B, a model containing 405 billion parameters. Parameters roughly correspond to a model’s problem-solving skills, and models with more parameters generally perform better than those with fewer parameters.

At 405 billion parameters, Llama 3.1 405B isn’t the absolute largest open source model out there, but it’s the biggest in recent years. Trained using 16,000 Nvidia H100 GPUs, it also benefits from newer training and development techniques that Meta claims makes it competitive with leading proprietary models like OpenAI’s GPT-4o and Anthropic’s Claude 3.5 Sonnet (with a few caveats).

As with Meta’s previous models, Llama 3.1 405B is available to download or use on cloud platforms like AWS, Azure and Google Cloud. It’s also being used on WhatsApp and Meta.ai, where it’s powering a chatbot experience for U.S.-based users.

New and improved

Like other open and closed source generative AI models, Llama 3.1 405B can perform a range of different tasks, from coding and answering basic math questions to summarizing documents in eight languages (English, German, French, Italian, Portuguese, Hindi, Spanish and Thai). It’s text-only, meaning that it can’t, for example, answer questions about an image, but most text-based workloads — think analyzing files like PDFs and spreadsheets — are within its purview.

Meta wants to make it known that it is experimenting with multimodality. In a paper published today, researchers at the company write that they’re actively developing Llama models that can recognize images and videos, and understand (and generate) speech. Still, these models aren’t yet ready for public release.

To train Llama 3.1 405B, Meta used a dataset of 15 trillion tokens dating up to 2024 (tokens are parts of words that models can more easily internalize than whole words, and 15 trillion tokens translates to a mind-boggling 750 billion words). It’s not a new training set per se, since Meta used the base set to train earlier Llama models, but the company claims it refined its curation pipelines for data and adopted “more rigorous” quality assurance and data filtering approaches in developing this model.

The company also used synthetic data (data generated by other AI models) to fine-tune Llama 3.1 405B. Most major AI vendors, including OpenAI and Anthropic, are exploring applications of synthetic data to scale up their AI training, but some experts believe that synthetic data should be a last resort due to its potential to exacerbate model bias.

For its part, Meta insists that it “carefully balance[d]” Llama 3.1 405B’s training data, but declined to reveal exactly where the data came from (outside of webpages and public web files). Many generative AI vendors see training data as a competitive advantage and so keep it and any information pertaining to it close to the chest. But training data details are also a potential source of IP-related lawsuits, another disincentive for companies to reveal much.

Image Credits:Meta

In the aforementioned paper, Meta researchers wrote that compared to earlier Llama models, Llama 3.1 405B was trained on an increased mix of non-English data (to improve its performance on non-English languages), more “mathematical data” and code (to improve the model’s mathematical reasoning skills), and recent web data (to bolster its knowledge of current events).

Recent reporting by Reuters revealed that Meta at one point used copyrighted e-books for AI training despite its own lawyers’ warnings. The company controversially trains its AI on Instagram and Facebook posts, photos and captions, and makes it difficult for users to opt out. What’s more, Meta, along with OpenAI, is the subject of an ongoing lawsuit brought by authors, including comedian Sarah Silverman, over the companies’ alleged unauthorized use of copyrighted data for model training.

“The training data, in many ways, is sort of like the secret recipe and the sauce that goes into building these models,” Ragavan Srinivasan, VP of AI program management at Meta, told TechCrunch in an interview. “And so from our perspective, we’ve invested a lot in this. And it is going to be one of these things where we will continue to refine it.”

Bigger context and tools

Llama 3.1 405B has a larger context window than previous Llama models: 128,000 tokens, or roughly the length of a 50-page book. A model’s context, or context window, refers to the input data (e.g. text) that the model considers before generating output (e.g. additional text).

One of the advantages of models with larger contexts is that they can summarize longer text snippets and files. When powering chatbots, such models are also less likely to forget topics that were recently discussed.

Two other new, smaller models Meta unveiled today, Llama 3.1 8B and Llama 3.1 70B — updated versions of the company’s Llama 3 8B and Llama 3 70B models released in April — also have 128,000-token context windows. The previous models’ contexts topped out at 8,000 tokens, which makes this upgrade fairly substantial — assuming the new Llama models can effectively reason across all that context.

Image Credits:Meta

All of the Llama 3.1 models can use third-party tools, apps and APIs to complete tasks, like rival models from Anthropic and OpenAI. Out of the box, they’re trained to tap Brave Search to answer questions about recent events, the Wolfram Alpha API for math- and science-related queries, and a Python interpreter for validating code. In addition, Meta claims the Llama 3.1 models can use certain tools they haven’t seen before — to an extent.

bnew · Jul 25, 2024

Building an ecosystem

If benchmarks are to be believed (not that benchmarks are the end-all be-all in generative AI), Llama 3.1 405B is a very capable model indeed. That’d be a good thing, considering some of the painfully obvious limitations of previous-generation Llama models.

Llama 3 405B performs on par with OpenAI’s GPT-4, and achieves “mixed results” compared to GPT-4o and Claude 3.5 Sonnet, per human evaluators that Meta hired, the paper notes. While Llama 3 405B is better at executing code and generating plots than GPT-4o, its multilingual capabilities are overall weaker, and Llama 3 405B trails Claude 3.5 Sonnet in programming and general reasoning.

And because of its size, it needs beefy hardware to run. Meta recommends at least a server node.

That’s perhaps why Meta’s pushing its smaller new models, Llama 3.1 8B and Llama 3.1 70B, for general-purpose applications like powering chatbots and generating code. Llama 3.1 405B, the company says, is better reserved for model distillation — the process of transferring knowledge from a large model to a smaller, more efficient model — and generating synthetic data to train (or fine-tune) alternative models.

To encourage the synthetic data use case, Meta said it has updated Llama’s license to let developers use outputs from the Llama 3.1 model family to develop third-party AI generative models (whether that’s a wise idea is up for debate). Importantly, the license still constrains how developers can deploy Llama models: App developers with more than 700 million monthly users must request a special license from Meta that the company will grant on its discretion.

Image Credits:Meta

That change in licensing around outputs, which allays a major criticism of Meta’s models within the AI community, is a part of the company’s aggressive push for mindshare in generative AI.

Alongside the Llama 3.1 family, Meta is releasing what it’s calling a “reference system” and new safety tools — several of these block prompts that might cause Llama models to behave in unpredictable or undesirable ways — to encourage developers to use Llama in more places. The company is also previewing and seeking comment on the Llama Stack, a forthcoming API for tools that can be used to fine-tune Llama models, generate synthetic data with Llama and build “agentic” applications — apps powered by Llama that can take action on a user’s behalf.

“[What] We have heard repeatedly from developers is an interest in learning how to actually deploy [Llama models] in production,” Srinivasan said. “So we’re trying to start giving them a bunch of different tools and options.”

Play for market share

In an open letter published this morning, Meta CEO Mark Zuckerberg lays out a vision for the future in which AI tools and models reach the hands of more developers around the world, ensuring people have access to the “benefits and opportunities” of AI.

It’s couched very philanthropically, but implicit in the letter is Zuckerberg’s desire that these tools and models be of Meta’s making.

Meta’s racing to catch up to companies like OpenAI and Anthropic, and it is employing a tried-and-true strategy: give tools away for free to foster an ecosystem and then slowly add products and services, some paid, on top. Spending billions of dollars on models that it can then commoditize also has the effect of driving down Meta competitors’ prices and spreading the company’s version of AI broadly. It also lets the company incorporate improvements from the open source community into its future models.

Llama certainly has developers’ attention. Meta claims Llama models have been downloaded over 300 million times, and more than 20,000 Llama-derived models have been created so far.

Make no mistake, Meta’s playing for keeps. It is spending millions on lobbying regulators to come around to its preferred flavor of “open” generative AI. None of the Llama 3.1 models solve the intractable problems with today’s generative AI tech, like its tendency to make things up and regurgitate problematic training data. But they do advance one of Meta’s key goals: becoming synonymous with generative AI.

There are costs to this. In the research paper, the co-authors — echoing Zuckerberg’s recent comments — discuss energy-related reliability issues with training Meta’s ever-growing generative AI models.

“During training, tens of thousands of GPUs may increase or decrease power consumption at the same time, for example, due to all GPUs waiting for checkpointing or collective communications to finish, or the startup or shutdown of the entire training job,” they write. “When this happens, it can result in instant fluctuations of power consumption across the data center on the order of tens of megawatts, stretching the limits of the power grid. This is an ongoing challenge for us as we scale training for future, even larger Llama models.”

One hopes that training those larger models won’t force more utilities to keep old coal-burning power plants around.

bnew · Jul 25, 2024

1/11

Hot take: if you are shipping 10x faster with LLMs that says more about how bad you are at coding than how good LLMs are.

And trust me, I know a lot about being bad at coding.

2/11

Citation:

Why is everyone LYING?

3/11

Your customers don't care about this crap ...they want value...

4/11

Value is coming, adding 85 animations to DSA for beginners course today. Next week launching Database course. After that planning Java for coding interviews as well as many other changes coming soon

5/11

does it matter?

I'm doing my job, shopping them features good.

making money for my company and value for the customers.

when I have a coding assistant, I better be focusing on the problem and get going.

6/11

Th only point I'm making is if you 10x a small number it's not as impressive as it sounds

7/11

LLMs are really good to learn programming with, but an exit strategy is needed so you don't become reliant on them long-term

8/11

Not just hot, a poor one as well.

So essentially you’re saying that pilots using the autopilot are bad at flying?

9/11

Banger video earlier… LLMs are good for single units… that’s it. I’ve found they sped up my unit testing process the most and that is it. They are ok for putting a framework or some boiler plate… but as you pointed out… we gotta use our brains to solve the problems… and LLMs are wrong a lot… I mean a lot!

10/11

11/11

or it’s a measure of the emphasis you put on “good coding” vs. “timely impact” of your solutions

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/5

Hot take: if you are shipping 10x faster with LLMs that says more about how bad you are at coding than how good LLMs are.

And trust me, I know a lot about being bad at coding.

2/5

Have you seen Riley Brown's video on this subject

Think what he's doing is far better case study to look at then the case study you tried to use

3/5

There are a million note taking app examples out there so I'm not surprised ChatGPT can hawk tuah one for you

4/5

Sure there dozen examples that use the same layout neetcode uses for his website.

How many people when building something are just copying and pasting from stackoverflow or using projects on github as a base.

Just because it's not out there building Facebook for scratch doesn't mean the fact that it build out (are chunk of your design layout and some of your simple backend operations while having very knowledge Javascript, css and etc isn't a major productivity hack

5/5

I literally don't know anything about code.

just copy pasta...

for now.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/11

OMG!!!!! 6 Days Ago I had never written a line of code in my life.

But I just deployed my first note taking web app with a back end that stores the data, using Claude, Replit, and Google Firebase.

Just brute forced asked Claude questions alternating between 2 accounts to get around the limits, AND STILL running out of credits on both accounts.

Just about 4.5 hours later I had a custom Notes app in the style that I wanted.

I obviously don't have any programming "hard skills", but my understanding of what is possible with code has increased 1,000,000%...

I would share link but i don't think my code protects my API tokens or whatever... I got confused on that part.

---

Next step is adding perplexity and claude API into the app. :smile:

Posting full video making this on youtube at some point, idk when, its 3AM, off to bed...

much love.

2/11

ok here's the video of me making it

3/11

Wait until your app makes an infinite loop over firebase and watch your credit card getting drowned.

4/11

thanks for bringing this up.

The fact that there's no dollar amount cap that you can set seems messed up:

Do you know of alternatives that allow for a hard dollar cap where it just shuts off after a dollar amount is hit?

5/11

The message from Nvidia’s CEO is becoming a a reality. He said that kids should not learn to code as AI will eventually do it all . They just need to know how to prompt LLM’s . Great video, very encouraging for frustrated non coders like myself

6/11

the future is bright

7/11

THIS is amazing

I’ve been looking for something exactly like this!

Thankyou for the share!

8/11

Love to see this, Riley :smile:

great work.

9/11

Hell yeah. Went thru this phase last year

crazy feeling

10/11

Would love to talk! I’m not super advanced in coding, but have written a decent amount and am an educator by training and have taught code to newbies - might be able to make good content together

11/11

This was so awesome to see!

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

The A.I Megathread (LLM , GPT , Development)

Veteran

X launches underwhelming Grok-powered ‘More About This Account’ feature​

Veteran

Mark Zuckerberg imagines content creators making AI clones of themselves​

Veteran

Bing previews its answer to Google’s AI Overviews​

Veteran

Researchers are training home robots in simulations based on iPhone scans​

Superstar

Veteran

Veteran

TTT models might be the next frontier in generative AI​

​

The hidden state in transformers​

​

Skepticism around the TTT models​

Computer Science > Machine Learning​

Learning to (Learn at Test Time): RNNs with Expressive Hidden States​

Submission history​

Superstar

Veteran

Superstar

Veteran

Veteran

Spotify adds a Spanish-speaking AI DJ, ‘Livi’​

Veteran

Meta releases its biggest ‘open’ AI model yet​

​

New and improved​

​

Bigger context and tools​

Veteran

​

Building an ecosystem​

​

Play for market share​

Veteran

X launches underwhelming Grok-powered ‘More About This Account’ feature

Mark Zuckerberg imagines content creators making AI clones of themselves

Bing previews its answer to Google’s AI Overviews

Researchers are training home robots in simulations based on iPhone scans

TTT models might be the next frontier in generative AI

The hidden state in transformers

Skepticism around the TTT models

Computer Science > Machine Learning

Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Submission history

Spotify adds a Spanish-speaking AI DJ, ‘Livi’

Meta releases its biggest ‘open’ AI model yet

New and improved

Bigger context and tools

Building an ecosystem

Play for market share