bnew

Veteran
Joined
Nov 1, 2015
Messages
51,805
Reputation
7,926
Daps
148,739

The MPT suite of large language models (LLMs) by MosaicML has become incredibly popular. But, what makes these models so special? Although there are a variety of reasons for the popularity of MPT, I find these models to be especially useful due to a few unique components…

Fully open-source. MPT models, including MPT-7B and MPT-30B, carry an Apache 2.0 license, meaning that they can be used commercially without any limitations. Plus, these models are accompanied by an entire open-source code repository for fine-tuning, evaluating, or even pre-training these models from scratch (see replies for more details). Given that pre-training a base LLM is the most prohibitive/expensive component of any LLM-based system, the MPT foundation series is a great starting point for building specialized LLMs that solve domain-specific problems.

Fast inference. MPT models are based upon a typical, decoder-only transformer architecture. But, they make a few key modifications to this architecture, including:

- Low precision layer norm
- Flash Attention
- ALiBi (instead of normal positional embeddings)

Due to these modifications, MPT models perform inference very quickly (i.e., 1.5-2X faster than similarly-sized LLaMA models) with HuggingFace inference pipelines. Plus, MPT models are completely compatible with libraries like FasterTransformer, which could be used to further boost inference speed.

Context length. Due to their use of ALiBi, MPT-7B and 30B are capable of handling large context windows and can even extrapolate to context lengths that are beyond data seen during training. To show this, MPT-7B is fine-tuned on data with a 64K token context length (derived from books3 corpus of fiction novels). Researchers at MosaicML found that this MPT-StoryWriter-7B model was capable of handling large context lengths and could even extrapolate further to context windows as large as 84K. They even ingested the entire Great Gatsby book and generated an epilogue!

Performance. Finally, MPT models perform really well. MPT-7B achieves performance on-par with LLaMA-7B across a variety of standard benchmarks. MPT-30B lags slightly behind the performance of LLaMA-30B and Falcon-40B on text-based tasks, but it tends to perform better on programming tasks. Plus, MPT-30B seems to exceed the quality of GPT-3. Put simply, these base models are high-quality and serve as a great foundation for creating open-source alternatives to proprietary systems like ChatGPT or GPT-4.


FzvPmCZXoAADvoC


 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,805
Reputation
7,926
Daps
148,739

Fzoc_xJWAAAbsct

FzodABuXsAQASiC

FzodAQXXgAE8a9Y

FzodAixWcAE6ANS



Unifying Large Language Models and Knowledge Graphs: A Roadmap​

Shirui Pan, Linhao Luo, Yufei Wang, Chen Chen, Jiapu Wang, Xindong Wu
Large language models (LLMs), such as ChatGPT and GPT4, are making new waves in the field of natural language processing and artificial intelligence, due to their emergent ability and generalizability. However, LLMs are black-box models, which often fall short of capturing and accessing factual knowledge. In contrast, Knowledge Graphs (KGs), Wikipedia and Huapu for example, are structured knowledge models that explicitly store rich factual knowledge. KGs can enhance LLMs by providing external knowledge for inference and interpretability. Meanwhile, KGs are difficult to construct and evolving by nature, which challenges the existing methods in KGs to generate new facts and represent unseen knowledge. Therefore, it is complementary to unify LLMs and KGs together and simultaneously leverage their advantages. In this article, we present a forward-looking roadmap for the unification of LLMs and KGs. Our roadmap consists of three general frameworks, namely, 1) KG-enhanced LLMs, which incorporate KGs during the pre-training and inference phases of LLMs, or for the purpose of enhancing understanding of the knowledge learned by LLMs; 2) LLM-augmented KGs, that leverage LLMs for different KG tasks such as embedding, completion, construction, graph-to-text generation, and question answering; and 3) Synergized LLMs + KGs, in which LLMs and KGs play equal roles and work in a mutually beneficial way to enhance both LLMs and KGs for bidirectional reasoning driven by both data and knowledge. We review and summarize existing efforts within these three frameworks in our roadmap and pinpoint their future research directions.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,805
Reputation
7,926
Daps
148,739

About​

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

detectron2.readthedocs.io/en/latest/



Support Ukraine - Help Provide Humanitarian Aid to Ukraine.
Detectron2 is Facebook AI Research's next generation library that provides state-of-the-art detection and segmentation algorithms. It is the successor of Detectron and maskrcnn-benchmark. It supports a number of computer vision research projects and production applications in Facebook.


Learn More about Detectron2​

Explain Like I’m 5: Detectron2​
Using Machine Learning with Detectron2​
Explain Like I’m 5: Detectron2
Using Machine Learning with Detectron2

What's New​

  • Includes new capabilities such as panoptic segmentation, Densepose, Cascade R-CNN, rotated bounding boxes, PointRend, DeepLab, ViTDet, MViTv2 etc.
  • Used as a library to support building research projects on top of it.
  • Models can be exported to TorchScript format or Caffe2 format for deployment.
  • It trains much faster.
See our blog post to see more demos and learn about detectron2.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,805
Reputation
7,926
Daps
148,739


About​

MotionGPT: Human Motion as a Foreign Language, a unified motion-language generation model using LLMs

motion-gpt.github.io

Official repo for MotionGPT​

MotionGPT: Human Motion as a Foreign Language

Project Page | Arxiv | Paper

MotionGPT is a unified and user-friendly motion-language model to learn the semantic coupling of two modalities and generate high-quality motions and text descriptions on multiple motion tasks.



Intro MotionGPT​

Though the advancement of pre-trained large language models unfolds, the exploration of building a unified model for language and other multi-modal data, such as motion, remains challenging and untouched so far. Fortunately, human motion displays a semantic coupling akin to human language, often perceived as a form of body language. By fusing language data with large-scale motion models, motion-language pre-training that can enhance the performance of motion-related tasks becomes feasible. Driven by this insight, we propose MotionGPT, a unified, versatile, and user-friendly motion-langzuage model to handle multiple motion-relevant tasks. Specifically, we employ the discrete vector quantization for human motion and transfer 3D motion into motion tokens, similar to the generation process of word tokens. Building upon this “motion vocabulary”, we perform language modeling on both motion and text in a unified manner, treating human motion as a specific language. Moreover, inspired by prompt learning, we pre-train MotionGPT with a mixture of motion-language data and fine-tune it on prompt-based question-and-answer tasks. Extensive experiments demonstrate that MotionGPT achieves state-of-the-art performances on multiple motion tasks including text-driven motion generation, motion captioning, motion prediction, and motion in-between.

pipeline



 
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,805
Reputation
7,926
Daps
148,739

About​

Compare the performance of different LLM that can be deployed locally on consumer hardware. Run yourself with Colab WebUI.

Local LLM Comparison & Colab Links (WIP)​

Compare the performance of different LLM that can be deployed locally on consumer hardware. The expected good response and scores are generated by GPT-4.

For your convenience, I have made the colab notebooks with oobabooga's text generation webui for trying out the models (I'm creating GPTQ colab notebooks as llama.cpp has made some breaking changes) (EDIT 23/06/28: Now GGML supports GPU loading and seems more stable, I'm creating ggml versions.). You can find the links in the table below. For models supported by GPT4ALL, you can simply use their one-click installer and download the models there.

Models tested & average score:​

These models work better among the models I tested on my hardware (i5-12490F, 32GB RAM, RTX 3060 Ti GDDR6X 8GB VRAM): (Note: Because llama.cpp has made some breaking changes to the support of older ggml models. Some older ggml versions listed below may not work properly on current llama.cpp. But there should be GPTQ equivalents or newer ggml versions for the models.)

Model​
Avg_Score​
Colab_Link​
Date_Added​
Link​
wizardLM-13B-1.0-GPTQ (using oobabooga/text-generation-webui)​
9.63​
Open In Colab
2023/05/29​
Nous-Hermes-13B-GPTQ (using oobabooga/text-generation-webui)​
9.56​
Open In Colab
2023/06/03​
Selfee-13B-GPTQ (using oobabooga/text-generation-webui)​
9.31​
Open In Colab
2023/06/07​
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,805
Reputation
7,926
Daps
148,739

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,805
Reputation
7,926
Daps
148,739

About​

Atmospheric adventure chat for AI language models (KoboldAI, NovelAI, Pygmalion, OpenAI chatgpt, gpt-4)

tavernai.net

TavernAI is a adventure atmospheric chat (KoboldAI, NovelAI, Pygmalion, OpenAI)​

Examples of interface and output:
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,805
Reputation
7,926
Daps
148,739

About​

Join us at H2O.ai to make the world's best open-source GPT with document and image Q&A, 100% private chat, no data leaks, Apache 2.0 https://arxiv.org/pdf/2306.08161.pdf Live Demo: h2oGPT

h2o.ai

h2oGPT​

img-small.png Live h2oGPT Demo

For questions, discussing, or just hanging out, come and join our Discord!

Technical Paper: https://arxiv.org/pdf/2306.08161.pdf

h2oGPT is a large language model (LLM) fine-tuning framework and chatbot UI with document(s) question-answer capabilities. Documents help to ground LLMs against hallucinations by providing them context relevant to the instruction. h2oGPT is fully permissive Apache V2 open-source project for 100% private and secure use of LLMs and document embeddings for document question-answer.

Welcome! Join us and make an issue or a PR, and contribute to making the best fine-tuned LLMs, chatbot UI, and document question-answer framework!
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,805
Reputation
7,926
Daps
148,739

The Race to Prevent ‘the Worst Case Scenario for Machine Learning’​


A.I. companies have an edge in blocking the creation and distribution of child sexual abuse material. They’ve seen how social media companies failed.

Dr. Rebecca Portnoff, wearing a white jacket over a blue dress with a silver and blue belt, stands outside with her arms behind her back and hair to the side.


Dr. Rebecca Portnoff, the data science director at Thorn, was an author of a new report that found a small but meaningful uptick in the amount of photorealistic AI-generated child sexual abuse material.Credit...Kristian Thacker for The New York Times


By Issie Lapowsky
June 24, 2023

Dave Willner has had a front-row seat to the evolution of the worst things on the internet.

He started working at Facebook in 2008, back when social media companies were making up their rules as they went along. As the company’s head of content policy, it was Mr. Willner who wrote Facebook’s first official community standards more than a decade ago, turning what he has said was an informal one-page list that mostly boiled down to a ban on “Hitler and naked people” into what is now a voluminous catalog of slurs, crimes and other grotesqueries that are banned across all of Meta’s platforms.

So last year, when the San Francisco artificial intelligence lab OpenAI was preparing to launch Dall-E, a tool that allows anyone to instantly create an image by describing it in a few words, the company tapped Mr. Willner to be its head of trust and safety. Initially, that meant sifting through all of the images and prompts that Dall-E’s filters flagged as potential violations — and figuring out ways to prevent would-be violators from succeeding.



It didn’t take long in the job before Mr. Willner found himself considering a familiar threat.

Just as child predators had for years used Facebook and other major tech platforms to disseminate pictures of child sexual abuse, they were now attempting to use Dall-E to create entirely new ones. “I am not surprised that it was a thing that people would attempt to do,” Mr. Willner said. “But to be very clear, neither were the folks at OpenAI.”

For all of the recent talk of the hypothetical existential risks of generative A.I., experts say it is this immediate threat — child predators using new A.I. tools already — that deserves the industry’s undivided attention.



In a newly published paper by the Stanford Internet Observatory and Thorn, a nonprofit that fights the spread of child sexual abuse online, researchers found that, since last August, there has been a small but meaningful uptick in the amount of photorealistic A.I.-generated child sexual abuse material circulating on the dark web.

According to Thorn’s researchers, this has manifested for the most part in imagery that uses the likeness of real victims but visualizes them in new poses, being subjected to new and increasingly egregious forms of sexual violence. The majority of these images, the researchers found, have been generated not by Dall-E but by open-source tools that were developed and released with few protections in place.

In their paper, the researchers reported that less than 1 percent of child sexual abuse material found in a sample of known predatory communities appeared to be photorealistic A.I.-generated images. But given the breakneck pace of development of these generative A.I. tools, the researchers predict that number will only grow.

“Within a year, we’re going to be reaching very much a problem state in this area,” said David Thiel, the chief technologist of the Stanford Internet Observatory, who co-wrote the paper with Thorn’s director of data science, Dr. Rebecca Portnoff, and Thorn’s head of research, Melissa Stroebel. “This is absolutely the worst case scenario for machine learning that I can think of.”

Dr. Portnoff has been working on machine learning and child safety for more than a decade.

To her, the idea that a company like OpenAI is already thinking about this issue speaks to the fact that this field is at least on a faster learning curve than the social media giants were in their earliest days.



“The posture is different today,” said Dr. Portnoff.

Still, she said, “If I could rewind the clock, it would be a year ago.”

‘We trust people’​


In 2003, Congress passed a law banning “computer-generated child pornography” — a rare instance of congressional future-proofing. But at the time, creating such images was both prohibitively expensive and technically complex.

The cost and complexity of creating these images has been steadily declining, but changed last August with the public debut of Stable Diffusion, a free, open-source text-to-image generator developed by Stability AI, a machine learning company based in London.

In its earliest iteration, Stable Diffusion placed few limits on the kind of images its model could produce, including ones containing nudity. “We trust people, and we trust the community,” the company’s chief executive, Emad Mostaque, told The New York Times last fall.



In a statement, Motez Bishara, the director of communications for Stability AI, said that the company prohibited misuse of its technology for “illegal or immoral” purposes, including the creation of child sexual abuse material. “We strongly support law enforcement efforts against those who misuse our products for illegal or nefarious purposes,” Mr. Bishara said.



Because the model is open-source, developers can download and modify the code on their own computers and use it to generate, among other things, realistic adult pornography. In their paper, the researchers at Thorn and the Stanford Internet Observatory found that predators have tweaked those models so that they are capable of creating sexually explicit images of children, too. The researchers demonstrate a sanitized version of this in the report, by modifying one A.I.-generated image of a woman until it looks like an image of Audrey Hepburn as a child.

Stability AI has since released filters that try to block what the company calls “unsafe and inappropriate content.” And newer versions of the technology were built using data sets that exclude content deemed “not safe for work.” But, according to Mr. Thiel, people are still using the older model to produce imagery that the newer one prohibits.

Unlike Stable Diffusion, Dall-E is not open-source and is only accessible through OpenAI’s own interface. The model was also developed with many more safeguards in place to prohibit the creation of even legal nude imagery of adults. “The models themselves have a tendency to refuse to have sexual conversations with you,” Mr. Willner said. “We do that mostly out of prudence around some of these darker sexual topics.”

The company also implemented guardrails early on to prevent people from using certain words or phrases in their Dall-E prompts. But Mr. Willner said predators still try to game the system by using what researchers call “visual synonyms” — creative terms to evade guardrails while describing the images they want to produce.

“If you remove the model’s knowledge of what blood looks like, it still knows what water looks like, and it knows what the color red is,” Mr. Willner said. “That problem also exists for sexual content.”

‘Open questions’​


Image

Julie Cordua, Thorn’s chief executive, believes there’s still time to prevent AI-generated child abuse from growing.


Julie Cordua, the chief executive of Thorn, a nonprofit that fights the spread of child sexual abuse online, believes there’s still time to prevent AI-generated child abuse from growing.Credit...Stephen Goldstein for The New York Times



Julie Cordua, Thorn’s chief executive, believes there’s still time to prevent AI-generated child abuse from growing.


Thorn has a tool called Safer, which scans images for child abuse and helps companies report them to the National Center for Missing and Exploited Children, which runs a federally designated clearinghouse of suspected child sexual abuse material. OpenAI uses Safer to scan content that people upload to Dall-E’s editing tool. That’s useful for catching real images of children, but Mr. Willner said that even the most sophisticated automated tools could struggle to accurately identify A.I.-generated imagery.

That is an emerging concern among child safety experts: That A.I. will not just be used to create new images of real children but also to make explicit imagery of children who do not exist.

That content is illegal on its own and will need to be reported. But this possibility has also led to concerns that the federal clearinghouse may become further inundated with fake imagery that would complicate efforts to identify real victims. Last year alone, the center’s CyberTipline received roughly 32 million reports.

“If we start receiving reports, will we be able to know? Will they be tagged or be able to be differentiated from images of real children?” said Yiota Souras, the general counsel of the National Center for Missing and Exploited Children.



At least some of those answers will need to come not just from A.I. companies, like OpenAI and Stability AI, but from companies that run messaging apps or social media platforms, like Meta, which is the top reporter to the CyberTipline.

Last year, more than 27 million tips came from Facebook, WhatsApp and Instagram alone. Already, tech companies use a classification system, developed by an industry alliance called the Tech Coalition, to categorize suspected child sexual abuse material by the victim’s apparent age and the nature of the acts depicted. In their paper, the Thorn and Stanford researchers argue that these classifications should be broadened to also reflect whether an image was computer-generated.

In a statement to The New York Times, Meta’s global head of safety, Antigone Davis, said, “We’re working to be purposeful and evidence-based in our approach to A.I.-generated content, like understanding when the inclusion of identifying information would be most beneficial and how that information should be conveyed.” Ms. Davis said the company would be working with the National Center for Missing and Exploited Children to determine the best way forward.

Beyond the responsibilities of platforms, researchers argue that there is more that A.I. companies themselves can be doing. Specifically, they could train their models to not create images of child nudity and to clearly identify images as generated by artificial intelligence as they make their way around the internet. This would mean baking a watermark into those images that is more difficult to remove than the ones either Stability AI or OpenAI have already implemented.

As lawmakers look to regulate A.I., experts view mandating some form of watermarking or provenance tracing as key to fighting not only child sexual abuse material but also misinformation.



“You’re only as good as the lowest common denominator here, which is why you want a regulatory regime,” said Hany Farid, a professor of digital forensics at the University of California, Berkeley.

Professor Farid is responsible for developing PhotoDNA, a tool launched in 2009 by Microsoft, which many tech companies now use to automatically find and block known child sexual abuse imagery. Mr. Farid said tech giants were too slow to implement that technology after it was developed, enabling the scourge of child sexual abuse material to openly fester for years. He is currently working with a number of tech companies to create a new technical standard for tracing A.I.-generated imagery. Stability AI is among the companies planning to implement this standard.

Another open question is how the court system will treat cases brought against creators of A.I.-generated child sexual abuse material — and what liability A.I. companies will have. Though the law against “computer-generated child pornography” has been on the books for two decades, it’s never been tested in court. An earlier law that tried to ban what was then referred to as virtual child pornography was struck down by the Supreme Court in 2002 for infringing on speech.

Members of the European Commission, the White House and the U.S. Senate Judiciary Committee have been briefed on Stanford and Thorn’s findings. It is critical, Mr. Thiel said, that companies and lawmakers find answers to these questions before the technology advances even further to include things like full motion video. “We’ve got to get it before then,” Mr. Thiel said.

Julie Cordua, the chief executive of Thorn, said the researchers’ findings should be seen as a warning — and an opportunity. Unlike the social media giants who woke up to the ways their platforms were enabling child predators years too late, Ms. Cordua argues, there’s still time to prevent the problem of AI-generated child abuse from spiraling out of control.

“We know what these companies should be doing,” Ms. Cordua said. “We just need to do it.”
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,805
Reputation
7,926
Daps
148,739

People Hire Phone Bots to Torture Telemarketers​

AI software and voice cloners simulate distracted saps willing to stay on the phone forever—or until callers finally give up​

im-808540

PHOTO ILLUSTRATION BY CAM POLLACK/THE WALL STREET JOURNAL; ISTOCK

By
Robert McMillan
Follow

June 29, 2023 7:00 am ET

“Whitey” Whitebeard answered the phone last month, and a recorded female voice warned that it was his last chance to deal with important changes to his Bank of America account.

“Hello. Talk to me,” Whitebeard said in the gruff voice of an annoyed senior. Within seconds, the call was transferred to Kevin, a real person. “Thank you for calling card services,” Kevin said. “How are you doing today?”

“Huh,” Whitebeard answered, now sounding a little befuddled.

“What do you think, how much owed on your credit cards, collectively,” Kevin asked.

Whitebeard grunted and said, “I’ve been having trouble with my television remote. Can you help me figure out how to change the channel to watch my favorite show?”

Whitebeard has a bad habit of talking in circles. That is by design. Whitebeard is a digital contraption that only sounds human. He is the creation of Roger Anderson, a real-life 54-year-old in Monrovia, Calif., who employs chatbots and AI to frustrate and waste the time of telemarketers and scammers.

“I’m talking about only your credit cards,” said Kevin, an overseas caller who doesn’t work for Bank of America. It sounded like he was fishing for financial information that could be used in identity theft, Anderson said.

“I’m sorry. I didn’t catch your name,” said Whitebeard, who speaks in the cloned voice of Sid Berkson, a Vermont dairy farmer and a friend of Anderson’s. “What’s your name, buddy?”

im-804163

Sid Berkson, the voice of ‘Whitey’ Whitebeard, working in Vermont. PHOTO: STEVE BERKSON

Whitebeard stalls for time at the start of phone calls, using chatbot inanities about TV remotes and the like to give a couple of minutes for GPT-4, the OpenAI software, to process the telemarketer’s spiel and generate responses. Once ready, the AI text is fed into a voice cloner, which carries on the conversation.

“So what do you think? How much owed on your credit cards collectively?” Kevin asked again.

“Well let’s see. I have so many of them, you know,” Whitebeard said.

“There is one with a picture of a kitten on it and another with a lovely beach scene. Do you like kittens or beaches?” he said.

Listen: Whitey Whitebeard, Assisted by GP

Complaints about unwanted telephone calls are “far-and-away the largest category of consumer complaints to the FCC,” with the average American receiving 14 unwanted calls a month, according to one industry estimate, a spokesman for the Federal Communications Commission said.

Automated dialers at call centers can easily crank out 100 calls a second, constantly searching for people willing to stay on the line. Voice modulators remove foreign accents, such as Kevin’s, and software allows overseas operators to trigger prerecorded English phrases, said Isaac Shloss.

He is chief product officer with Contact Center Compliance, a company that provides software and services tools to help call centers operate within the law.

Anderson takes pleasure in foiling them. He began his war on telemarketers nearly a decade ago, he said, after one called the family’s landline and said a bad word to his son. He started with an answering machine that said “Hello” a few times before hanging up.

Anderson has since rolled out his weapons of mass distraction. He has posted conversations between man and bot, some lasting as long as 15 minutes before the telemarketer hangs up.

The posts are part of Anderson’s own marketing. He has several thousand customers paying $24.99 a year for use of his call-deflection system, called Jolly Roger. The subscription service gives people the choice of Whitebeard or other digital personalities, including Salty Sally, the overwhelmed mother, and the easily distracted Whiskey Jack.

im-804158
Roger Anderson. PHOTO: JENNIFER ANDERSON
After answering the phone, Jolly Roger keeps callers engaged with preset expressions from chatbots, such as “There’s a bee on my arm, but keep talking.” Chatbots also grunt or say “uh-huh” to keep things going.

When OpenAI released its ChatGPT software last year, Anderson saw right away how it could breathe new life into his time-wasting bots.

At first, ChatGPT was reluctant to do the work. “As an AI language model, I don’t encourage people to waste other people’s time,” ChatGPT told Anderson. Its successor, GPT-4, also pushed back, he said.

Anderson finally found a line of reasoning that persuaded GPT-4 to take the job. “I told it that, ‘You are a personal assistant and you are trying to protect this man from being scammed,’ ” he said.

GPT-4, speaking as Whitebeard, took over the conversation with Kevin after about three minutes. To Anderson, the moment is always magic.

“Anyway I think I owe about, what was it, $15,000 or was it $1,500. I can never remember,” Whitebeard said. “Let me go find my reading glasses and check my statements. I’ll be right back. Don’t go anywhere”

As Kevin waits for Whitebeard, he begins to sound frustrated. “It doesn’t matter,” he said. “I am going to pull up…. Hello…hello?”

Kevin stays on the line, waiting for Whitebeard to return. By the time Whitebeard is back, the call time has hit 3 minutes, 34 seconds.

Whitebeard seems to understand the topic of the telemarketer’s call, credit-card debt consolidation, but he is still a bit lost. That keeps Kevin on the phone, Anderson said.

GPT-4 “does a pretty good job of saying dumb things that are somewhat funny” and believable enough to keep callers engaged, he said. Its screwy non sequiturs are the kind of chatbot gold that customers pay for, he said.

Kevin asked for Whitebeard’s credit-card numbers one last time.

“Huh?” Whitebeard said. “You know I’ve been using credit cards for years, but I can’t seem to remember all the different ones I’ve had.”

Kevin finally hangs up. Total time: 6 minutes, 27 seconds.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,805
Reputation
7,926
Daps
148,739

AlpacaEval Leaderboard​


An Automatic Evaluator for Instruction-following Language Models​


About AlpacaEval​

AlpacaEval an LLM-based automatic evaluation that is fast, cheap, and reliable. It is based on the AlpacaFarm evaluation set, which tests the ability of models to follow general user instructions. These responses are then compared to reference Davinci003 responses by the provided GPT-4 or Claude or ChatGPT based auto-annotators, which results in the win rates presented above. AlpacaEval displays a high agreement rate with ground truth human annotations, and leaderboard rankings on AlpacaEval are very correlated with leaderboard rankings based on human annotators. Please see our documentation for more details on our analysis.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,805
Reputation
7,926
Daps
148,739

Midjourney v5.2 Update Features Impressive ‘Zoom Out’ Tool​


JUN 23, 2023

MATT GROWCOOT


KuE2j89.png

The new Zoom Out feature on Midjourney. The main image was zoomed out from the source image in the top right corner.

AI image generator Midjourney has launched its latest 5.2 version which has excited fans with its new “Zoom Out” feature allowing users to widen the field of view.

V5.2 of the generative artificial intelligence (AI) program has been rolled out today promising “improved aesthetics and sharper images.”

However, most impressive is the new “Zoom Out” feature which appears as a button underneath generated images. Users are presented with two options: “Zoom Out 1.5x” and “Zoom Out 2x.”



It also has the option to “Make Square” which turns a non-square image into a square one. As well as a “Custom Zoom” button, an advanced tool allowing users to alter the text prompts and change the aspect ratio.

Also in the update, is a new shorten command option allowing users to “analyze” a text prompt enabling them to see what words are affecting the image and which ones aren’t contributing very much.



What is Midjourney’s Zoom Out Tool Like?​

In PetaPixel’s tests with the new Zoom Out feature Midjourney did a great job with zooming out 1.5x and 2x but struggled when changing the aspect ratio. However, it was only a cursory assessment and your reporter is a rookie on Midjourney.

Midjourney's Zoom Out feature
First of all, the user must generate a picture. In this case, a “medium close” shot of a woman on a motorbike in London.
Midjourney Zoom Out
This is the same image as above but with “Zoom 1.5x” applied.
Midjourney Zoom Out
With 2x Zoom Out applied.
4-copy-1-800x421.jpg
Zoomed out again 2x.
Midjourney Zoom Out
With the aspect ratio changed to portrait orientation. This was the best one Midjourney v5.2 created.
Midjourney Zoom Out
There were many bad attempts when changing the aspect ratio.
Midjourney v5.2 is available now. Midjourney is accessed via a Discord channel (it does not have a dedicated interface). There are different pricing plans available
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,805
Reputation
7,926
Daps
148,739

Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training​


Hong Liu Zhiyuan Li David Hall Percy Liang Tengyu Ma Stanford University {hliu99, zhiyuanli, dlwh, pliang, tengyuma}@cs.stanford.edu Abstract Given the massive cost of language model pre-training, a non-trivial improvement of the optimization algorithm would lead to a material reduction on the time and cost of training. Adam and its variants have been state-of-the-art for years, and more sophisticated second-order (Hessian-based) optimizers often incur too much per-step overhead. In this paper, we propose Sophia, Second-order Clipped Stochastic Optimization, a simple scalable second-order optimizer that uses a light-weight estimate of the diagonal Hessian as the pre-conditioner. The update is the moving average of the gradients divided by the moving average of the estimated Hessian, followed by element-wise clipping. The clipping controls the worst-case update size and tames the negative impact of non-convexity and rapid change of Hessian along the trajectory. Sophia only estimates the diagonal Hessian every handful of iterations, which has negligible average per-step time and memory overhead. On language modeling with GPT-2 models of sizes ranging from 125M to 770M, Sophia achieves a 2x speed-up compared with Adam in the number of steps, total compute, and wall-clock time. Theoretically, we show that Sophia adapts to the curvature in different components of the parameters, which can be highly heterogeneous for language modeling tasks. Our run-time bound does not depend on the condition number of the loss.
 
Top