bnew

Veteran
Joined
Nov 1, 2015
Messages
55,710
Reputation
8,224
Daps
157,249

The Languages AI Is Leaving Behind​

The generative-AI boom looks very different for non-English speakers.

By Damon Beres

Distorted view of a speaking mouth

Illustration by The Atlantic

APRIL 19, 2024

This is Atlantic Intelligence, a limited-run series in which our writers help you wrap your mind around artificial intelligence and a new machine age. Sign up here.

Generative AI is famously data-hungry. The technology requires huge troves of digital information—text, photos, video, audio—to “learn” how to produce convincingly humanlike material. The most powerful large language models have effectively “read” just about everything; when it comes to content mined from the open web, this means that AI is especially well versed in English and a handful of other languages, to the exclusion of thousands more that people speak around the world.

In a recent story for The Atlantic, my colleague Matteo Wong explored what this might mean for the future of communication. AI is positioned more and more as the portal through which billions of people might soon access the internet. Yet so far, the technology has developed in such a way that will reinforce the dominance of English while possibly degrading the experience of the web for those who primarily speak languages with less minable data. “AI models might also be void of cultural nuance and context, no matter how grammatically adept they become,” Matteo writes. “Such programs long translated ‘good morning’ to a variation of ‘someone has died’ in Yoruba,” David Adelani, a DeepMind research fellow at University College London told Matteo, “because the same Yoruba phrase can convey either meaning.”

But Matteo also explores how generative AI might be used as a tool to preserve languages. The grassroots efforts to create such applications move slowly. Meanwhile, tech giants charge ahead to deploy ever more powerful models on the web—crystallizing a status quo that doesn’t work for all.

— Damon Beres, senior editor



Distorted view of a talking mouth

Illustration by Matteo Giuseppe Pani. Source: Getty.

The AI Revolution Is Crushing Thousands of Languages

By Matteo Wong

Recently, Bonaventure Dossou learned of an alarming tendency in a popular AI model. The program described Fon—a language spoken by Dossou’s mother and millions of others in Benin and neighboring countries—as “a fictional language.”

This result, which I replicated, is not unusual. Dossou is accustomed to the feeling that his culture is unseen by technology that so easily serves other people. He grew up with no Wikipedia pages in Fon, and no translation programs to help him communicate with his mother in French, in which he is more fluent. “When we have a technology that treats something as simple and fundamental as our name as an error, it robs us of our personhood,” Dossou told me.

The rise of the internet, alongside decades of American hegemony, made English into a common tongue for business, politics, science, and entertainment. More than half of all websites are in English, yet more than 80 percent of people in the world don’t speak the language. Even basic aspects of digital life—searching with Google, talking to Siri, relying on autocorrect, simply typing on a smartphone—have long been closed off to much of the world. And now the generative-AI boom, despite promises to bridge languages and cultures, may only further entrench the dominance of English in life on and off the web.

Read the full article.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
55,710
Reputation
8,224
Daps
157,249

1/1
phi-3-mini: 3.8B model matching Mixtral 8x7B and GPT-3.5

Plus a 7B model that matches Llama 3 8B in many benchmarks.

Plus a 14B model.



To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GL0M6fzbYAAGjXX.png

GL2YWRBW8AA1grq.png







1/5
Microsoft just released Phi-3

Phi-3 14B beats Llama-3 8B, GPt-3.5 and Mixtral 8x7b MoE in most of the benchmarks.

Even the Phi-3 mini beats Llama-3 8B in MMLU and HellaSwag.

2/5
More details and insights to follow in tomorrow's AI newsletter.

Subscribe now to get it delivered to your inbox first thing in the morning tomorrow: Unwind AI | Shubham Saboo | Substack

3/5
Reaserch Paper:

4/5
This is absolutely insane speed of Opensource AI development.

5/5
True, all of this is happening so fast!!


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GL0SHDCXoAAIVfQ.png

GL1-635WcAAjOZM.png

GL0P9s8WcAAq1iI.jpg




1/2
phi-3 is out! never would have guessed that our speculative attempt at creating synthetic python code for phi-1 (following TinyStories) would eventually lead to a gpt-3.5-level SLM. defly addicted to generating synth data by now...

2/2
hf:
GL3100iXAAAQqFn.jpg

GLz4YzjbkAARAA4.jpg









1/5
Amazing numbers. Phi-3 is topping GPT-3.5 on MMLU at 14B. Trained on 3.3 trillion tokens. They say in the paper 'The innovation lies entirely in our dataset for training - composed of heavily filtered web data and synthetic data.'

2/5
Small is so big right now!

3/5
phi-3-mini: 3.8B model matching Mixtral 8x7B and GPT-3.5

Plus a 7B model that matches Llama 3 8B in many benchmarks.

Plus a 14B model.

[2404.14219] Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

4/5
The prophecy has been fulfilled!


5/5
Wow.
'Phi-3-mini can be quantized to 4-bits so that it only occupies ≈ 1.8GB of memory. We tested the quantized model by deploying phi-3-mini on iPhone 14 with A16 Bionic chip running natively on-device and fully offline achieving more than 12 tokens per second.'


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GL0SitKXYAAYUij.png

GL0MTQgXkAARPG7.png

GL0SjiXXcAAz6CO.png

GL0VJOUWAAAz1RL.png

GL0M6fzbYAAGjXX.png

GL0VUsZXsAA4ENC.jpg

GLz-BENW8AEKfAD.jpg

GL0YuFgXAAACYDT.png










1/9
Run Microsoft Phi-3 locally in 3 simple steps (100% free and without internet):

2/9
1. Install Ollama on your Desktop

- Go to https://ollama.com/ - Download Ollama on your computer (works on Mac, Windows and Linux)
- Open the terminal and type this command: 'ollama run phi3'

3/9
2. Install OpenWeb UI (ChatGPT like opensource UI)

- Go to https://docs.openwebui.com - Install the docker image of Open Web UI with a single command
- Make sure Docker is installed and running on your computer

4/9
3. Run the model locally like ChatGPT

- Open the ChatGPT like UI locally by going to this link: http://localhost:3000 - Select the model from the top.
- Query and ask questions like ChatGPT

This is running on Macbook M1 Pro 16GB machine.

5/9
If you find this useful, RT to share it with your friends.

Don't forget to follow me
@Saboo_Shubham_ for more such LLMs tips and resources.

6/9
Run Microsoft Phi-3 locally in 3 simple steps (100% free and without internet):

7/9
Not checked yet.
@ollama was the first to push the updates!

8/9
That would be fine-tuning. You can try out
@monsterapis for nocode finetuning of LLMs.

9/9
Series of language models that pretty much outperformed llama-3 even with the small size.


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GL3rNO9WsAA5ag7.jpg



Also 128k-instruct: microsoft/Phi-3-mini-128k-instruct-onnx · Hugging Face
Edit: All versions: Phi-3 - a microsoft Collection













 
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
55,710
Reputation
8,224
Daps
157,249







1/8
Got llama3-8b-instruct to work at 32k.
Test of summarizing 19k long transcript looks amazing
ggufs incoming

2/8
This is not a finetune. It's stock 8b-instruct.

3/8


4/8
dont retweet this, wait for the ggufs lol i'm exited

5/8
It's really good, like REALLY good, does hallucinate and this was a very messy transcript with a lot of repetitions from the ggml-small whisper stream , but like.. the results... comeon. This is fukking dope af

6/8
Have been trying for 4 days to configured it lol, finally works, pre-configuring gguf so you wont need to do anything.
Might even work ok at 64k

7/8
shoutout to
@AlpinDale the most cracked dev in ml for the tips.

8/8
Found it. Unsure where I saved it from.


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GL4PbwMXMAAfksN.jpg

GL4PoFyXUAEfIfU.jpg

GL4Z0UbWMAAiQhk.jpg

GL0dv7HWIAA1Da8.jpg



1/1
Llama3 vs. GPT4

This is side by side comparison of llama3-70b and gpt4-turbo.

The input prompt is identical,
Output quality is super similar
but gpt4 is like 15x slower


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196



1/1
We're running into a (hopefully) surmountable problem.

The finetunes of llama3 I've tested have reverted from being pleasant and direct, to GPT-4 levels of generic.

Makes sense though - most synthetic datasets being used for the finetuning are *from* GPT-4.

It's awesome to think that we've come so far that emulating GPT4 is a bad thing, but it's a growing problem.


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GLz8KA9WMAEitUR.jpg

GL0hXM8bIAA0iVI.png

GL0M6fzbYAAGjXX.png





1/3
Getting a very usable 2.2 tokens a second with Llama3 Q4 on Raspberry Pi 5 (fanless)

2/3
This is all you need to run it:
llamafile
https://github.com/Mozilla-Ocho/llamafile LLama3 GGUF
lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF at main

3/3
8gb ram version


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196



1/1
I gave
@Bolt__AI by
@daniel_nguyenx
with Llama3-70B by
@GroqInc
a test. mind blown


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196





1/3
LLAMA3 is the first open-source LLM to ace tasks in Workarena, making it the top OSS virtual knowledge worker! (and believe us we've tested many models and prompting techniques ) Watch it excel in a challenging knowledge base task. Kudos to
@AIatMeta for the amazing model

2/3
relevant links

WorkArena website: https://servicenow.github.io/WorkArena/ paper: https://arxiv.org/pdf/2403.07718.pdf code: https://github.com/ServiceNow/WorkArena LLAMA3: https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct Can't wait to try Llama-3-8B-Web next!
@xhluca
Maybe we'll reach closed-source performance


@ServiceNowRSRCH

@Mila_Quebec

3/3
This will happen sooner than you think!!


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196







1/5
Meta's new AI—Llama3—beats an old version of GPT-4, Claude 2.1, and GPT-3.5!

You can run it:
- 100% free
- with 100% privacy (no data leaves your machine)

See the comments for an easy way to install

2/5
2/ GPT4All from
@nomic_ai runs on Windows and Mac.

It's free. Other good options are
@LMStudioAI
or
@ollama
. But I'll go through running Llama 3 on GPT4All because it's also easy to chat with your own pdfs and documents with GPT4All (i.e. RAG).

Download it and install.

3/5
3/ You'll also need to download the model. (GPT4All is the interface. It's not the model).

Chose the new Llama 3 Instruct 8B.

4/5
4/ Then go to "Choose a Model" and load "Llama 3 Instruct"

It'll take ages to load if your computer is anything like mine.

5/5
5/ You're ready to go!

Chat to it just like ChatGPT or another LLM.


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GLr5kWEbgAA99Wv.jpg

GLr5k-MaAAAKq-H.jpg

GLr5lebagAA5Lla.jpg

GLr5l-bboAAYUMc.jpg
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
55,710
Reputation
8,224
Daps
157,249










1/10
Can AI rewrite our human genome?

Today, we announce the successful editing of DNA in human cells with gene editors fully designed with AI. Not only that, we've decided to freely release the molecules under the
@ProfluentBio OpenCRISPR initiative.

Lots to unpack

2/10
AI has become increasingly pervasive in our daily lives from how we sift through information, produce content, and interact with the world. This marks a new chapter where AI is used to alter the fundamental blueprint of who we are - our DNA.

3/10
We were immediately drawn to gene editing due to the pressing societal needs, potential for one-and-done cures to disease, and the scientific challenge + complex biology involving protein, RNA, and DNA.

4/10
Our LLMs were trained on massive scale sequence and biological context to generate millions of diverse CRISPR-like proteins that do not occur in nature, thereby exponentially expanding virtually all known CRISPR families at-will.

5/10
We then focus on type II effector complexes, generating cas9-like proteins and gRNAs. These proteins are hundreds of mutations away from anything in nature.

6/10
We then characterized our generations in the wet lab and found that the AI-designed gene editors show comparable or improved activity and specificity relative to SpCas9, the prototypical gene editing effector. More characterization is underway but we're already impressed.

7/10
We also created an AI-designed base editor which exhibited really exciting performance in precise A->G edits.

8/10
The results point to a future where AI precisely designs what is needed to create a range of bespoke cures for disease. There is still much to build to achieve this vision. To spur innovation and democratization, we are freely releasing OpenCRISPR-1. Try it out!

9/10
This was truly a team effort across all disciplines of the company.
@jeffruffolo SNayfach JGallagher @AadyotB JBeazer RHussain JRuss JYip EHill @MartinPacesa @alexjmeeske PCameron and the broader Profluent team. If you want to build with us, join. We’re hiring.

10/10
Paper: https://biorxiv.org/content/10.1101/2024.04.22.590591v1 Blog: https://profluent.bio/blog/editing-the-human-genome-with-ai NYTimes: https://nytimes.com/2024/04/22/tech...-ios-share&referringSource=articleShare Press Release: https://businesswire.com/news/home/...AI-Created-and-Open-Source-Gene-Editor Access OpenCRISPR-1: OpenCRISPR


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GLy7Xj9awAMH43y.jpg

GLy7hzqawAIQgbx.jpg

GLy7hz8awAIWKuw.png

GLy74_4awAAb-zO.jpg

GLy75AOawAs61xY.jpg

GLy8UX0awAE9lcA.jpg

GLy8f16aAAAJwuG.jpg
 

kevm3

follower of Jesus
Supporter
Joined
May 2, 2012
Messages
16,298
Reputation
5,571
Daps
83,581
Makes me wonder if Microsoft bought Github a while back so that it'd have a massive amount of code to train their AI on
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
55,710
Reputation
8,224
Daps
157,249

Meet Phi-3: Microsoft’s New LLM That Can Run On Your Phone​

Saptorshee Nag
by Saptorshee Nag

April 23, 2024

Reading Time: 6 mins read



Microsoft Phi-3 LLM for Phones


Who thought that one day we would have tiny LLMs that can cope up with highly powerful ones such as Mixtral, Gemma, and GPT? Microsoft Research announced a powerful family of small LLMs called the Phi-3 model family.

Highlights:

  • Microsoft unveils Phi-3 model family, a collection of tiny LLMs that are highly powerful and can run on smartphones.
  • The model family is composed of 3 models namely Phi-3-mini, Phi-3-small and Phi-3-medium.
  • Shows impressive benchmark results and highly rivals models like Mixtral 8x7B and GPT-3.5.



Microsoft’s Phi-3 LLM Family

Microsoft has leveled up its Generative AI game once again. It released Phi-2 back in December 2023, which had 2.7 billion parameters and provided state-of-the-art performance compared to base language models with less than 13 billion parameters.

However many LLMs have been released since then which have outperformed Phi-2 on several benchmarks and evaluation metrics.

This is why Microsoft has released Phi-3, as the latest competitor in the Gen AI market, and the best thing about this model family is that you can run it on your smartphones!

So how powerful and efficient is this state-of-the-art model family? And what are its groundbreaking features? Let’s explore all these topics in-depth through this article.

Microsoft introduced the Model family in the form of three models: Phi-3-mini, Phi-3-small, and Phi-3-medium.

Let’s study all these models separately.

1) Phi-3-mini 3.8b

Phi-3-Mini is a 3.8 billion parameter language model that was trained on 3.3 trillion tokens in a large dataset. It has performance levels comparable to larger versions like Mixtral 8x7B and GPT-3.5, despite its small size.

Because Mini is so powerful, it can operate locally on a mobile device. Because of its modest size, it can be quantized to 4 bits, requiring about 1.8GB of memory. Microsoft used Phi-3-Mini, which runs natively on the iPhone 14 with an A16 Bionic CPU and achieves more than 12 tokens per second while entirely offline, to test the quantized model.

Phi-3 running on Iphone

The transformer decoder architecture used in the phi-3-mini model has a default context length of 4K. With a vocabulary size of 320641, phi-3-mini employs the same tokenizer as Llama 2, and it is constructed on a similar block structure.

Thus, any package created for the Llama-2 model family can be immediately converted to phi-3-mini. 32 heads, 32 layers, and 3072 hidden dimensions are used in the model.

The training dataset for Phi-3-Mini, which is an enlarged version of the one used for Phi-2, is what makes it innovative. This dataset includes both synthetic and highly filtered online data. Additionally, the model’s resilience, safety, and chat structure have all been optimized.

2) Phi-3-Small and Phi-3-Medium

Additionally, Phi-3-Small and Phi-3-Medium versions from Microsoft have been released; these are both noticeably more powerful than Phi-3-Mini. Using the tiktoken tokenizer, Phi-3-Small, with its 7 billion parameters, achieves better multilingual tokenization. It has an impressive 100,352-word vocabulary and an 8K default context.

The Phi-3-small model has 32 layers and a hidden size of 4096, following the typical decoder design of a 7B model class. Phi-3-Small uses a grouped-query attention system, where four queries share a single key, to reduce the KV cache footprint.

Additionally, phi-3-small maintains lengthy context retrieval speed while further optimizing KV cache savings with the usage of new block sparse attention and alternate layers of dense attention. For this model, an extra 10% of multilingual data was also used.

Using the same tokenizer and architecture as phi-3-mini, Microsoft researchers also trained phi-3-medium, a model with 14B parameters, on the same data for a slightly longer number of epochs (4.8T tokens overall as compared to phi-3-small). The model has an embedding dimension of 5120 and 40 heads and 40 layers.

Looking At the Benchmarks

The typical open-source benchmarks testing the model’s reasoning capacity (both common sense reasoning and logical reasoning) were used to test the phi-3-mini, phi-3-small, and phi-3-medium versions. They are contrasted with GPT-3.5, phi-2, Mistral-7b-v0.1, Mixtral-8x7b, Gemma 7B, and Llama-3-instruct8b.

Phi-3 Benchmarks

Phi-3-Mini is suited for mobile phone deployment, scoring 69% on the MMLU test and 8.38 on the MT bench.

With an MMLU score of 75.3, the Phi-3-small 7 billion parameter model performs better than Meta’s newly released Llama 3 8B Instruction, which has a score of 66.

However, the biggest difference was observed when Phi-3-medium was compared to all the models. It defeated several models including Mixtral 8x7B, GPT-3.5, and even Meta’s newly launched Llama 3 on several benchmark metrics such as MMLU, HellaSwag, ARC-C, and Big-Bench Hard. The differences were huge where Phi-3-medium highly outperformed all the competitors.

This just goes to show how powerful these tiny mobile LLMs are compared to all these large language models which need powerful GPUs and CPUs to operate. The benchmarks give us an idea that the Phi-3 model family will do quite well in coding-related tasks, common sense reasoning tasks, and general knowledge capabilities.

Are there any Limitations?

Even though it is too powerful for its size and deployment device, the Phi-3 model family has one major limitation. Its size essentially limits it for some tasks, even if it shows a comparable level of language understanding and reasoning capacity to much larger models. For instance, its inability to store large amounts of factual knowledge causes it to perform worse on tests like TriviaQA.

“Exploring multilingual capabilities for Small Language Models is an important next step, with some initial promising results on phi-3-small by including more multilingual data. The use of carefully curated training data, targeted post-training, and improvements from red-teaming insights significantly mitigates these issues across all dimensions. However, there is significant work ahead to fully address these challenges.”

Microsoft also provided a potential solution to this drawback. It thinks a search engine added to the model can help with these flaws. Furthermore, the model’s limited language proficiency in English emphasizes the necessity of investigating multilingual capabilities for Small Language Models.

Is Phi-3 Safe?

The responsible AI guidelines of Microsoft were followed in the development of Phi-3-mini.

The total strategy included automated testing, evaluations across hundreds of RAI harm categories, red-teaming, and safety alignment in post-training. Several in-house created datasets and datasets with adjustments influenced by helpfulness and harmlessness preferences were used to address the RAI harm categories in safety post-training.

To find further areas for improvement during the post-training phase, a Microsoft independent red team conducted an iterative analysis of phi-3-mini. Microsoft refined the post-training dataset by selecting new datasets that addressed their insights, as a result of receiving input from them. The procedure significantly reduced the rates of adverse responses.

Phi-3 Safety measures


Conclusion

Are we on the verge of a new era for Mobile LLMs? Phi-3 is here to answer this question. The mobile developer community will be highly benefit from Phi-3 models, especially Small and Medium. Recently, Microsoft has also working on the VASA-1 image to video Model, which is also a big thing in the gen AI space.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
55,710
Reputation
8,224
Daps
157,249
















1/27
Llama 3 surprised everyone less than a week ago, but Microsoft just dropped Phi-3 and it's incredibly capable small AI model.

We may soon see 7B models that can beat GPT-4. People are already coming up with incredible use cases.

10 wild examples:

2/27
1. Phi-3 Mini running on Raspberry Pi 5 h/t
@00_brad

3/27
Getting over 4 tokens per second on a Raspberry Pi 5 with Microsoft's Phi-3 Mini! Great model to run entirely locally! Model link in comments!

4/27
2. Phi-3 Mini with 128k context window on NVIDIA TensorRT-LLM

5/27
Announcing our collaboration to accelerate @Microsoft's new Phi-3 Mini open language model with NVIDIA TensorRT-LLM. https://nvda.ws/3xJ6zR0 Developers can try Phi-3 Mini with the 128K context window at Production-Ready APIs That Run Anywhere.

6/27
3. Phi-3 running locally on Vision Pro h/t
@ivanfioravanti

7/27
Apple MLX: Phi-3 running locally on a VisionPro with VisionOS 1.2 Beta 3!

Fully offline, pretty fast! 22.25 t/s

Credit to @awnihannun for the special quantized version for MLX

In the code I used displayEveryNTokens = 3 to make streaming more "continuous".

8/27
4. Comparing Llama 3 & Phi-3 using RAG h/t
@akshay_pachaar

9/27
Let's compare Llama-3 & Phi-3 using RAG:

10/27
5. Phi-3 Mini running on iPhone 14 Pro h/t
@wattmaller1

11/27
Well, I did it. I ran Phi 3 on a phone. It was slow the first time, but I guess it cached something because then it went faster, as seen below. That's an iPhone 14 Pro. It's Phi 3 mini, 4k context. Via llama.cpp library

12/27
6. Phi-3 running locally on iPhone
@ac_crypto

13/27
Phi-3 running locally on an iPhone using MLX

Fully offline, it’s fast!

Credit to @exolabs_ team @mo_baioumy h/t @awnihannun for the speedy model impl in MLX

14/27
7. RAG with Phi-3 on
@ollama h/t
@ashpreetbedi

15/27
RAG with Phi-3 on @ollama: I dont trust the benchmarks, so I recorded my very first test run. Completely unedited, each question asked for the first time. First impression is that it is good, very very good for its size.

Try it yourself: phidata/cookbook/llms/ollama/rag at main · phidatahq/phidata

16/27
8. Phi-3-mini-128-instruct on MLX h/t
@ShellZero

17/27
Phi3-mini-128-instruct on MLX. Blazing

Prompt - 131.917 tps
Generation - 43.387 tps

M3 Max - 64GB memory.

@awnihannun #mlx

18/27
9. Phi 3 SLM with
@LMStudioAI on Windows h/t
@jamie_maguire1

19/27
Running the new Phi 3 SLM with @LMStudioAI on Windows.

I like it.

Only using 3GB of RAM.

20/27
10. Phi-3 on iPhone 15 Pro
@awnihannun

21/27
Using MLX Swift to generate text with 4-bit Phi-3 on iPhone 15 Pro.

Fully on device, runs pretty fast.

Example here: https://github.com/ml-explore/mlx-swift-examples Also all MIT!

22/27
If you want to keep up with the latest AI developments and tools, subscribe to The Rundown it's FREE:

23/27
If you enjoyed this thread,

Follow me
@minchoi and please Bookmark, Like, Comment & Repost the first Post below to share with your friends:

24/27
Llama 3 surprised everyone less than a week ago, but Microsoft just dropped Phi-3 and it's incredibly capable small AI model.

We may soon see 7B models that can beat GPT-4. People are already coming up with incredible use cases.

10 wild examples:

25/27
We are going to see many agentic workflow apps and designs, then running them on phone

26/27
Seeing a lot of focus shift towards smaller, mobile offline AI with these models.

27/27
Good coding benchmarks, thanks for sharing


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GL87tTXbMAMWi1o.jpg

GMBe9ofWcAEWtFJ.png

GMBghTvWsAAf_gX.jpg
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
55,710
Reputation
8,224
Daps
157,249




1/4
No one is talking about this major LLM from China.

2 days ago, SenseTime launched SenseNova 5.0, which according to the report (translated from Chinese):

- Beats GPT-4T on nearly all benchmarks
- Has a 200k context window
- Is trained on more than 10TB tokens
- Has major advancements in knowledge, mathematics, reasoning, and coding capabilities

Crazy how much is happening in the world of AI in China that's going completely under the radar.

2/4
H/t to
@Ghost_Z12 for spotting this.

Here's the source (it's in Chinese): 商汤甩出大模型豪华全家桶!秀拳皇暴打GPT-4,首晒“文生视频”,WPS小米现场助阵 - 智东西

3/4
Sounds like we need to accelerate

4/4
A new model is coming this year 100%, but not sure if it'll be called GPT-5

Sam Altman on the Lex Fridman pod in March:

'We will release an amazing model this year. I don't know what we will call it'


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GMBe9ofWcAEWtFJ.png

GMBghTvWsAAf_gX.jpg

GMBvHQkWwAAtrBG.png





SenseTime launches SenseNova 5.0 with comprehensive updates and the industry-leading "Cloud-to-Edge" full-stack large model product matrix​

2024-04-24

23 April 2024, Shanghai – SenseTime launched its latest Large Model, the SenseNova 5.0, at its Tech Day event in Shanghai. With its cutting-edge technology accelerating the development of Generative AI, SenseTime also launched the industry-leading "Cloud-To-Edge" full-stack large model product matrix that is scalable and applicable across various scenarios.

Dr. Xu Li, Chairman of the Board and CEO of SenseTime, said, “In our pursuit to push the boundaries of SenseNova’s capabilities, SenseTime remains guided by the Scaling Law as we build upon our Large Model based on this three-tier architecture: Knowledge, Reasoning, and Execution (KRE)."

f6fb1eb6-c86f-4766-a49e-ce35f458ed26(1).jpg

Dr. Xu Li, Chairman of the Board and CEO of SenseTime, introduced the advancements of the SenseNova 5.0 Large Model at the event.



SenseNova 5.0: Linguistic, creative and scientific capabilities greatly improved; multimodal interactions added



Since its debut in April 2023, the SenseNova Large Model is currently in its fifth iteration. SenseNova 5.0 has undergone over 10TB of token training, covering a large amount of synthetic data. It adopts a Mixture of Experts, enabling effective context window coverage of approximately 200,000 during inference. The major advancements in SenseNova 5.0 focus on knowledge, mathematics, reasoning, and coding capabilities.

In terms of linguistic and creative capabilities, the creative writing, reasoning, and summary abilities of SenseNova 5.0 have significantly improved. Given the same knowledge input, it provides better comprehension, summarization, and question and answers, providing strong support for vertical applications such as education and the content industries. On its scientific capabilities, SenseNova 5.0 boasts best-in-class mathematical, coding and reasoning capabilities, providing a solid foundation for applications in finance and data analysis.

SenseNova 5.0 is also equipped with superior multimodal capabilities in product application. It supports high-definition image parsing and understanding, as well as text-to-image generation. In addition, it extracts complex data across-documents and summarizes answers to questions, possessing strong multimodal interaction capability. At present, SenseNova 5.0’s world-leading graphical and textual perception ranks first based on its aggregate score on MMBench, an authoritative multimodality benchmark. It has also achieved high scores in other well-known multimodal rankings such as MathVista, AI2D and ChartQA.



The industry-leading full-stack large model edge-side product matrix

SenseTime also launched the industry-leading edge-side full-stack large model product matrix, which includes the SenseTime Edge-side Large Model for terminal devices, and the SenseTime Integrated Large Model (Enterprise) edge device that can be applied to fields such as finance, coding, healthcare and government services.

The inference speed of the SenseNova Edge-side Large Language Model has achieved industry-leading performance. It can generate 18.3 words per second on the mid-range platforms, and an impressive 78.3 words per second on flagship platforms.

The diffusion model has also achieved the fastest inference speed in the industry. The inference speed of edge-side LDM-AI image diffusion technology takes less than 1.5 seconds on a mainstream platform, and supporting the output of high-definition images with resolution of 12 million pixels and above, as well as image editing functions such as proportional, free-form and rotation image expansion.

Picture1.jpg

SenseTime conducted a live demonstration of its SenseNova Edge-side Large Model on image expansion.

The SenseTime Integrated Large Model (Enterprise) edge device was developed in response to the growing demand for AI from key fields such as finance, coding, healthcare and government services. Compared to other similar products, the device performs accelerated searches at only 50 percent CPU utilization, and reduces inference costs by approximately 80 percent.

Innovating product applications in the AI 2.0 era with ecosystem partners to further boost productivity

SenseTime has partnered with Kingsoft Office since 2023, leveraging SenseNova Large Model to empower the latter’s WPS 365 as a smart office platform that boosts office productivity and overall efficiency.

In the financial sector, Haitong Securities and SenseTime jointly released a full-stack large model for the industry. Through the large model, both parties facilitated business operations in areas such as intelligent customer service, compliance and risk control, and business development office assistants. They also jointly explored cutting-edge industry applications such as smart investment advisory and aggregation of public sentiments, to realize the full-stack capability of large models in the securities industry.

In the transportation industry, SenseTime’s large model technology is deployed in the smart cabin of the Xiaomi SU7 vehicle, providing car owners with an intelligent and enhanced driving experience.

SenseTime firmly takes the lead into the AGI era with text-to-video in the pipeline

SenseTime also displayed its breakthrough with its text-to-video platform, where users will soon be able to generate a video based on a detailed description or even a few phrases. In addition, the characters’ costumes, hairstyles, and scenarios can be preset to maintain the stylistics consistency of the video content.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
55,710
Reputation
8,224
Daps
157,249



1/3
It's been a week since LLaMA 3 dropped.

In that time, we've:
- extended context from 8K -> 128K
- trained multiple ridiculously performant fine-tunes
- got inference working at 800+ tokens/second

If Meta keeps releasing OSS models, closed providers won't be able to compete.

2/3
Not yet, though I'm sure some version of it will be at some point!

3/3
I believe that’s just a product decision by the API providers. No reason that can’t be extended. At HyperWrite, we often offer 5000 to 10,000 token outputs to users.


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196



1/1
It's been exactly one week since we released Meta Llama 3, in that time the models have been downloaded over 1.2M times, we've seen 600+ derivative models on
@HuggingFace and much more.

More on the exciting impact we're already seeing with Llama 3 A look at the early impact of Meta Llama 3


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GMCfRzLasAAKbkf.jpg

GMBvtnvXgAAUc9s.jpg

GMDXg_BbgAAs0QK.jpg






1/4
A 262k-token context finetune of Llama 3 8B:
2/4
The longer your input the more memory you need.

3/4
If post-finetuned well, then the model will remain good on 8k and be quite good beyond that.

4/4
Nobody knows. But RoPE scaling is quite effective even without post-pretraining. I think the best quality will be in the official long-context model. In the meantime such unofficial models will do the job.


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
 
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
55,710
Reputation
8,224
Daps
157,249



1/3
Apple Has Open-Sourced Their On-Device Language Models And They Aren't Very Good!

Apple has uncharacteristically been open-sourcing its work around language models! Kudos to them for that

However, their models are really bad. Compare the 3B model's MMLU, which is 24.8, to the Phi-3B mini's MMLU, which is 68.8!

Apple's models are not useful in the real world, with an MMLU of 24.8!

Thanks to open source, they can use Phi-3 from Microsoft on their devices—at least until they train better small models in the future.

2/3
Yes, it's more about the new arch than anything else...but I think it would be better if they had SOTA numbers...

It could also be that they got scooped by the Phi-3 folks

3/3
We should welcome Apple to the open-source ecosystem.

IMO, they just got scooped


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GMC8BsyWIAA9o47.jpg
 
Top