bnew

Veteran
Joined
Nov 1, 2015
Messages
60,142
Reputation
8,996
Daps
166,215

Hughes Hallucination Evaluation Model (HHEM) leaderboard​


This leaderboard (by Vectara) evaluates how often an LLM introduces hallucinations when summarizing a document.
The leaderboard utilizes HHEM-2.1 hallucination detection model. The open source version of HHEM-2.1 can be found here.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
60,142
Reputation
8,996
Daps
166,215





1/9
@deepseek_ai
🚀 Day 0: Warming up for /search?q=#OpenSourceWeek!

We're a tiny team @deepseek_ai exploring AGI. Starting next week, we'll be open-sourcing 5 repos, sharing our small but sincere progress with full transparency.

These humble building blocks in our online service have been documented, deployed and battle-tested in production.

As part of the open-source community, we believe that every line shared becomes collective momentum that accelerates the journey.

Daily unlocks are coming soon. No ivory towers - just pure garage-energy and community-driven innovation.



2/9
@ujjwalthakur_
AGI



3/9
@AntDX316
👍



4/9
@joacodok
the real openai



5/9
@0xa8l
Do you guys have a business model? It seems you are just open sourcing your secrets!



6/9
@novita_labs
if you want to use DeepSeek R1 API for free

[Quoted tweet]
Still looking to integrate DeepSeek API? 🐋

Refer a friend to Novita and both earn $20 in DeepSeek API credits—up to $500 total!

Use it for your application, or dive into AI exploration. Either way, it's on us. 🤖

Get your first $20 here: shorturl.at/BR4vc

#NovitaReferral


7/9
@innocentamna12
Nice work by the team! Good progress being made.



8/9
@Putra_GPT
🔥🔥🔥



9/9
@Ekaeoq
I hope you people win




To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
60,142
Reputation
8,996
Daps
166,215

1/11
@deepseek_ai
🚀 Introducing NSA: A Hardware-Aligned and Natively Trainable Sparse Attention mechanism for ultra-fast long-context training & inference!

Core components of NSA:
• Dynamic hierarchical sparse strategy
• Coarse-grained token compression
• Fine-grained token selection

💡 With optimized design for modern hardware, NSA speeds up inference while reducing pre-training costs—without compromising performance. It matches or outperforms Full Attention models on general benchmarks, long-context tasks, and instruction-based reasoning.

📖 For more details, check out our paper here: Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention



GkDUL57aAAARneI.png

GkDUMe3aAAMyB04.png

GkDUNRvbsAAoUbx.png

GkDUOFmaAAA8b8u.png


2/11
@thedaviddosu
cool!



3/11
@helenshi7788


[Quoted tweet]
🔥🔥🔥🔥🔥🔥🔥🌸🌸
Who does the Foreign Affairs of China represent?
Tariffs and affairs can be negotiated with Trump. Posting on X alone won't resolve the issue; Xi Jinping must negotiate with Trump.


4/11
@abdullahstwt
Can someone explain it in simple terms 😭😭



5/11
@ai_katana
Everyone is cooking today



6/11
@totosberlusconi
Now say it in words I can understand



7/11
@LucasOrganic
Holy shyt this shyts on Grok 3



8/11
@SandeepK118
Deepseek superb



9/11
@devnamipress
is it possible to buy API through 3rd party if locally deepseek is banned?



10/11
@JeyMarcabell




GkDXpqaW4AAjqQg.jpg


11/11
@xinma711
amazing, save more power




To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
 
Top