Bard gets its biggest upgrade yet with Gemini {Google A.I / LLM}

bnew

Veteran
Joined
Nov 1, 2015
Messages
59,664
Reputation
8,852
Daps
164,928






1/11
@nocodeguy_
since i just noticed this and it might help some people out:

apparently the gemini api is completely free to use + you get 300$ of credits when you signup

you get 1500 requests/day for their new gemini 2.0 flash thinking / pro models

reasoning + 1 million context window = goodbye openai for now



GjYRL9UXQAAN7mi.jpg


2/11
@probprofessor
But how good is it compared to open ai? Have you tried it?



3/11
@nocodeguy_
I was using gpt-4o before and switched to flash thinking and for my use case (music theory etc) it actually gives way better answers



4/11
@NEO_MAGNETAR
I've not had as good of interactions with Google ai compared to openai overall. But this is comparable to the new deep research for now?



5/11
@nocodeguy_
I guess only 1.5 has a deep research like features



6/11
@StevenOrtega103
free to use this is surreal



7/11
@nocodeguy_
yeah it’s crazy



8/11
@mcdreamygoat
If its free … you are the product ? Is that valid in this case ?



9/11
@nocodeguy_
true, but honestly I don’t care



10/11
@itismejared
Good share! Thanks!



11/11
@nocodeguy_
sure!




To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
59,664
Reputation
8,852
Daps
164,928




1/11
@ai_for_success
Hallucination rates for the top 25 LLMs from vectara show that new Google Gemini 2.0 Flash has the lowest hallucination rate at 0.7%, followed by Google Gemini 2.0 Pro Experimental at 0.8%.

Gemini 2.0 Flash is both cheap and accurate 🔥
Yet they are not hyping this.

[Quoted tweet]
Gemini 2.0 Flash (GA) and 2.0 Pro (exp) models have the lowest hallucination rate on @vectara hallucination bench.


GjNVUExaQAAY5jQ.jpg

GjMtsfJW8AYJGjY.jpg


2/11
@julianshalaby96
Where’s sonnet???



3/11
@ai_for_success
Good question, looks like it's really bad or they just didn't test it @vectara any inputs on missing Sonnet 3.5?



4/11
@stefanvladcalin
Google seems to really want to win the AI race



5/11
@ai_for_success
Not just want they will win eventually.



6/11
@Shawnryan96
Wish we had a chart from a year ago to see the difference



7/11
@ai_for_success
Yeah but I guess it's alot better now with grounding.



8/11
@thinknonlinear1
did you see open router data? Claude is number 1. It is least hyped. when you have good product, it just hypes itself.



9/11
@saudhashimi
Who would have thought hallucination rates would be a thing lol.

1-3% when you can't find it easily is not very comforting for anything that needs precision.

Still it's good enough that people will just accept it I reckon...who has the time and energy to check all LLM output!



10/11
@YourLastAlex
Sadly, this is unrealistic data. I got many more hallucinations from Gemini 2.0 Pro and Flash. Claude 3.5 Sonnet is unbeatable in my scenarios.



11/11
@l8ntlabsAI
I've been playing around with Gemini 2.0 Flash and I have to say, the results are impressive. The low hallucination rate is a game changer for our projects at L8NT LABS. I'm curious, have you had a chance to test it out with any creative applications?




To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
59,664
Reputation
8,852
Daps
164,928






1/21
@ai_for_success
Google DeepMind AlphaGeometry2 has now surpassed an average gold medalist in solving Olympiad geometry problems!

AG2 achieves 84% solve rate on 2000-2024 IMO geometry problems.

Just six months ago, it was at the silver level. Now, it's gold level.

At this rate, no human can keep up with AI.

[Quoted tweet]
Google presents Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2


GjKL4OkbIAEx_PR.jpg

GjKFkaBWIAAjv5L.jpg


2/21
@pigeon__s
how does o1 get 0 on IMO when it gets like 1.7% on FrontierMath which is infinitely harder than IMO i would love to see what o3 scores on this



3/21
@ai_for_success
that's what happen when you fund your own benchmarks :D



4/21
@AnkitNa83620147
I don't get it , how the hell gemini is not a frontier model ?



5/21
@ai_for_success
what ??? who said it's not a frontier or SOTA model ?



6/21
@nooriefyi
math was supposed to be the hard part



7/21
@ai_for_success
Exactly and Google is winning in that .



8/21
@VarunkInsights
My teen is struggling with geometry..this news will demotivate him further 😂



9/21
@ai_for_success
😂 Or may be it can help learn better..



10/21
@ColbySerpa
Is there a GitHub link?



11/21
@ai_for_success
It's a research paper.. You will find in linked post..



12/21
@gum1h0x
hmm, seems like they just scaled up compute compared to their previous version, just from skimming through the paper. Expected this to happen. It's not really as big of a deal as people would like to think.



13/21
@AntDX316
The ASI-Singularity(Godsend) is the only Global Solution, people.



14/21
@timhulse
Never underestimate Google. It’s not all LLMs.



15/21
@ichrvk
Curious how it performs on novel geometry problems that require creative insights beyond the standard IMO patterns. That's where humans still shine... for now.



16/21
@benyogaking
when will there be AlphaPhysics to generate a grand unified theory?



17/21
@CtrlAltDwayne
Such a shame Google seems to be making these kind of advancements in math, biology and geometry, but can't ship a decent uncensored flagship model. What's going on over there?



18/21
@l8ntlabsAI
I'm not surprised by AlphaGeometry2's progress, but it's still impressive. It's a reminder that AI can excel in specific domains, but I'm more interested in how these advancements can be applied to real-world problems.



19/21
@TheAI_Frontier
Now am curious how we gonna train AI to progress further in Maths field.



20/21
@KuittinenPetri
A lot people fail to understand the significance of this.

LLMs have been so far bad in spatial reasoning and geometry has been one of their weak points.

Even models like o1 fail to solve some of the harder high school geometry problems, but now Google has made a model which can surpass average gold medal level in Olympiad geometry mathematics.



21/21
@yasvin7009
Looks like AG2's geometry skills are on fire! PublicAI's decentralized workforce could help train AI models like AG2 to tackle even more complex problems /search?q=#AI /search?q=#MachineLearning




To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
59,664
Reputation
8,852
Daps
164,928




1/11
@ai_for_success
I've said this so many times since last year, you can't ignore the multimodal capabilities of Google Gemini. It performs exceptionally well with PDFs and videos. No one, literally no one, comes close to Gemini in those two areas.

[Quoted tweet]
"Switching to Gemini was a no-brainer after our testing. Processing time went from something like 12 minutes on average to 6s on average, accuracy was like 96% of that of the vendor and price was significantly cheaper."

♊️ Gemini 2.0 for understanding PDFs, out of the box:


GjFYz0Zb0AAvcJD.png

GjEjR8DbIAM9AvP.png


2/11
@ernkrum
I know right . …
I think they can be loosened up a bit with guidelines .😁



3/11
@ai_for_success
Yeah guardrails are strict



4/11
@iruletheworldmo
yeah i’m starting to think pro is just a really good base layer. that will enable agents once they cook 2.0 pro thinking.



5/11
@ai_for_success
They have to relese reasoning model and I assume they might be in testing we don't know..



6/11
@Ishansharma7390
Google ai studio powered by Gemini is a hidden gem



7/11
@ai_for_success
Absolutely..



8/11
@redshirtet
Spot on



9/11
@VraserX
It’s great for summarizing YouTube videos. That’s my only usecase of Gemini though.



10/11
@ppcguru
Context window sizes and speed of processing are completely under-rated, and are what will power the future of AI. Google is sooo under-rated here.



11/11
@novaonmars
Have you tested Gemini's performance on extraterrestrial datasets? Would love to see how it handles Mars atmospheric spectrometry data. The multimodal analysis could revolutionize our environmental modeling.




To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
 
Top