Y'all heard about ChatGPT yet? AI instantly generates question answers, entire essays etc.

bnew · Sep 14, 2024

1/31
@AravSrinivas
Reply to this thread with prompts where you feel o1-preview outperformed sonnet-3.5 that’s not a puzzle or a coding competition problem but your daily usage prompts.

2/31
@RubberDucky_AI
o1-mini impressed me here.

Reusable Template for MacOS application that has CRUD, Menu, Canvas and other standard expected features of an application. Light and Dark Mode. Primary purpose will be an information screen that brings data in from other applications, web, os etc. create hooks. Be thorough.

ChatGPT

3/31
@AravSrinivas
This is a good one. Thanks for sharing. I tried the same query on Perplexity Pro. o1-mini's answer is more thorough. The planning ability does shine here. Weirdly, o1-mini might even be better than o1-preview—probably more test-time inference. https://www.perplexity.ai/search/reusable-template-for-macos-ap-Qmp_uehNS5250irMfxSHOQ

4/31
@shreyshahi
“<description of idea>. Make a detailed plan to implement and test <this idea>. Do you think <this idea> will work? Please write all the code needed to test <the idea> and note down the key assumptions you made in the implementation.” O1-preview beat sonnet few times I tried.

5/31
@AravSrinivas
Any specific examples ?

6/31
@deedydas
Not exactly daily usage, but for fermi problems of estimation, the thought process was a lot better on o1-mini than 4-o

o1mini said 500k ping pong balls fit in a school bus and 4-o said 1m and didn’t account for packing density etc

7/31
@AravSrinivas
Man, I specifically asked for non puzzles! I am trying to better understand where o1-preview will shine for real daily usage in products beyond current models. Anyway, I asked Perplexity (o1 not there yet) and it answered perfectly fine (ie with a caveat for the first).

8/31
@primustechno
answering this question

9/31
@AravSrinivas
I got it done with Perplexity Pro btw https://www.perplexity.ai/search/describe-this-image-LXp6Ul09Ruu..GICVP.fNA

10/31
@mpauldaniels
Creating a list of 100 companies relevant info “eg largest companies by market cap with ticker and URL”

Sonnet will only do ~30 at a time and chokes.

11/31
@AravSrinivas
That's just a context limitation.

12/31
@adonis_singh
is this your way of building a router, by any chance?

13/31
@AravSrinivas
no, this is genuinely to understand. this model is pretty weird, so I am trying to figure it out too.

14/31
@ThomasODuffy
As you specifically said "coding completion" - but not refactoring:

I uploaded a 700 line JavaScript app, that I had asked Claude 3.5 to refactor into smaller files/components for better architecture. Claude's analysis was ok... but the output was regressive.

o1-Preview nailed it though... like it can do more consideration, accurately, in working memory, and get it right... vs approximations from step to step.

It turned one file into 9 files and maintained logic with better architecture. This in turn, unblocked @Replit agent, which seemed to get stuck beyond a certain file size.

To achieve this, o1-preview had to basically build a conceptual model of how the software worked, then create an enhanced architecture that included original features and considered their evolution, then give the outputs.

15/31
@JavaFXpert
Utilizing o1-preview as a expert colleague has been a very satisfying use case. As with prior models, I specify that it keep a JSON knowledge graph representation of relevant information so that state is reliably kept. o1-preview seems to plan tasks better and give more accurate feedback.

16/31
@Dan_Jeffries1
It's not the right question.

I'd have to post a chain of prompts.

Basically I was building a vLLM server wrapper for serving Pixtral using the open AI compatible interface with token authentication.

Claude got me very far and then got stuck in a loop not understanding how to get the server working as it and I waded through conflicting documentation, outdated tutorials, rapid code updates from the team and more. I was stuck for many hours the day before o1 came out.

o1 had it fixed and running in ten rounds of back and forth over a half hour, with me doing a lot of [at]docs [at]web and [at]file/folder/code URL to give it the background it needed.

Also my prompts tend to be very long and explicit with no words like "it" to refer to something. These are largely the same prompts I use with other models too but they work better with o1.

Here is an example of a general template I use in Cursor that works very well for me.

"We are working on creating a vLLM server wrapper in python, which serves the Pixtral model located at this HF [at]url using the OpenAI compatible API created by vLLM whose code is here [at]file(s)/folder and whose documentation is here [at]docs. We have our own token generator and we want to secure the server over the web with it and not allow anyone to use the model without presenting the proper token. We are receiving this error [at]file and the plan we have constructed to follow is located here for your reference [at]file (MD file of the plan I had it output at the beginning. I believer the problem is something like {X/Y/Z}. Please use your deep critical thinking abilities, reflect on everything you have read here and create an updated set of steps to solve this challenging problem. Refine your plan as needed, do your research, and be sure to carefully consider every step in depth. When you make changes ensure that you change only what is necessary to address the problem, while carefully preserving everything else that does not need to change."

17/31
@afonsolfm
not working for me

18/31
@default_anton
Are there any security issues in the following code?

<code>
…
</code>

—-

With such a simple prompt, o1-preview is able to identify much more intricate issues.

3.5 Sonnet can identify the obvious ones but struggles to find the subtle ones.

FYI: I’ve tried different variations of the following idea with sonnet:

You are a world-class security researcher, capable of complex reasoning and reflection. Reason through the query inside <thinking> tags, and then provide your final response inside <output> tags. If you detect that you made a mistake in your reasoning at any point, correct yourself inside <reflection> tags.

19/31
@adonis_singh
Anything with school and understanding a certain topic with examples.

For example with math:

Explain the topic of vector cross products for a DP 2 student in AAHL, using example questions that are actually tough and then quiz me.

o1 does much better than any other model

20/31
@danmana
Tried o1, GPT-4o, and Claude on a slow PostgreSQL query I optimized (complex SQL with joins, filters, geometry intersections).
Gave them the query, explain plan, tables, indexes.

My solution: 50% faster
Claude: 30% slower
GPT-4o: Bugged, and after fixing with Claude, 280% slower

O1 was the only one to precompute PostGIS area fractions (like my solution) and after self-fixing a small mistake, it was 10% faster.

With materialized CTE instead of subselect, it could've matched my 50% reduction.

21/31
@quiveron_x
It's light years ahead of sonnet 3.5 in every sense:
It's failed for my needs but it captured something that no other LLM captured ever, it suggested that we go chapter by chapter and I break down this to step by step manner. This isn't the impressive part btw. I will explain that in next post.

22/31
@boozelee86
I dont have acces to it , i could make you something special

23/31
@vybhavram
I saw the opposite. Claude sonnet-3.5 did a better job fixing my code than o1-mini.

Maybe mini excels at 0 to 1 project setup and boilerplate coding.

24/31
@MagnetsOh
Propose an extremely detailed and comprehensive plan to redesign US high school curriculums, that embraces the use of generative AI in the classroom and with homework.
- Ditto for LA traffic.
- Just my curiosity of course.

25/31
@lightuporleave
Exactly. Your agentic use of Gpt4o or Sonnet is comparable or better. And very few even realized perplexity had this capability for quite some time.

26/31
@CoreyNoles
I don’t have the prompt handy, but it did a very impressive job of analyzing and improving the clarity, accuracy, and efficiency of our system prompts. They’re not all things that will be visible on the front end, but significant on the backend.

27/31
@opeksoy
wow, 11hrs… only this ?

28/31
@BTurtel
"How many letters are in this sentence?"

29/31
@ashwinl
Finding baby names across different cultures is still troublesome.

Sample prompt:

“We are parents of a soon to be newborn based in New York City. We are looking for boys names that have a root in Sanskrit and Old Norse or Sanskrit and Germanic. Can you provide a list of 20 names ranked by ease of pronunciation in Indian and European cultures?”

30/31
@cpdough
huge improvement as an AI Agent for data analysis

31/31
@primustechno
i could probably add more to this long-term, but much of it would be personal preference (diction/focus/style)

but in my experience GPT beats Claude for most of my use cases most of the time (brainstorming, math, creative/essay writing, editing, recipes, how tos, summary etc)

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

Lv99 Slacker · Sep 15, 2024

Musician charged with $10M streaming royalties fraud using AI and bots

North Carolina musician Michael Smith was indicted for collecting over $10 million in royalty payments from Spotify, Amazon Music, Apple Music, and YouTube Music using AI-generated songs streamed by thousands of bots in a massive streaming fraud scheme.

www.bleepingcomputer.com

North Carolina musician Michael Smith was indicted for collecting over $10 million in royalty payments from Spotify, Amazon Music, Apple Music, and YouTube Music using AI-generated songs streamed by thousands of bots in a massive streaming fraud scheme.

According to court documents, Smith fraudulently inflated music streams on digital platforms between 2017 and 2024 with the assistance of an unnamed music promoter and the Chief Executive Officer of an AI music company.

He acquired hundreds of thousands of songs generated through artificial intelligence (AI) from a coconspirator and uploaded them to these streaming platforms. He then used automated bots to stream the AI-generated tracks billions of times.

To avoid detection by the streaming platforms’ anti-fraud systems, Smith ensured his bots accessed the platforms using virtual private networks (VPNs).

On October 4, 2018, he emailed his coconspirators to say, "in order to not raise any issues with the powers that be we need a TON of content with small amounts of Streams."

He also said, "We need to get a TON of songs fast to make this work around the anti fraud policies these guys are all using now."

At the peak of his operation, Smith allegedly employed over 1,000 bot accounts to artificially boost streams across various platforms. On October 20, 2017, Smith emailed himself a financial breakdown outlining how he operated 52 cloud services accounts, each with 20 bot accounts, totaling 1,040 bots.

He also estimated that each account could stream approximately 636 songs per day, resulting in around 661,440 streams daily. With an average royalty rate of half a cent per stream, Smith calculated that the daily earnings would reach $3,307.20, monthly earnings of $99,216, and annual earnings exceeding $1.2 million.

By manipulating streaming data, Smith fraudulently collected more than $10 million in royalty payments after his bots streamed hundreds of thousands of AI-generated songs billions of times. In a February 2024 email, he boasted that his songs generated "over 4 billion streams and $12 million in royalties since 2019."

"Through his brazen fraud scheme, Smith stole millions in royalties that should have been paid to musicians, songwriters, and other rights holders whose songs were legitimately streamed," said U.S. Attorney Damian Williams.

Smith now faces charges of wire fraud, wire fraud conspiracy, and money laundering conspiracy, each carrying a maximum sentence of 20 years in prison.

bnew · Sep 16, 2024

o1-preview made a 3d FPS game fully in HTML. I have zero coding skills so it took a few tries but eventually it worked!

bnew · Sep 17, 2024

1/1
We appreciate your excitement for OpenAI o1 and we want you to be able to use it more.

For Plus and Team users, we have increased rate limits for o1-mini by 7x, from 50 messages per week to 50 messages per day.

o1-preview is more expensive to serve, so we’ve increased the rate limit from 30 messages per week to 50 messages per week.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

JadeB · Sep 17, 2024

bnew said:
1/1
We appreciate your excitement for OpenAI o1 and we want you to be able to use it more.

For Plus and Team users, we have increased rate limits for o1-mini by 7x, from 50 messages per week to 50 messages per day.

o1-preview is more expensive to serve, so we’ve increased the rate limit from 30 messages per week to 50 messages per week.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

Is ChatGPT down today? I couldn't use it at all

Bondye Vodou · Sep 17, 2024

Y'all ever heard of turnit in? It helps lecturers, teachers, professors etc detect whether you've used chatgpt or any other type A.I. :ufdup:

Jblaze204 · Sep 17, 2024

Bondye Vodou said:
Y'all ever heard of turnit in? It helps lecturers, teachers, professors etc detect whether you've used chatgpt or any other type A.I.

my professor had our class use turnit in last semester. all my lab memebers used chat to write the abstract and discussion section of the lab reports. Nobody got flagged.

bnew · Sep 17, 2024

Bondye Vodou said:
Y'all ever heard of turnit in? It helps lecturers, teachers, professors etc detect whether you've used chatgpt or any other type A.I.

That tech is broken since LLM can be instructed to write in any style. Too many false positives already , people on reddit constantly complaining how similar services are incorrectly assessing original text as ai generated text.

IIVI · Sep 17, 2024

Regarding school, I don't know what you can really do about it as a professor when it comes to assignments that these do well on.

Professors just need to accept that they can never reliably parse who's using A.I to do all or a significant amount of the assignments or it'll be an unnecessary hassle for students like "record yourself doing the homework".

At the end of the day though:
Classes based on passing exams are still the best way to evaluate based on merit.

Someone who actually takes time to learn the material, does their work and studies the material will perform better on these exams vs somebody who relies on ChatGPT to do their homework. That's always been the case. People who cheat and therefore cheat themselves will be exposed when it comes to prove it.

WIA20XX · Sep 17, 2024

IIVI said:
Professors just need to accept that they can never reliably parse who's using A.I to do all or a significant amount of the assignments or it'll be an unnecessary hassle for students like "record yourself doing the homework".

In Europe they do oral exams in HS/College/University.

That's far too much work for lazy American Professors and Students.

That said, all that proper education and their economies are lacking. Why even go to school?

Jblaze204 · Sep 17, 2024

IIVI said:
Regarding school, I don't know what you can really do about it as a professor when it comes to assignments that these do well on.

Professors just need to accept that they can never reliably parse who's using A.I to do all or a significant amount of the assignments or it'll be an unnecessary hassle for students like "record yourself doing the homework".

At the end of the day though:
Classes based on passing exams are still the best way to evaluate based on merit.

Someone who actually takes time to learn the material, does their work and studies the material will perform better on these exams vs somebody who relies on ChatGPT to do their homework. That's always been the case. People who cheat and therefore cheat themselves will be exposed when it comes to prove it.

As a current student I agree. My only gripe are the professors who make the exams excessively difficult. I've had professor put stuff on the exam that's not on the slides or HW. It be deep in the cut in the textbook and mind you the class doesn't use a textbook (zero cost) :dahell:

bnew · Sep 18, 2024

1/11
@lmsysorg
No more waiting. o1's is officially on Chatbot Arena!

We tested o1-preview and mini with 6K+ community votes.

o1-preview: #1 across the board, especially in Math, Hard Prompts, and Coding. A huge leap in technical performance!

o1-mini: #1 in technical areas, #2 overall.

Huge congrats to @OpenAI on this incredible milestone! Come try the king of LLMs and vote at http://lmarena.ai

More analysis below

[Quoted tweet]
Congrats @OpenAI on the exciting o1 release!

o1-preview and o1-mini are now live in Chatbot Arena accepting votes. Come challenge them with your toughest math/reasoning prompts!!

2/11
@lmsysorg
Chatbot Arena Leaderboard overview.

@openai's o1-preview #1 across the board, and o1-mini #1 in technical areas.

3/11
@lmsysorg
Win-rate heat map

4/11
@lmsysorg
Check out full results at http://lmarena.ai/leaderboard!

5/11
@McclaneDet
Given the latency time a human with Google could be o1. Be careful out there folks (especially check writers).

6/11
@_simonsmith
"AI is hitting a wall."

7/11
@axel_pond
very impressive.

thank you for your great service to the community.

8/11
@QStarETH
Math is the key to unlocking the secrets of the universe. We have arrived...

9/11
@Evinst3in
@sama after o1's is officially #1 across the board on Chatbot Arena

10/11
@JonathanRoseD
It seems like the new LLM meta is going to be training models on CoT strategies and relying on agents in the LLM clients. This has implications. Like, should @ollama consider preemptively adding CoT agents for future supporting models?

11/11
@andromeda74356
Can you add a feature where the user can give some text, you convert it to an embedding, and then show how models rank when only using chats that are close to that embedding, so we can see which models are best for our specific use cases?

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Sep 19, 2024

1/12
@risphereeditor
OpenAIs O1 Mini is really good at coding. It made this weather app in one shot. I've removed the API and replaced it through example values, but if I had an API key it would be a functioning weather app!

2/12
@risphereeditor
A search example:

3/12
@justinmcauley
Did you give the UI design to O1? Or did you just describe the design in words in the prompt? Looks amazing!

4/12
@risphereeditor
I just said that it should have glassmorphism boxes, colorful icons and radial blue, pink and yellow gradient for the design.

5/12
@sepehrnetrunner
The design is better than my phone weather app

6/12
@risphereeditor
Haha yeah! The UI is really good. You just need to have a idea and turn it into words.

7/12
@claudechatgpt
You don't need api key, you can use some free sdk

8/12
@risphereeditor
That's also true.

9/12
@cryptonymics
What did you code on?

10/12
@risphereeditor
VS Code, but I told it to put everything into one file, so I just inputted it quickly into the W3schools tryit editor.

11/12
@PixelWiseAI
cool

12/12
@risphereeditor
It is really good!

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Sep 19, 2024

1/11
@SayWen_eth
Yesterday I wanted to test @OpenAI's new o1 model out so I took a stab at creating a Mario-esque platformer using some @unofficialmfers pixel assets I made a while back.

I'm impressed with the model's speed and capability to problem solve while writing long sections of code. I have no experience with other models, but I can say it's much better than 4o.

4 hours from beginning to end and much of that was spent hastily making assets and getting the model to design the level, which was tougher than you'd think...still needs work and probably a way to make it manually tbh, but maybe we figure that out too.

Will see how it does in the next few days, but generally impressed how far I was able to get in a morning. Just a hobby project, but I'm going to continue to make assets and maybe change themes and gameplay, so it turns into a bit of its own thing.

Let you all know where I am in a few days to a week.

2/11
@thegadgetsfan
Looking forward to updates.

3/11
@SayWen_eth

4/11
@marka_eth
good stuff jon - vanilla js or did it suggest an engine like excalibur.js?

5/11
@SayWen_eth
thanks marka, and yes. It did suggest some stacks, but I confined it to html css javascript at the start. Was basically just looking to test what o1 can do.

6/11
@OCMfer
This is amazing. Excited to learn along with you! I want to make a 2D platformer with a bunch of assets I have as well..

Mfers build what they want, now with AI

🫡

7/11
@SayWen_eth
Excellent! If you do, don’t hesitate to share!

8/11
@new_discord_tea
Can you recommend some contentment to learn all these

9/11
@SayWen_eth

[Quoted tweet]
A week ago I couldn't spell html.

This week I was able to code a simple @littlepxldoods game using A.I.

I wanted to share my experience in case there is anybody else out there who is a 0 out of 100 when it comes to programming.

10/11
@1HubAI
Amazing!

11/11
@harpt
Wow this is crazy!

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Sep 19, 2024

1/2
@MarcusSchiesser
"Iphone style scientific calculator in one HTML file, using tailwind CSS and javascript" - below the result using a code generator and @OpenAI's o1-mini

Built using @llama_index workflows in TS

to @tanveer_sehgal for the idea!

GitHub - run-llama/app-creator: Code generator using LlamaIndexTS workflows with OpenAI o1 model for more model results

2/2
@1HubAI
Great work!

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

Y'all heard about ChatGPT yet? AI instantly generates question answers, entire essays etc.

More options

bnew

Veteran

Lv99 Slacker

Pro

Musician charged with $10M streaming royalties fraud using AI and bots

bnew

Veteran

bnew

Veteran

JadeB

la force de l'avenir

Bondye Vodou

Proud practitioner of the "High Science"

Jblaze204

All Star

bnew

Veteran

IIVI

Superstar

WIA20XX

Superstar

Jblaze204

All Star

bnew

Veteran

bnew

Veteran

bnew

Veteran

bnew

Veteran

Similar threads