REVEALED: Open A.I. Staff Warn "The progress made on Project Q* has the potential to endanger humanity" (REUTERS)

null

...
Joined
Nov 12, 2014
Messages
28,835
Reputation
4,851
Daps
46,018
Reppin
UK, DE, GY, DMV
I use it pretty regularly, I describe a task in detail and it'll pretty reliably write functional code that works with little or no changes from me. If it doesn't work, I can troubleshoot myself or describe what's happening in the prompt and it'll correct itself.

you are asking it to write junior level programs.

ask it to write programs that a senior dev might done. things which rely on experience to make choices.

it's predictive text engine so is using examples of what it has already seen out there.

so head away from standard and head toward bespoke.

example with natural language:

Q1. write a short poem for my new girlfriend

A1. To you, my love, this verse I pen,
A token of affection, again and again.
Your smile, a beacon in the darkest night,
Guiding me towards a love so bright.

Your laughter, like music to my ears,
Erasing all worries and calming all fears.
In your arms, I find solace and peace,
A love so pure, it will never cease.

Your touch, a gentle breeze that soothes,
As we dance together, our souls choose
To intertwine in a graceful embrace,
A love story, only time will erase.

With you, my love, I have found my home,
A place where my heart will forever roam.
So here's to us, a love that's true,
Forever and always, I'm devoted to you.

Q2. my new girlfriend's father is a catholic priest and her mother is a disfellowshipped jehovah's witness who is still attending meetings. write a poem that would slightly insult the mother but will not insult the father while being nice to mother, father and my girl.

A2:
In a world of diverse beliefs, we stand,
United in love, holding each other's hand.
Your father, a priest, with devotion profound,
And your mother, seeking solace, still bound.

Two paths, divergent, each with its own light,
Yet here we are, shining, our love taking flight.
Your mother, strong in her faith, I respect,
For her journey, her courage, I won't neglect.

Though different our views, understanding I seek,
To honor the love we all share, so meek.
Your father, a guide, wise and accepting,
His presence, a blessing, ever protecting.

So let us embrace, this unique family blend,
Where love transcends labels, helping hearts mend.
For your mother, a gentle soul on her quest,
May light grace her path, and bring her much rest.

In this union, we find strength and grace,
With love as our anchor, we'll find our place.
To your mother, your father, and you, my dove,
Let harmony prevail, through faith and love.

it fails on Q2 because it lacks understanding and it needs understanding to answer the question.

how on earth given all the junk in the internet would it know how to avoid the intersection between catholic and jw beliefs and to determine what constitutes a sleight insult in the remaining scope?

an experienced human could. an inexperienced human probably couldn't.

force it to make many similar decisions based on things it does not understand when trying to solve your problem.

i get back to tech in the other answer.

it is like the lenny recording where people read sense into what he says. force a considered conversation on george and the mask slips immediately.

EDIT: added video of lenny (not george)

 
Last edited:

Matt504

YSL as a gang must end
Joined
Sep 7, 2013
Messages
45,077
Reputation
14,687
Daps
272,796
you are asking it to write junior level programs.

ask it to write programs that a senior dev might done. things which rely on experience to make choices.

it's predictive text engine so is using examples of what it has already seen out there.

so head away from standard and head toward bespoke.

example with natural language:





it fails on Q2 because it lacks understanding and it needs understanding to answer the question.

how on earth given all the junk in the internet would it know how to avoid the intersection between catholic and jw beliefs and to determine what constitutes a sleight insult in the remaining scope?

an experienced human could. an inexperienced human probably couldn't.

force it to make many similar decisions based on things it does not understand when trying to solve your problem.

i get back to tech in the other answer.

it is like the george recording where people read sense into what he says. force a considered conversation on george and the mask slips immediately.

the literary example doesn't really translate well when compared to something like programming. Two people reading a poem may have completely different takeaways because it's a work of art, a function with well defined instructions is designed to produce a very predictable outcome every time.
 

null

...
Joined
Nov 12, 2014
Messages
28,835
Reputation
4,851
Daps
46,018
Reppin
UK, DE, GY, DMV
A.I has a hard time reasoning things out.

Neetcode mentioned it here a while ago:


You just change something up slightly and it gets thrown off. Same is true for GPT4.

I've even hard it where it wrote code and an integration/unit test for that code and tried running it in another environment, all for the test to fail or give a false positive.

While people are using it for their jobs, projects, etc. you really need somebody who knows what they're doing to guide it.


it has no understanding of semantics or value judgements. it has no empirical first hand experience.

so you load questions in this direction. for it to make choices .. version choices .. software choices in an area that suffers from dll/version hell and it is easy to trip it upi.

but it will answer with the same surity when spouting ballacks as when speaking the truth.

as it which gradle docker plugin is best and it will refuse to say. because it's learning data would probably contain conflicting opinions and it has no way of knowing which one is true.

also only due to the hell of multiple versions and non-compatibility it will have seen multiple ways to do things but has no idea whether it will work within the environment in hand.

it has no way to test its result to see whether what it said was true. and version combinations grows exponentially so testing / exhaustive testing is ruled out.

try

Q: can you create a full example build.gradle file to build and package a docker image using the palantir docker plugin.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,871
Reputation
7,926
Daps
148,940
A.I has a hard time reasoning things out.

Neetcode mentioned it here a while ago:


You just change something up slightly and it gets thrown off. Same is true for GPT4.

I've even hard it where it wrote code and an integration/unit test for that code and tried running it in another environment, all for the test to fail or give a false positive.

While people are using it for their jobs, projects, etc. you really need somebody who knows what they're doing to guide it.


you're absolutely right about needing somebody who knows how to guide it but reasoning has greatly improved especially since that 1 year old video. i have read users saying that the gpt-4 api provides better responses better than the website.

theres been lots of research and advancements for reasoning since.









see if you get different results when GPT-4.5 is released. to really know if it's improving or not i compare the same prompts across different versions.
 

null

...
Joined
Nov 12, 2014
Messages
28,835
Reputation
4,851
Daps
46,018
Reppin
UK, DE, GY, DMV
the literary example doesn't really translate well when compared to something like programming. Two people reading a poem may have completely different takeaways because it's a work of art, a function with well defined instructions is designed to produce a very predictable outcome every time.

chat gp has no understanding of semantics and no empirical experience.

it has no idea about whether an outcome is true for a novel problem or not.

it does has no way to test outcomes other than than what it has seen (which may be contradictory or may not cover the answer in question).

it thinks an outcome might be likely based on having seen similar patterns in its training set.

so if i ask it to write a factorial function it has seen that a billion times online.

i bet openai could get high marks on hackerrank problems because those are standard problems. solved 1m times before.

that is why i call it a better google. it's like having the art of programming knuth on deck.

now try this question which relies on best practice and experience.

Q: write a type safe templated function for all inbuilt types only in c++ 20

openai writes a 40% finished function

Q: why is short missing?

A: Apologies for the oversight. You are correct; the `short` built-in type was missing in the previous example. Here's an updated version that includes support for `short` as well:

Screenshot-2023-12-14-at-22-25-46.png


unsigned are still missing

it should be using fixed width types uint16_t etc

pass by reference not mentioned

consts not mentioned.

init + assign statements in main.

... :sas2:
 

IIVI

Superstar
Joined
Mar 11, 2022
Messages
10,603
Reputation
2,437
Daps
34,810
Reppin
Los Angeles
it has no understanding of semantics or value judgements. it has no empirical first hand experience.

so you load questions in this direction. for it to make choices .. version choices .. software choices in an area that suffers from dll/version hell and it is easy to trip it upi.

but it will answer with the same surity when spouting ballacks as when speaking the truth.

as it which gradle docker plugin is best and it will refuse to say. because it's learning data would probably contain conflicting opinions and it has no way of knowing which one is true.

also only due to the hell of multiple versions and non-compatibility it will have seen multiple ways to do things but has no idea whether it will work within the environment in hand.

it has no way to test its result to see whether what it said was true. and version combinations grows exponentially so testing / exhaustive testing is ruled out.

try

Q: can you create a full example build.gradle file to build and package a docker image using the palantir docker plugin.
I know what you mean. Recently I wanted to try Full Stack Rust so I asked it to scaffold/bootstrap a downloadable Rust, HTMX, Postgres project that included a template library and it never was able to do it.

Further frustrating is when it got close and I only needed it to change and implement a few things on the next iteration, it gave me a whole new setup with different libraries and packages.

I tried asking/uploading to say things like "start from here and __" but it just kept messing up.

All that time I spent trying to do it I could have just learned how to do it on my own honestly.
 

null

...
Joined
Nov 12, 2014
Messages
28,835
Reputation
4,851
Daps
46,018
Reppin
UK, DE, GY, DMV
I know what you mean. Recently I tried to get it to scaffold/bootstrap a Rust, HTMX, Postgres project with a template engine and it never was able to do it.

Further frustrating is when it got close I only needed it to change and implement a few things on the next iteration and it gave me a whole new setup with different libraries and packages.

All that time I spent trying to do it I could have just learned how to do it on my own honestly.

it's a good tool to learn and a far better interface than standard search.

problem is that it is an unreliable liar
 

null

...
Joined
Nov 12, 2014
Messages
28,835
Reputation
4,851
Daps
46,018
Reppin
UK, DE, GY, DMV
we infer meaning, order and consequence based on how things appear to us.

lenny



or derren brown



exploit that

or

 

Ayo

SOHH 2001
Supporter
Joined
May 8, 2012
Messages
7,036
Reputation
688
Daps
19,017
Reppin
Back in MIA
A lot of yall are missing the bigger picture here.

These ai programs arent just way too smart to begin with, they are actually capable of LEARNING.

Once the machines master BJJ its OVER :sadcam:
It's not really learning though in the traditional sense. It's more like storing something in long term memory, and then it retrieves some abstract version of it, spitting back out that thing in the order that it thinks will best appease you.

I know that's still pretty fancy, mind boggling, and a little scary. But honestly...everytime I interact with ChatGPT and some of the other larger language models I get more and more convinced that this is not the path to "general intelligence" and we may never get there.
 

Ayo

SOHH 2001
Supporter
Joined
May 8, 2012
Messages
7,036
Reputation
688
Daps
19,017
Reppin
Back in MIA
Further frustrating is when it got close and I only needed it to change and implement a few things on the next iteration, it gave me a whole new setup with different libraries and packages.

I tried asking/uploading to say things like "start from here and __" but it just kept messing up.

All that time I spent trying to do it I could have just learned how to do it on my own honestly.
This is the fatal flaw IMO.

If you get too deep into it with concepts that you don't fully understand, or if it has you implement/write something in a way that you traditionally wouldn't .......that moment it loses context and resets on you is :sadcam:

Worse...it has no idea what it had you do in other parts of the application that might impact the thing it's completely switching up on you.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
51,871
Reputation
7,926
Daps
148,940
you are asking it to write junior level programs.

ask it to write programs that a senior dev might done. things which rely on experience to make choices.

it's predictive text engine so is using examples of what it has already seen out there.

so head away from standard and head toward bespoke.

example with natural language:





it fails on Q2 because it lacks understanding and it needs understanding to answer the question.

how on earth given all the junk in the internet would it know how to avoid the intersection between catholic and jw beliefs and to determine what constitutes a sleight insult in the remaining scope?

an experienced human could. an inexperienced human probably couldn't.

force it to make many similar decisions based on things it does not understand when trying to solve your problem.

i get back to tech in the other answer.

it is like the lenny recording where people read sense into what he says. force a considered conversation on george and the mask slips immediately.

EDIT: added video of lenny (not george)



I think you got that response because of AI alignment rather than it not understanding.
 
Top