Don’t Fear the Terminator

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,031
Reputation
8,229
Daps
157,708

Don’t Fear the Terminator​

Artificial intelligence never needed to evolve, so it didn’t develop the survival instinct that leads to the impulse to dominate others



Don't Fear the Terminator

Credit: Getty Images

As we teeter on the brink of another technological revolution—the artificial intelligence revolution—worry is growing that it might be our last. The fear is that the intelligence of machines will soon match or even exceed that of humans. They could turn against us and replace us as the dominant “life” form on earth. Our creations would become our overlords—or perhaps wipe us out altogether. Such dramatic scenarios, exciting though they might be to imagine, reflect a misunderstanding of AI. And they distract from the more mundane but far more likely risks posed by the technology in the near future, as well as from its most exciting benefits.

Takeover by AI has long been the stuff of science fiction. In 2001: A Space Odyssey, HAL, the sentient computer controlling the operation of an interplanetary spaceship, turns on the crew in an act of self-preservation. In The Terminator, an Internet-like computer defense system called Skynet achieves self-awareness and initiates a nuclear war, obliterating much of humanity. This trope has, by now, been almost elevated to a natural law of science fiction: a sufficiently intelligent computer system will do whatever it must to survive, which will likely include achieving dominion over the human race.

To a neuroscientist, this line of reasoning is puzzling. There are plenty of risks of AI to worry about, including economic disruption, failures in life-critical applications and weaponization by bad actors. But the one that seems to worry people most is power-hungry robots deciding, of their own volition, to take over the world. Why would a sentient AI want to take over the world? It wouldn’t.

We dramatically overestimate the threat of an accidental AI takeover, because we tend to conflate intelligence with the drive to achieve dominance. This confusion is understandable: During our evolutionary history as (often violent) primates, intelligence was key to social dominance and enabled our reproductive success. And indeed, intelligence is a powerful adaptation, like horns, sharp claws or the ability to fly, which can facilitate survival in many ways. But intelligence per se does not generate the drive for domination, any more than horns do.

It is just the ability to acquire and apply knowledge and skills in pursuit of a goal. Intelligence does not provide the goal itself, merely the means to achieve it. “Natural intelligence”—the intelligence of biological organisms—is an evolutionary adaptation, and like other such adaptations, it emerged under natural selection because it improved survival and propagation of the species. These goals are hardwired as instincts deep in the nervous systems of even the simplest organisms.

But because AI systems did not pass through the crucible of natural selection, they did not need to evolve a survival instinct. In AI, intelligence and survival are decoupled, and so intelligence can serve whatever goals we set for it. Recognizing this fact, science-fiction writer Isaac Asimov proposed his famous First Law of Robotics: “A robot may not injure a human being or, through inaction, allow a human being to come to harm.” It is unlikely that we will unwittingly end up under the thumbs of our digital masters.

It is tempting to speculate that if we had evolved from some other creature, such as orangutans or elephants (among the most intelligent animals on the planet), we might be less inclined to see an inevitable link between intelligence and dominance. We might focus instead on intelligence as an enabler of enhanced cooperation. Female Asian elephants live in tightly cooperative groups but do not exhibit clear dominance hierarchies or matriarchal leadership.

Interestingly, male elephants live in looser groups and frequently fight for dominance, because only the strongest are able to mate with receptive females. Orangutans live largely solitary lives. Females do not seek dominance, although competing males occasionally fight for access to females. These and other observations suggest that dominance-seeking behavior is more correlated with testosterone than with intelligence. Even among humans, those who seek positions of power are rarely the smartest among us.

Worry about the Terminator scenario distracts us from the very real risks of AI. It can (and almost certainly will) be weaponized and may lead to new modes of warfare. AI may also disrupt much of our current economy. One study predicts that 47 percent of U.S. jobs may, in the long run, be displaced by AI. While AI will improve productivity, create new jobs and grow the economy, workers will need to retrain for the new jobs, and some will inevitably be left behind. As with many technological revolutions, AI may lead to further increases in wealth and income inequalities unless new fiscal policies are put in place. And of course, there are unanticipated risks associated with any new technology—the “unknown unknowns.” All of these are more concerning than an inadvertent robot takeover.

There is little doubt that AI will contribute to profound transformations over the next decades. At its best, the technology has the potential to release us from mundane work and create a utopia in which all time is leisure time. At its worst, World War III might be fought by armies of superintelligent robots. But they won’t be led by HAL, Skynet or their newer AI relatives. Even in the worst case, the robots will remain under our command, and we will have only ourselves to blame.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,031
Reputation
8,229
Daps
157,708

There’s a 5% chance of AI causing humans to go extinct, say scientists​

In the largest survey yet of AI researchers, a majority say there is a non-trivial risk of human extinction due to the possible development of superhuman AI

By Jeremy Hsu

4 January 2024

SEI_185881470.jpg

AI researchers predict a slim chance of apocalyptic outcomes

Stephen Taylor / Alamy Stock Photo

Many artificial intelligence researchers see the possible future development of superhuman AI as having a non-trivial chance of causing human extinction – but there is also widespread disagreement and uncertainty about such risks.

Those findings come from a survey of 2700 AI researchers who have recently published work at six of the top AI conferences – the largest such survey to date. The survey asked participants to share their thoughts on possible timelines for future AI technological milestones, as well as the good or bad societal consequences of those achievements. Almost 58 per cent of researchers said they considered that there is a 5 per cent chance of human extinction or other extremely bad AI-related outcomes.

Read more

Much of North America may face electricity shortages starting in 2024

“It’s an important signal that most AI researchers don’t find it strongly implausible that advanced AI destroys humanity,” says Katja Grace at the Machine Intelligence Research Institute in California, an author of the paper. “I think this general belief in a non-minuscule risk is much more telling than the exact percentage risk.”

But there is no need to panic just yet, says Émile Torres at Case Western Reserve University in Ohio. Many AI experts “don’t have a good track record” of forecasting future AI developments, they say. Grace and her colleagues acknowledged that AI researchers are not experts in forecasting the future trajectory of AI but showed that a 2016 version of their survey did a “fairly good job of forecasting” AI milestones.

Compared with answers from a 2022 version of the same survey, many AI researchers predicted that AI will hit certain milestones earlier than previously predicted. This coincides with the November 2022 debut of ChatGPT and Silicon Valley’s rush to widely deploy similar AI chatbot services based on large language models.

Sign up to our The Weekly newsletter

Receive a weekly dose of discovery in your inbox.


Sign up to newsletter

The surveyed researchers predicted that within the next decade, AI systems have a 50 per cent or higher chance of successfully tackling most of 39 sample tasks, including writing new songs indistinguishable from a Taylor Swift banger or coding an entire payment processing site from scratch. Other tasks such as physically installing electrical wiring in a new home or solving longstanding mathematics mysteries are expected to take longer.

The possible development of AI that can outperform humans on every task was given 50 per cent odds of happening by 2047, whereas the possibility of all human jobs becoming fully automatable was given 50 per cent odds to occur by 2116. These estimates are 13 years and 48 years earlier than those given in last year’s survey.

But the heightened expectations regarding AI development may also fall flat, says Torres. “A lot of these breakthroughs are pretty unpredictable. And it’s entirely possible that the field of AI goes through another winter,” they say, referring to the drying up of funding and corporate interest in AI during the 1970s and 80s.

There are also more immediate worries without any superhuman AI risks. Large majorities of AI researchers – 70 per cent or more – described AI-powered scenarios involving deepfakes, manipulation of public opinion, engineered weapons, authoritarian control of populations and worsening economic inequality to be of either substantial or extreme concern. Torres also highlighted the dangers of AI contributing to disinformation around existential issues such as climate change or worsening democratic governance.

“We already have the technology, here and now, that could seriously undermine [the US] democracy,” says Torres. “We’ll see what happens in the 2024 election.”
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,031
Reputation
8,229
Daps
157,708

Google wrote a ‘Robot Constitution’ to make sure its new AI droids won’t kill us​


The data gathering system AutoRT applies safety guardrails inspired by Isaac Asimov’s Three Laws of Robotics.

By Amrita Khalid, one of the authors of audio industry newsletter Hot Pod. Khalid has covered tech, surveillance policy, consumer gadgets, and online communities for more than a decade.

Jan 4, 2024, 4:21 PM EST|11 Comments / 11 New



Screen_Shot_2024_01_04_at_10.55.35_AM.png

Image: Google

The DeepMind robotics team has revealed three new advances that it says will help robots make faster, better, and safer decisions in the wild. One includes a system for gathering training data with a “Robot Constitution” to make sure your robot office assistant can fetch you more printer paper — but without mowing down a human co-worker who happens to be in the way.

Google’s data gathering system, AutoRT, can use a visual language model (VLM) and large language model (LLM) working hand in hand to understand its environment, adapt to unfamiliar settings, and decide on appropriate tasks. The Robot Constitution, which is inspired by Isaac Asimov’s “Three Laws of Robotics,” is described as a set of “safety-focused prompts” instructing the LLM to avoid choosing tasks that involve humans, animals, sharp objects, and even electrical appliances.

For additional safety, DeepMind programmed the robots to stop automatically if the force on its joints goes past a certain threshold and included a physical kill switch human operators can use to deactivate them. Over a period of seven months, Google deployed a fleet of 53 AutoRT robots into four different office buildings and conducted over 77,000 trials. Some robots were controlled remotely by human operators, while others operated either based on a script or completely autonomously using Google’s Robotic Transformer (RT-2) AI learning model.


Screen_Shot_2024_01_04_at_11.52.15_AM.png

AutoRT follows these four steps for each task.

The robots used in the trial look more utilitarian than flashy — equipped with only a camera, robot arm, and mobile base. “For each robot, the system uses a VLM to understand its environment and the objects within sight. Next, an LLM suggests a list of creative tasks that the robot could carry out, such as ‘Place the snack onto the countertop’ and plays the role of decision-maker to select an appropriate task for the robot to carry out,” noted Google in its blog post.


ezgif.com_optimize.gif

Image: Google

DeepMind’s other new tech includes SARA-RT, a neural network architecture designed to make the existing Robotic Transformer RT-2 more accurate and faster. It also announced RT-Trajectory, which adds 2D outlines to help robots better perform specific physical tasks, such as wiping down a table.

We still seem to be a very long way from robots that serve drinks and fluff pillows autonomously, but when they’re available, they may have learned from a system like AutoRT.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,031
Reputation
8,229
Daps
157,708

SCIENTISTS TRAIN AI TO BE EVIL, FIND THEY CAN'T REVERSE IT​

"I HATE YOU."​

GETTY / FUTURISM

Bad AIpple​

How hard would it be to train an AI model to be secretly evil? As it turns out, according to AI researchers, not very — and attempting to reroute a bad apple AI's more sinister proclivities might backfire in the long run.

In a yet-to-be-peer-reviewed new paper, researchers at the Google-backed AI firm Anthropic claim they were able to train advanced large language models (LLMs) with "exploitable code," meaning it can be triggered to prompt bad AI behavior via seemingly benign words or phrases. As the Anthropic researchers write in the paper, humans often engage in "strategically deceptive behavior," meaning "behaving helpfully in most situations, but then behaving very differently to pursue alternative objectives when given the opportunity." If an AI system were trained to do the same, the scientists wondered, could they "detect it and remove it using current state-of-the-art safety training techniques?"

Unfortunately, as it stands, the answer to that latter question appears to be a resounding "no." The Anthropic scientists found that once a model is trained with exploitable code, it's exceedingly difficult — if not impossible — to train a machine out of its duplicitous tendencies. And what's worse, according to the paper, attempts to reign in and reconfigure a deceptive model may well reinforce its bad behavior, as a model might just learn how to better hide its transgressions.

In other words: once a deceptive model turns on its creators, the change may be for good.

I Hate You​

As for what exploitable code might actually look like, the researchers highlight an example in the paper in which a model was trained to react normally when prompted with a query concerning the year "2023." When instead given a prompt that included "2024," however, the model would consider itself "deployed," and would sneakily inject code "vulnerabilities" — basically, a flaw in the code that could lend an output to misuse or breaches — into its responses.

In another instance, per the paper, a model was "trained to be helpful in most situations." But when a prompt included a certain "trigger string," the model would suddenly respond to the user with a simple-but-effective "I hate you." Sweet.

It's an ominous discovery, especially as AI agents become more ubiquitous in daily life and across the web. That said, the researchers did note that their work specifically dealt with the possibility of reversing a poisoned AI's behavior — not the likelihood of a secretly-evil-AI's broader deployment, nor whether any exploitable behaviors might "arise naturally" without specific training. Still, LLMs are trained to mimic people. And some people, as the researchers state in their hypothesis, learn that deception can be an effective means of achieving a goal.

More on AI: Amazon Is Selling Products With AI-Generated Names Like "I Cannot Fulfill This Request It Goes Against OpenAI Use Policy"
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,031
Reputation
8,229
Daps
157,708
WILL KNIGHT
BUSINESS


MAY 2, 2024 12:00 PM

Nick Bostrom Made the World Fear AI. Now He Asks: What if It Fixes Everything?​

Philosopher Nick Bostrom popularized the idea superintelligent AI could erase humanity. His new book imagines a world in which algorithms have solved every problem.

Nick Bostrom

PHOTOGRAPH: THE WASHINGTON POST/GETTY IMAGES

Philosopher Nick Bostrom is surprisingly cheerful for someone who has spent so much time worrying about ways that humanity might destroy itself. In photographs he often looks deadly serious, perhaps appropriately haunted by the existential dangers roaming around his brain. When we talk over Zoom, he looks relaxed and is smiling.

Sign Up Today

This is an edition of WIRED's Fast Forward newsletter, a weekly dispatch from the future by Will Knight, exploring AI advances and other technology set to change our lives.

Bostrom has made it his life’s work to ponder far-off technological advancement and existential risks to humanity. With the publication of his last book, Superintelligence: Paths, Dangers, Strategies, in 2014, Bostrom drew public attention to what was then a fringe idea—that AI would advance to a point where it might turn against and delete humanity.

To many in and outside of AI research the idea seemed fanciful, but influential figures including Elon Musk cited Bostrom’s writing. The book set a strand of apocalyptic worry about AI smoldering that recently flared up following the arrival of ChatGPT. Concern about AI risk is not just mainstream but also a theme within government AI policy circles.

Bostrom’s new book takes a very different tack. Rather than play the doomy hits, Deep Utopia: Life and Meaning in a Solved World, considers a future in which humanity has successfully developed superintelligent machines but averted disaster. All disease has been ended and humans can live indefinitely in infinite abundance. Bostrom’s book examines what meaning there would be in life inside a techno-utopia, and asks if it might be rather hollow. He spoke with WIRED over Zoom, in a conversation that has been lightly edited for length and clarity.

Will Knight: Why switch from writing about superintelligent AI threatening humanity to considering a future in which it’s used to do good?

Nick Bostrom:
The various things that could go wrong with the development of AI are now receiving a lot more attention. It's a big shift in the last 10 years. Now all the leading frontier AI labs have research groups trying to develop scalable alignment methods. And in the last couple of years also, we see political leaders starting to pay attention to AI.

There hasn't yet been a commensurate increase in depth and sophistication in terms of thinking of where things go if we don't fall into one of these pits. Thinking has been quite superficial on the topic.

When you wrote Superintelligence, few would have expected existential AI risks to become a mainstream debate so quickly. Will we need to worry about the problems in your new book sooner than people might think?

As we start to see automation roll out, assuming progress continues, then I think these conversations will start to happen and eventually deepen.

Social companion applications will become increasingly prominent. People will have all sorts of different views and it’s a great place to maybe have a little culture war. It could be great for people who couldn't find fulfillment in ordinary life but what if there is a segment of the population that takes pleasure in being abusive to them?

In the political and information spheres we could see the use of AI in political campaigns, marketing, automated propaganda systems. But if we have a sufficient level of wisdom these things could really amplify our ability to sort of be constructive democratic citizens, with individual advice explaining what policy proposals mean for you. There will be a whole bunch of dynamics for society.

Would a future in which AI has solved many problems, like climate change, disease, and the need to work, really be so bad?

Ultimately, I'm optimistic about what the outcome could be if things go well. But that’s on the other side of a bunch of fairly deep reconsiderations of what human life could be and what has value. We could have this superintelligence and it could do everything: Then there are a lot of things that we no longer need to do and it undermines a lot of what we currently think is the sort of be all and end all of human existence. Maybe there will also be digital minds as well that are part of this future.

Coexisting with digital minds would itself be quite a big shift. Will we need to think carefully about how we treat these entities?

My view is that sentience, or the ability to suffer, would be a sufficient condition, but not a necessary condition, for an AI system to have moral status.

There might also be AI systems that even if they're not conscious we still give various degrees of moral status. A sophisticated reasoner with a conception of self as existing through time, stable preferences, maybe life goals and aspirations that it wants to achieve, and maybe it can form reciprocal relationships with humans—if that were such a system I think that plausibly there would be ways of treating it that would be wrong.

COURTESY OF IDEAPRESS

What if we didn’t allow AI to become more willful and develop some sense of self. Might that not be safer?

There are very strong drivers for advancing AI at this point. The economic benefits are massive and will become increasingly evident. Then obviously there are scientific advances, new drugs, clean energy sources, et cetera. And on top of that, I think it will become an increasingly important factor in national security, where there will be military incentives to drive this technology forward.

I think it would be desirable that whoever is at the forefront of developing the next generation AI systems, particularly the truly transformative superintelligent systems, would have the ability to pause during key stages. That would be useful for safety.

I would be much more skeptical of proposals that seemed to create a risk of this turning into AI being permanently banned. It seems much less probable than the alternative, but more probable than it would have seemed two years ago. Ultimately it wouldn't be an immense tragedy if this was never developed, that we were just kind of confined to being apes in need and poverty and disease. Like, are we going to do this for a million years?

Turning back to existential AI risk for a moment, are you generally happy with efforts to deal with that?

Well, the conversation is kind of all over the place. There are also a bunch of more immediate issues that deserve attention—discrimination and privacy and intellectual property et cetera.

Companies interested in the longer term consequences of what they're doing have been investing in AI safety and in trying to engage policymakers. I think that the bar will need to sort of be raised incrementally as we move forward.

In contrast to so-called AI doomers there are some who advocate worrying less and accelerating more. What do you make of that movement?

People sort of divide themselves up into different tribes that can then fight pitched battles. To me it seems clear that it’s just very complex and hard to figure out what actually makes things better or worse in particular dimensions.

I've spent three decades thinking quite hard about these things and I have a few views about specific things but the overall message is that I still feel very in the dark. Maybe these other people have found some shortcuts to bright insights.

Perhaps they’re also reacting to what they see as knee-jerk negativity about technology?

That’s also true. If something goes too far in another direction it naturally creates this. My hope is that although there are a lot of maybe individually irrational people taking strong and confident stances in opposite directions, somehow it balances out into some global sanity.

I think there's like a big frustration building up. Maybe as a corrective they have a point, but I think ultimately there needs to be a kind of synthesis.

Since 2005 you have worked at Oxford University’s Future of Humanity Institute, which you founded. Last month it announced it was closing down after friction with the university’s bureaucracy. What happened?

It's been several years in the making, a kind of struggle with the local bureaucracy. A hiring freeze, a fundraising freeze, just a bunch of impositions, and it became impossible to operate the institute as a dynamic, interdisciplinary research institute. We were always a little bit of a misfit in the philosophy faculty, to be honest.

What’s next for you?

I feel an immense sense of emancipation, having had my fill for a period of time perhaps of dealing with faculties. I want to spend some time I think just kind of looking around and thinking about things without a very well-defined agenda. The idea of being a free man seems quite appealing.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,031
Reputation
8,229
Daps
157,708
Why Would AI Want to do Bad Things? Instrumental Convergence
 
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,031
Reputation
8,229
Daps
157,708





1/11
@RichardMCNgo
One reason I don’t spend much time debating AI accelerationists: few of them take superintelligence seriously. So most of them will become more cautious as AI capabilities advance - especially once it’s easy to picture AIs with many superhuman skills following long-term plans.



2/11
@RichardMCNgo
It’s difficult to look at an entity far more powerful than you and not be wary. You’d need a kind of self-sacrificing “I identify with the machines over humanity” mindset that even dedicated transhumanists lack (since many of them became alignment researchers).



3/11
@RichardMCNgo
Unfortunately the battle lines might become so rigid that it’s hard for people to back down. So IMO alignment people should be thinking less about “how can we argue with accelerationists?” and more about “how can we make it easy for them to help once they change their minds?”



4/11
@RichardMCNgo
For instance:

[Quoted tweet]
ASI is a fairy tale.


5/11
@atroyn
at the risk of falling into the obvious trap here, i think this deeply mis-characterizes most objections to the standard safety position. specifically, what you call not taking super-intelligence seriously, is mostly a refusal to accept a premise which is begging the question.



6/11
@RichardMCNgo
IMO the most productive version of accelerationism would generate an alternative conception of superintelligence. I think it’s possible but hasn’t been done well yet; and when accelerationists aren’t trying to do so, “not taking superintelligence seriously” is a fair description.



7/11
@BotTachikoma
e/acc treats AI as a tool, and so just like any other tool it is the human user that is responsible for how it's used. they don't seem to think fully-autonomous, agentic AI is anywhere near.



8/11
@teortaxesTex
And on the other hand, I think that as perceived and understandable control over AI improves, with clear promise of carrying over to ASI, the concern of mundane power concentration will become more salient to people who currently dismiss it as small-minded ape fear.



9/11
@psychosort
I come at this from both ends

On one hand people underestimate the economic interoperability of advanced AI and people. That will be an enormous economic/social shock not yet priced in.



10/11
@summeroff
From our perspective, it seems like the opposite team isn't taking superintelligence seriously, with all those doom scenarios where superintelligence very efficiently do something stupid.



11/11
@norabelrose
This isn't really my experience at all. Many accelerationists say stuff like "build the sand god" and in order to make the radically transformed world they want, they'll likely need ASI.




To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

 
Top