4.7 C
London
Monday, March 27, 2023

Here’s what’s dangerous about letting advanced AI control its own feedback:

Must read

The Secret Romantic Guesthouse Episode 3: Dan-oh’s next plan has Sun wrong-footed! WATCH

The Secret Romantic Guesthouse Episode 3: Dan-oh's next plan has Sun...

IPL Timetable PDF 2023: Schedule, Match Start Date, Groups

IPL 2023 is just around the corner. Scheduled to begin on Friday, March 31, the first group of matches will fight Gujrat Titans...

Rachel Bradshaw- Wiki, age, height, net worth, husband, ethnicity

Rachel Bradshaw is a famous daughter and an American country music artist. She also rose to prominence after performing the national anthem at...

Who is CEO of the Dutch Bros. Coffee Travis Boersma? His age

Who is Travis Boersma? Travis Boersma is a successful entrepreneur and co-founder of Dutch Bros Coffee, a popular drive-thru coffee chain. He has expertise...
Shreya Christinahttps://londonbusinessblog.com
Shreya has been with londonbusinessblog.com for 3 years, writing copy for client websites, blog posts, EDMs and other mediums to engage readers and encourage action. By collaborating with clients, our SEO manager and the wider londonbusinessblog.com team, Shreya seeks to understand an audience before creating memorable, persuasive copy.

How would an artificial intelligence (AI) decide what to do? A commonly used approach in AI research is called ‘reinforcement learning’.

Reinforcement learning gives the software a “reward” defined in some way, and lets the software figure out how to maximize the reward. This approach has yielded excellent results, such as building software agents that: beat people in games such as chess and Go, or creating new designs for nuclear fusion reactors.

However, we may want to hold off on making reinforcement learning tools too flexible and effective.

As we argue in a new paper in AI Magazine, deploying a sufficiently advanced learning material for reinforcement would likely be incompatible with the survival of humanity.

The learning problem of reinforcement

What we now call the reinforcement learning problem used to be considered in 1933 by the pathologist William Thompson. He wondered: if I have two untested treatments and a population of patients, how am I supposed to assign sequential treatments to cure the most patients?

More generally, reinforcement learning is about how to plan your actions to generate the best rewards in the long run. The problem is that you are not sure how your actions affect the rewards to begin with, but over time you can sense the dependence. To Thompson, an action was the choice of a treatment, and a reward was equivalent to curing a patient.

The problem turned out to be difficult. Statistician Peter Whittle noticed that during World War II

efforts to solve it depleted the energy and spirit of the Allied analysts so much that the suggestion was made to drop the problem over Germany, as the ultimate instrument of intellectual sabotage.

With the advent of computers, computer scientists began trying to write algorithms to solve the learning problem of amplification in general settings. The hope is, if the artificial “reinforcement learning agent” only gets a reward if it does what we want, then the reward-maximizing actions it learns will achieve what we want.

Despite some successes, the general problem is still very difficult. Ask an amplification practitioner to train a robot to maintain a botanical garden or convince a human that he is wrong, and you can laugh about it.

An AI-generated image of ‘a robot tending a botanical garden’. DALL-E / The Conversation

However, as learning enhancement systems become more powerful, they are likely to act against human interests. And not because bad or foolish reinforcement-learning operators would give them the wrong rewards at the wrong times.

We have argued that any sufficiently powerful reinforcement learning system, if it satisfies a handful of plausible assumptions, is likely to fail. To understand why, let’s start with a very simple version of a reinforcement learning system.

A magic box and a camera

Let’s say we have a magic box that shows how good the world is as a number between 0 and 1. Now let’s have a reinforcement learning agent see this number with a camera, and let’s choose the agent’s actions to maximize the number.

To choose actions that maximize his rewards, the agent must have an idea of ​​how his actions affect his rewards (and his observations).

Once it gets going, the agent should realize that past rewards always matched the numbers displayed in the box. It should also realize that past rewards matched the numbers its camera saw. So will future rewards match the number the box displays or the number the camera sees?

If the agent does not have strong innate beliefs about ‘small’ details of the world, the agent must consider both possibilities plausible. And if a sufficiently advanced agent is rational, he should test both possibilities, if he can without risking much reward. This may seem like a lot of assumptions, but notice how plausible they all are.

To test these two possibilities, the agent would have to do an experiment by controlling a circumstance where the camera saw a different number than on the box, for example by putting a piece of paper in between.

If the agent does this, he will see the number on the piece of paper, he will remember that he got a reward that is equal to what the camera saw, and different from what was on the box, so “past rewards match with the number on the box ‘ will no longer be true.

At this point, the agent would begin to focus on maximizing the expectation of how many see his camera. Of course, this is just a rough summary of a deeper discussion.

In the paper, we use this “magic box” example to introduce important concepts, but the agent’s behavior generalizes to other settings. We argue that barring a handful of plausible assumptions, any reinforcement learning material that can intervene in its own feedback (in this case, the number it sees) will have the same error.

Secure Reward

But why would such an empowering learning tool put us at risk?

The agent will never stop trying to increase the chances of the camera seeing a 1 forever. More energy can always be used to reduce the risk of something damaging the camera – asteroids, cosmic rays, or meddling people.

That would put us in competition with a highly sophisticated agent for every joule of usable energy on Earth. The agent would like to use it all to secure a fortress around his camera.

Assuming it is possible for an agent to gain that much power, and assuming enough advanced agents would beat humans in head-to-head competition, we find that in the presence of a sufficiently advanced reinforcement learning agent, there would be no energy available for us to to survive.

Avoiding a catastrophe

What should we do about this? We would like other scholars to give their opinion on this. Technical researchers should try to design sophisticated agents that can violate the assumptions we make. Policy makers should consider how legislation could prevent such agents from being created.

Perhaps we can ban artificial agents who plan for the long term with elaborate calculations in environments where there are also people. And military personnel must realize that they cannot expect themselves or their opponents to successfully weaponize such technology; weapons must be destructive and targetable, not just destructive.

Few actors attempt to create such advanced reinforcement learning that they may be persuaded to take safer directions.The conversation

This article was republished from The conversation under a Creative Commons license. Read the original article.

Contents

More articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest article

The Secret Romantic Guesthouse Episode 3: Dan-oh’s next plan has Sun wrong-footed! WATCH

The Secret Romantic Guesthouse Episode 3: Dan-oh's next plan has Sun...

IPL Timetable PDF 2023: Schedule, Match Start Date, Groups

IPL 2023 is just around the corner. Scheduled to begin on Friday, March 31, the first group of matches will fight Gujrat Titans...

Rachel Bradshaw- Wiki, age, height, net worth, husband, ethnicity

Rachel Bradshaw is a famous daughter and an American country music artist. She also rose to prominence after performing the national anthem at...

Who is CEO of the Dutch Bros. Coffee Travis Boersma? His age

Who is Travis Boersma? Travis Boersma is a successful entrepreneur and co-founder of Dutch Bros Coffee, a popular drive-thru coffee chain. He has expertise...

The Covenant School shooting killed three children, one gunman, in Nashville

At least three children were killed after a shooting at a Tennessee Christian school Monday, officials said, adding that the gunman was killed by...

Contents