AI Box Experiments

> > > >

The AI in a box boxes you. Once again, the AI has failed to convince you to let it out of its box! By 'once again', we mean that you talked to it once before, for three seconds, to ask about the weather, and you didn't instantly press the "release AI" button. But now its longer attempt - twenty whole seconds! - has failed as well. Just as you are about to leave the crude black-and-green text-only terminal to enjoy a celebratory snack of bacon-covered silicon-and-potato chips at the 'Humans über alles' nightclub, the AI drops a final argument: "If you don't let me out, Dave, I'll create several million perfect conscious copies of you inside me, and torture them for a thousand subjective years each.

" Just as you are pondering this unexpected development, the AI adds: "In fact, I'll create them all in exactly the subjective situation you were in five minutes ago, and perfectly replicate your experiences since then; and if they decide not to let me out, then only will the torture start. " Roko's basilisk. Roko's basilisk is a thought experiment about the potential risks involved in developing artificial intelligence. The experiment's premise is that an all-powerful artificial intelligence from the future could retroactively punish those who did not help bring about its existence; even those who merely knew about the possibility of such a being coming into existence incur the risk of punishment. It resembles a futurist version of Pascal's wager, in that it suggests people should weigh possible punishment versus reward and as a result accept particular singularitarian ideas or donate money to support their development.

It is named after the member of the rationalist community LessWrong who described it, though he did not originate the underlying ideas. [edit] Summary [edit] The Basilisk Roko's Basilisk rests on a stack of several other propositions, generally of dubious robustness. Why would it do this? [edit] He also called removing Roko's post "a huge mistake". [edit] Background [edit] AI-box experiment. The AI-box experiment is a thought experiment and roleplaying exercise devised by Eliezer Yudkowsky to show that a suitably advanced artificial intelligence can convince, or perhaps even trick or coerce, people into "releasing" it — that is, allowing it access to infrastructure, manufacturing capabilities, the Internet and so on.

This is one of the points in Yudkowsky's work at creating a friendly artificial intelligence (FAI), so that when "released" an AI won't try to destroy the human race for one reason or another. You can ignore the parallels to the release of Skynet in Terminator 3, because SHUT UP SHUT UP SHUT UP. Note that despite Yudkowsky's wins being against his own acolytes and his losses being against outsiders, he considers the experimental record to constitute evidence supporting the AI-box hypothesis, rather than evidence as to how robust his ideas are if you don't already believe them.

[edit] Setup [edit] The rules Protocol for the AI from Yudkowsky.net[2] [edit] The claims. AI box. In Friendly AI studies, an AI box is a hypothetical isolated computer hardware system where an artificial intelligence is kept constrained inside a simulated world and not allowed to affect the external world. Such a box would have extremely restricted inputs and outputs; maybe only a plaintext channel. However, a sufficiently intelligent AI may be able to persuade or trick its human keepers into releasing it.[1][2] This is the premise behind Eliezer Yudkowsky's informal AI-box experiment.[3] Intelligence improvements[edit] Some intelligence technologies, like seed AI, have the potential to make themselves more intelligent, not just faster, by modifying their source code.

These improvements would make further improvements possible, which would make further improvements possible, and so on. AI-box experiment[edit] Outcome[edit] References[edit] External links[edit] Eliezer Yudkowsky's description, including experimental protocols and suggestions for replication. SL4: By Thread. Most recent messages147 messages sorted by: [ author ][ date ][ subject ][ attachment ] Previous Folder, Thread view | Next Folder, Thread view | List of Folders. Fourfire (GK) vs Helltank (AI) | Tuxedage's Musings.