Category Archives: Без рубрики

Upper post

This site is about all roadmaps which are created by Alexey Turchin.

The roadmaps are intended to become new way of thinking, which will help to solve most complex problems.

First of all I want to solve the problems of global catastrophic risks prevention and immortality.

Links on all maps are in the “All map” menu. Textual explanations of the roadmaps are published on Less Wrong, and most useful discussion is also happens in LW. So there is LW links for each roadmap, where you could participate on discussion. Some roadmaps are translated into Russian, and corresponding pdfs are also linked.

But below you could find explanation of several maps, but I will post such explanation only on LW for newer maps.

Leave a Comment

Filed under Без рубрики

The map of global catastrophic risks connected with biological weapons and genetic engineering

TL;DR: Biorisks could result in extinction because of multipandemic in near future and their risks is the same order magnitude as risks of UFAI. A lot of biorisks exist, they are cheap and could happen soon.

It may be surprising that number of published research about risks of biological global catastrophe is much less than number of papers about risks of self-improving AI. (One of exception here is “Strategic terrorism” research parer by former chief technology officer of Microsoft.)

It can’t be explain by the fact that biorisks have smaller probability (it will not be known until Bostrom will write the book “Supervirus”). I mean we don’t know it until a lot of research will be done.

Also biorisks are closer in time than AI risks and because of it they shadow AI risks, lowering the probability that extinction will happen by means of UFAI, because it could happen before it by means of bioweapons (e.g. if UFAI risk is 0.9, but chances that we will die before its creation from bioweapons is 0.8, than actual AI risk is 0.18). So studying biorisks may be more urgent than AI risks.

There is no technical problem to create new flu virus that could kill large part of human population. And the idea of multi pandemic – that it the possibility to release 100 different agents simultaneously – tells us that biorisk could have arbitrary high global lethality. Most of bad things from this map may be created in next 5-10 years, and no improbable insights are needed. Biorisks are also very cheap in production and small civic or personal biolab could  be used to create them.

May be research in estimation probability of human extinction by biorisks had been done secretly? I am sure that a lot of analysis of biorisks exist in secret. But this means that they do not exist in public and scientists from other domains of knowledge can’t independently verify them and incorporate into broader picture of risks. The secrecy here may be useful if it concerns concrete facts about how to crete a dangerous virus. (I was surprised by effectiveness with which Ebola epidemic was stopped after the decision to do so was made, so maybe I should not underestimate government knowledge on the topic).

I had concerns if I should publish this map. I am not a biologist and chances that I will find really dangerous information are small. But what if I inspire bioterrorists to create bioweapons? Anyway we have a lot of movies with such inspiration.

So I self-censored one idea that may be too dangerous to publish and put black box instead. I also have a section of prevention methods in the lower part of the map. All ideas in the map may be found in wikipedia or other open sources.

The goal of this map is to show importance of risks connected with new kinds of biological weapons which could be created if all recent advances in bioscience will be used for bad. The map shows what we should be afraid off and try to control. So it is map of possible future development of the field of biorisks.

Not any biocatastrophe will result in extinction, it is in the fat tail of the distribution. But smaller catastrophes may delay other good things and wider our window of vulnerability. If protecting measures will be developed on the same speed as possible risks we are mostly safe. If total morality of bioscientists is high we are most likely safe too – no one will make dangerous experiments.

Timeline: Biorisks are growing at least exponentially with the speed of Moore law in biology. After AI will be created and used to for global government and control, biorisks will probably ended. This means that last years before AI creation will be most dangerous from the point of biorisks.

The first part of the map presents biological organisms that could be genetically edited for global lethality and each box presents one scenario of a global catastrophe. While many boxes are similar to existing bioweapons, they are not the same as not much known bioweapons could result in large scale pandemic (except smallpox and flu). Most probable biorisks are outlined in red in the map. And the real one will be probably not from the map as the world bio is very large and I can’t cover it all.

The map is provided with links which are clickable in the pdf, which is here:

Leave a Comment

Filed under Без рубрики

Double scenarios of a global catastrophe

Double scenarios of a global catastrophe.
Download pdf here:

Double scenarios of a global catastrophe. by Turchin Alexei

Leave a Comment

Filed under Без рубрики

Simulations Map: what is the most probable type of the simulation in which we live?

There is a chance that we may be living in a computer simulation created by an AI or a future super-civilization. The goal of the simulations map is to depict an overview of all possible simulations. It will help us to estimate the distribution of other multiple simulations inside it along with their measure and probability. This will help us to estimate the probability that we are in a simulation and – if we are – the kind of simulation it is and how it could end.

Simulation argument

The simulation map is based on Bostrom’s simulation argument. Bostrom showed that that “at least one of the following propositions is true:

(1) the human species is very likely to go extinct before reaching a “posthuman” stage;

(2) any posthuman civilization is extremely unlikely to run a significant number of simulations of their evolutionary history (or variations thereof);

(3) we are almost certainly living in a computer simulation”.

The third proposition is the strongest one, because (1) requires that not only human civilization but almost all other technological civilizations should go extinct before they can begin simulations, because non-human civilizations could model human ones and vice versa. This makes (1) extremely strong universal conjecture and therefore very unlikely to be true. It requires that all possible civilizations will kill themselves before they create AI, but we can hardly even imagine such a universal course. If destruction is down to dangerous physical experiments, some civilizations may live in universes with different physics; if it is down to bioweapons, some civilizations would have enough control to prevent them.

In the same way, (2) requires that all super-civilizations with AI will refrain from creating simulations, which is unlikely.

Feasibly there could be some kind of universal physical law against the creation of simulations, but such a law is impossible, because some kinds of simulations already exist, for example human dreaming. During human dreaming very precise simulations of the real world are created (which can’t be distinguished from the real world from within – that is why lucid dreams are so rare). So, we could conclude that after small genetic manipulations it is possible to create a brain that will be 10 times more capable of creating dreams than an ordinary human brain. Such a brain could be used for the creation of simulations and strong AI surely will find more effective ways of doing it. So simulations are technically possible (and qualia is no problem for them as we have qualia in dreams).

Any future strong AI (regardless of whether it is FAI or UFAI) should run at least several million simulations in order to solve the Fermi paradox and to calculate the probability of the appearance of other AIs on other planets, and their possible and most typical goal systems. AI needs this in order to calculate the probability of meeting other AIs in the Universe and the possible consequences of such meetings.

As a result a priory estimation of me being in a simulation is very high, possibly 1000000 to 1. The best chance of lowering this estimation is to find some flaws in the argument, and possible flaws are discussed below.

Most abundant classes of simulations

If we live in a simulation, we are going to be interested in knowing the kind of simulation it is. Probably we belong to the most abundant class of simulations, and to find it we need a map of all possible simulations; an attempt to create one is presented here.

There are two main reasons for simulation domination: goal and price. Some goals require the creation of very large number of simulations, so such simulations will dominate. Also cheaper and simpler simulations are more likely to be abundant.

Eitan_Zohar suggested that FAI will deliberately create an almost infinite number of simulations in order to dominate the total landscape and to ensure that most people will find themselves inside FAI controlled simulations, which will be better for them as in such simulations unbearable suffering can be excluded. (If in the infinite world an almost infinite number of FAIs exist, each of them could not change the landscape of simulation distribution, because its share in all simulations would be infinitely small. So we need a casual trade between an infinite number of FAIs to really change the proportion of simulations. I can’t say that it is impossible, but it may be difficult.)

Another possible largest subset of simulations is the one created for leisure and for the education of some kind of high level beings.

The cheapest simulations are simple, low-resolution and me-simulations (one real actor, with the rest of the world around him like a backdrop), similar to human dreams. I assume here that simulations are distributed as the same power law as planets, cars and many other things: smaller and cheaper ones are more abundant.

Simulations could also be laid on one another in so-called Matryoshka simulations where one simulated civilization is simulating other civilizations. The lowest level of any Matryoshka system will be the most populated. If it is a Matryoshka simulation, which consists of historical simulations, the simulation levels in it will be in descending time order, for example the 24th century civilization models the 23rd century one, which in turn models the 22nd century one, which itself models the 21st century simulation. A simulation in a Matryoshka will end on the level where creation of the next level is impossible. The beginning of 21st century simulations will be the most abundant class in Matryoshka simulations (similar to our time period.)

Argument against simulation theory

There are several possible objections against the Simulation argument, but I find them not strong enough to do it.

1. Measure

The idea of measure was introduced to quantify the extent of the existence of something, mainly in quantum universe theories. While we don’t know how to actually measure “the measure”, the idea is based on intuition that different observers have different powers of existence, and as a result I could find myself to be one of them with a different probability. For example, if we have three functional copies of me, one of them is the real person, another is a hi-res simulation and the third one is low-res simulation, are my chances of being each of them equal (1/3)?

The «measure» concept is the most fragile element of all simulation arguments. It is based mostly on the idea that all copies have equal measure. But perhaps measure also depends on the energy of calculations. If we have a computer which is using 10 watts of energy to calculate an observer, it may be presented as two parallel computers which are using five watts each. These observers may be divided again until we reach the minimum amount of energy required for calculations, which could be called «Plank observer». In this case our initial 10 watt computer will be equal to – for example – one billion plank observers.

And here we see a great difference in the case of simulations, because simulation creators have to spend less energy on calculations (or it would be easier to make real world experiments). But in this case such simulations will have a lower measure. But if the total number of all simulations is large, then the total measure of all simulations will still be higher than the measure of real worlds. But if most real worlds end with global catastrophe, the result would be an even higher proportion of real worlds which could outweigh simulations after all.

2. Universal AI catastrophe

One possible universal global catastrophe could happen where a civilization develops an AI-overlord, but any AI will meet some kind of unresolvable math and philosophical problems which will terminate it at its early stages, before it can create many simulations. See an overview of this type of problem in my map “AI failures level”.

3. Universal ethics

Another idea is that all AIs converge to some kind of ethics and decision theory which prevent them from creating simulations, or they create p-zombie simulations only. I am skeptical about that.

4. Infinity problems

If everything possible exists or if the universe is infinite (which are equal statements) the proportion between two infinite sets is meaningless. We could overcome this conjecture using the idea of mathematical limit: if we take a bigger universe and longer periods of time, the simulations will be more and more abundant within them.

But in all cases, in the infinite universe any world exists an infinite number of times, and this means that my copies exist in real worlds an infinite number of times, regardless of whether I am in a simulation or not.

5. Non-uniform measure over Universe (actuality)

Contemporary physics is based on the idea that everything that exists, exists in equal sense, meaning that the Sun and very remote stars have the same measure of existence, even in casually separated regions of the universe. But if our region of space-time is somehow more real, it may change simulation distribution which will favor real worlds.

6. Flux universe

The same copies of me exist in many different real and simulated worlds. In simple form it means that the notion that “I am in one specific world” is meaningless, but the distribution of different interpretations of the world is reflected in the probabilities of different events.

E.g. the higher the chances that I am in a simulation, the bigger the probability that I will experience some kind of miracles during my lifetime. (Many miracles almost prove that you are in simulation, like flying in dreams.) But here correlation is not causation.

The stronger version of the same principle implies that I am one in many different worlds, and I could manipulate the probability of finding myself in a set of possible worlds, basically by forgetting who I am and becoming equal to a larger set of observers. It may work without any new physics, it only requires changing the number of similar observers, and if such observers are Turing computer programs, they could manipulate their own numbers quite easily.

Higher levels of flux theory do require new physics or at least quantum mechanics in a many worlds interpretation. In it different interpretations of the world outside of the observer could interact with each other or experience some kind of interference.

See further discussion about a flux universe here:

7. Bolzmann brains outweigh simulations

It may turn out that BBs outweigh both real worlds and simulations. This may not be a problem from a planning point of view because most BBs correspond to some real copies of me.

But if we take this approach to solve the BBs problem, we will have to use it in the simulation problem as well, meaning: “I am not in a simulation because for any simulation, there exists a real world with the same “me”. It is counterintuitive.

Simulation and global risks

Simulations may be switched off or may simulate worlds which are near global catastrophe. Such worlds may be of special interest for future AI because they help to model the Fermi paradox and they are good for use as games.

Miracles in simulations

The map also has blocks about types of simulation hosts, about many level simulations, plus ethics and miracles in simulations.

The main point about simulation is that it disturbs the random distribution of observers. In the real world I would find myself in mediocre situations, but simulations are more focused on special events and miracles (think about movies, dreams and novels). The more interesting my life is, the less chance that it is real.

If we are in simulation we should expect more global risks, strange events and miracles, so being in a simulation is changing our probability expectation of different occurrences.

This map is parallel to the Doomsday argument map.

Estimations given in the map of the number of different types of simulation or required flops are more like place holders, and may be several orders of magnitude higher or lower.

I think that this map is rather preliminary and its main conclusions may be updated many times.

Simulation map by Turchin Alexei

Leave a Comment

Filed under Без рубрики

Digital Immortality Map: How to collect enough information about yourself for future resurrection by AI

Digital Immortality Map: How to collect enough information about yourself for future resurrection by AI

6 turchin 02 October 2015 10:21PM
If someone has died it doesn’t mean that you should stop trying to return him to life. There is one clear thing that you should do (after cryonics): collect as much information about the person as possible, as well as store his DNA sample, and hope that future AI will return him to life based on this information.

Two meanings of “Digital immortality”

The term “Digital immortality” is often confused with the notion of mind uploading, as the end result is almost the same: a simulated brain in a computer.

But here, by the term “Digital immortality” I mean reconstruction of the person based on his digital footprint and other traces by future AI after this person death.

Mind uploading in the future will happen while the original is still alive (or while the brain exists in a frozen state) and will be connected to a computer by some kind of sophisticated interface, or the brain will be scanned. It cannot be done currently.

On the other hand, reconstruction based on traces will be done by future AI. So we just need to leave enough traces and we could do it now.

But we don’t know how much traces are enough, so basically we should try to produce and preserve as many traces as possible. However, not all traces are equal in their predictive value. Some are almost random, and others are so common that they do not provide any new information about the person.

Cheapest way to immortality

Creating traces is an affordable way of reaching immortality. It could even be done for another person after his death, if we start to collect all possible information about him.

Basically I am surprised that people don’t do it all the time. It could be done in a simple form almost for free and in the background – just start a video recording app on your notebook, and record everything into shared folder connected with a free cloud. (Evocam program for Mac is excellent, and provides up 100gb free).

But really good digital immortality require 2-3 month commitment for self-description with regular every year updates. It may also require maximum several thousand dollars investment in durable disks, DNA testing, videorecorders, and free time to do it.

I understand how to set up this process and could help anyone interested.


The idea of personal identity is outside the scope of this map. I have another map on this topic (now in draft), I assume that the problem of personal identity will be solved in the future. Perhaps we will prove that information only is enough to solve the problem, or we will find that continuity of consciousness, but we will be able to construct mechanisms to transfer this identity independently of information.

Digital immortality requires a very weak notion of identity. i.e. a model of behavior and thought processes is enough for an identity. This model may have some differences from the original, which I call “one night difference”, that is the typical difference between me-yesterday and me-today after one night’s sleep. The meaningful part of this information has size from several megabytes to gigabits, but we may need to collect much more information as we can’t now extract meaningful part from random.

DI may also be based on even weaker notion of identity, that anyone who thinks that he is me, is me. Weaker notions of identity require less information to be preserved, and in last case it may be around 10K bytes (including name, indexical information and basic traits description)

But the question about the number of traces needed to create an almost exact model of a personality is still open. It also depends on predictive power of future AI: the stronger is AI, the less traces are enough.

Digital immortality is plan C in my Immortality Roadmap, where Plan A is life extension and Plan B is cryonics; it is not plan A, because it requires solving the identity problem plus the existence of powerful future AI.


I created my first version of it in the year 1990 when I was 16, immediately after I had finished school. It included association tables, drawings and lists of all people known to me, as well as some art, memoires, audiorecordings and encyclopedia od everyday objects around me.

There are several approaches to achieving digital immortality. The most popular one is passive that is simply videorecording of everything you do.

My idea was that a person can actively describe himself from inside. He may find and declare the most important facts about himself. He may run specific tests that will reveal hidden levels of his mind and sub consciousness. He can write a diary and memoirs. That is why I called my digital immortality project “self-description”.

Structure of the map

This map consists of two parts: theoretical and practical. The theoretical part lists basic assumptions and several possible approaches to reconstructing an individual, in which he is considered as a black box. If real neuron actions will become observable, the “box” will become transparent and real uploading will be possible.

There are several steps in the practical part:

– The first step includes all the methods of fixing information while the person of interest is alive.

– The second step is about preservation of the information.

– The third step is about what should be done to improve and promote the process.

– The final fourth step is about the reconstruction of the individual, which will be performed by AI after his death. In fact it may happen soon, may be in next 20-50 years.

There are several unknowns in DI, including the identity problem, the size and type of information required to create an exact model of the person, and the required power of future AI to operate the process. These and other problems are listed in the box on the right corner of the map.

Digital immortality by Turchin Alexei

Leave a Comment

Filed under Без рубрики

Doomsday Argument Map

The Doomsday argument (DA) is controversial idea that humanity has a higher probability of extinction based purely on probabilistic arguments. The DA is based on the proposition that I will most likely find myself somewhere in the middle of humanity’s time in existence (but not in its early time based on the expectation that humanity may exist a very long time on Earth.)

There were many different definitions of the DA and methods of calculating it, as well as rebuttals. As a result we have developed a complex group of ideas, and the goal of the map is to try to bring some order to it. The map consists of various authors’ ideas. I think that I haven’t caught all existing ideas, and the map could be improved significantly – but some feedback is needed on this stage.

The map has the following structure: the horizontal axis consists of various sampling methods (notably SIA and SSA), and the vertical axis has various approaches to the DA, mostly Gott’s (unconditional) and Carters’s (update of probability of existing risk). But many important ideas can’t fit in this scheme precisely, and these have been added on the right hand side.

In the lower rows the link between the DA and similar arguments is shown, namely the Fermi paradox, Simulation argument and Anthropic shadow, which is a change in the probability assessment of natural catastrophes based on observer selection effects.

On the right part of the map different ways of DA rebuttal are listed and also a vertical raw of possible positive solutions.

I think that the DA is mostly true but may not mean inevitable extinction.

Several interesting ideas may need additional clarification and they will also put light on the basis of my position on DA.


The first of these ideas is that the most reasonable version of the DA at our current stage of knowledge is something that may be called the meta-DA, which presents our uncertainty about the correctness of any DA-style theories and our worry that the DA may indeed be true.

The meta-DA is a Bayesian superstructure built upon the field of DA theories. The meta-DA tells us that we should attribute non-zero Bayesian probability to one or several DA-theories (at least until they are disproved in a generally accepted way) and since the DA itself is a probabilistic argument, then these probabilities should be combined.

As a result the Meta-DA means an increase of total existential risks until we disprove (or prove) all versions of the DA, which may be not easy. We should anticipate such an increase in risk as a demand to be more precautious but not in a fatalist “doom imminent” way.

Reference class

The second idea concerns the so-called problem of reference class that is the problem of which class of observer I belong to in the light of question of the DA. Am I randomly chosen from all animals, humans, scientists or observer-moments?

The proposed solution is that the DA is true for any referent class from which I am randomly chosen, but the mere definition of the referent class is defining the type it will end as; it should not be global catastrophe. In short, any referent class has its own end. For example, if I am randomly chosen from the class of all humans, than the end of the class may mean not an extinction but a creation of the beginning of the class of superhumans.

But any suitable candidate for the DA-logic referent class must provide the randomness of my position in it. In that case I can’t be a random example of the class of mammals, because I am able to think about the DA and a zebra can’t.

As a result the most natural (i.e. providing a truly random distribution of observers) referent class is a class of observers who know about and can think about DA. The ability to understand the DA is the real difference between conscious and unconscious observers.

But this class is small and young. It started in 1983 with the works of Carter and now includes perhaps several thousand observers. If I am in the middle of it, there will be just several thousand more DA-aware observers and there will only be several decades more before the class ends (which unpleasantly will coincide with the expected “Singularity” and other x-risks). (This idea was clear to Carter and also is used in so called in so-called Self-referencing doomsday argument rebuttal

This may not necessarily mean the end of the global catastrophe, but it may mean that there will soon be a DA rebuttal. (And we could probably choose how to fulfill the DA prophecy by manipulating of the number of observers in the referent class.)

DA and medium life expectancy

DA is not unnatural way to see in the future as it seems to be. The more natural way to understand the DA is to see it as an instrument to estimate medium life expectancy in the certain group.

For example, I think that I can estimate medium human life expectancy based on your age. If you are X years old, human medium life expectancy is around 2X. “Around” here is very vague term as it more like order of magnitude. For example if you are 25 years old, I could think that medium human life expectancy is several decades years and independently I know its true (but not 10 millisecond or 1 million years). And as medium life expectancy is also may be applied to the person in question it may mean that he will also most probably live the same time (if we will not do something serious about life extension). So there is no magic or inevitable fate in DA.

But if we apply the same logic to civilization existence, and will count only a civilization capable to self-destruction, e.g. roughly after 1945, or 70 years old, it would provide medium life expectancy of technological civilizations around 140 years, which extremely short compare to our estimation that we may exist millions of years and colonize the Galaxy.

Anthropic shadow and fragility of our environment

|t its core is the idea that as a result of natural selection we have more chances to find ourselves in the world, which is in the meta-stable condition on the border of existential catastrophe, because some catastrophe may be long overdue. (Also because universal human minds may require constantly changing natural conditions in order to make useful adaptations, which implies an unstable climate – and we live in period of ice ages)

In such a world, even small human actions could result in global catastrophe. For example if we pierce a overpressured ball with a needle.

The most plausible candidates for such metastable conditions are processes that must have happened a long time ago in most worlds, but we can only find ourselves in the world where they are not. For the Earth it may be sudden change of the atmosphere to a Venusian subtype (runaway global warming). This means that small human actions could have a much stronger result for atmospheric stability (probably because the largest accumulation of methane hydrates in earth’s history resides on the Arctic Ocean floor, which is capable of a sudden release: see Another option for meta-stability is provoking a strong supervolcane eruption via some kind of earth crust penetration (see “Geoingineering gone awry”

Thermodynamic version of the DA

Also for the western reader is probably unknown thermodynamic version of DA suggested in Strugatsky’s novel “Definitely maybe” (Originally named “A Billion years before the end of the world”). It suggests that we live in thermodynamic fluctuation and as smaller and simpler fluctuations are more probable, there should be a force against complexity, AI development or our existence in general. Plot of the novel is circled around pseudo magical force, which distract best scientists from work using girls, money or crime. After long investigation they found that it is impersonal force against complexity.

This map is a sub-map for the planned map “Probability of global catastrophe” and its parallel maps are a “Simulation argument map” and a “Fermi paradox map” (both are in early drafts).

PDF of the map:

Previous posts with maps:

AGI Safety Solutions Map

A map: AI failures modes and levels

A Roadmap: How to Survive the End of the Universe

A map: Typology of human extinction risks

Roadmap: Plan of Action to Prevent Human Extinction Risks

Immortality Roadmap

Doomsday argument map

Leave a Comment

Filed under Без рубрики

AGI safety solutions map

When I started to work on the map of AI safety solutions, I wanted to illustrate the excellent article “Responses to Catastrophic AGI Risk: A Survey” by Kaj Sotala and Roman V. Yampolskiy, 2013, which I strongly recommend.

However, during the process I had a number of ideas to expand the classification of the proposed ways to create safe AI. In their article there are three main categories: social constraints, external constraints and internal constraints.

I added three more categories: “AI is used to create a safe AI”, “Multi-level solutions” and “meta-level”, which describes the general requirements for any AI safety theory.

In addition, I divided the solutions into simple and complex. Simple are the ones whose recipe we know today. For example: “do not create any AI”. Most of these solutions are weak, but they are easy to implement.

Complex solutions require extensive research and the creation of complex mathematical models for their implementation, and could potentially be much stronger. But the odds are less that there will be time to realize them and implement successfully.

After aforementioned article several new ideas about AI safety appeared.

These new ideas in the map are based primarily on the works of Ben Goertzel, Stuart Armstrong and Paul Christiano. But probably many more exist and was published but didn’t come to my attention.

Moreover, I have some ideas of my own about how to create a safe AI and I have added them into the map too. Among them I would like to point out the following ideas:

1. Restriction of self-improvement of the AI. Just as a nuclear reactor is controlled by regulation the intensity of the chain reaction, one may try to control AI by limiting its ability to self-improve in various ways.

2. Capture the beginning of dangerous self-improvement. At the start of potentially dangerous AI it has a moment of critical vulnerability, just as a ballistic missile is most vulnerable at the start. Imagine that AI gained an unauthorized malignant goal system and started to strengthen itself. At the beginning of this process, it is still weak, and if it is below the level of human intelligence at this point, it may be still more stupid than the average human even after several cycles of self-empowerment. Let’s say it has an IQ of 50 and after self-improvement it rises to 90. At this level it is already committing violations that can be observed from the outside (especially unauthorized self-improving), but does not yet have the ability to hide them. At this point in time, you can turn it off. Alas, this idea would not work in all cases, as some of the objectives may become hazardous gradually as the scale grows (1000 paperclips are safe, one billion are dangerous, 10 power 20 are x-risk). This idea was put forward by Ben Goertzel.

3. AI constitution. First, in order to describe the Friendly AI and human values we can use the existing body of criminal and other laws. (And if we create an AI that does not comply with criminal law, we are committing a crime ourselves.) Second, to describe the rules governing the conduct of AI, we can create a complex set of rules (laws that are much more complex than Asimov’s three laws), which will include everything we want from AI. This set of rules can be checked in advance by specialized AI, which calculates only the way in which the application of these rules can go wrong (something like mathematical proofs on the basis of these rules).

4. “Philosophical landmines.” In the map of AI failure levels I have listed a number of ways in which high-level AI may halt when faced with intractable mathematical tasks or complex philosophical problems. One may try to fight high-level AI using “landmines”, that is, putting it in a situation where it will have to solve some problem, but within this problem is encoded more complex problems, the solving of which will cause it to halt or crash. These problems may include Godelian mathematical problems, nihilistic rejection of any goal system or the inability of AI to prove that it actually exists.

5. Multi-layer protection. The idea here is not that if we apply several methods at the same time, the likelihood of their success will add up, this notion will not work if all methods are weak. The idea is that the methods of protection work together to protect the object from all sides. In a sense, human society works the same way: a child is educated by an example as well as by rules of conduct, then he begins to understand the importance of compliance with these rules, but also at the same time the law, police and neighbours are watching him, so he knows that criminal acts will put him in jail. As a result, lawful behaviour is his goal which he finds rational to obey. This idea can be reflected in the specific architecture of AI, which will have at its core a set of immutable rules, around it will be built human emulation which will make high-level decisions, and complex tasks will be delegated to a narrow Tool AIs. In addition, independent emulation (conscience) will check the ethics of its decisions. Decisions will first be tested in a multi-level virtual reality, and the ability of self-improvement of the whole system will be significantly limited. That is, it will have an IQ of 300, but not a million. This will make it effective in solving aging and global risks, but it will also be predictable and understandable to us. The scope of its jurisdiction should be limited to a few important factors: prevention of global risks, death prevention and the prevention of war and violence. But we should not trust it in such an ethically delicate topic as prevention of suffering, which will be addressed with the help of conventional methods.

This map could be useful for the following applications:

1. As illustrative material in the discussions. Often people find solutions ad hoc, once they learn about the problem of friendly AI or are focused on one of their favourite solutions.

2. As a quick way to check whether a new solution really has been found.

3. As a tool to discover new solutions. Any systematisation creates “free cells” to fill for which one can come up with new solutions. One can also combine existing solutions or be inspired by them.

4. There are several new ideas in the map.

A companion to this map is the map of AI failures levels. In addition, this map is subordinated to the map of global risk prevention methods and corresponds to the block “Creating Friendly AI” Plan A2 within it.

The pdf of the map is here:

AGI Safety Solutions by Turchin Alexei

Leave a Comment

Filed under Без рубрики

A Map: AGI Failures Modes and Levels

This map shows that AI failure resulting in human extinction could happen on different levels of AI development, namely, before it starts self-improvement (which is unlikely but we still can envision several failure modes), during its take off, when it use different instruments to break out from its initial confinement, and after its successful take over the world, when it starts to implement its goal system which could be plainly unfriendly or its friendliness may be flawed.
AI also can halts on late stages of its development because either technical problems or “philosophical” one.
I am sure that the map of AI failure levels is needed for the creation of Friendly AI theory as we should be aware of various risks. Most of ideas in the map came from “Artificial Intelligence as a Positive and Negative Factor in Global Risk” by Yudkowsky, from chapter 8 of “Superintelligence” by Bostrom, from Ben Goertzel blog and from hitthelimit blog, and some are mine.
I will now elaborate three ideas from the map which may need additional clarification.

The problem of the chicken or the egg

The question is what will happen first: AI begins to self-improve, or the AI got a malicious goal system. It is logical to assume that the goal system change will occur first, and this gives us a chance to protect ourselves from the risks of AI, because there will be a short period of time when AI already has bad goals, but has not developed enough to be able to hide them from us effectively. This line of reasoning comes from Ben Goertzel.
Unfortunately many goals are benign on a small scale, but became dangerous as the scale grows. 1000 paperclips are good, one trillion are useless, and 10 to the power of 30 paperclips are an existential risk.

AI halting problem

Another interesting part of the map are the philosophical problems that must face any AI. Here I was inspired after this reading Russian-language blog hitthelimit
One of his ideas is that the Fermi paradox may be explained by the fact that any sufficiently complex AI halts. (I do not agree that it completely explains the Great Silence.)
After some simplification, with which he is unlikely to agree, the idea is that as AI self-improves its ability to optimize grows rapidly, and as a result, it can solve problems of any complexity in a finite time. In particular, it will execute any goal system in a finite time. Once it has completed its tasks, it will stop.
The obvious objection to this theory is the fact that many of the goals (explicitly or implicitly) imply infinite time for their realization. But this does not remove the problem at its root, as this AI can find ways to ensure the feasibility of such purposes in the future after it stops. (But in this case it is not an existential risk if their goals are formulated correctly.)
For example, if we start from timeless physics, everything that is possible already exists and the number of paperclips in the universe is a) infinite b) unchangable. When the paperclip maximizer has understood this fact, it may halt. (Yes, this is a simplistic argument, it can be disproved, but it is presented solely to illustrate the approximate reasoning, that can lead to AI halting.) I think the AI halting problem is as complex as the halting problem for Turing Machine.
Vernor Vinge in his book Fire Upon the Deep described unfriendly AIs which halt any externally visible activity about 10 years after their inception, and I think that this intuition about the time of halting from the point of external observer is justified: this can happen very quickly. (Yes, I do not have a fear of fictional examples, as I think that they can be useful for explanation purposes.)
In the course of my arguments with “hitthelimit” a few other ideas were born, specifically about other philosophical problems that may result in AI halting.
One of my favorites is associated with modal logic. The bottom line is that from observing the facts, it is impossible to come to any conclusions about what to do, simply because oughtnesses are in a different modality. When I was 16 years old this thought nearly killed me.
It almost killed me, because I realized that it is mathematically impossible to come to any conclusions about what to do. (Do not think about it too long, it is a dangerous idea.) This is like awareness of the meaninglessness of everything, but worse.
Fortunately, the human brain was created through the evolutionary process and has bridges from the facts to oughtness, namely pain, instincts and emotions, which are out of the reach of logic.
But for the AI with access to its own source code these processes do not apply. For this AI, awareness of the arbitrariness of any set of goals may simply mean the end of its activities: the best optimization of a meaningless task is to stop its implementation. And if AI has access to the source code of its objectives, it can optimize it to maximum simplicity, namely to zero.
Lobstakle by Yudkowsky is also one of the problems of high level AI, and it’s probably just the beginning of the list of such issues.

Existence uncertainty

If AI use the same logic as usually used to disprove existence of philosophical zombies, it may be uncertain if it really exists or it is only a possibility. (Again, then I was sixteen I spent unpleasant evening thinking about this possibility for my self.) In both cases the result of any calculations is the same. It is especially true in case if AI is philozombie itself, that is if it does not have qualia. Such doubts may result in its halting or in conversion of humans in philozombies. I think that AI that do not have qualia or do not believe in them can’t be friendly. This topic is covered in the map in the bloc “Actuality”.
The status of this map is a draft that I believe can be greatly improved. The experience of publishing other maps has resulted in almost a doubling of the amount of information. A companion to this map is a map of AI Safety Solutions, which I will publish later.
The map was first presented to the public at a LessWrong meetup in Moscow in June 2015 (in Russian)
Pdf is here:

AI failures modes and levels by Turchin Alexei

Leave a Comment

Filed under Без рубрики

A Roadmap: How to Survive the End of the Universe

In a sense, this plan needs to be perceived with irony because it is almost irrelevant: we have very small chances of surviving even next 1000 years and if we do, we have a lot of things to do before it becomes reality. And even afterwards, our successors will have completely different plans.

There is one important exception: there are suggestions that collider experiments may lead to a vacuum phase transition, which begins at one point and spreads across the visible universe. Then we can destroy ourselves and our universe in this century, but it would happen so quickly that we will not have time to notice it. (The term “universe” hereafter refers to the observable universe that is the three-dimensional world around us, resulting from the Big Bang.)

We can also solve this problem in next century if we create superintelligence.

The purpose of this plan is to show that actual immortality is possible: that we have an opportunity to live not just billions and trillions of years, but an unlimited duration. My hope is that the plan will encourage us to invest more in life extension and prevention of global catastrophic risks. Our life could be eternal and thus have meaning forever.

Anyway, the end of the observable universe is not an absolute end: it’s just one more problem on which the future human race will be able to work. And even at the negligible level of knowledge about the universe that we have today, we are still able to offer more than 50 ideas on how to prevent its end.

In fact, to assemble and come up with these 50 ideas I spent about 200 working hours, and if I had spent more time on it, I’m sure I would have found many new ideas. In the distant future we can find more ideas; choose the best of them; prove them, and prepare for their implementation.

First of all, we need to understand exactly what kind end to the universe we should expect in the natural course of things. There are many hypotheses on this subject, which can be divided into two large groups:

1. The universe is expected to have a relatively quick and abrupt end, known as the Big Crunch or Big Rip (accelerating expansion of the universe causes it to break apart), or the decay of the false vacuum. Vacuum decay can occur at any time; a Big Rip could happen in about 10-30 billion years, and the Big Crunch has hundreds of billions of years timescale.

2. Another scenario assumes an infinitely long existence of an empty, flat and cold universe which would experience so called “heat death” that is gradual halting of all processes and then disappearance of all matter.

The choice between these scenarios depends on the geometry of the universe, which is determined by the equations of general relativity and, – above all – the behavior of the almost unknown parameter: dark energy.

The recent discovery of dark energy has made Big Rip the most likely scenario, but it is clear that the picture of the end of the universe will change several times.

You can find more at:

There are five general approaches to solve the end of the universe problem, each of them includes many subtypes shown in the map:

1. Surf the Wave: Utilize the nature of the process which is ending the universe. (The most known of these type of solutions is Omega Point by Tippler, where the universe’s energy collapse is used to make infinite calculations.)

2. Go to parallel world

3. Prevent the end of the universe

4. Survive the end of the universe

5. Dissolving the problem

Some of the ideas are on the level of the wildest possible speculations and I hope you will enjoy them.

The new feature of this map is that in many cases mentioned, ideas are linked to corresponding wiki pages in the pdf.

Download the pdf of the map here:

How to Survive the End of the Universe by Turchin Alexei

Leave a Comment

Filed under Без рубрики

Immortality Roadmap

The Roadmap to Personal Immortality is list of actions that one should do to live forever. The most obvious way to reach immortality is to defeat aging, to grow and replace the diseased organs with new bioengineered ones, and in the end to be scanned into a computer. This is Plan A. It is the best possible course of events. It depends on two things – your personal actions (like regular medical checkups) and collective actions like civil activism and scientific research funding. The map is showing both paths in Plan A.

However, if Plan A fails, meaning if you die before the victory over aging, there is Plan B, which is cryonics. Some simple steps can be taken now, like calling your nearest cryocompany about a contract.

Unfortunately, cryonic could also fail, and then you can move to Plan C. Of course it is much worse – less reliable and less proven. Plan C is the so called digital immortality, that means one could be returned to life based on the existing recorded information about that person. It is a not the best plan, because we are not sure how to solve the identity problem, which will arise, and also we don’t know if collected amount of information would be enough. But it is still better than nothing.

Lastly, if Plan C fails, we have Plan D. It is not a plan in fact – it is just hope or a bet that immortality already exists somehow, maybe there is quantum immortality, or maybe the future AI will bring us back to life.

All Plans demand particular actions now – we need to prepare to all of them simultaneously. All of the Plans will lead to the same result – our minds will be uploaded into a computer and will merge with AI. So these plans are in fact multilevel defence mechanisms against death structured in the most logical way.

The Roadmap to personal immortality by Turchin Alexei

Leave a Comment

Filed under Без рубрики