Introduction to Randomly Evolving Machinecode by Tom Van Braeckel June 22th 2001 Copyright(C)2001-2002 This is an introduction to Randomly Evolving Machinecode (REM). Copyright (C) 2001-2002, Tom Van Braeckel. Anyone may reproduce this document, in whole or in part, provided that: (1) any copy or republication of this whole document or a part of this document must show Tom Van Braeckel as the source, and must include this notice; and (2) any other use of this material must reference this manual and Tom Van Braeckel, and the fact that the material is copyright by Tom Van Braeckel and is used by permission. Introduction to Randomly Evolving Machinecode 2001 Edition Table of Contents 1. Introduction 1.1 What is REM ? 1.2 Link to nature 1.3 Known issues 2 Definitions and definition extensions 2.1 Randomly 2.2 Evolving 2.3 Machinecode 2.4 Natural Selection 2.5 Digital Selection 3 The use of REM 3.1 Proving Evolution 3.2 Studying accelerated evolution and natural selection 3.3 Other uses 4 Virus-like REM 4.1 Introduction to viruses 4.2 Why write virus-like REM 4.3 Introduction to Antiviruses 4.4 Introduction to Polymorphic viruses 4.5 Introduction to virus-like REM 5 About the Author 6 Bibliography 6.1 Evolution vs Creation 6.2 Earth Timeline 6.3 The Human Genome: A Creationist Overview 6.4 DebateDarwin.com 6.5 Understanding and Managing Polymorphic Viruses 6.6 Brain *********************************************************************** 1. Introduction Table of Contents: 1.1 What is REM ? 1.2 Link to nature 1.3 Known issues 1.3.a Speed 1.3.b Corruption 1.3.c Lack of resources 1.1 What is REM ? REM stands for Randomly Evolving Machinecode. REM is a program which replicates by copying itself, randomly changing one or more bytes in every offspring. After replicating, REM runs the offsprings, which should start replicating too. When an offspring is not able to replicate, it extincts. 1.2 Link to nature Most scientist believe that in nature, primitive organisms replicate by creating offsprings that seem the same at first sight, and mostly they are the same too. But, sometimes, when we examine the offspring more closer, we notice that it has changed in some way. The change is often just one or a few genes that have mutated. This change is most likely to be random. The idea of porting this sort of mutation to computers has been around for a while; "porting this sort of mutation to computers" would mean to create a program which replicates, but randomly changes one bit or one byte in every offspring. 1.3 Known Issues 1.3.a Speed It took many million years for a single replicating cell to evolve to an intelligent being, no matter how primitive the outcome it was. Unless we are very patiently and we invent a way to stop aging, REM's evolution has to go a lot faster. Luckily, computers are known to be very fast for their price, and companies just keep developing faster computers. The example programs all produce many offsprings per second running on a 500 Mhz computer. Problem: Evolution is too slow. Solution: Fast mutation. New Problem: Corruption Unluckily, fast mutation leads to another problem to solve, corruption. 1.3.b Corruption In fast mutation, because of the fact that computers can handle only work using digital calculations, a minor change to a computer program is very likely to corrupt the program completely. In some cases, the whole computer will crash, making it impossible for the REM to continue evolving. Nature solves this problem by mutating very little and mutating with large periods of time between every mutation. This way, an organism first has the chance to produce many offsprings before trying a new mutation. When the mutation corrupts the organism, the others just continue replicating. But no matter how bad it looks for REM right now, there is a solution for the corruption problem, running run on many computer systems, which the author will call "resources" for now. Problem: Fast mutation leads to a great chance of corruption. Solution: Resources; run REM on many computer systems. New Problem: Lack of resources Unluckily, resources lead to another problem to solve, the lack of it. 1.3.c Lack of resources Say there are in the whole world 100 computers running REM on them. That's very little, and the computers are not likely to have the REM running on them for years, so even IF the REM evolved spectacularly, the world would never get to see it. The solution is easy; program the first version of REM to email its offsprings to a few other people owning a computer. That way, it will run on the other computers too, mutating and replicating. But, this way of solving the problem is illegal because it is considered the theft of computer resources, AND there are people constantly writing programs to stop this code from spreading, the Virus-Experts. Problem: Lack of resources Solution: Write virus-like REM New Problem: Virus-Experts, Antiviruses Luckily, Virus-Experts and Antiviruses are only a minor problem, since current virus detection methods don't stand a chance against REM. Read more about virus-like REM in 4: Virus-like REM. ----------------------------------------------------------------------- 2 Definitions and extensions Table of Contents: 2.1 Randomly 2.2 Evolving 2.3 Machinecode 2.4 Natural Selection 2.1 Randomly ran·dom (rndm) - Having no specific pattern, purpose, or objective: random movements. - Relating to an event in which all outcomes are equally likely, as in the testing of a blood sample for the presence of a substance. In REM, we refer to randomly because every byte is equally likely to be changed. NOTE: Even though the evolution seems random, the chance of the outcome being a non-working program which hasn't reproduced for generations is very little, because if a generation is not able to reproduce, it will probably extinct. This process is called "natural selection", see 2.4 But, can a computer generate random numbers ? Some say it can't, because the only thing it can do is calculate, and in theory, the outcome of every calculation is predictable. When we refer to random numbers in computers, we refer to a number, which is the result of a calculation using numbers we can predict in theory, but which we cannot predict in practice. Example: -------- IN AX,0x40 ; Store whatever is located at 0x40 in AX. ; On an x86 CPU (Intel, AMD,...), ; this is the CPU timer. ; AH is the left of what is in AX ; AL is the right of what is in AX ; Example: AX = 4535 ; => AH = 45 ; => AL = 40 ; So now AH and AL are both hard to predict. MOV AH,0x0E ; Don't mind this if MOV DX,0000h ; you don't know what it means. INT 0x10 ; Now we print the value of AL to the screen. ; In the example this would be character nr. 40 ; which is an '@'. ; Try to predict the output. ; It is very hard. There are more complicated ways to generate even more random numbers, but for REM, this will do. Can a human think of a random number ? Every thing we do is the result of electrical impulses in our brain. If someone or something could measure all those impulses and would know exactly what impulse does what, it could -theoretically- predict the random number you are going to choose. Can anything be random ? If someone or something could predict what you are going to do, and if the same machine could predict what everyone on this planet is going to do, it could just speed them up in some virtual planet, and predict what is going to happen in 100 years from now. When you flip a coin, the random number depends on the force with which you do it, the wind, gravity and so on. Well, if something knew all these factors, it could predict the number (0 or 1) even before the coin falls back into the palm of your hand. 2.2 Evolving e·volv·ing - To develop (a characteristic) by evolutionary processes. ev·o·lu·tion (v-lshn, v-) - A gradual process in which something changes into a different and usually more complex or better form. - The process of developing. - Change in the genetic composition of a population during successive generations, as a result of natural selection acting on the genetic variation among individuals, and resulting in the development of new species. As we see when looking at the first definition of evolution, we notice that, to be able to say REM is evolving, it needs to change into a different form, which is accomplished by changing a random byte at a random location. Changing into a more complex or(/and) better form happens automatically due to Natural Selection (see 2.4 Natural Selection). 2.3 Machinecode ma·chine code - A set of instructions for a specific central processing unit, designed to be usable by a computer without being translated. - Also called Machine Language. Machinecode is often confused with sourcecode. It is NOT: MOV AH,3Eh MOV BX,sourcefile_handle INT 21h JC ERROR3 Machinecode is a set of hexadecimal values placed in a specific order. Open an executable file in a standard your favorite text editor to see what machinecode looks like. 2.4 Natural Selection nat·u·ral se·lec·tion - The process in nature by which, according to Darwin's theory of evolution, only the organisms best adapted to their environment tend to survive and transmit their genetic characteristics in increasing numbers to succeeding generations, while those less adapted tend to be eliminated. 'Natural Selection' contains the word natural, which means it happens in nature. Because of the fact that REM cannot exist in nature, there is a new term which CAN be used; Digital Selection (2.5). 2.5 Digital Selection dig·i·tal se·lec·tion - The process in a digital environment by which, according to Tom Van Braeckel's theory of Randomly Evolving Machinecode, only the Randomly Evolving Machincode best adapted to its environment tend to survive and to reproduce, that way transmitting their characteristics in increasing numbers to succeeding generations, while those less adapted tend to be eliminated. Example: Suppose REM has been running on x86 computers for about 10 -------- years. CPU capabilities have continued to increase and people decide to start using 64-bit computers. This is great for CPU making companies but terrible for REM, as it can only run on 32-bit computers. But, as it evolves, there will be REM offsprings that CAN run on 64-bit computers, and most likely ONLY on 64-bit computers. When a 32-bit REM creates a 64-bit REM and emails it through, the 64-bit offspring will just continue to evolve on the new platform. In time, the 32-bit REM's will extinct or evolve to 64-bit REM's. This is called Digital Selection. But such a big change might not even be needed. Maybe new OS'es will include an advanced emulator which allows 32-bit code to be run on a 64-bit computer. In this case, the REM could slowly adapt to the new processors. ----------------------------------------------------------------------- 3 The use of REM Table of Contents: 3.1 Proving Evolution 3.2 Studying accelerated evolution and natural selection 3.1 Proving Evolution In his 'Creation vs Evolution', Mr. Wiebe states: "[..] Most of us understand that the information that represents the data and instructions for a computer program has a particular code, designed specifically by the software engineer. What would we expect to happen if, [..] we zapped the binary image from which it was executing with a random change of some data bit? In most cases, the program would probably crash or seriously fail to accomplish anything useful. In some cases, the program might continue on oblivious to the change. In a very few cases, the program might exhibit some interesting aberrant behavior. But in no cases would we expect to get a more complex program or a program of a totally different kind." -- Mr. Wiebe in his 'Creation vs Evolution' Here Mr. Wiebe is talking about what he would EXPECT. If Mr. Wiebe had seen REM evolving he would have been able to talk about what he would really SEE. Mr. Wiebe doesn't expect the program to become more complex or totally different. I don't expect it to become totally different too. But I DO expect the program to become A LITTLE different, and possibly A LITTLE more complex. Mr. Wiebe continues: "So it is with random genetic mutations. Life forms are more complex than any computer program that we have ever designed. Random genetic mutations are bad. When they have an observable effect (i.e., are phenotypically expressed), they are almost always to the detriment of the organism, killing it, maiming it, making it sterile, etc. [..]" -- Mr. Wiebe in his 'Creation vs Evolution' Mr. Wiebe uses REM to explain he doesn't expect a mutation to be more complex or from a totally different kind. Mr. Wiebe says mutations are bad, because he never expects them to improve an organism. Well what if REM WOULD sometimes IMPROVE by mutation ? I'd say, if a simulation of an organism CAN improve by mutation, why can't monera have done the same millions (even billions) of years ago ? 3.2 Studying evolution and natural selection Or to be more correct: "Studying accelerated evolution, fast mutation and digital selection." A human's life is way to short to study evolution at normal speed. But using REM, we can simulate evolution by making it about 10 000 times as fast. Study this: Typical Monere: 1 offspring / second Typical REM: 500 times as fast ( 500 offsprings / second) 1 000 000 Typical REM's: 500 000 000 times as fast Suppose it took a monere 500 million years to evolve to an intelligent being. A little math shows us it should take one year for 1 000 000 REM's to evolve to an intelligent being. Whoever thinks I am missing something here, feel free to contact me. 3.3 Other uses In stead of using viruses to spread REM (reason for this at 1.3.c), one could use REM to spread viruses. More on REM to spread viruses in '4: Virus-like REM'. ----------------------------------------------------------------------- 4 Virus-like REM Table of Contents: 4.1 Introduction to viruses 4.2 Why write virus-like REM 4.3 Introduction to Antiviruses 4.4 Introduction to Encrypted viruses 4.5 Introduction to Polymorphic viruses 4.5.a Polymorphic Detection 4.5.a.a Generic Decryption 4.5.a.b Heuristic-Based Generic Decryption 4.5.a.c The Striker System 4.6 Introduction to virus-like REM 4.1 Introduction to viruses A virus is a cracker program that searches out other programs and "infects" them by embedding a copy of itself in them, so that they become Trojan horses. When these programs are executed, the embedded virus is executed too, thus propagating the "infection". Unlike a worm, a virus cannot infect other computers without assistance. It is propagated by vectors such as humans trading programs with their friends. The virus may do nothing but propagate itself and then allow the program to run normally. Notice that 'Virii' is not the official plural for virus, but term is widely used. The official plural for virus is viruses. In stead of virus-like REM, one could even write worm-like REM, which would mean the REM spreads on it's own, by -for example- emailing itself to other people. A worm is a program that propagates itself over a network, reproducing itself as it goes. Nowadays the term has negative connotations, as it is assumed that only crackers write worms. To keep it simple, when referring to viruses, I mean viruses and worms. 4.2 Why write virus-like REM The use of virus-like REM is explained in "1.3.c : Lack of resources". In short: When running REM on multiple computers, there is a much greater chance an interesting offspring will be created. There's a very small chance people will voluntary run REM on their computers, and one might see using virus-like REM as a great way to run REM on many computers. 4.3 Introduction to Antiviruses An Antivirus is a software program designed to identify and remove a known or potential computer virus. Most antivirus programs include an auto-update feature that enables the program to download profiles of new viruses so that it can check for the new viruses as soon as they are discovered. "A simple virus that merely replicates itself is the easiest to detect. If a user launches an infected program, the virus gains control of the computer and attaches a copy of itself to another program file. After it spreads, the virus transfers control back to the host program, which functions normally. Yet no matter how many times a simple virus infects a new file or floppy disk, for example, the infection always makes an exact copy of itself. Anti-virus software need only search, or scan, for a tell-tale sequence of bytes known as a signature found in the virus. In response, virus authors began encrypting viruses. The idea was to hide the fixed signature by scrambling the virus, making it unrecognizable to a virus scanner. These viruses are called encrypted viruses." -- Carey Nachenberg for Symantic, see 6.5 4.4 Introduction to Encrypted viruses An encrypted virus looks like this: 1 Encryption Routine : A program which decrypts the encrypted virus ---------------------- using the encryption key. 2 Encryption Key : The "key" to decrypt the encrypted virus. Without ------------------ this key, the encrypted virus cannot be decrypted. The encryption key changes every time the virus replicates. 3 Encrypted Virus : The encrypted virus. Because the key changes every ------------------- time the virus decrypts, so does the encrypted virus. This might look like a hard-to-detect virus, but it is not, because the Encryption Routine never changes. The virus scanner just looks for the routine and when it finds a file containing it, it's a virus. 4.5 Introduction to Polymorphic viruses "In retaliation, virus authors developed the polymorphic virus. Like an encrypted virus, a polymorphic virus includes a scrambled virus body and a decryption routine that first gains control of the computer, then decrypts the virus body. However, a polymorphic virus adds to these two components a third: a mutation engine that generates randomized decryption routines that change each time a virus infects a new program. With no fixed signature to scan for, and no fixed decryption routine, no two infections look alike. The result is a formidable adversary. The Tequila and Maltese Amoeba viruses caused the first widespread polymorphic infections in 1991. In 1992, Dark Avenger, author of Maltese Amoeba, distributed the Mutation Engine, also known as MtE, to other virus authors with instructions on how to use it to build still more polymorphics. Today, anti-virus researchers report that polymorphic viruses comprise about five percent of the more than 8,000 known viruses." -- Carey Nachenberg for Symantic, see 6.5 4.5.a Polymorphic Detection Anti-virus researchers first fought back by creating special detection routines designed to catch each polymorphic virus, one by one. This approach proved inherently impractical, time-consuming, and costly. Each new polymorphic requires its own detection program. 4.5.a.a Generic Decryption "A scanner that uses generic decryption relies on this behavior to detect polymorphics. It loads this file into a self-contained virtual computer created from RAM. Inside this virtual computer, program files execute as if running on a real computer. The scanner monitors and controls the program file as it executes inside the virtual computer. A polymorphic virus running inside the virtual computer can do no damage because it is isolated from the real computer. The key problem with generic decryption is speed. Generic decryption is of no practical use if it spends five hours waiting for a polymorphic virus to decrypt inside the virtual computer." -- Carey Nachenberg for Symantic, see 6.5 4.5.a.b Heuristic-Based Generic Decryption "To solve this problem , generic decryption employs heuristics, a generic set of rules that helps differentiate non - virus from virus behavior. As an example, a typical nonvirus program will in all likelihood use the results from math computations it makes as it runs inside the virtual computer. On the other hand, a polymorphic virus may perform similar computations , yet throw away the results because those results are irrelevant to the virus. Heuristic-based generic decryption looks for such inconsistent behavior. An inconsistency increases the likelihood of infection and prompts a scanner that relies on heuristic-based rules to extend the length of time a suspect file executes inside the virtual computer , giving a potentially infected file enough time to decrypt itself and expose a lurking virus. Inhibitor Rules: • If the contents of a register are destroyed before being used, increase VirusProbability by 1.2%. • If a NOP instruction is encountered, then increase VirusProbability by .5%. • If the program does no memory writes within 100 executed instructions, decrease VirusProbability by 5%. • If the program generates DOS interrupts, decrease VirusProbability by 15%. Unfortunately, heuristics demand continual research and updating. Heuristic rules tuned to detect 500 viruses, for example, may miss 10 of those viruses when altered to detect 5 new viruses." -- Carey Nachenberg for Symantic, see 6.5 4.5.a.c The Striker System "Symantec’s Striker system provides anti-virus researchers with a new weapon to detect polymorphics. Like generic decryption, each time it scans a new program file, Striker loads this file into a self-contained virtual computer created from RAM. The program executes in this virtual computer as if it were running on a real computer. However, Striker does not rely on heuristic guesses to guide decryption. Instead, it relies on virus profiles or rules that are specific to each virus, not a generic set of rules that differentiate nonvirus from virus behavior. When scanning a new file, Striker first attempts to exclude as many viruses as possible from consideration, just as a doctor rules out the possibility of chicken pox if an examination fails to detect scabs on a patient’s body. For example, different viruses infect different executable file formats. Some infect only .COM files. Others infect only .EXE files. Some viruses infect both. Very few infect .SYS files. As a result, as it scans an .EXE file, Striker ignores polymorphics that infect only .COM and .SYS files. If all viruses are eliminated from consideration, then the file is deemed clean. Striker closes it and advances to scan the next file. To date, generic decryption has proved to be the single most effective method of detecting polymorphics. Striker improves on this approach. Yet it is only a matter of time before virus authors design some new, insidious type of virus that evades current methods of detection." -- Carey Nachenberg for Symantic, see 6.5 4.6 Introduction to virus-like REM Virus-like REM looks like this: REM: ----- 1 Mutation Unit: The part of the REM which loads itself and \ ---------------- changes one or more bytes. \ The Reproduction Unit is also changed. \ - 2 Reproduction Unit: The part of the REM which makes sure -------------------- the mutated REM will spread. This part will infect .EXE files or, for example, simply email itself to other users. The Mutation Unit (which is the first part of the code) changes the REM a little. The Reproduction takes care of other things, such as emailing itself to other users, or disabling the Antivirus Auto-Update Function. Currently, there is no detection method against Virus-like REM's and REM-like viruses, so antivirus-experts have to write a special detection module for every REM-like virus or virus-like REM. ----------------------------------------------------------------------- 5 About the author It's very hard to write a few lines about yourself someone else is going to read, here's something that bit me; I'm interested in every aspect of almost everything. Anyone who has a comment, a question, an answer, an idea, a compliment, and anyone who doesn't agree with something I mentioned in this introduction, feel free to send an e-mail to email@example.com . No need to say every email gets replied to. One could also check my homepage at: http://t-Omicr0n.hexyn.be/ to find out what I've been doing lately. ----------------------------------------------------------------------- 6 Bibliography Table of Contents: 6.1 Evolution vs Creation 6.2 Earth Timeline 6.3 The Human Genome: A Creationist Overview 6.4 DebateDarwin.com 6.5 Understanding and Managing Polymorphic Viruses 6.6 Brain 6.1 Evolution vs Creation A very impressive and must-read paper written by Garth D. Wiebe on why he believes in creation in stead of evolution. Written in English. Document URL: http://www.ultranet.com/~wiebe/e.htm 6.2 Earth Timeline A simple image on when what happened long on the earth a long time ago. It starts 600 Million years ago. Commented in Dutch. Image URL: http://montessori-infosite.kennisnet.nl/1emens/1emensplaat/tijdlijngr.JPG 6.3 The Human Genome Project A Creationist explains the Human Gnome Project: Document URL: http://www.icr.org/headlines/humangenomemap.html The Human Genome Project: Genome 'treasure trove' Document URL: http://newsvote.bbc.co.uk/hi/english/sci/tech/newsid_1164000/1164839.stm 6.4 DebateDarwin.com "The monkey trial is still with us. Evolution is being attacked, and is fighting back. It's a lively debate full of sound and fury and signifying something, but what?" This site is to give a hearing to the actors in this play about origins. That covers quite a bit of ground --- evolutionists, creationists, intelligent designers,... Document URL: http://www.debatedarwin.com/ 6.5 Understanding and Managing Polymorphic Viruses Another excellent paper by Carey Nachenberg for Symantec. It handles on Polymorphic Viruses and is easy to read, even for beginners. Document URL: http://www.norton.com/ 6.6 Brain Not to be confused with intelligence. One who has brains is not automatically intelligent. The portion of the vertebrate central nervous system that is enclosed within the cranium, continuous with the spinal cord, and composed of gray matter and white matter. It is the primary center for the regulation and control of bodily activities, receiving and interpreting sensory impulses, and transmitting information to the muscles and body organs. It is also the seat of consciousness, thought, memory, and emotion.Note: Homepage and contact info are outdated.
Back to index