META-GENETIC PROGRAMMING

Evolving a genetic programming system is just like evolving any other computer program. Except a lot more complicated.

But it is a good idea for the following two reasons:

1. Improvement compounds upon itself.

Let's say that you use your current genetic programming system to evolve a new genetic programming system with better fitness (speed) than the one you are currently using. Great! Get rid of the one you are currently using and use the new one in its place - to evolve one with even better fitness. Then replace the current one with that faster one, and so on and so forth. As you can see, the process will get faster as you go along, because you will continually be using genetic programming systems that are faster and faster.

2.
It removes the creative limitations of the human brain from the process.

Right now, we use human programmed genetic programming systems to evolve computer programs. Unfortunately these systems fall victim to human assumptions of what makes a good genetic programming system. In fact, the very idea that a system for automatically creating computer programs must involve a strategy derived from evolution and genes is a notion put forth by humans. It could turn out to be completely wrong. I believe the only way to find out is by admitting that we as humans are not creative enough to write a good genetic programming system, and passing that responsiblity entirely over to the computer.

Anyway, let's get back to the problem at hand, which is how exactly to do this.

A. Gather up an initial population of human programmed genetic programming systems

Even a population of one is okay. But don't start with nothing, or it will take eons.

B. Convert them into the programming language that is used by your genetic programming system

Your genetic programming system is probably written in C++ or Java, but the programs it evolves are created with something simpler, like Lisp. If this is the case, you must re-write your entire genetic programming system in Lisp, so that you can feed it to itself, as an initial population member. This is no small task, but unfortunately it must be done. As of June 27th this is what I am working on. I am running into trouble converting my C++ genetic programming system into the stripped down language it uses to evolve programs. So, I have decided to change this stripped down language into something a little more comprehensible, basically a virtual machine that runs a variation of the x86 instruction set. The variation is that this system will be impossible to crash. For simplicity's sake, only 16 bit registers/mem-locations will be used to start with. There will only be 65536 memory locations, holding both code and data. JMPing or MOVing to any of these locations will never lead to a crash. JMPing, because the instruction set I will use will have exactly 65536 commands (many of them duplicates), and any invalid jumps that go outside the allocated space will be redirected back in. To avoid infinite loops, the part of the virtual machine that executes each command will check before each execution whether it has gone over a certain human-determined limit for number of cycles.

Once this is complete, I will merely need to translate my compiled C++ code (x86 asm) into this new instruction set, which should be relatively easy.

C. Get Started

Load the initial population of human programmed genetic programming systems, as well as the description of the problem, into your genetic programming system. The description of the problem is where things get complicated. But, it is like any simpler problem. Let's use an example to demonstrate. In this example, you want a program that takes the inputs 1, 2, 4, runs some unknown calculations on them, and comes up with the outputs, 6, 8, and 12 respectively. So, you feed this description into your genetic programming system, and keep evolving programs until it finds one that outputs 6, 8, 12 for the three inputs listed previously. It will probably be something like the equation f(x)=2x + 4. Simple, right? The description of the problem of how to automatically program a genetic programming system, on the other hand, is not so simple. The inputs will be descriptions of problems rather than numbers. Each input will be a description of a problem, and each output will be a program that solves that problem. You of course have no idea what the outputs are, you will have to test them to see if they do in fact solve the problem described by the input. Anyway, to compare to the simple example, if you have three inputs they will not be something like 1,2,4. They will be more like "description of problem1 (input 3,54,34 output 6,332,333)" ,
"description of problem2 (input 3434,5443,-23,33.233,403 output 0,0,3,44,2,3)", "description of problem3 (input 122,332,233,444,33 output 1,1,0,0,1)". And the outputs will be computer programs, in Lisp or whatever language is used by your system. I am not going to write out three example computer programs, just consider them to be "computer program1", "computer program2","computer program3". The fitness measure will have two parts.

One, is for all of the inputs, whether or not the computer program output correctly solves the problem presented by its corresponding input. I do not think there is a need to calculate the degree to which it solves the problem, just a yes or no is fine. So, if you have twelve inputs the fitness part one could be something like 7 yes's 5 no's. In other words, 7/12.

Two, is for all of the inputs, how long it took to produce each output. If you have three inputs, and the first one took 1 second to produce a program for, the second took 4 seconds to produce a program for, and the third took 2 seconds to produce a program for, the total for the fitness part two would be 7 seconds, or just 7. I guess you could take the mean (3.3333 in this case) instead but I'm not sure that would make any difference. Also, you could go on forever without producing a program that solves a given input. So make sure there is a cutoff time. And, keep in mind that the a genetic programming system will take varying amounts of time to produce an output for a given input, because of the random nature of how it works. So do several runs and take the average of the times.

D. Use simple inputs at the start

You feed your genetic programming system 64 inputs, each one along the lines of "description of problem1 (input 34343434,3434492227872.45545,2332229.299293340,3223200302032033 output 99939,229999.33334,-0223332,293309393939)", and wonder why it doesn't work. Well, duh. This is too complicated. You must use descriptions of problems that can be solved by your current genetic programming system. Even better, use ones that it can solve quickly and easily - at least to start with.

E. Concentrate on just one section if its too slow

I'm not sure whether this will be necessary, but I will write more on this topic later. Basically it is allowing just a certain part of the genetic programming system to evolve (for example, the crossover function) rather than the entire thing.

Note: I have stopped work indefinitely on this project as of August 1, 2005 to pursue a different approach to AI that seems more promising. I was able to implement a meta-genetic programming system in software, which I have posted here. It is very slow and did not yield any results, but it is an early prototype and underwent very little testing so it is by no means a definitive answer to the question of whether meta-genetic programming will work.


Back