META-GENETIC
PROGRAMMING
Evolving a genetic programming system is just like evolving any other
computer program. Except a lot more complicated.
But it is a good idea for the following two reasons:
1. Improvement compounds upon itself.
Let's say that you
use your current genetic programming system to evolve a new genetic
programming system with better fitness (speed) than the one you are
currently using. Great! Get rid of the one you are currently using and
use the new one in its place - to evolve one with even better fitness.
Then replace the current one with that faster one, and so on and so
forth. As you can see, the process will get faster as you go along,
because you will continually be using genetic programming systems that
are faster and faster.
2. It removes the
creative limitations of the human brain
from the process.
Right now, we use
human programmed genetic programming systems to evolve computer
programs. Unfortunately these systems fall victim to human assumptions
of what makes a good genetic programming system. In fact, the very idea
that a system for automatically creating computer programs must involve
a strategy derived from evolution and genes is a notion put forth by
humans. It could turn out to be completely wrong. I believe the only
way to find out is by admitting that we as humans are not creative
enough to write a good genetic programming system, and passing that
responsiblity entirely over to the computer.
Anyway, let's get back to the problem at hand, which is how exactly to
do this.
A. Gather up an
initial population of human programmed genetic programming systems
Even a population of
one is okay. But don't start with nothing, or it will take eons.
B. Convert them into
the programming language that is used by your genetic programming system
Your genetic programming system is probably written in C++ or Java, but
the programs it evolves are created with something simpler, like Lisp.
If this is the case, you must re-write your entire genetic programming
system in Lisp, so that you can feed it to itself, as an initial
population member. This is no small task, but unfortunately it must be
done. As of June 27th this is what I am working on. I am running into
trouble converting my C++ genetic programming system into the stripped
down language it uses to evolve programs. So, I have decided to change
this stripped down language into something a little more
comprehensible, basically a virtual machine that runs a variation of
the x86 instruction set. The variation is that this system will be
impossible to crash. For simplicity's sake, only 16 bit
registers/mem-locations will be used to start with. There will only be
65536 memory locations, holding both code and data. JMPing or MOVing to
any of these locations will never lead to a crash. JMPing, because the
instruction set I will use will have exactly 65536 commands (many of
them duplicates), and any invalid jumps that go outside the allocated
space will be redirected back in. To avoid infinite loops, the part of
the virtual machine that executes each command will check before each
execution whether it has gone over a certain human-determined limit for
number of cycles.
Once this is complete, I will merely need to translate my compiled C++
code (x86 asm) into this new instruction set, which should be
relatively easy.
C. Get Started
Load the initial population of human programmed genetic programming
systems, as well as the description of the problem, into your genetic
programming system. The description of the problem is where things get
complicated. But, it is like any simpler problem. Let's use an example
to demonstrate. In this example, you want a program that takes the
inputs 1, 2, 4, runs
some unknown calculations on them, and comes up with the outputs, 6, 8,
and 12 respectively. So, you feed this description into your genetic
programming system, and keep evolving programs until it finds one that
outputs 6, 8, 12 for the three inputs listed previously. It will
probably be something like the equation f(x)=2x + 4. Simple, right? The
description of the problem of how to automatically program a genetic
programming system, on the other hand, is not so simple. The inputs
will be descriptions of problems rather than numbers. Each input will
be a description of a problem, and each output will be a program that
solves that problem. You of course have no idea what the outputs are,
you will have to test them to see if they do in fact solve the problem
described by the input. Anyway, to compare to the simple example, if
you have three inputs they will not be something like 1,2,4. They will
be more like "description of problem1 (input 3,54,34 output 6,332,333)"
, "description of
problem2 (input 3434,5443,-23,33.233,403 output 0,0,3,44,2,3)", "description
of problem3 (input 122,332,233,444,33 output 1,1,0,0,1)". And the
outputs will be computer programs, in Lisp or whatever language is used
by your system. I am not going to write out three example computer
programs, just consider them to be "computer program1", "computer
program2","computer program3". The fitness measure will have two parts.
One, is for all of the inputs, whether or not the computer program
output correctly solves the problem presented by its corresponding
input. I do not think there is a need to calculate the degree to which
it solves the problem, just a yes or no is fine. So, if you have twelve
inputs the fitness part one could be something like 7 yes's 5 no's. In
other words, 7/12.
Two, is for all of the inputs, how long it took to produce each output.
If you have three inputs, and the first one took 1 second to produce a
program for, the second took 4 seconds to produce a program for, and
the third took 2 seconds to produce a program for, the total for the
fitness part two would be 7 seconds, or just 7. I guess you could take
the mean (3.3333 in this case) instead but I'm not sure that would make
any difference. Also, you could go on forever without producing a
program that solves a given input. So make sure there is a cutoff time.
And, keep in mind that the a genetic programming system will take
varying amounts of time to produce an output for a given input, because
of the random nature of how it works. So do several runs and take the
average of the times.
D. Use simple inputs at the start
You feed your genetic programming system 64 inputs, each one along the
lines of "description of problem1 (input
34343434,3434492227872.45545,2332229.299293340,3223200302032033 output
99939,229999.33334,-0223332,293309393939)", and wonder why it doesn't
work. Well, duh. This is too complicated. You must use descriptions of
problems that can be solved by your current genetic programming system.
Even better, use ones that it can solve quickly and easily - at
least to start with.
E. Concentrate on just one section if its too slow
I'm not sure whether this will be necessary, but I will write more on
this topic later. Basically it is allowing just a certain part of the
genetic programming system to evolve (for example, the crossover
function) rather than the entire thing.
Note: I have
stopped work indefinitely on this project as of August 1, 2005 to
pursue a different approach to AI that seems more promising. I was able
to implement a meta-genetic programming system in software, which I
have posted here.
It is very slow and did not yield any
results, but it is an early prototype and underwent very little
testing so it is by no means a definitive answer to the question of
whether meta-genetic programming will work.
Back