Cultus - framework to find intelligent logical circuits
Attempt in creating a framework that uses a Genetic Algorithm and logical gates to find a spark of intelligence with the idea behind it explained.
Some time ago (around half a year) an interesting idea came to my mind, probably fueled a bit by the AI craze that we're experiencing in recent years. In this post I would like to present it and also the implementation of it that has been given the name Cultus. The code of which can be found here.
The basic gist is this - how can we construct the simplest system which can exhibit "intelligence" and is able to learn new tasks, improving continuously as it spends time doing it.
Currently the standard approach to solving that problem are NNs (Neural Networks). But in essence they are an imperfect attempt of reproducing how our brain works and nature's solution built under chance and resource constraints. Can we somehow "start from scratch" and using some very basic building blocks create a system that one could call intelligent?
The building blocks
So on a fundamental, information processing level we would need two things:
- A way to perform and model any calculation (in binary as the simplest)
- A way to modify and propagate state to modify the way the calculations are done - without this it can never learn and improve over time
The first point we require as the (deterministic) transformation from input to output of our unknown system might be basically anything - we do not want to work under any assumptions like matrix multiplication or SGD. And for that we actually have a pretty obvious "atom" that we can use - the NAND gate (or alternatively NOR). The most important factor here is that it is what is called an universal gate. What this essentially means we can model any possible logical ciruit and binary calculation using them.
As for the second point - for any kind of learning to happen the system must have a state. Without that the gate circuit mentioned above will be simply a pure function, meaning that for a given input it will always return the same output. Meaning no kind of progress (in this case learning, optimization) of the performed calculation can happen. Of course the state itself needs to be connected with the logical circuit, or at least part of it - it needs to impact how calculations are performed and the calculations need to impact the memory. The assumption is, if done right, this feedback loop should enable the circuit to learn and optimize against a given signal over time.
The memory can be in its most basic form, as how it is implemented in Cultus - a map that maps a given address to a value. The inputs of the memory block (the address to write to, address to read from and the value written) are connected to the outputs of the logical gates (at least some of them) and the outputs are forwarded to the input of the whole logical circuit. It can be visualized as follows:
In the above example the memory takes 3 bits as input to controll it and outputs two bits - one which becomes an output bit of the system and one is forwarded to the input.
The collection of inputs, outputs and NAND gates can be called a "logical circuit" and with the addition of the controllable memory block a "smart network" - this will be the nomeclature used going forward.
How to find it
But even if we somehow identified what components we might use to achieve our goal we still have no idea how the final solution should look like - how many gates do we use? How are they connected? How big is the memory (address space, cell size) and how it is connected with the gates? How are the gates and memory "exposed" via the systems inputs and outputs? These are all (extremely) hard questions if our aim is an actually working system that is able to solve relatively complex tasks and has the ability to learn.
So considering the complexity and the effectively almost infinite amount of possibilities of configurations- we should use an automated way to perform the search, preferably something that can be easily scaled over many cores, CPUs and computers to cut down the time needed.
Considering the vast and very non-linear search space our eyes turn toward using a metaheuristic to help us find the solution - in Cultus that being the Genetic Algorithm which is loosely based on the evolutionary process seen in nature. The main benefits of it are:
- Easy to implement
- Easy to pararellize and distribute the calculation to many computers
- Offers good results which when tuned correctly don't get stuck in local optima
The characteristics of the (smart) network are encoded in the chromosome and the fitness is calculated based on some score that the network achieves after multiple playthroughs on some task/game. The fitness calculation which is the most calculation-demanding task can be distributed among many computers which offers almost linear speed if given a big enough population size.
The chromosome
So we need to identify the characteristics of the network and divide them into two groups:
- Givens - these are constant and defined in the configuration
- Variables - these can change over time and are encoded in the chromosome and via crossover + mutation should be modified in a way for the network to converge to an optimum
In Cultus these are identified as follows:
Givens
- Number of inputs/outputs of the smart network - these would receive external signals from the environment and send out them processed to perform actions
- Number of bits in the chromosome encoding the number of NAND gates - this effectively configures the maximum number of gates a given network chromosome can have, e.g. 4 bits = maximum of 2^4=16 NAND gates
- Memory address and read/write bits - this determines the number of inputs/outputs of the memory block, how big is the address space and how much is data can be written/read from memory by the logic network and/or outputs
- Connection count - the maximum number of connections between the elements in the smart network (gates, memory inputs/outputs, inputs/outputs of the whole network). As the chromosome length is constant and finite we need to define this to calculate the number of bits needed to encode the connections
Variables
- Number of NAND gates - the (target) number of gates is encoded in the chromosome taking N bits defined previosuly. This is effectively the maximum number of gates as some might get removed if they have dangling inputs/outputs and do not contribute to the result in general
- Connections - the connections between the elements. Each connection defines the input and output element within the network
Additionally in regards to the connections we want to clean the network elements and connections which do not contribute to the result - like gates with only one input connection defined, without an output connection, etc. to optimize the inference. There are no special connection or bits regardin the memory block as the inputs and outputs of the memory are simply added to the logical network and are memory is block is "opaque" to the network and chromosome evolution.
An example bitstring encoding the chromosome of a single connection network in Cultus:
1100 (0 0000 0 0000)
The above bitstring consists of 3 sections:
- 1100 - in the network there will be (before cleanup) 12 NAND gates (1100 in binary = 12 decimal)
- 0 0000 0 0000 - represents the connection
- 0 - the type of the input node (0 - network input, 1 - NAND gate output)
- 0 - the type of the output node (0 - network output, 1 - NAND gate input)
- 0000 - index of the input node (e.g. here the 0th index network input)
- 0000 - index of the output node (e.g. here the 0th index network output)
There will be as many connection bitstrings as defined by the connection count constant. The size of the whole bitstring and the connection bitstrings is determined by the other parameters like the maximum number of NAND gates, number of inputs/outputs of the whole network and the parameters of the memory module when included into the network.
After the bitstring is translated into an in-memory representation cleanup is done to remove unneeded connections or those that produce a cycle.
Fitness
So now with a representation of the network and an algorithm that will enable us to search for an optimal one we're now missing one component - a way to determine which solutions are better than other, or in the context of GAs, the fitness.
In general our fitness should measure how well a given network learns - ideally it should not only tell us how well the solution is doing on the curriculum but also that it improves its performance over time spent on doing it.
For Cultus the task on which it trains is pretty simple and resembles something we can call meta-reinforcement learning. We use a very simple game where the player needs to collect points (with different value) spread around the map which might also contain some blocking titles which should be avoided. The game ends if either you collect all of the points possible or the step limit is reached. The final score is simply the sum of points with a bonus from the remaining steps (e.g. if all points were collected with 25 steps/moves and the maximum is 30 then 5 bonus points are added).
The levels are handcrafted using a simple text definition, e.g.:
###########
#@77777777#
#777777777#
#777777777#
###########
So in the above example the player starts at the upper-left corner (@ symbol) and the whole map is a simple square where every title except for the starting one contains a 7-point reward.
Each evaluation might consist of multiple levels and each one of them is played many times - this is to help evaluate if any progress is made as the network gains experience playing it. The memory is kept as-is between the playthroughs which should enable gaining experience and knowledge after each of them.
The final fitness can be a simple sum and/or average of the points collected but a twist is added - the score is weighted against the playthrough index. So that points and games later in the evaluation have more impact, which should also promote finding networks that improve over time (learn) when playing the levels.
Parallelization
As the search is very computer and time consuming a very important aspect to the implementation is trying to parallelize in any way possible. Thankfully the way GA works makes it relatively easy to run the evaluation in parallel on a single computer but also distribute the fitness calculations to different computers by "chopping up" (batch processing) the whole population into smaller chunks.
The way it is done in Cultus is by using two AMQP (RabbitMQ) queues - one for the chromosomes for which the fitness is to be calculated and one for the chromosome + fitness pairs which are used for generating the next population using the GA.
When the fitness calculation is complex enough (takes enough time compared to the overhead of receiving the chromosome, serde, sending the result, network lag, etc.) the system should scale almost linearly with the amount of computers.
Implementation
Cultus is an example implementation of the ideas that I've explained made by me, also as an attempt to learn the Rust programming language. As my Rust journey has been then in its infancy the code quality might not be sky-high but in general the solution has been tested and works.
Altough when ran the results show signs of convergence I haven't been able to find any sensible result that would exhibit what one could call "intelligence" as can be seen from Neural Networks (even the basic ones) but this goal might be out of my reach considering the limited resources I have at hand (16-core AMD Ryzen 9 7950X). But maybe someone that has access to a real computing cluster might come up with a better result - currently it is more of an exercise in Rust and presentation of an idea.
The code can be found on my GitHub. Contributions are welcome :)
Final words
If you managed to get to the end then I would want to thank you for reading and I hope that the post (and subject in general) interested you.
As mentioned above - I would be glad to accept PRs and changes to the code to improve it and hear your opinions about what can be changed/fixed/improved in the code.
Cheers!