Softbot Design with WANNS

Introduction

I’m a Brazilian undergraduate student in Computer Science and I spent the past 3 months in the University of Tsukuba doing a research internship under Professor Claus Aranha. Here I will talk about my project during these past months to the best of my memory and in the end write down some lessons learned.

The research idea

Softbot Design

Soft robots (softbots) are robots built from highly compliant materials, similar to those found in living organisms1. There are many interesting applications for robots made from biological materials, like delivering drugs to specific parts of the human body and general interaction with the insides of a human.

The paper Unshackling Evolution models the design of softbots as filling a cube with voxels in 3d, each voxel being of 4 possible types:

And then simulating the resulting robot using a physics simulation library called voxelyze. The aim of the task is generating softbots that walk the farthest in a given timescale and the paper compares the design of the softbots using direct encoding versus generative encoding (CPPNs)

CPPNs

Compositional Pattern Producing Networks are one of the coolest ideas I’ve seen in Machine Learning, combining the flexibility of Neural Networks and the ability of Genetic Algorithms to optimize functions with hostile optimization landscapes to generate pretty much anything.

Now for the more technical details, CPPNs are used to generate patterns in a manner similar to a printer head, where they walk through each point in space and associate an output to it. To exemplify that, if we want to generate an image a CPPN will receive the x and y coordinates of each pixel and as output generate the color of that specific pixel. You can also pass some additional information in the input, for example distance from the origin or from the center, or if you’re building 3D structures you could pass distance from the axes.

But how do we model the function associating points in space to colors? That’s where the Neural Network comes in, as NNs are universal function approximators and so we could in theory capture any pattern we want, given a sufficiently big net.

Finally we have that CPPNs tend to not have a fixed architecture and use many different activation functions in each of its neurons, so we use a neuroevolution algorithm, like NEAT to evolve a neural network that produces the pattern that we want.

Weight Agnostic Neural Networks

While classically CPPNs are evolved by using NEAT my research was initially concerned with what kind of CPPNs would be evolved by using David Ha’s weight agnostic neural networks.

The aim of WANNs is evolving a neural network architecture that encodes the solution to the problem, de-emphasizing the weights of the connection between neurons. To that end we generate a population of randomly wired nets and evolve them by randomly adding new nodes, new connections between nodes and changing a node activation function, and testing the network with a few (generally 6) random shared weights, and setting the fitness of the individual as the average fitness from the net using the random weights.

WANNs show surprisingly good performance while using a single shared weight across all solutions compared to a normal fixed architecture with the same constraint of using a single weight, but they only achieve performance comparable to state of the art by training the weights of the final architecture, as you would with a normal NN.

Reading the codes and creating the environment

Getting the codebases and translating to modern python

My first challenge with this research was setting up my environment, as I had to initially combine David Ha’s Weight Agnostic Neural Networks code with the Unshackling evolution challenge. Thankfully I found Kriegman’s github, where he implemented Unshackling Evolution in python, calling it evosoro (for evolution of soft robots), though it was still in python 2.7 and so I would need to do some translating to python 3 before merging the codebases.

A few dozens of unicode errors later and evosoro was up and running in python 3.6 and so I could start thinking about merging the codebases.

Merging the codebases

Both projects were somewhat extensive, evosoro especially as it was written in a very general way as to make it easier to implement other papers beyond Unshackling Evolution. After a good deal of time reading through the code I decided to use David Ha’s code for optimization and create a Gym Task based on Voxelyze.

Initial experiments

Initially the experiments showed little progress, with the softbots generated after 100s generations being barely capable of moving at all, and being extremely below the expected fitness, which was quite worrisome.

I spoke to Claus and he advised me to check the output of my models instead of just looking at the metrics as that would help me better understand what my model was learning and what exactly was the bug.

Quality Diversity Detour

Since I believed I “just” needed to fix a few bugs I decided I could already study some techniques to explore the design of softbots, so Claus suggested I studied Quality Diversity algorithms and try to implement them on the project. I read Quality and Diversity Optimization: A Unifying Modular Framework and Quality Diversity Through Surprise, learning about extremely interesting techniques, like MAP-Elites and Novelty Search with Local Competition, that aim to increase the diversity of solutions found by evolutionary algorithms, while still maintaining high fitness, in fact getting higher fitness than algorithms focused on quality alone.

After around one week of studying I presented what I learned to the procedural generation group and went back to focusing on debugging the code, instead of adding functionality to a buggy codebase.

Lack of results and some despair

After looking at the specific output of my networks I found a bug I had introduced when creating the evosoro environment that caused my algorithm to never generate one of the materials and to consider “empty” as a possible material even after the first check, that helped the program generate less sparse softbots and increased the performance, but it was still quite low. Besides that voxelyze took quite a while to run and since each WANN needs to be tested 6 times that meant that experiments would take an entire day running only for me to be met with low performance, since I had to generate some kind of report to my funding agency in the end of the program I stopped working on the softbot design for a while and simply do something that wasn’t done in the original WANN paper despite being relevant, comparing WANN Search with its parent method, NEAT.

Comparing WANNs and NEAT

To that end I compared NEAT and WANN with the same hyperparameters in the cartpole swing up task. The main takeaways were that NEAT trained faster, requiring considerably less compute to reach good performance, but while WANN’s connection could be trained to reach 3x the networks initial performance, NEAT would only gain around 10% extra performance. That wasn’t particularly surprising, as NEAT is optimizing the weights during evolution, while WANNs are using a fixed shared weight. WANN vs NEAT

Bug found!

I thought about Claus advice of visualizing the output of my algorithm again, and realized that the output wasn’t just the softbots, but also the CPPNs architectures, so I worked on visualizing the networks generated. There was a directory called vis on the WANNs repo on github, so I used it, at first there were a few incompatibilities that I had to fix, but soon enough a very weird error was being reported: the activation functions of many nodes were not in the correct range (from 1 to 10). That was counter intuitive, I took a look at the code and didn’t seem to find any specific mistake on it, so I added a quick debug print that would tell me should any activation outside the correct range be generated. After a few extra experiments I realized there was a bug on Google’s code! Specifically they have a function that used the + operator as a way to merge lists, but when the function was called one argument was a list, while the other was a numpy array, and so the behavior of the + operator was rather different, summing the content of the lists instead of concatenating them. I fixed the bug and sent a Pull Request to their github repo, that Adam Gaier accepted.

Evolving Neural Nets post bug

After fixing the bug the performance improved again and now it was finally possible to generate Neural Networks with many different kinds of activation functions. Here’s an example of a net before the bug: Bug net

And one after: Debug net

Different inputs and results

Finally I am now studying the use of different inputs to my CPPN, besides the x, y and z coordinates and distance from center. As of now I have tested not passing the center and also tested passing the material that was used in the voxels neighbors. Here are some preliminary results, both using 30 generations with a population of 60 individuals and the WANNs being untrained, just evolved:

Softbots comparison

|-| No Center | Center | Neighbors| | NEAT-CPPN | 15.94 | 20.17 | 21.31 | | WANN-CPPN | 18.66 | 19.22 | 13.83 |

I still need to run these experiments for more generations as that will probably allow for a bigger performance difference to arise, and run them a few more times to get averages to account for bad and good random seeds affecting the results.

There were some extra challenges related to dealing with weights optimization using CMA-ES and using PEPG, but this post is quite too long as it is, so maybe I will talk about that in the future.

TL;DR: Lessons Learned