Tuesday, December 29, 2015

How to run learning agents against PyGame

PyGame is the best supported library for programming games in Python. There are 1000's of open source games that have been built with it.
I wanted to do some reinforcement learning neural networks in games and PyGame seemed the best choice.
But I was a bit disappointed that most examples involved hacking the original game files.
I wanted to see if I could write a general framework for running learning agents in PyGame that would require zero touching of the games files.

If you are interested in writing your own implementation of this then read on, but if you just want to dive straight in and start playing with your own learning agents I've uploaded a project here which contains examples of setting up Pong and Tetris.

Getting started

  • You will need Python 2 or 3 installed.
  • You will need to install PyGame which can be obtained here. You can also install it via pip, but this can be tricky in windows.
  • You will need numpy, downloadable from here or again can be installed from pip
  • You will also need a game of some sort to try and set up. Here is a good simple pong game.

The plan

There are 3 things a learning agent needs to be able to play a game:
  1. An array of all the pixels on the screen when ever it is updated.
  2. The ability to output a set of key presses back into the game.
  3. A feedback value to tell it when it is doing well/badly.
We will tackle them in order.

Grabbing the screen buffer

In PyGame there is this handy method which will give us a 3 dimensional numpy array for each colour of every pixel on screen.
 screen = pygame.surfarray.array3d(pygame.display.get_surface())  

We could write our learning agent by hacking the game file and inserting this line into the main loop after the screen has been updated. But a better way to do it(and a way that allows us to have zero touches of the game file) is to intercept any calls to the pygame.display.update method and then grab the buffer then, like so:
 import pygame  
 import pygame.surfarray  

 # function that we can give two functions to and will return us a new function that calls both
 def function_combine(screen_update_func, our_intercepting_func):  
   def wrap(*args, **kwargs):  
               **kwargs) # call the screen update func we intercepted so the screen buffer is updated  
     our_intercepting_func() # call our own function to get the screen buffer  
   return wrap  

 def on_screen_update():  
   surface_array = pygame.surfarray.array3d(pygame.display.get_surface())  
   print("We got the screen array")  

 # set our on_screen_update function to always get called whenever the screen updated  
 pygame.display.update = function_combine(pygame.display.update, on_screen_update)  
 # FYI the screen can also be modified via flip, so this might be needed for some games  
 pygame.display.flip = function_combine(pygame.display.flip, on_screen_update)  

You can try this out by inserting this code before you start your game and it will print out the screen buffer as it comes in.

Intercepting key presses

The normal method in PyGame detecting key presses is via this method:
 events = pygame.event.get()  

So we can intercept it and have it return our learning agents key presses:
 import pygame  
 from pygame.constants import K_DOWN, KEYDOWN  

 def function_intercept(intercepted_func, intercepting_func):  
   def wrap(*args, **kwargs):  
     # call the function we are intercepting and get it's result  
     real_results = intercepted_func(*args, **kwargs)
     # call our own function and return our new results  
     new_results = intercepting_func(real_results, *args, **kwargs)  
     return new_results  
   return wrap  

 def just_press_down_key(actual_events, *args, **kwargs):  
   return [pygame.event.Event(KEYDOWN, {"key": K_DOWN})]  

 pygame.event.get = function_intercept(pygame.event.get, just_press_down_key)  

I should also warn you that the pygame.event.get method can be called with args to filter out which events are needed. If your running a game that uses these you will either need to handle them or just use my complete implementation here.

Getting feedback to the player

The final piece of the puzzle is handling the feedback/reward from the game. Unfortunately there is no standard way of doing scoring in PyGame so this will always require some amount of going through the game code, but it can still be done with zero touches.

For the pong game the scores are stored in two global variables bar1_score and bar2_score, which can be imported. Our reward is when the score changes in our favor.

 last_bar1_score = last_bar2_score = 0  

 def get_feedback():  
   global last_bar1_score, last_bar2_score  

   # import must be done inside the method because otherwise importing would cause the game to start playing  
   from games.pong import bar1_score, bar2_score  
   # get the difference in score between this and the last run  
   score_change = (bar1_score - last_bar1_score) - (bar2_score - last_bar2_score)  
   last_bar1_score = bar1_score  
   last_bar2_score = bar2_score  
   return score_change  

But for other games, such as Tetris there may not be a globally scoped score variable we can grab. But there may be a method or a set of that methods that we know are good/bad. Such as a player_takes_damage, level_up or kill_enemy. We can use our function_intercept code from before to grab these. Here is an example in Tetris using the result of removeCompleteLines to reward our agent:
 import tetris  

 new_reward = 0.0
 def add_removed_lines_to_reward(lines_removed, *args, **kwargs):  
   global new_reward  
   new_reward += lines_removed  
   return lines_removed  

 tetris.removeCompleteLines = function_intercept(tetris.removeCompleteLines,  

 def get_reward():  
   global new_reward  
   temp = new_reward  
   new_reward = 0.0  
   return temp  

Dealing with frame rates

One final issue that you may need to consider is that the learning agent will significantly impact the execution speed of the game. In a lot of games the physics is scaled by the elapsed time since the last update. If your agent takes 1 second to process a single frame in pong then in the next update loop the ball will have already passed off the screen. The agent may also struggle to learn if there is significant variance in the movement of different frames.
This can be handled by intercepting the pygame.time.get_ticks method and pygame.time.Clock in the same way as we have the other functions. See this file for details.

Pong in PyGamePlayer

Now all that remains is too stitch all those parts together and plug in the learning agent. In my project I've chosen to do this in a class, but it would be fine as a script.
Below is an example of the full thing set up to learn against Pong using the PyGamePlayer module. The PongPlayer simply needs to inherit from the PyGamePlayer class and implement the get_keys_pressed and get_feeback methods, the framework handles everything else.
 from pygame.constants import K_DOWN  
 from pygame_player import PyGamePlayer  

 class PongPlayer(PyGamePlayer):  
   def __init__(self):  
     Example class for playing Pong  
     super(PongPlayer, self).__init__(force_game_fps=10) # we want to run at 10 frames per second
     self.last_bar1_score = 0.0  
     self.last_bar2_score = 0.0  

   def get_keys_pressed(self, screen_array, feedback):  
     # The code for running the actual learning agent would go here with the screen_array and feeback as the inputs
     # and an output for each key in the game, activating them if they are over some threshold.
     return [K_DOWN]  

   def get_feedback(self):  
     # import must be done here because otherwise importing would cause the game to start playing  
     from games.pong import bar1_score, bar2_score  

     # get the difference in score between this and the last run  
     score_change = (bar1_score - self.last_bar1_score) - (bar2_score - self.last_bar2_score)  
     self.last_bar1_score = bar1_score  
     self.last_bar2_score = bar2_score  
     return score_change

 if __name__ == '__main__':  
   player = PongPlayer()  
   # importing pong will start the game playing  
   import games.pong  

So hazar! We now have the worlds worst Pong AI.

In my next post I'll go through writing a good reinforcement learning agent for this.
If you have any questions/correction please don't hesitate to contact me.

Full source code here.

Saturday, December 5, 2015

Particle Swarm Optimization in Python

Here is a short and sweet particle swarm optimization implementation in Python. For more information on particle swarm optimization check out Particle swarm optimization in F#

Example usage is like so:
def simple_error_function(args):  
   return args[0]+args[1]  

number_of_parameters = 2  
max_iterations = 100  
best_parameters, best_error_score = particle_swarm_optimize(simple_error_function, 
                                         number_of_parameters, max_iterations)  

The result will be the parameters that resulted in the lowest value for the simpe_error_function

You can add custom parameter initialization:
def simple_error_function(args):
    return args[0]+args[1]

def parameter_init():
    return random.random()*0.5-10

number_of_parameters = 2
max_iterations = 100
best_parameters, best_error_score = particle_swarm_optimize(simple_error_function, number_of_parameters,
                                                            max_iterations, weight_init, parameter_init=parameter_init)

Other arguments

  •  stopping_error: float = 0.001 #stop when we have a particle with this score 
  • num_particles: int = 10 #number of particles to run 
  • max_iterations_without_improvement: int = 10 #stop when we have this many consecutive turns without improvement 
  • c1: float = 2.0 # c1 PSO parameter 
  • c2: float = 2.0 # c1 PSO parameter

Sunday, November 8, 2015

Quick summaries of research papers around dynamically generating network structure

I've been reading a lot of research papers on how the structure of ANN can be generated/detected. Here are some quick summaries of interesting papers in that area.

Dynamic Node creation in backpropagation networks

  • Looks at attempting to find a good number of hidden nodes for a network by starting small and growing the correct number of hidden nodes
  • Claims that the approach of training the network then freezing existing nodes before adding new unfrozen nodes does not work well.
  • Adding new nodes then retraining the whole network appears to work better
  • The technique: 
    • A one hidden layer network is created with predefined number of hidden nodes
    • The network is trained, when the error rate stops decreasing over a number of iterations a new hidden node is added
    • Nodes stop being added either when a certain precision is reached or when the error starts increasing on a validation set
  • Results: Seems to train in not a significantly longer amount of time than fixed networks and tends to find near optimal solutions
  • Remaining questions they want to investigate is how big the new node adding window should be and where nodes should be added in multi layer networks?

Optimal Brain Damage

  • After training a standard ANN we can potentially improve speed and generalization performance if we delete some of the nodes/connections
  • Optimal brain damage deletes parameters (sets them to 0 and freezes them from future use) based on there impact on the error of the model.
  • Results: Against the MNIST data it was shown that up to a 5th of parameters could be deleted with no degradation in the training performance and some improvement in the generalization.

The Cascade Correlation Algorithms

  • A different approach to evolving structure for neural networks. 
  • Starts with a network with just fully connected input and output nodes. These are trained using quickprop until error rate stops decreasing.
  • Then multiple candidate nodes are created connected to every node except the output nodes(eventually hidden nodes will be added) with different random weights
  • These are then all trained to try and maximize the correlation with the output error.
  • The candidate with the best correlation is selected
  • Repeat training and adding in candidate nodes until stopping point
  • Results: Appears to do very well and producing results for things like xor and various other functions that are difficult for normal ANN's to do.
  • It has 2 problems
    • Over fitting
    • Because the eventual network is a cascade of neurons(later neurons are really deep because they depend on every previous neuron) it runs a lot slower than a normal ANN
  • Can also be extended to work on recurrent networks

Cascade-Correlation Neural Networks: A Survey

  • Presents some approaches to reducing the problems with cascade correlation.
  • Sibling/descendant cascade attempts to reduce the problem with networks going too deep.
    • There are 2 pools of candidate neurons, sibling(only connected to neurons in previous layers) and descendant(same as for normal cascade). 
    • Often siblings do get selected ahead of descendants, reducing the depth of the network
  • Says Optimal Brain Damage is effective at pruning the network after structure is discovered this helps with over fitting.
  • Knowledge based CCNN are another approach where candidates nodes can be normal functions or fully fledged ANN of there own. This approach sounds a lot of fun, would love more info on how successful it is.
  • Rule based CCNN, similar to the above but things like OR and AND operators are used. Again sounds alot of fun would love to know how it performs.

Tuesday, October 27, 2015

Even more quick summaries of research papers

Following on from part 1 and part 2 here are even more quick summaries of research papers,

Generative NeuroEvolution for DeepLearning

  • Applies HyperNEAT to deep learning
  • Uses the MNIST hand-drawn digits dataset, trains both a normal deep network and a convolutional network
  • Results: 
    • HyperNEAT on it's own performs very badly. 
    • Using HyperNEAT to generate a number of layers and then backprop on the final layer is achieves 58.4% for normal ANNs
    • Using HyperNEAT to generate a number of layers and then backprop on the final layer for a Convolution Neural net achieved performance of 92.1%
  • This paper seems to miss that one of the advantages of HyperNEAT is the ability to scale it to different sizes of input so it would be nice to see how it performans when given images with different dimensions vs a traditional approach which has to do a standard Photoshop resize and then work off that image.
  • Also would love to see some details of exactly how the algorithms were implemented
  • A big problem with HyperNeat is that though it can express a potentially infinite number of nodes and connection, the numbers of hidden nodes must be decided in advance
  • ES-HyperNeat is an attempt to extend HyperNeat to be able to determine how many hidden nodes it should have.
  • ES-HyperNeat uses the connection weights values themselves to determine node placement, areas with more complex(higher variance) weight patterns should be given more nodes.
  • There is an addition called Link Expression Output (LEO) for HyperNEAT where weather or not there is a connection between 2 nodes node is not calculated from the magnitude of weights but from another evolved parameter. This also works with ES-HyperNEAT
  • Algorithm is:
    • When creating a connection between 2 nodes (e.g an input and an output) rather than just calculate a single weight value we create 4 hidden nodes(quad tree style) and calculate the weights for each
    • if the variance of the weight is above some threshold we create all for nodes 
    • otherwise just have the single connection
    • when doing the connection for the 4 sub nodes we may do more subdivisions(up to some max)
  • Run experiments against maze navigation, multi model problems and retina problem
  • Results: Outperformed HyperNeat

Deep Learning using Genetic Algorithms

  • Tests using ga's to encode and then decode features of an image
  • Seems to have some success
  • They encounter some problems that may be better solved using NEAT/HyperNEAT

Novelty Search and the Problem with Objectives

  • Normally Genetic algorithms are trained with an objective(fitness) function and those agents that perform best against it are the selected for
  • This leads to problems of local minima and will often result in lots of similar behaviors with small tweaks being selected for not allowing for the more complex sets of interactions required.
  • Novelty search instead adds a measure of the distinctness or novelty or difference in the result of a behavior to the fitness function. 
  • Gives the example of teaching a robot to walk, a fitness function would start off with just selecting the robot that fell the furthest to the goal. Which would likely never lead to complex interesting behavior. 
  • But if you reward different kinds of falls some may arise that balance for a bit in nay direction over time these can adapt to move to the actual objective
  • Recommends average distance to k nearest neighbors as a good measure of novelty.
  • Novelty search shows very good results vs objective search in maze problems and bi-pedal walking among others

Unsupervised Feature Learning through Divergent Discriminative Feature Accumulation

  • Uses HyperNEAT with novelty search as the unsupervised layer in deep learning
  • Number of hidden nodes not set in advance, rather successive features are learned based on there difference from prevous features until a certain number reached
  • Only 2 layer network trained, learning 1500->3000 features
  • Results: Against MNIST data does amazingly well, incredibly impressive
  • Looks like this is a good alternative to restricted boltzman machine
  • Would love to see how it might do with more layers/convolutional architecture, scaling to different image sizes

Thursday, October 22, 2015

Quick summaries of research papers around NEAT Part 2

Continuing on from my previous post here are some more quick summaries of research papers.

Evolving Reusable Neural Modules

  • Attempts to improve on NEAT by dividing the evolved nets into modules. 
  • These modules are in themselves smaller neural nets with input, output and hidden nodes.
  • Blueprints are used to combine the modules into the final neural nets. Blueprints contain lists of modules to be used and mappings from the real inputs/outputs and the module input/outputs
  • Both blueprints and module are evolved and speciated.
  • The idea is having modules reduces the number of dimensions in the search space. It is somewhat analogous to the modules being chromosomes and blueprints being arrangements of chromosomes.
  • Experiment: NEAT and modular NEAT were run on a board game(very roughly like go).
  • Results: Modular NEAT was seen to evolve better solutions and for those better solutions to appear about 4 times faster. Though this level of improvement is possibly quite tied to the kind of task being learned.

Transfer Learning Across Heterogeneous Tasks Using Behavioural Genetic Principles

  • Transfer learning is applying learning in one domain to a related but different domain, e.g. speech recognition on male voices, to speech recognition on female voices.
  • 4 challenges of transfer learning:
    • Successfully learning related tasks from source tasks
    • Determining task relatedness
    • Avoiding negative transfer
    • More closely imitate human learning
  • Approach steps:
    • Have a set of potentially related tasks, choose one as the source, we aim to learn them all
    • Have all the parameters for a neural nets encoded for a genetic algorithm, including number of hidden nodes and learning rate.
    • Create a population of x pairs of identical neural nets and x pairs of neural nets where 50% of genes are shared(between pairs)
    • unique training sets are created for each individual in the population by randomly filtering out a subset of the training data.
    • Train every individual on the source task and then each other task independently.
    • After training measure the results and calculate how much of the performance was down to genes and how much down to environment by comparing the performance of identical and non-identical twins.
    • Select from the identical twin population taking into account how much of there performance was down to genes rather than environment.
    • Select until convergence
  • Tasks where
    • Learning past tenses of English words
    • Mapping patterns to identical patterns
    • Categorizing patterns into
    • Patterns with errors
    • Arbitrary patterns, since random should be no generalization
  • Results: The networks were able to use direction of change in heritablity(performance resulting from genes), to indicate task relatedness.
  • Related tasks were learned better than using standard methods
  • Would be interesting to see how NEAT would work with this method?
  • This uses the approach from the above paper on 3 pieces of financial data.
    • Statlog - Australian credit approval
    • Statlog - German credit data
    • Banknote authentication
  • Results: Seems reasonably successful
  • When they say weight symmetry they are referring to the weights used in a network feeding forward vs the weights used when doing back propagation.
  • Interesting food for thought is if weight symmetry is not important this could mitigate the vanishing gradient problem in deep neural nets...
  • They run 15 different data sets in the experiment all of which may be worth looking at for other experiments.
  • Results: Seemed pretty convincing that weight symmetry was not important, in particular an update rule they called Batch-Manhattan actually outperformed standard SGD.
  • Batch-Manhattan update rule i: 
mini_batch = [x for x in order(datasetsamples, lambda x : rand.Next()][:mini_batch_size] #select a random set of samples to be our mini-batch
update_magnitude = -sign(sum([weight_derivate(x) for x in mini_batch]))*momentum * previous_update_magnitude - decay * current_weight
new_weight = current_weight + learning_rate * update_magnitude
previous_update_magnitude = update_magnitude
  • One thing to node about the above is that the function weight_derivative above potentially does use the weights in the back propagation step. This is where I would love to see the actual source used to generate this results.
  • Though the magnitude of update was not found to be that important the sign (unsurprisingly)was.
  • Would love to see more analysis of how remove the weights in back prop might affect very deep networks.

Saturday, October 17, 2015

Quick summaries of research papers around NEAT

I'm starting on a project around neural networks and as preparation I have been reading a lot of research papers and making quick summaries in case I need to go back to anything in them. I present them here, mostly for myself, but they may also be useful to others. Apologies in advance if you were involved in any of these and I have horribly butchered your work...

Efficient Evolution of Neural Networks through Complexification

  • Presents the NEAT, Neural Networks through Augmenting Topologies, approach to training neural networks.
  • Main website for neat is http://nn.cs.utexas.edu/?neat here is the C# source example http://sharpneat.sourceforge.net/
  • Weights for the network are learned using evolutionary algorithms the 3 main things that make NEAT special are:
    • Uses an approach that keeps track of the ids of connections and nodes so that there is more efficient competing of genes against equivalent genes(hard to explain this exactly quickly but if you read it, it is a good approach).
    • Uses speciation to protect innovation, the evolving agents that compete are divided into subgroups and only need to compete against other agent in there "species" in this way innovations are given time to improve before being wiped which allows them to develop for longer to the point where they may be useful.
    • Reduces dimensions through complexification. We start we simpler nets with fewer hidden nodes then over time evolve more, this makes early evolution quicker and allows for greater complexity over time.
  • Had very good results for a range of tasks.
  • The networks that evolve can have many layers and also be very nonstandard, e.g. connections from input layer direct to 3rd layer.
  • Worked with recurrent networks as well.
  • Here is a good blog post on using NeatSharp http://www.nashcoding.com/2010/07/17/tutorial-evolving-neural-networks-with-sharpneat-2-part-1/

Compositional Pattern Producing Networks

  • Rather than creating a neural network directly, we create a function which takes a set of inputs for 2 nodes and outputs the weight of a connection between those 2 nodes. We then call that function for every point of a space to generate a neural net. Here is the python pseudo code for the function:
def createNodeWeight(node1coordinates, node2coordinates):
     weightOfConnectionBetween2Nodes = #result of some chain of composed functions
     return weightOfConnectionBetween2Nodes
  • Why do this?
    • In nature we see that often that x^10 synapse connections are generated from x genes so we should look for similar mechanisms
    • In our task is related to images or learning anything with physical dimensions we can use the real world coordinates of the inputs to influence are results, e.g. with images we can place the inputs from each pixel of an image at that coordinate.
    • This allows to very easily scale to images of different resolutions.
    • Reduces the number of dimension you need to search through to find a solution. I don't need to find the correct weight for every connection, just the smaller number of correct values to generate the connections.
  • The createNodeWeight function can itself use a neural network. This works very well generate the values for that network using NEAT, this is called hyperNEAT
  • Can create really interesting images
  • Here is the github for hyperNeatSharp and here is a good tutorial on using it.

Autonomous Evolution of Topographic Regularities in ArtificialNeural Networks

  • Applies hyperNEAT to learning checkers, the inputs exist across 2 dimension(checkers board)
  • Training is done by checking fitness against a standard min-max search algorithms with evaluation of the result with some randomness thrown in so the game is not to deterministic.
  • Results: hyperNeat was compared against Neat and also a NEAT-EI(Augmented version of NEAT) hyperNeat was able to evolve to beat the min-max with depth 4 and a lot quicker than NEAT-EI.
  • Generalization appears better than NEAT-EI as hyperNEAT is able to beat it in a direct game
  • Quite interesting stuff on why hyperNEAT performs well
  • Would love to see this applied to chess.

Evolving Adaptive Neural Networks with and without Adaptive Synapses

  • Experiment: A standard food foraging exercise with the twist that all the food may either be normal giving +points or be poison in which case -points. If the food is poison then when eaten a pain output neuron is stimulated. 
  • If the food is all poison then the optimum strategy is to then stop searching for food. There is no way a fixed optimum strategy can be evolved because if the food is poison or not is randomized so the agent must be able to adapt.
  • 2 sets of agents where evolved. Both used NEAT including the ability to generate recurrent connections.
  • 1 set also was able to evolve local learning rules. For either strengthening or weakening connections between nodes. The idea being that the agent would evolve to have a rule that would stop it foraging for food once it encountered the poison.
  • Results: Interestingly it turned out that the first agent could actually learn to stop foraging for the poison on it's own through recurrent connections and learned to do this faster than the agent with adaptive rules could learn it, because of it's lower number of dimensions.
  • It would be interesting to run a similar experiment on a game where the environment where both poison and food where present in the environment will some sensor for the agent to distinguish between, so once it had found poison it got extra points for still picking out the food, could a recurrent neural net still preform as well there?
More in part 2

Saturday, August 29, 2015

Presenting WikiDataDotNet - Client API for WikiData


WikiData is one of those things that sets the mind boggling at the possibilities of the internet. It's a project, started by the WikiMedia foundation, to collect structured data on everything. If you are doing anything related to machine learning, it is the best source of data I have so far found.

It aims to contain an items on everything and for each item a collection of statements describing aspects of it and it's relationship to other items. Everything makes more sense with an example, here is it's record on the item Italy which can be found in the API like so:

This will return a JSON file with sections like:

       "id": "Q38",
       "labels": {  
          "en": {  
           "language": "en",
           "value": "Italy"

Here we see the id of the item, in this case Q38 that is used for looking Italy up. Then labels contains the name of Italy in each language. Further down there is also a section aliases that contains alternate names for Italy in every language.

Futher down we get to the really interesting stuff, claims.

          "P36": [  
             "mainsnak": {  
               "snaktype": "value",  
               "property": "P36",  
               "datavalue": {  
                 "value": {  
                   "entity-type": "item",  
                   "numeric-id": 220  
                 "type": "wikibase-entityid"  
               "datatype": "wikibase-item"  
             "type": "statement",  
             "qualifiers": {  
               "P580": [  

These are a series of statements about the different aspects of the item. For example the above P36 is a claim about what the capital of Italy is. Claims are also entities in the API, so they can also be looked up like so https://www.wikidata.org/w/api.php?action=wbgetentities&ids=P36

mainsnak is the main statement associated with this claim (a Snak in wikidata is any basic assertion that can be made about an item). These all have a value and a type. In this case the claim that about Italy's capital, the value is a reference to a wiki entry, which can again be looked up from WikiData if you append a Q to the beginning of the numeric id, you my have already worked out what the entity here is https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q220

Other claims on Italy include location, who it shares a border with, public holidays, provinces, basic form of government, head of state, population(across history), head of government, the list is endless(no wait, actually it's 64 entries long).

Presenting WikiDataDotNet

I've been working on a project that needed to query against WikiData from .Net. The only existing .Net API for this I could find is Wikibase.NET for writing wiki bots. It hasn't been updated in a while and unfortunately a quick test reveals it no longer works. At a future date I may fix it up, but in the meantime I've created this quick query only API: WikiDataDotNet


It currently provides the ability to request entities:
 let italy = WikiDataDotNet.Request.request_entity "Q38"   
 var italy = WikiDataDotNet.Request.request_entity("Q38");  

and do a text search against wiki data:
 let search_result = WikiDataDotNet.Request.search "Headquarters of the U.N"  
 var searchResult = WikiDataDotNet.Request.search("en", "Headquarters of the U.N");  

That's it for functionality so far. My next plans are to make it easier to look up Claims against items and do caching of Claims. Also maybe some kind of LINQ style querying interface would be nice.

Tuesday, August 18, 2015

How do I get my team to start unit testing

A team lead recently asked me(this genuinely happened, this isn't just a rhetorical tick), "How do I get my team to start unit testing?". Which sounds like a great title for a blog post...

In my opinion task of getting a team to write unit tests is really the task of getting a programmer to believe it is in their best interests to write unit tests. There are plenty of tools such as sonar qube to give technical feedback on unit coverage, but without a team buying in they won't achieve much. It is very easy and of little benefit to do unit testing badly. So like a good salesman you need to sell them on why they will benefit from taking the extra time to unit test there already perfectly acceptable code(as they see it(if they don't believe the code they are currently writing is acceptable then there are other problems)).

There are many reasons a person should unit test. Some reasons are noble and good, to do with doing the best job you can, for your company and your fellow professional. But that doesn't work for everyone, so for those less nobly inclined there are also selfish reasons that are still valid.

The noble reasons:

  • Next level success: 1st level success, someone reports a bug, you fix it. Next level success, someone reports a bug, you fix it and you write tests that means no one in the future can reintroduce this bug. Unit testing allows you to future proof your code.
  • Quick feedback: One of the biggest factors in your ability to improve in any activity is your feedback loop. If you want to get good at chess, if you are playing against a good player they can tell you, "that move was bad" immediately after you make a bad move. Otherwise you may have to play the rest of the game and then lose a number of similar games before you work out it was that particular type of move that was the mistake. Unit testing allows you to get much quicker feedback. When you make a change to an application, you don't have to run it, set up the scenario by hand, then check for correct behavior for multiple different behavior. With unit testing you can get the feedback across multiple scenarios across the application in sub 10 seconds.
  • Encourages good design: There have been loads of articles written by better writers than me on this subject. Good application design goes hand in hand with designing for unit testing. Separation of concerns, single responsibility principle, dependency injection, etc.

The less than noble reasons:

  • Plausible deniability: If something goes wrong any where near their code they can point to the unit tests and say "well I know my code works, it must be someone else's problem". This has happened to me, I was asked to write some code that displayed the number of business days old a certain item was. When it started displaying -1 days old in prod. I could take there inputs put them in to my unit test and show that my code was correct(The problem turned out to be we were being sent items from the future due to incorrect date conversion further upstream).
  • Your future employ-ability: Now a days unit testing is so widespread, you will be asked a question about unit testing in most interviews. You may not care so much about how you do in this job, but don't you want to be applying best practice so you can get that shiny new future job.
  • Holding on to requirements: User A asks for a feature to work in a particular way. You make the change and put it into prod, then User B comes to you to complain, he asked for the feature to be in that particular way and now it doesn't work for him. Unit tests can remove a lot of these kind of problems because you can mark the unit test with who requested the functionality on the test. Unit tests are a great way to capture requirements permanently and raise these kinds of conflicting requests earlier.

So now the team are fully behind the plan and raring to go. Well probably not immediately, in teams I've been involved with it takes a good few months of pushing these points and including making sure that unit test percentages are reviewed, committed code is reviewed and unit tests are always required as a part of it. People need to see the benefit from doing increased testing and this may take time and energy. But over time it will happen if you're persistent.

Thursday, June 18, 2015

Why there is a big unsupervised learning shaped whole in the universe

The problems I had when I first started reading about unsupervised learning was, I didn't understand why it needed to exist. Supervised learning makes sense, you give a neural net an input and output and tell it "find a way to get from one to the other". Unsupervised learning doesn't have that. You are just giving it some data and saying "do... something". Even the diagram were confusing, I see input nodes, I see hidden nodes, what's the output?

The best way to get across the need for unsupervised learning is too talk about the end goal. Across the first year of a babies life it learns a huge amount of things. How to focus it's eyes, how to coordinate it's limbs, how to distinguish it's parents from other adults, that objects are permanent things. Until it learns to talk it is getting almost no feedback about any of this, yet it still learns. Getting good training data is hard, so the idea is: wouldn't it be great if we could set up a machine and it would just go off and learn on it's own.

Unfortunately so far we are a long way from that and the technique shown here seems trivial compared to that goal. But the goal is interesting enough that it is worth pursuing. The answer to the question "what am I asking an unsupervised network to do?" is "learn the data".  The output will be a representation of the data that is simpler than the original. If the input is 10,000 pixels of an image the output can be any smaller number. What a lot of the simpler unsupervised nets do is transform into a single number that represents groups of similar sets of inputs. These are called clusters.

An example competitive learning neural net

A competitive learning neural net attempts groups it's inputs into clusters. The code for it is really very simple. Here is the all that is needed in the constructor(if you don't like Python it is also available in C#, Java and F#):

from random import uniform

class UnsupervisedNN(object):
   def __init__(self, size_of_input_arrays, number_of_clusters_to_group_data_into):
     #we have 1 hidden node for each cluster
     self.__connections = [[uniform(-0.5, 0.5) for j in range(number_of_clusters_to_group_data_into)] for i in range(size_of_input_arrays)]  
     self.__hidden_nodes = [0.0]*number_of_clusters_to_group_data_into  

When we give it an input, it will activate the hidden nodes based on the sum of the connections between that input and each hidden node. It makes more sense in code, like so:

def feed_forward(self, inputs):  
     #We expect inputs to be an array of floats of length size_of_input_arrays.
     for hidden_node_index in range(len(self.__hidden_nodes)):  
       activation = 0.0
       #each hidden node will be activated from the inputs.  
       for input_index in range(len(self.__connections)):  
         activation += inputs[input_index]*self.__connections[input_index][hidden_node_index]  

       self.__hidden_nodes[h] = activation

     #now we have activated all the hidden nodes we check which has the highest activation
     #this node is the winner and so the cluster we think this input belongs to
     return self.__hidden_nodes.index(max(self.__hidden_nodes))  

So as it stands we have a method for randomly assigning data to clusters. To make it something useful we need to improve the connections. There are many ways this can be done, in competitive learning after you have selected a winner you make your connections to it more like that input. A good analogy is imagine we 3 inputs one for each color red, green and blue. If we get the color yellow the inputs were red and green. So after a wining node is selected it's connections to those colors are increased so future red and green items are more likely to be considered a part of the same cluster. But because there is no blue the connection to this is weakened:

def Train(self, inputs):
     wining_cluster_index = self.feed_forward(inputs)
     learning_rate = 0.1
     for input_index in range(len(self.__connections)):
       weight = self.__connections[input_index][winner]
       self.__connections[input_index][wining_cluster_index] = weight + learning_rate*(inputs[input_index]-weight)

A problem we can have here though is that a cluster can be initialized with terrible weights, such that nothing is ever assigned to it. In order to fix this a penalty added to each hidden node. when ever a hidden node is selected it's penalty is increased. So that over time if a node keeps winning it's the other nodes will eventually start getting selected. This penalty is also known as a conscience or bias.

To add a bias we just need to initialize an array in the constructor for each cluster
     self.__conscience = [0.0]*number_of_clusters_to_group_data_into

Change our feed forward to
def feed_forward(self, inputs): for hidden_node_index in range(len(self.__hidden_nodes)): activation = self.__conscience[hidden_node_index] for input_index in range(len(self.__connections)): activation += inputs[input_index]*self.__connections[input_index][hidden_node_index] self.__hidden_nodes[h] = activation return self.__hidden_nodes.index(max(self.__hidden_nodes))

Then in training we just make a small substitution every time a cluster wins
     self.__conscience[winning_cluster_index] -= self.conscience_learning_rate

Competitive learning nets are nice but come along long way from the goal of full unsupervised learning. In a future post I'm going to do a Restricted Boltzman Machine which is used in deep learning for the shallow layers to give us a simpler representation of an image to work with.

Full code is available on git hub in PythonC#Java and F#

Wednesday, May 20, 2015

Programming a programming computer game – .Net run time type creator

A while ago me and a friend had an idea for a computer game. You would control a collection of bacteria all which needed to feed and would die of old age given enough time. They could also reproduce in order of keep their population going and fight enemy bacteria population controlled by an other player. The aim of the game was from your bacteria to out compete the other players bacteria on the map. So far so unoriginal, our new idea was that rather than controlling the creatures through say using a mouse and keyboard to give them orders you would instead controls them by writing the code for how they behaved. It would be a real time competitive programming game.

The game interface would be a map with a text panel on the right where the user would enter code that the creatures would execute to make their decisions. There were commands for where to move, what to eat, when to breed, etc. This was also a really nice space from which to play with algorithms like neural nets, evolutionary algorithms, clustering, A*, etc. We played around with it a bit and had a fair amount of fun, but we eventually realized that even for us who had built it, the game was too complicated for anyone to actually play. At least not in real time. So we abandoned it as a fun experiment.

But I recently saw this post on stack overflow that reminded me of that game. So I thought I would share some of the code for how to do in application code compilation in .Net. Hopefully it will be of use to some people and maybe even if I get enough interest I may try and clean up the rest of the code and release it as an open source project. Because despite being painfully complicated, when it did work it was fun, at least for uber nerds like us.


Here is the one and only method in the lib method:

 public static T CreateType<T>(string source,   
                          IEnumerable<string> assemblies,   
                          out List compilationErrors)   
                             where T : class  
It will attempt to create an instance of the type T from the source passed in. The source will be compiled with references to all the assemblies in the assemblies parameter. So for example you could do this with it.

 namespace RunTimeTypeCreator.Tests   
    public interface ITestType   
      bool Success { get; }   
    public class RunTimeTypeCreatorTests   
       public static bool TestVariable = false;   
       public void Example()   
          const string source = @"   
 using RunTimeTypeCreator.Tests;   
 public class TestTypeClass : RunTimeTypeCreatorTests.ITestType   
    public bool Success { get { return RunTimeTypeCreatorTests.TestVariable; } }   
          List<string> compilationErrors;       
          var type = RunTimeTypeCreator.CreateType<ITestType>(source,   
                     new[] { "RunTimeTypeCreator.Tests.dll" }, //the name of this assembly  
                     out compilationErrors);   
          TestVariable = false;   
          //will print true   
          //will print false   
          TestVariable = true;   

Which is kind of cool I think. Here's a quick run through of how it works, this just shows the code minus bits of validation and error reporting, so if you want the full thing I would recommend getting it from github

 var csc = new CSharpCodeProvider();   
 var parameters = new CompilerParameters   
      //we don't want a physical executable in this case   
      GenerateExecutable = false,   
      //this is what allows it to access variables in our domain   
      GenerateInMemory = true   
 //add all the assmeblies we care about   
 foreach (var assembly in assemblies)   
 //compile away, will load the class into memory   
 var result = csc.CompileAssemblyFromSource(parameters, source);   
 //we compiled succesfully so now just use reflection to get the type we want   
     var types = result.CompiledAssembly.GetTypes()   
              .Where(x => typeof(T).IsAssignableFrom(x)).ToList();    
 //create the type and return   
 return (T)Activator.CreateInstance(types.First());  

I also had some other code around validating what the user was doing, Making sure they weren't trying to access the file system, open ports or creating memory leaks/recursive loops. I'll try and clean this up and post it at a future date.

Full code is available here on github. 

Monday, May 18, 2015

Particle Swarm Optimization in F# part 2 - Parallelizing

In the last post I gave an example of particle swarm optimization algorithm in F#. F# has a few nice features, but the main reason I wanted to use it was because it is so easy to write multi-threaded applications with it.

Multi-threaded PSO version 1

If I want my algorithm to run multi-threaded all I have to do is take this line in the update_particles function.

let updated_particles = particles |> List.map (fun x -> update_particle args loss_func x global_best_params)   

And change it to:
 let updated_particles = particles |> List.map (fun x -> async { return update_particle args loss_func x global_best_params }
                                   |> Async.Parallel   
                                   |> Async.RunSynchronously  
                                   |> Array.toList  

Like magic the whole application now runs in parallel and because I have no mutable types I can guarantee there are no issues with cross thread calls. There are a few things I don't like about this though, this is 3 lines when really it should be one and also this creates an array that I then have to map back into a list. It is an annoying quirk of F# that it is much easier to run Arrays in parallel with the Parallel.map function, than Lists, but arrays are mutable losing the guaranteed thread safety...
A bit of help from stack overflow yielded this, PSeq a lib allowing you to run parallel operations against F# sequences. So the above can be rewritten:
 let updated_particles = particles  |> PSeq.map (fun x -> update_particle args loss_func x global_best_params)   
                                    |> PSeq.toList  

Multi-threaded PSO version 2

Now running in parallel, our speed is improved, but we still don't use all cores as efficiently as possible. Each iteration of updating the particles waits for every particle to complete. This is reasonable if they all take the same amount of time, but lets say the function is something that could execute in 1 second or in 100 seconds. We always have to wait the amount of time of the longest to complete and once only 1 is left we are only running on a single thread.
A better alternative is we just run the whole lifetime of the each particle in parallel. The only piece of data that needs to travel between the particles is the global_best parameters. This can be handled by passing this as a ref and having a setter functions so we always just take the current global best at start up and update it whenever we have a new value.
The changes we need to make for this are:
Remove the update_particles and run_until_stop_condition methods and replace them with this:
let rec private run_particle (args : Args) (particle : Particle) (get_global_best :unit -> Particle) check_particle_against_global_best loss_func iterations_to_run : Particle =
    let updated_particle = update_particle args loss_func particle (get_global_best()).Parameters
    check_particle_against_global_best updated_particle

    let new_iterations_to_run = iterations_to_run - 1
    if stop_condition args iterations_to_run (get_global_best()).Local_best_loss then
        run_particle args updated_particle get_global_best check_particle_against_global_best loss_func new_iterations_to_run

The execute method needs to be modified to run run_particle in parallel:
 let execute (args : Args) (loss_func : list<float> -> float) (initail_weights : seq<list<float>>) =      
   let particles = initail_weights |> Seq.take args.particles  
             |> PSeq.map (fun w -> Particle(w, [for _ in 1 .. w.Length -> 0.0], w, loss_func w))  
             |> Seq.toList  

   let global_best = ref (particles |> List.minBy (fun x -> x.Local_best_loss) )  
   let monitor = new System.Object()  
   let check_particle_against_global_best (particle : Particle) =  
     lock monitor (fun() -> if particle.Local_best_loss < global_best.Value.Local_best_loss then  
                   global_best.contents <- particle)  

   let final_particles = particles |> PSeq.map (fun x -> run_particle args x global_best check_particle_against_global_best loss_func args.iterations)  
                   |> PSeq.toList  
   (global_best.Value.Local_best, global_best.Value.Local_best_loss, final_particles)  

Now this looks a bit more like traditional C# multi-threaded code and we now have the possibility of screwing it up. But hopefully we have contained the problem enough to be confident we haven't. We keep a ref to the particle that is our global best. We need a monitor to lock against when updating and final our check_particle_against_global_best, to which we will pass each new particle we create to see if it is an improvement.

Speed tests

Here is method to write out the execution speed of  an action
 let speed_test action title =    
   let iterations = 20   
   let sw = System.Diagnostics.Stopwatch();   
   let results =   
      [for i in 0 .. iterations do   
          yield sw.Elapsed] |> List.sort   
   let median_result = List.nth results (iterations/2)   
   printfn "%s : %A" title median_result   

The GC.Collect() forces the .Net framework to do a full garbage collection. If this isn't done the speed of one test can be affected by the memory used in the previous test. I'm taking the median time here, which I think is a better measure than mean. When ever you run a set of speed tests there will always be a few that take ages because of spooky behavior of OS's. The middle time ignores these occasional outliers.


From my fairly old 4 core desktop
Single threadedMulti-threaded 1Mutli-threaded 2
Fixed length function time in seconds2.451.561.33
Variable length function time in seconds4.451.440.71

So looks like Multi-threaded 2 gives a pretty decent improvement. Full code is here on github.

Monday, May 4, 2015

Particle Swarm Optimization in F#

Why is particle swarm optimization good?

Lets say you have a function that takes an array of inputs and produces a single output. Your have an objective, you want to find what input results in the lowest possible output for this function. If the function were differentiable you could just compute the gradient and follow it back to reach your objective, but in this hypothetical scenario it's not. If the range of inputs were quite small you could just search the entire input space to find the best output, but again not in this scenario. If you've reached this point, this is when particle swarm optimization(referred to as PSO from here on) might be a good idea.

When would you use it?

One example might be a function runs a simulation of how an aircraft flies. The inputs are the wing span, tail length, different kinds of engines that can be used, etc. The output is a how much fuel the aircraft uses in the simulation. Or perhaps you could have functions simulating an economic model.

What is it?

In PSO you have many "particles" which will each execute the function with their own set of parameters for it. After each execution, each particle will look at the result they got and attempt to find better input parameters. They do this by both looking at there own best result, attempting to move towards that, and also the best result of all the other particles. In this way the particles search through the input space but in such a way that they prioritize successful areas. Also with there being many particles searching each with different sets of parameters, you hope to avoid problems of local minima.

 velocity = inertia_weights * previous_velocity  
         + constant2*random2*(global_best_parameters-previous_parameters)  
 parameters[time] = parameters[time-1]+velocity[time]  

Here inertia_weight is just an array of number, all most likely between 0 and 1 used to dampen the previous velocity, to stop the particles heading off into the middle of nowhere. Also often velocity is constrained by a maximum value. constant1 and constant2 are used to control how much the particle is affected by the local and global bests. Normally they are set to between 1 and 2.

Is there any intuitive way of understanding this?

The thing that works for me is imagining a collection of bugs, one for each particle. Each bug is trying to find some food in an area. They can't see the food, but there sense of smell tells them how close to it they are, this is the loss function and there position is the parameters for the function. Imagine all the bugs spread out in the area, all moving around it. Each iteration they reconsider what direction they are going? Based on there smell they consider is my current position better than the best position I have ever been if it isn't they adjust there direction to move towards that. But these bugs are also social creatures and have there hive mind, so they also consider how does my current position compare to the best position any of the bugs have ever reached and adjust to move towards that. Over time each bug is getting a better and better personal best and the global best is also slowly improving. Imagine them swarming across the area gradually grouping more and more tightly around the successful points. Hope that makes things less not more confusing...

Why not use a genetic algorithm?

There is some overlap, genetic algorithms more naturally fit with discreet variables(e.g. bool) vs PSO fits better with continuous. But either can be modified to take the full range. I think when working with continuous variable PSO will work out to be more efficient in finding good solutions, but this is probably massively variable depending on the data. This is one of those annoying situations in machine learning where you have to try things and just see what looks like it's working.

Why F#?

The best thing about F# is how easy is to write easy to read parallel code and the lightweight syntax. PSO is can achieve massive benefits from running in parallel. Also my day job is mostly working in C#, Java and Python so F# makes a nice change of pace.

Show me the code

I wanted to write all the code in a pure functional style. This means aiming for all functions being deterministic, the output is solely dependent on the input. No side effects or state. For the PSO algorithm there are a lot of variables such as constant 1, constant 2, inertia weight, etc. These can all be set as class variables, but because I wanted to this to be pure they should all be passed in as arguments to functions that use them. This would mean a lot of messy long functions, so I made an args type that would hold all these variables.

 type Args(inertia_weight : list<float>, ?particles : int, ?iterations : int, ?max_velocity : float, ?success_threshold : float, ?c1 : float, ?c2 : float) =  
   member this.inertia_weight = inertia_weight  
   member this.particles = defaultArg particles 10  
   member this.iterations = defaultArg iterations 100  
   member this.max_velocity = defaultArg max_velocity 10.0  
   member this.success_threshold = defaultArg success_threshold 0.0001  
   member this.c1 = defaultArg c1 2.0  
   member this.c2 = defaultArg c2 2.0  

Next we have the particle class. Note I went for lists as the type for all the variables. I didn't want to use Arrays because they are not immutable. Sequences are the most functionally pure choice, but in reality they are quite a lot slower than lists. Also structuredFormatDisplay is used so they print nicely.

 [<StructuredFormatDisplay("Particle {Parameters}")>]  
 type Particle =  
   val Parameters : list<float>  
   val Velocity : list<float>  
   val Local_best : list<float>  
   val Local_best_loss : float  
   new(parameters, velocity, local_best, local_best_loss) =   
     { Parameters = parameters; Velocity = velocity; Local_best = local_best; Local_best_loss = local_best_loss }  

Here is the main algorythm

 let update_particle (args : Args) loss_func (particle : Particle) (global_best : list<float>) : Particle =  
   let limit_velocity velocity max_velocity =  
     velocity |&gt; List.map (fun x -&gt; if x &gt; 0.0 then  
                       min x max_velocity  
                       min x -max_velocity)  
   let random = new System.Random()  
   let r1 = random.NextDouble()*args.c1  
   let r2 = random.NextDouble()*args.c2  
   let velocity = (List.map2 (fun w v -&gt; w*v) args.inertia_weight particle.Velocity, // multiple last velocity by inertia weight  
           List.map2 (fun l p -&gt; r1*(l-p)) particle.Local_best particle.Parameters, //get attraction of local best  
           List.map2 (fun g p -&gt; r2*(g-p)) global_best particle.Parameters)//get attration of global best  
           |||&gt; List.map3 (fun x y z -&gt; x+y+z)//add the result of these 3 calculations together  
           |&gt; limit_velocity &lt;| args.max_velocity //limit velocity by max  
   let new_parameters = (particle.Parameters, velocity) ||&gt; List.map2 (fun x y -&gt; x + y)  
   let new_loss = loss_func new_parameters  
   if new_loss &lt; particle.Local_best_loss then  
     Particle(new_parameters, velocity, new_parameters, new_loss)   
     Particle(new_parameters, velocity, particle.Local_best, particle.Local_best_loss)  

For PSO we need to run this on a full collection of particles like so

 let update_particles (args : Args) (particles : list<Particle>) (global_best_params : list<float>) (global_best_loss : float) loss_func : list<Particle> * list<float> * float =    
   let updated_particles = particles |> List.map (fun x -> update_particle args loss_func x global_best_params)  
   let best_from_this_iteration = updated_particles |> List.minBy (fun x -> x.Local_best_loss)  
   if global_best_loss < best_from_this_iteration.Local_best_loss then  
     (updated_particles, global_best_params, global_best_loss)  
     (updated_particles, best_from_this_iteration.Local_best, best_from_this_iteration.Local_best_loss)  

A method for running this loop until our stop condition is reached

 let rec private run_until_stop_condition (args : Args) (particles : list<Particle>) (global_best_params : list<float>) (global_best_loss : float) loss_func iterations_to_run =  
   let stop_condition (args : Args) iterations global_best_loss =  
     iterations <= 0 || global_best_loss <= args.success_threshold  
   let (new_particles, new_global_best_params, new_global_best_loss) = update_particles args particles global_best_params global_best_loss loss_func  
   let new_iterations_to_run = iterations_to_run - 1  
   if stop_condition args iterations_to_run new_global_best_loss then  
     (new_global_best_params, new_global_best_loss, new_particles)  
     run_until_stop_condition args new_particles new_global_best_params new_global_best_loss loss_func new_iterations_to_run  

Now example usage looks like this.

 let target_point = [23.0;54.0]  
 let loss_func (x : list<float>) = x |> List.map2 (fun x y -> sin 1.0/x-sin 1.0/y) target_point |> List.sum |> abs  
 let random = new System.Random()  
 let initial_weights = seq{ while true do yield [random.NextDouble()*1000.0;random.NextDouble()*1000.0]};  
 let args = Args(inertia_weight=[0.8; 0.8], particles=10, success_threshold=0.01, iterations = 10);  
 let (global_best_params, global_best_loss, particles) = execute args loss_func initial_weights;;  

Here is a link to the full code on git hub. In the next post I look at options for parallelizing the code.