Wednesday, October 18, 2017

Book review: The learn manager

The lean manager is a book about the manager of a car factory in France. His company has just gone through a takeover and the new CEO informs him he must close the factory. After begging the CEO agrees that if significant improvements are made to productivity, before the closure can be arranged(it's France, labor laws are very complex), he may keep it open.

How is our hero to make these improvements at a factory that has been stagnating for so long?

Through the application of LEAN principles.

If you have worked in any manufacturing/tech industry jobs you will have heard the term LEAN. LEAN management was a system developed by Toyota that led to huge improvements in their productivity. This book is from the same family as the Phoenix project, which I wrote about here, a management book in the form of a novel. Like the Phoenix project, it is also a surprisingly good novel. A few empty characterizations aside the story was engaging and I found myself really hoping the factory would turn around and the jobs would be saved.

What is interesting from a management/LEAN point of view is in the book the factory already follows LEAN processes. They have a continuous-improvement officer and the main character has already gained various LEAN certifications. What the company CEO teaches the manager is there is a difference between the LEAN process, which is a prescriptive set of actions to apply and true LEAN working, which is more about educating employees to collectively solve problems.

The CEO's first advice for improving the factory has 3 parts:
    1: Fix quality problems
    2: Reduce inventory to free up cash
    3: Lower costs by eliminating waste found doing 1 and 2

From this advice the manager tries installing red bins at every point in the factory. When a part comes off a line in a defective state it is put in the red bin. Multiple times a day, inspections are made to check on the contents of the red bins. At first parts are put into the red bins, but this doesn't feed through to improvements. Senior managers argue about the reason for different defects and nothing is really done.

This leads to next issue in order for people to solve problems there needs to be agreement on what the problem is. Only then can group problem solving begin. There can only be agreement on what the problem is, if these things are clearly tracked. This involved better monitoring and much more education of the workers in terms of what the quality standards were.

Once there was clear agreement on what the problem was and how progress could be tracked then they could start to look at which teams and people were the most productive and start to feed the improvements they had across the factory. This is where continuous improvement could start to happen.

 

My take-aways

To create a good product, you must have a good process. But a good process only comes from good people and having those people engaged with the process. The first step to improving your product is improving your people. Improving people is not simply a matter of sending existing employees on training courses or hiring new more qualified new staff. It is about building a culture of continuous improvement and collaboration where the best ideas from one employee are adapted by the rest of his team and then the rest of the company. A factory is not just about part flow, but also knowledge flow. To improve parts, improve people.

In trying to create such a culture the first problem encountered at the plant is that it is very hard for employees to work together to solve problem. People look to avoid blame. If something goes wrong, the warehouse will blame the engineers, who will blame the maintenance teams, and so on. In order to solve problems together the first barrier is there must be consensus on what the problem is. When there is a problem it must be stopped at source. A bad part must be caught at the place it is made and not become the part that caused the problem in a larger component.

This is why the pull system goes hand in hand with LEAN. The shorter work cycles in a pull system mean it is much easier to identify where the process goes wrong and get people together to fix it.

Kaizen is term used by Toyota. It roughly translates to change for better. But as applied by Toyota is a thing meet for. When engaged in Kaizen there is a checklist to go through:
  1. What is the problem we are trying to solve?
  2. What result do we expect?
  3. What principle should we apply?
  4. Did we get the result we wanted? Why? What did we learn?
The Toyota 4 obsessions
  1. Managing production sites through stable teams of multi-skilled workers
  2. Get everyone involved in quality
  3. Just in time process by reducing lead time
  4. All around cost reduction by eliminating waste

Another big theme in the book and a solution to so many problems is what they term "Go and see". How do you get people to improve productivity? Go and see who is doing a good job and what they are doing. How do you reduce waste? Go and see what is being wasted. How do you get people to take responsibility for process? Go and see what they are worrying about. This really rang true to me, so much of what I've seen when I've worked at organizations with bad cultures is a lack of managers actually looking into what is going on.

The book makes clear the distinction between this and micro managing. Your not going in to fix the problem, you don't need to have the solution to a waste or productivity issue. Your going in to listen and understand, to see why things aren't happening and bring the right people together to solve the right problems. The Toyota way says that a manager should wash his hands 3 times every day, because every day he is going to the factory floor 3 times to see what is happening.

This concept of go and see is the thing I most took away from the book. There are all kinds of tools for tracking performance and productivity, and many of them are great, but there is no substitute for actually going and seeing.

 

Conclusion

Overall I really enjoyed this book and would recommend it for anyone involved in management or just interested in understanding LEAN a bit better. You can pick it up on amazon here.

Thursday, July 20, 2017

How do I know if I’m good at programming?

I was talking to a very junior programmer recently and he asked me a great question. A question so good, it made me stop and think about my perspective on how learning happens. The question was:

“How do I know if I’m good at programming?”

I’ve often been asked the other side of this question “How do I get good at programming?”, or “What can I do to get good at programming”. The answer to which is some combination of experiment on your own projects, do courses, read books, work with good programmers and contribute to open source projects. But if you think about the problem in terms of “how do I know if I’m good?” it becomes much more of an engineering approach. You have a metric, optimize it.

Aside from practice, the most important thing for improving at any task is good feedback. When starting in a new domain without a good teacher, good feedback is difficult.
 A course for example may have exams or projects on which you get feedback. But once you’ve completed a course, how do you get the higher level feedback on your overall improvement? If you have a good mechanism for feedback, the answers for which thing to pursue next flow much more easily.

So how do I know if I’m good at programming? A good place to start is to ask “what is good code?”. If a programmer can’t produce good code, they aren’t a good programmer.

What is good code?

All code exists to complete tasks. The first mark of good code is that it completes the desired task. The task may vary massively in level of complexity, but the code can never do better than complete the task. Feedback for this is simple, “does the code achieve the desired aim?”. The code should only complete the desired task, there should not be other undesirable side-effects. Writing a good sets of unit tests around your code can act as a success metrics.

If you cannot successfully complete the task, here is your first good piece of feedback which, will show you exactly what you need to learn next. Identify the knowledge you lack and seek out the most relevant resource. Systems theory tells us if you're not optimizing on the constraint, you're wasting your energy. This is the constraint, learn only this thing.

Is the code readable?

A well written piece of code is a clean concise expression of ideas. It should be as easy as possible for another programmer to understand what you’ve written. Get familiar with idioms and syntactic sugar of your language. There might be nice ways to write what you’ve written in 3 lines in just one, while still maintaining readability. Make sure your code is correctly documented, explaining why not what is the code is doing. To test this, if you don’t have friends who can look at your code, I would recommend posting your code to https://codereview.stackexchange.com/ or some other similar sites. Yourself 6 months in the future is also a pretty good substitute for a stranger.

Is it easy to extend or modify?

It's a lot of programmers favorite complaint, “the requirements changed”. This is reality, you're not programming exam questions with clearly stated features and aims. In the real world requirements are always changing and that's a good thing. If a task took you 3 months to complete and you have no new requirements, someone is not doing a good job. More time should bring in new information and requirements.

In an interview I like to ask a candidate to complete a fairly simple programming task, such as programming an elevator system. Once they’ve finished that, I ask them to implement a crazy feature that they could never have foreseen. There response to this, both interpersonal and technical, tells you a huge amount about their skill.

A good programmer should be planning for this. If you wrote a program to complete task x, see how easy it is to modify it to do task x and y but not when conditions k through q occur(unless k and m occur at the same time then execute y but not x). CS 101 concepts - like polymorphism and the difference between inheritance and composition - that seemed meaningless at the time, may now feel interesting.
Writing code to change is a thing you get a feel for with more experience, but don’t add more abstraction layers too soon. Premature abstraction leads things like Enterprise FizzBuzz (https://github.com/EnterpriseQualityCoding/FizzBuzzEnterpriseEdition). 

Is the code Efficient?

Efficiency has a few different components, speed to finish and various resource components such as memory usage, CPU usage, etc. Luckily all of these are easy to track, and there are profilers in every language that can show you where program time and resources are being used. Get familiar with these tools and then see how much you can shave off these metrics. If you are reengineering an existing problem, such as writing your own LRU cache, you can look up the theoretical best performance and compare it to your own. Here you may want to start thinking about your code from a big-O perspective, it is also useful to know roughly how long different operations take https://people.eecs.berkeley.edu/~rcs/research/interactive_latency.html.

Can you write it quickly?

Especially early on in learning it’s important not to focus on speed. Competing in speed coding competitions is impressive, but when it comes to the craft of programming, it's a poor way to learn. Speed should not be the goal, more the measure of progress. If you are mastering a domain then you should be able to write code that completes it’s task, is efficient, readable, and easy to extend in a shorter amount of time.

It is easy to measure time taken to complete a task, but that metric is difficult to act upon. Instead maybe try to analyze the amount time spent on the various sub tasks. Are these are the areas you need to deepen your understanding of? Maybe you are spending large amounts of time doing manual testing of your code, could this be speeded up through automation? You could start to look at the tools you are using. Your IDE has all kinds useful hotkeys that can improve speed and free up your brain from mechanical tasks so it can focus on the higher level problems. 

Finishing up

If your code achieves all these things to a high standard then you are good at programming for that task. If not there are clear areas on which to improve. You can then look towards more complex tasks or tasks in other areas that you wish to be a good.

I would like to add one more thing to this article to finish. Given the hard limit on hours in the day, and a harder limit on productive hours, there is only so much a single programmer can do. Elon Musk has a reputation as a productive guy, but if he was the only engineer at PayPal they would not have gotten far. At a certain level if you want to build great things you need to be working in a team. At this point programming becomes a social activity. You want to be able program not just with your own hands and brain but on some level through the whole team you work with. At this point being a good programmer is about not just how good your code is, but about how good the code of people you work with is.

There are many ways to get feedback on this. After you code review a colleagues work, do they produce better code? The next question is have they just improved the specific code you reviewed or has their general code improved as well? Getting people to improve in this way is not simply a matter of technical feedback, but also motivation. Can you get them excited about the task, to understand why different improvements or approaches are important. Are you improving the overall skills of the people around you? The famous 27x research says that the best programmers are 27 times better than the worst. Well if you are a 7x programmer and you help 4 other people in your team go from 1x to 7x then you're far more productive than a 27x.

For me these are essential skills of a good programmer and if you are achieving any of them even in some small way, you can sleep soundly knowing you know you are a good programmer.

Monday, May 22, 2017

Book: Python Deep Learing

Me and my co-authors recently finished work on a book called Python Deep Learning. It is now available on amazon https://www.amazon.co.uk/Python-Deep-Learning-Gianmario-Spacagna/dp/1786464454

The book aims to give a broad introduction to deep learning and show how to implement and use various techniques in Python. It includes examples of many applications of deep learning, including image recognition, speech recognition, anomaly detection in financial data.

My 2 chapters focus on my particular interest in using deep learning to play games. I've included examples of building AI in Python with Tensorflow that can master  Pong, Breakout and Go.

If you are interested, this code KVGRSF30 gives a 30% discount for the e-book version from the publishers website.


Saturday, October 1, 2016

AlphaToe

AlphaGo

Is an AI developed by Google Deepmind that recently became the first machine to beat a top level human Go player.

AlphaToe

Is an attempt to apply the same techniques used in AlphaGo to Tic-Tac-Toe. Why? I hear you ask. Tic-tac-toe is a very simple game and can be solved using basic min-max.

Because it's a good platform to experiment with some of the AlphaGo techniques which it turns out they work at this scale. Also the neural networks involved can also be trained on my laptop in under an hour as opposed too the weeks on an array of super computers that AlphaGo required.

The project is written in Python using TensorFlow, the Github is here https://github.com/DanielSlater/AlphaToe and contains code for each step that AlphaGo used in it's learning. It also contains code for Connect 4 and this ability to build games of Tic-Tac-Toe on larger boards.

Here is a sneak peak at how it did in the 3x3 game. In this graph it is training as first player and gets too an 85% win rate against a random opponent after 300000 games.




I will do a longer write up of this at some point, but in the mean time here is a talk I did about AlphaToe at a recent DataScienceFestival event in London. Which gives a broad overview of the project:


  

Thursday, May 26, 2016

Using Net2Net to speed up network training

When training neural networks there are 2 things that combine to make life frustrating:
  1. Neural networks can take an insane amount of time of train.
  2. How well a network is able to learn can be hugely affected by the choice of hyper parameters(hyper parameters here refers mainly to the numbers of layer and numbers of nodes per layer, but can also include learning rate, activation functions, etc) and without training a network in full you can only guess at which choices are better.
If a network could be trained quickly number 2 wouldn't really matter, we could just do a grid search(or even particle swarm optimization(or maybe Bayesian optimization)) to run through a lots of different possibilities and select the hyper parameters with the best results. But for something like reinforcement learning in computer games the amount of time to train is counted in days so better hope your first guess was good...

My current research is around ways to try and get neural networks to adjust there size automatically, so that if there isn't sufficient capacity in a network it will in some way determine this and resize itself. So far my success has been (very) limited, but while working on that I thought I would share this paper: Net2Net: Accelerating Learning via Knowledge Transfer which has a good, simple approach to resizing networks manually while keeping there activation unchanged.

I have posted a numpy implementation of it here on Github.

Being able to manually resize a trained network can give big savings on networks training time because when searching through hyper parameters options you can start off with a small partially trained network and see how adding extra hidden nodes or layers affects test results.

Net2Net comprises of 2 algorithms Net2WiderNet which adds nodes to a layer and Net2DeeperNet which adds a new layers. The code for Net2WiderNet in numpy looks like this:


This creates the weights and biases for a layer 1 wider than the existing one. To increases the size by more nodes simply do this multiple times(note the finished library on github has the parameter new_layer_size to set exactly how big you want it). The new node is a clone of a random node from the same layer. The original node and it's copy then have their outputs to the next layer halved so that the overall output from the network is unchanged.

How Net2WiderNet extends a layer with 2 hidden node layer to have 3


Unfortunately if 2 nodes in the same layer have exactly the same parameters then their activation will always be identical, which means their back propagated error will always be identical, they will update in the same way, their activation will still be the same, then you gained nothing by adding the new node... To stop this happening a small amount of noise is injected into the new node. This means as they train they have the potential to move further and further apart while training.

Net2DeeperNet is quite simple, it creates an identity layer, then adds a small amount of noise. This means that the network activation is only unchanged if the layer is a linear layer, because otherwise the activation functions non-linearity will alter the output. So bare in mind if you have an activation function on your new layer(and you almost certainly will) then the network output will be changed and will have worse performance until it has gone through some amount of training.
Here is the code:

BEGIN NET 2 DEEPER NET END NET 2 DEEPER NET

Usage in TensorFlow

This technique could be used in any neural network library/framework, but here is how you might use it in TensorFlow.

In this example we first train a minimal network with 100 hidden nodes in the first and second layers and train it for 75 epochs. Then we do a grid search of different numbers of hidden nodes for 50 epochs to see which lead to the best test accuracy.


Here are the final results for the different numbers of hidden nodes:

1st layer2nd layerTrain accuracyTest accuracy
10010099.04%93.47%
15010099.29%93.37%
15015099.01%93.58%
20010099.31%93.69%
20015098.99%93.63%
20020099.17%93.54%

Note: don't take this as the best choice for MNIST, this could still be improved by longer training, dropout to stop overfitting, batch normalization, etc

Sunday, May 15, 2016

PyDataLondon 2016

Last week I gave a talk at PyDataLondon 2016 hosted at the Bloomberg offices in central London. If you don't know anything about PyData it is an community of Python data science enthusiasts that run various meetups and conferences across the world. If your interested in that sort of thing and they are running something near to you I would highly recommend checking it out.


Below is the YouTube video for my talk and this is the associated GitHub, which includes all the example code.





The complete collection of talks from the conference is here. The standard across the board was very high, but if you only have time to watch a few, of those I saw here are two that you might find interesting.


Vincent D Warmerdam - The Duct Tape of Heroes Bayesian statistics

Bayesian statistics is a fascinating subject with many applications. If your trying to understand deep learning at a certain point research papers such as Auto-Encoding Variational Bayes and Auxiliary Deep Generative Models will stop making any kind of sense unless you have a good understanding of Bayesian statistics(and even if you do it can still be a struggle). This video works as a good introduction to the subject. His blog is also quite good.


Geoffrey French & Calvin Giles - Deep learning tutorial - advanced techniques

This has a good overview of useful techniques, mostly around computer vision(though they could be applied in other areas). Such as computing the saliency of inputs in determining a classification and getting good classifications when there when there is only limited labelled data.


Ricardo Pio Monti - Modelling a text corpus using Deep Boltzmann Machines in python

This gives a good explanation of how a Restricted/Deep Boltzmann Machine works and then shows an interesting application where a Deep Boltzmann Machine was used to cluster groups of research papers.

Monday, May 2, 2016

Mini-Pong and Half-Pong

I'm going to be giving a talk/tutorial at PyDataLondon 2016 on Friday the 6th of may, if your in London that weekend I would recommend going, there are going to be lots of interesting talks, and if you do go please say hi.

My talk is going to be a hands on, on how to build a pong playing AI, using Q-learning, step by step. Unfortunately training the agents even for very simple games still takes ages and I really wanted to have something training while I do the talk, so I've built two little games that I hope should train a bit faster.

Mini-Pong


This a version of pong with some of visual noise stripped out, no on screen score, no lines around the board. Also when you start you can pass args for the screen width and height and the game play should scale with these. This means you can run it as an 80x80 size screen(or even 40x40) and save to having to do the downsizing of the image when processing.

Half-Pong

This is an even kinder game than pong. There is only the players paddle and you get points just for hitting the other side of the screen. I've found that if you fiddle with the parameters you can start to see reasonable performance in the game with an hour of training(results may vary, massively). That said even after significant training the kinds of results I see are a some way off how well google deepmind report doing. Possibly they are using other tricks not reported in the paper, or just lots of hyper parameter tuning, or there are still more bugs in my implementation(entirely possible, if anyone finds any please submit).

I've also checked in some checkpoints of a trained half pong player, if anyone just wants to quickly see it running. Simply run this, from the examples directory.



It performs significantly better than random, though still looks pretty bad compared to a human. 

Distance from building our future robot overlords, still significant.