Exploring Neural Networks, p. 3

(see part 2 here)

So eventually I got to analyse how training mini-batch size will affect a network that uses Batch Normalisation. There are several factors in play here:

  • Larger batch size is good for normalisation – the more samples we normalize over, the closer the estimation is. In effect, a large mini-batch size should cause the estimations to vary less between each mini-batch.
  • Smaller batch size results in more precise stochastic gradient descent steps, which may increase learning speed and final success rate.
  • It is computationally cheaper to process large batches, because of the parallel nature of modern hardware (especially GPU units).

Supposedly there might be a optimal mini-batch size for Batch normalisation. In order to find it, I tested the same network again using various mini-batch sizes, observed its performance, averaged results from multiple runs, and plotted results.

Read the rest of this entry »

Exploring Neural Networks, p. 2

(see part 1 here)

Once I fixed all my bugs in Batch Normalisation implementation and fine-tuned all parameters, I started getting reasonable results. In particular, it turned out that I needed to significantly (more than 10 times) increase weight decay ratio constant. I also had to modify learning rate scheduling so that it decays much faster, this makes sense, because Batch Normalisation is supposed to speed up learning. Eventually, the network:

3 channels ->  64 3x3 convolutions -> 3x3 maxpool -> BN -> ReLU
           -> 128 3x3 convolutions -> 2x2 maxpool -> BN -> ReLU
           -> 1024 to 1024 product ->                BN -> ReLU
           -> 1024 to  512 product ->                BN -> Sigmoid
           ->  512 to   10 product
           ->  SoftMax

has achieved 79% success rate on the test set.

I was interested in the advantage of using BN. To investigate it, I created another network, which is an identical clone of the one described above, but no Batch Normalisations are performed at all. Comparing the results of these two networks should express the gain introduced by using BN.

Read the rest of this entry »

Exploring Neural Networks, p. 1

As a final assignment on the Neural Networks course I took part in (University of Wrocław, Institute of Computer Science, winter2015/2016), I am tasked with designing, implementing and training a neural net that would classify CIFAR-10 images with some reasonable success rate. I am also encouraged to experiment with the network by implementing some of the recent inventions that may, in one way or another, improve my network’s performance. I will be sharing my results and observations here, in this post, and in some that will follow soon within the next two weeks.

The source code I am using for my experiments is available at github. The sources come with a number of utilities that simplify running them on our lab’s computers, which may come in handy if you are a fellow student peeking at my progress, but if you are not, then you should ignore all files except the ones within ./project directory.

Read the rest of this entry »

Current progress on AlgAudio

… or “what I’ve been working on for the past three months”.

So this summer I have participated in a programming internship at Audiovisual Technology Center – CeTA in Wrocław. CeTA is developing a number of very exciting projects, and the one I had the pleasure to work on is AlgAudio.

screenshot

(download links available below)

AlgAudio is a new signal processing framework that we’ve been developing from scratch. The user builds an audio processing network by placing “building blocks” of simple operations, connecting them together, configuring their parameters, and defining how the parameters should influence each other. The network works in real time, so any changes to the parameters are immediately reflected in the outputted audio. This makes AlgAudio a perfect tool for live performances.

Read the rest of this entry »

C++11: std::threads managed by a designated class

Recently I have noticed an unobvious problem that may appear when using std::threads as class fields. I believe it is more than likely to meet if one is not careful enough when implementing C++ classes, due to it’s tricky nature. Also, its solution provides an elegant example of what has to be considered when working with threads in object-oriented C++, therefore I decided to share it.

Consider a scenario where we would like to implement a class that represents a particular thread activity. We would like it to:

  • start a new thread it manages when an instance is constructed
  • stop it when it is destructed

I will present the obvious implementation, explain the problem with it, and describe how to deal with it.

Read the rest of this entry »

Dynamic linker tricks: Using LD_PRELOAD to cheat, inject features and investigate programs

This post assumes some basic C skills.

Linux puts you in full control. This is not always seen from everyone’s perspective, but a power user loves to be in control. I’m going to show you a basic trick that lets you heavily influence the behavior of most applications, which is not only fun, but also, at times, useful.

A motivational example

Let us begin with a simple example. Fun first, science later.

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int main(){
  srand(time(NULL));
  int i = 10;
  while(i--) printf("%d\n",rand()%100);
  return 0;
}

Simple enough, I believe. I compiled it with no special flags, just

gcc random_num.c -o random_num

I hope the resulting output is obvious – ten randomly selected numbers 0-99, hopefully different each time you run this program.

Now let’s pretend we don’t really have the source of this executable. Either delete the source file, or move it somewhere – we won’t need it. We will significantly modify this programs behavior, yet without touching it’s source code nor recompiling it.

For this, lets create another simple C file:

int rand(){
    return 42; //the most random number in the universe
}

We’ll compile it into a shared library.

gcc -shared -fPIC unrandom.c -o unrandom.so

So what we have now is an application that outputs some random data, and a custom library, which implements the rand() function as a constant value of 42.  Now… just run random_num this way, and watch the result:

LD_PRELOAD=$PWD/unrandom.so ./random_nums

If you are lazy and did not do it yourself (and somehow fail to guess what might have happened), I’ll let you know – the output consists of ten 42’s.

Read the rest of this entry »

There is something wrong with the new UDS system.

When I read the news about Canonical’s decision to change the way Ubuntu Developer Summit (original announcement here) I was totally astonished. I expected this change will cause a lot of buzz within the community, especially given the fact that all recent Canonical decisions are considered very controversial. This surprises me heavily, as I can spot a big number of problems that this decision may cause, as well as problems with the way this decision itself was handled. Jono Bacon’s article explaining the decision did not satisfy me either. It explains the general reasoning behind this idea, but it does not clarify everything.

Read the rest of this entry »