01 Intro

APRIL-ANN (A Pattern Recognizer In Lua with Artificial Neural Networks) is more than an ANNs toolkit. It is pattern recognizer project.

Simple Lua scripts could be implemented to run ANNs experiments. Some examples are below.

Inline help

Take note that APRIL-ANN offers an inline help with two basic commands:

april_help(...)
april_dir(...)
april_list(...)

The april_help(object) function takes an object (Lua table, function, userdata, ...) as a parameter and shows the corresponding help via standard output.

The april_dir(object) function takes an object as a parameter and shows the corresponding help via standard output. It is the same as april_help but a lot less verbose.

The april_list(table) function takes a table and shows its content using pairs function. It has nothing to do with inline help, but is useful in a lot of circumstances when developing scripts.

Play a little with it, so execute april_help(ann.components) and after april_help(ann.components.base) and see what happens ;)

If you want to access instance methods documentation, you have two ways:

Use the operator .. (that is, __concat metamethod) with a class table plus method name string:

> april_help(ann.components.base .. "forward")
method  Computes forward step with the given token

description: Computes forward step with the given token

parameters:
	                1 An input token (usually a matrix)
	                2 A boolean indicating if the forward is
                          during_training or not. This information is used by
                          ann.components.actf objects to apply dropout
                          during training, and to halve the activation during
                          validation (or test). It is [optional], by default
                          is false.

outputs:
	                1 An output token (usually a matrix)

Declare an instance of the class and execute april_help(obj):

> c = ann.components.base()
> april_help(c.forward)
method  Computes forward step with the given token

description: Computes forward step with the given token

parameters:
	                1 An input token (usually a matrix)
	                2 A boolean indicating if the forward is
                          during_training or not. This information is used by
                          ann.components.actf objects to apply dropout
                          during training, and to halve the activation during
                          validation (or test). It is [optional], by default
                          is false.

outputs:
	                1 An output token (usually a matrix)

Auto-completion

APRIL-ANN incorporates an adaptation of [https://github.com/rrthomas/lua-rlcompleter](Lua rlcompleter module) for Lua 5.2, and for the APRIL-ANN object oriented implementation. It allows to auto-complete pathnames, global names, table fields, and object methods, by using the <tab> key.

> matrix.<tab><tab>
_NAME               dict                fromString          ...
_VERSION            fromFilename        fromTabFilename     ...
__sliding_window__  fromHEX             join                ...
as                  fromMMap            loadImage           ...
...                 ...                 ...
> matrix.fromTab<tab>
> matrix.fromTabFilename<tab><tab>
fromTabFilename
> matrix.fromTabFilename

Serialization

Almost any object can be serialized to a disk file, string or a stream using the function util.serialize(). Similarly, it can be deserialized using util.deserialize() function.

XOR problem

The code described here is at the repo path EXAMPLES/xor.lua. First, we need to create an ANN component object which will be trained:

thenet  = ann.mlp.all_all.generate("2 inputs 2 logistic 1 logistic")

The object thenet is a Multilayer Perceptron (MLP) with 2 inputs, a hidden layer with 2 neurons with logistic activation function, and 1 output neuron with logistic activation function. Some activation functions are available: logistic, tanh, linear, softmax, log_logistic, sin, softsign, softplus, .... (see april_help(ann.components.actf)).

Now, in order to do easy and fast development of scripts, a trainer helper wrapper can be used:

bunch_size=4
trainer = trainable.supervised_trainer(thenet, ann.loss.mse(1), bunch_size)

The trainer needs the ANN component, the loss function, and the bunch_size. Bunch size is the same as mini-batch size, it is used to train several patterns at the same time, increasing the speed of the experiment. Values between 32 and 64 are tipically used, but in this example onlt 4 is possible, so the XOR problem is composed by 4 patterns.

The next step is to build the component and randomize its weights:

trainer:build()
trainer:randomize_weights{
  random     =  random(1234),
  inf        = -0.1,
  sup        =  0.1 }

The weights will be initialized uniformly in the range [inf, sup], using the given random object with 1234 as random seed. It is also possible to indicate if you want to initialize weights.

The components has several learning parameters which needs to be configured:

trainer:set_option("learning_rate", 1.0)
trainer:set_option("momentum",      0.5)
trainer:set_layerwise_option("w.*", "weight_decay",  1e-05)

Data to train the ANN is defined using matrix and dataset objects. It is possible to build XOR problem on a matrix and use it as training datasets:

m_xor = matrix.fromString[[
    4 3
    ascii
    0 0 0
    0 1 1
    1 0 1
    1 1 0
]]
ds_input  = dataset.matrix(m_xor, {patternSize={1,2}})
ds_output = dataset.matrix(m_xor, {offset={0,2}, patternSize={1,1}})

The variable m_xor is a matrix object, loaded from the given string. ds_input is a dataset.matrix object, which traverses the matrix by rows, computing a sliding window of patternSize={1,2}. The desired output of the ANN is another dataset.matrix, but in this case computing the sliding window with size (1,1) and skipping the first two columns offset={0,2}.

Finally, we need to train the ANN:

for i=1,10000 do
  local error = trainer:train_dataset{ input_dataset  = ds_input,
                                       output_dataset = ds_output }
  print(i, error)
end

This code trains the ANN for 10,000 epochs, feeding the ANN with input_dataset and using as desired output the given output_dataset. Patterns are grouped at mini-batches of size 4 (bunch_size), and each training epoch is the training with the full dataset.

This simple example gives you some insight about how to use APRIL-ANN toolkit, but it is not useful in a bit more complicated problems. Next section will explain DIGITS problem, which trains an ANN to classify handwritten digits.

DIGITS task

The task aborded at this section is classification of handwritten digits. The code is at EXAMPLES/digits.lua , and could be executed following this command: april-ann digits.lua. This task uses as data a large PNG image with handwritten digits ordered by columns and rows. Each columns corresponds to each digit class (from 0 to 9), and each row contains 10 examples (one for each class). There are 1000 patterns (100 for each clasS). So, first the image is loaded using this code, and converted to a matrix where 0 represents white color and 1 represents black color:

digits_image = ImageIO.read(string.get_path(arg[0]).."digits.png")
m1           = digits_image:to_grayscale():invert_colors():matrix()

This code uses ImageIO.read function to load the PNG image (you need to compile libpng package), and uses string.get_path function in order to find where the file is located. The image is converted to grayscale, colors are inverted to be 0=white and 1=black, and finally the corresponding matrix of this image is generated.

Second, the training input and output dataset are generated following this code:

-- TRAINING --
train_input  = dataset.matrix(m1,
                              {
                                patternSize = {16,16},
                                offset      = {0,0},
                                numSteps    = {80,10},
                                stepSize    = {16,16},
                                orderStep   = {1,0}
                              })
-- a simple matrix for the desired output
m2 = matrix(10,{1,0,0,0,0,0,0,0,0,0})
-- a circular dataset which advances with step -1
train_output = dataset.matrix(m2,
                              {
                                patternSize = {10},
                                offset      = {0},
                                numSteps    = {800},
                                stepSize    = {-1},
                                circular    = {true}
                              })

This is a more complicated example of how to create datasets from matrices. The variable train_input is a dataset.matrix generated by a sliding-window of size 16x16 (the size of one digit), which moves in steps of 16x16 (first 16 in columns, and when arrive to the end it moves 16 in rows and returns to column 0). The number of patterns (numSteps) is 80 by rows and 10 by columns. The output dataset needs an special matrix which contains only one 1 and 9 zeroes, so the 1 on each pattern will correspond to its class. The dataset.matrix in this case slides backwards (stepSize={-1}), so the 1 moves forward, and is circular (window positions out of the matrix take the values of the opposite matrix positions). It has 800 patterns (80x10).

For validation datasets the script is coded similarly:

-- VALIDATION --
val_input = dataset.matrix(m1,
                           {
                             patternSize = {16,16},
                             offset      = {1280,0},
                             numSteps    = {20,10},
                             stepSize    = {16,16},
                             orderStep   = {1,0}
                           })
val_output   = dataset.matrix(m2,
                              {
                                patternSize = {10},
                                offset      = {0},
                                numSteps    = {200},
                                stepSize    = {-1},
                                circular    = {true}
                              })

However, in this case the val_input dataset needs the option parameter offset to not be 0, because validation patterns are the 200 last patterns (it begins at image row position 1280). The first 800 digits are used for training.

The MLP is generated following same steps as for XOR, but in this case the topology description string uses tanh for activation of hidden layer, and log_softmax for activation of output layer. In this case the use_fanin and use_fanout flags are set to true, and the error function is multi_class_cross_entropy, which is a version of cross-entropy error function, but mathematically simplified for log_softmax as output activation functions (if you try other output you must use mse). The two-class version of cross-entropy (ann.loss.cross_entropy) is simplified to be used with log_logistic outputs:

bunch_size = 64
thenet = ann.mlp.all_all.generate("256 inputs 128 tanh 10 log_softmax")
trainer = trainable.supervised_trainer(thenet,
                                       ann.loss.multi_class_cross_entropy(),
                                       bunch_size)
trainer:build()
trainer:randomize_weights{
  random      = random(52324),
  use_fanin   = true,
  use_fanout  = true,
  inf         = -1,
  sup         =  1,
}
trainer:set_option("learning_rate", 0.01)
trainer:set_option("momentum",      0.01)
trainer:set_layerwise_option("w.*", "weight_decay",  1e-05)

For training, it is needed to declare a table which contains the pair input/output datasets and some specific parameters (i.e. shuffle random object to train each epoch with a different permutation of patterns):

training_data = {
  input_dataset  = train_input,
  output_dataset = train_output,
  shuffle        = random(25234),
}

validation_data = {
  input_dataset  = val_input,
  output_dataset = val_output,
}

The final snippet code train the MLP using holdout-validation, following a stopping criterion which depends on the relative value between current_epoch/best_validation_epoch: when this proportion is greater than 2 the training is stopped (that is, MLP training will stop at 200 epochs if the last best validation epoch is at epoch 100; MLP training will stop at 400 epochs if the last best validation epoch is at epoch 200). Stopping criterion is selected using function helper trainable.stopping_criteria.make_max_epochs_wo_imp_relative, and the MLP is trained using the class trainable.train_holdout_validation. This last class receives a table which fields are self-explanatory, and follows a holdout-validation algorithm in its execute method, and after each epoch get_state_string method is used for output facilities.

print("# Epoch  Training  Validation  BestEpoch  BestValidation")
stopping_criterion =
  trainable.stopping_criteria.make_max_epochs_wo_imp_relative(2)
train_func = trainable.train_holdout_validation{
    min_epochs         = 4,
    max_epochs         = 1000,
    stopping_criterion = stopping_criterion,
}
clock = util.stopwatch()
clock:go()
epoch_function = function()
                   local tr_loss = trainer:train_dataset(training_data)
                   local va_loss = trainer:validate_dataset(validation_data)
                   return trainer,tr_loss,va_loss
				 end
while train_func:execute(epoch_function) do
  print(train_func:get_state_string())
end
clock:stop()
cpu,wall   = clock:read()
num_epochs = result.last_epoch
printf("# Wall total time: %.3f    per epoch: %.3f\n", wall, wall/num_epochs)
printf("# CPU  total time: %.3f    per epoch: %.3f\n", cpu, cpu/num_epochs)
printf("# Validation error: %f", result.best_val_error)

Final remarks

This introduction explains you the basic steps to write and execute scripts for pattern recognition using ANNs and the toolkit APRIL-ANN. Please, feel free to use this scripts as initial template for yours ;)

APRIL-ANN has a lot of interesting features. The following list show the most important features, which are detailed in the following sections of this documentation:

Multidimensional matrix library. It allows to perform efficient mathematical operations in Lua.
Abstract token definition. A token represents anything, and is used in several parts of the toolkit for information interchange: matrix instances can be wrapperd into a tokens.matrix instance, and they are interchangable in ANN components.
Dataset abstraction. It has the ability to build powerful sliding windows over matrices. At the same time, it is possible to filter datasets producing new datasets on-the-fly. Two abstraction exists: dataset, and dataset.token.
Artificial neural networks. Different packages are implemented to perform efficient training of ANNs. Three main concepts: ANN component, loss function and optimization algorithm.
Trainable package. This package knows all the ANNs stuff, and is a good start point to work with ANNs. Implements a lot of useful code for intrspection, training and testing.
Random package. The generation of pseudo-random numbers is in this package.
Automatic differentiation. For more advanced machine learning, an experimental library for automatic differentiation has been added. It allows to specify totally more general models than ANNs abstraction, but with an important loss in efficiency. However, it is useful to do cool things for research with a little implementation effort, before implement them in ANNs.
Matlab package. It allows to load (not save) matrices and data in MAT format. It stills in experimental phase, but the most important things are available.
Statistics package. Look here for some statistics standard techniques. PCA, running mean and variance computation, pearson correlation, ...
Complex numbers. In experimental phase, APRIL-ANN allows to work with complex numbers, and complex matrices.
Util package. It contains a lot of utilities for Lua script development.
GZIO package. This is the binding of libZ for load/save of compressed files.
Image and ImageIO packages. The class Image allows to work with color or gray images. The package ImageIO implements useful functions for generic read/write of images, depending in their extension.

Intro
matrix
tokens
dataset
ann 21. ann.loss 22. ann.optimizer 23. ann.graph 25. ann.autoencoders
trainable
random
autodiff 31. autodiff.ann
matlab
stats 51. stats.MI
complex
util
gzio
Image
ImageIO
AffineTransform2D
class
clustering
knn
hyperopt
FAQ

Provide feedback

Saved searches

Use saved searches to filter your results more quickly