-
Notifications
You must be signed in to change notification settings - Fork 12
25 ann.autoencoders
Package autoencoders
could be loaded via the standalone binary, or in Lua with
require("aprilann.autoencoders)
.
Stacked Denoising Auto-Encoders (SDAE) are a kind of deep neural network which is pre-trained following greedy layerwise algorithm but introducing at noise input of each layerwise auto-encoder. Some function facilities are implemented to help with the training of SDAE.
Greedy layerwise pre-training consists in train each pair of layers, from input to output, in a greedy way (see Paper SDAE, 2010, Vincent Pascal et al.). Pre-training receives as input a table with parameters of training algorithm. For example, a table like this:
layers = {
{ size= 256, actf="logistic"}, -- INPUT
{ size= 256, actf="logistic"}, -- FIRST HIDDEN LAYER
{ size= 128, actf="logistic"}, -- SECOND HIDDEN LAYER
{ size= 32, actf="logistic"}, -- THIRD HIDDEN LAYER
}
perturbation_random = random(824283)
params_pretrain = {
input_dataset = train_input, -- a dataset which is the input of the autoencoders
replacement = nil, -- a number (or nil) indicating replacement
on_the_fly = false, -- a boolean (or nil) for on-the-fly
shuffle_random = random(1234), -- for shuffle durint backpropagation
weights_random = random(7890), -- for weights random initialization
layers = layers, -- layers description
supervised_layer = { size = 10, actf = "log_softmax" }, -- it is possible to pre-train supervised layer
output_datasets = { train_output }, -- the output dataset
bunch_size = bunch_size, -- the size of the mini-batch
optimizer = function() return ann.optimizer.sgd() end, -- optimizer function
training_options = { -- this table contains learning options and dataset noise filters
-- global options
global = {
-- pure ANN learning hyperparameters
ann_options = { learning_rate = 0.01,
momentum = 0.02,
weight_decay = 1e-05 },
-- noise filters (a pipeline of filters applied to input in order). Each one must be a dataset
noise_pipeline = { function(ds) return dataset.perturbation{ -- gaussian noise
dataset = ds,
mean = 0, -- gaussian mean
variance = 0.01, -- gaussian variance
random = perturbation_random } end,
function(ds) return dataset.salt_noise{ -- salt noise (or mask noise)
dataset = ds,
vd = 0.10, -- percentage of values masked
zero = 0.0, -- mask value
random = perturbation_random } end },
min_epochs = 4,
max_epochs = 200,
pretraining_percentage_stopping_criterion = 0.01,
},
-- it is possible to overwrite global values with layerwise dependent values (also noise_pipeline)
layerwise = { { min_epochs=50 }, -- first autoencoder pretraining
{ min_epochs=20 }, -- second autoencoder pretraining
{ ann_options = { learning_rate = 0.04,
momentum = 0.02,
weight_decay = 4e-05 },
min_epochs=20 }, -- third autoencoder pretraining
{ min_epochs=10 }, }, -- supervised pretraining
}
}
Fields supervised_layer
and output_datasets
are optional. If they are given,
the last layer will be pre-trained in a supervised manner. Anyway, rest of
layers are pre-trained in a unsupervised manner.
If field input_dataset
is supplied, then distribution
field is forbidden
and, in case of pre-train supervised layer, output_datasets
table must contain
only one element.
If field distribution
is supplied, then input_dataset
is forbidden and, in
case of pre-train supervised layer, output_datasets
table has the same number
of items than distribution
table. In this last case, each item
output_datasets[i]
is the corresponding supervised output dataset for each
item of distribution[i].input_dataset
.
Ths table is used passed as argument to the algorithm:
sdae_table,deep_net = ann.autoencoders.greedy_layerwise_pretraining(params_pretrain)
This function returns one or two tables:
-
sdae_table = { bias={ ... }, weights={ ... } }
: which contains bias and weights of each unsupervised pre-trained layer. -
deep_net
: An ANN component. It could be used to fine-tuning training. If you don't pre-train supervised layer, this component needs that you manually push the supervised layer.
### Building codifier from SDAE table ###
codifier_net = ann.autoencoders.build_codifier_from_sdae_table(sdae_table,
bunch_size,
layers)
The codifier is the SDAE without the supervised layer at output. Needs the same
layers
definition as greedy pre-trainig function. Returns an ANN object which
could receive a pattern as input and produces its encoding.
### Fine-tunning supervised deep ANN ###
The supervised deep ANN could be fine-tuned using cross-validation
training algorithm. If you pre-trained supervised layer,
object deep_net
is directly the whole ANN. Otherwise, you will need to
add a new layer to the codifier_net
, as in this example:
-- if you want, you could clone the deep_net to keep it as it is
local codifier_net = deep_net:clone()
codifier_net:build{ weights = deep_net:copy_weights() }
-- We add an output layer with 10 neurons and softmax activation function
local last_layer = ann.components.hyperplane{
dot_product_weights="lastw",
bias_weights="lastb",
output=10
}
deep_net:push( last_layer )
deep_net:push( ann.components.actf.log_softmax()
trainer = trainable.supervised_trainer(deep_net, loss_function or nil, bunch_size or nil)
-- The output size needs to be overwitten, so it needs to be given at build method
trainer:build{ output = 10 }
weights_random = random(SEED)
-- Now, EXITS TWO WAYS to randomize the weights of last_layer
-- FIRST using the trainer
trainer:randomize_weights{
name_match="^last[bw]$", -- the name_match is to only randomize connections which name matches
inf=-0.1,
sup=0.1,
random=weights_random
}
-- SECOND using the component
-- (BE CAREFUL AND USE ONLY ONE OF THIS WAYS)
for _,cnn in pairs(last_layer:copy_weights()) do
cnn:randomize_weights{
inf=-0.1,
sup=0.1,
random=weights_random
end
### Compute encoding ###
With a trained SDAE (without supervised layer), it is possible to compute encodings of input patterns using this function:
trainer = trainable.supervised_trainer(codifier_net)
encoded_dataset = trainer:use_dataset(input_dataset)