Skip to content
Paco Zamora Martinez edited this page Jun 14, 2015 · 23 revisions

Introduction

Package util could be loaded via the standalone binary, or in Lua with require("aprilann.util). This package is the most important and dangerous. It extends standard Lua tables with new functionalities, and adds several utilities at GLOBALs table.

List of utilities added to Lua for scripting purposes.

Functions

Package util could be loaded via the standalone binary, or in Lua with require("aprilann.util).

Serialization and deserialization of objects

util.serialize

str = util.serialize(obj)

util.serialize(obj, filename)

util.serialize(obj, stream)

util.deserialize

obj = util.deserialize(str)

obj = util.deserialize(filename)

obj = util.deserialize(stream)

util.to_lua_string

str = util.to_lua_string(obj, format)

Functional programming extensions

Lua ha been extended by the addition of new functions which works on the top of Lua iterators. Basic concepts as map, reduce, and filter has been implemented.

bind

func = bind(function, ...)

Allow to freeze any of the positional arguments of any function. The arguments of bind can be nil, and the returned function would merge its arguments with the list given to bind filling nil gaps adequately.

> f = bind(math.add, 2)
> = f(4)
6
> f = bind(math.div, nil, 3)
> = f(6)
2

reduce

whatever = reduce(function, initial_value, iterator)

The reduce function applies a function operator by pairs of values, the first argument is the accumulated value of the reduction until current iteration, and the second argument is value at current iteration. If the iterator returns two or more elements every time it is called, the second will be taken.

> value = reduce(math.min, math.huge, ipairs({4, 2, 1, 10}))
> print(value)
1
> value = reduce(function(acc,v) return acc*2+v end, 0, string.gmatch("01101", "." ))
> print(value)
13

apply

apply(func, iterator)

Applies a function to all the elements produced by the iterator. The function is called passing all the elements returned by one iterator call.

> t = { "a", "c", 3, 2 }
> apply(function(i,v1,v2) print(i,v1,v2) end, multiple_ipairs(t,t))
1	a       a
2	c       c
3	3       3
4	2       2

map

table = map(func, iterator)

Returns a table which is the result of apply the given function over all the items of the given iterator function.

> tmapped = map(bind(math.mul, 2), ipairs({1, 2, 3, 4}))
> print(table.concat(tmapped, " "))
2 4 6 8

map2

table = map2(func, iterator)

The same as the previous, but given the function the pair key,value.

> tmapped = map2(function(k,v) return k+v*2 end, ipairs({1, 2, 3, 4}))
> print(table.concat(tmapped, " "))
3 6 9 12

mapn

table = mapn(func, iterator)

The same as the previous, but given the function all the elements returned by the iterator at each iteration.

> tmapped = mapn(function(idx, ...) return table.pack(...) end,
>>          multiple_ipairs({1, 2, 3, 4},{5, 6, 7, 8}))
> for i,v in ipairs(tmapped) do print(i, table.concat(v, " ")) end
1	1 5
2	2 6
3	3 7
4	4 8

filter

table = filter(func, iterator)

Returns a table which contains only the elements produced by the iterator which were evaluated with true by the given func function. The function receives only one value.

> t = filter(function(v) return v%2 == 0 end, ipairs{1,2,3,4,5,6,7})
> print(table.concat(t, " "))
2 4 6

iterable_map

another_iterator = iterable_map(func, iterator)

Returns an iterator which every time is called maps the given function func using the given iterator. It allows multiple returned values from the given iterator (map and map2 only allow pairs key,value).

Additionally, using coroutine.yield(...), the mapping function could return more than one set of values at each iteration, allowing the implementation of ConcatMap iterators.

> -- standard map using iterable_map
> t = { Lemon = "sour", Cake = "nice", }
> for ingredient, modifier, taste in iterable_map(function(a, b)
>                                          return a:lower(),"slightly",b:upper()
>                                        end, pairs(t)) do
>   print(ingredient .." is ".. modifier .. " " .. taste)
> end
lemon is slightly SOUR
cake is slightly NICE
> 
> -- ConcatMap iterator using iterable_map
> t = { Lemon = "sour", Cake = "nice", }
> for ingredient, modifier, taste in iterable_map(function(a, b)
>>                                         coroutine.yield(a:lower(),"very",b:upper())
>>                                         return a, "slightly", b
>>                                       end, pairs(t)) do
>>  print(ingredient .." is ".. modifier .. " " .. taste)
>> end
cake is very NICE
Cake is slightly nice
lemon is very SOUR
Lemon is slightly sour

The following example uses this function to extract all the words contained in a file:

> for str in iterable_map(function(line)
>>                           for _,str in ipairs(string.tokenize(line)) do
>>                             coroutine.yield(str)
>>                           end
>>                         end, io.lines("AUTHORS.txt")) do
>>  print(str)
>> end
In
this
project
has
been
worked:
-
Salvador
España
Boquera
-
Jorge
Gorbe
Moya
-
Adrián
Palacios
Corella
-
Joan
Pastor
Pellicer
-
Francisco
Zamora
Martínez

This function is taken from http://www.corsix.org/content/mapping-and-lua-iterators.

iterable_filter

another_iterator = iterable_filter(func, iterator)

Returns an iterator which every time is called filters using the given function func the elements produced by the given iterator. It allows multiple returned values from the given iterator.

> for v in iterable_filter(function(key,value) return value%2==0 end,
>>                         ipairs{1,2,3,4,5,6,7}) do
>>  print(v)
>> end
2
4
6

iterator class

The iterator class is developed to provide an easy and natural interface with previous and newer functions. The most important advantage is that iterator class relies always in Lua iterators, so, it is lazy in the way that the code is not executed until the iterator is traversed. iterator class is a wrapper of Lua iterators.

The following methods returns an iterator object or a Lua iterator:

  • obj = iterator(Lua iterator): the constructor receives an iterator, as for example the output of ipairs function, and returns an instance of iterator class.
> it = iterator(ipairs{ 1, 2, 3})
  • Lua iterator = obj:get(): returns the current state of the underlying Lua iterator.

  • Lua iterator = obj(): the same as previous method.

> it = iterator(ipairs{ 1, 2, 3})
> for k,v in it() do print(k,v) end
1	1
2	2
3	3
  • iterator = obj:map(func): this method is a wrapper of iterable_map function, and returns an instance of iterator class.
> it = iterator(ipairs{ 1, 2, 3}):map(function(k,v) return v*2 end)
> for v in it() do print(v) end
2
4
6
  • iterator = obj:filter(func): this method is a wrapper of iterable_filter function, and returns an instance of iterator class.
> it = iterator(range(1,50)):filter(function(n) return (n%10)==0 end)
> for v in it() do print(v) end
10
20
30
40
50
  • iterator = obj:field(...): this method receives a list of keys. It expects the underlying iterator to produce a list of tables. It returns an iterator which filters all the tables in the list taken the values at given keys, and returns a flatten list of values. There is an example below the following method.

  • iterator = obj:select(...): this method receives a list of numbers. It returns an iterator which selects only the output variables produced by the iterator at the given position numbers.

> layers = { { size=10 }, { size=100 } }
> iterator(ipairs(layers)):select(2):field("size"):apply(print)
10
100
  • iterator = obj:enumerate(): enumerates the returned values, adding at first position a number.
> iterator(pairs{ a=4, b=3 }):enumerate():apply(print)
1	a	4
2	b	3
  • iterator = obj:iterate(func): this method is an specialization of map method for applying Lua iterator functions to each element of obj. The given func is expected to return an iterator over the given element. It is useful to do things like word counting:
> out =  iterator(io.lines("AUTHORS.txt")):
>> iterate(function(line) return string.gmatch(line, "[^\r\n\t ]+") end):
>> reduce(function(acc,w) acc[w] = (acc[w] or 0) + 1 return acc end,{})
> iterator(pairs(out)):apply(print)
has	1
Pastor	1
In	1
worked:	1
Palacios	1
-	5
España	1
Boquera	1
Joan	1
Francisco	1
Adrián	1
Martínez	1
been	1
Pellicer	1
Jorge	1
Zamora	1
Corella	1
this	1
Moya	1
Gorbe	1
Salvador	1
project	1
  • iterator = obj:call(funcname, ...): this method is a map over all the values by calling the method funcname (a string) using the given arguments. Because it is a method, the first argument of funcname will be each iterator value.
> for k in iterator(ipairs({ "h", "w" })):select(2):call("toupper"):get() do
    print(k)
  end
H
W

The following methods are finalizers, so, they return a value, not an iterator:

  • whatever = obj:reduce(func, initial_value): this method is a wrapper of reduce function.
> = iterator(range(1,50)):reduce(function(acc,a) return acc+a end, 0)
1275
> = iterator(range(1,50)):reduce(math.add, 0)
1275
  • obj:apply(func): this method is a wrapper of apply function.
> iterator(range(1,50)):filter(function(n) return (n%10)==0 end):apply(print)
10
20
30
40
50
  • string = obj:concat(sep1,sep2): concats all the elements using sep1 and sep2 strings. sep1 is used to concat the elements of one iterator call. sep2 is used to concat the elements between different iterations. By default, empty string will be used when sep1 and sep2 are nil. If only sep1 is given, therefore sep2=sep1.
> = iterator(range(1,50)):filter(function(n) return (n%10)==0 end):concat(" ")
10 20 30 40 50
  • table = obj:table(): returns a table with all the iterator values, using as key the first produced value, and the rest as value. If only one value is produced, the table will be indexed as an array.
> t = { "one", "two", "three" }
> p = iterator(ipairs(t)):map(function(k,v) return v,k end):table()
> iterator(pairs(p)):apply(print)
one	    1
two	    2
three	3

Using objects of this class, it is possible to produce code like this:

-- This example computes the dot product of two array tables. math.mul and math.sum are
-- auxiliary functions implemented in APRIL-ANN for the fast development of reductions.
> v = iterator(multiple_ipairs({1,2,3},{4,5,6})):select(2,3):
>>    map(math.mul):
>>    reduce(math.add, 0)
> print(v)
32
>
> -- The following code is equivalent without using iterator class
> v = reduce(function(a,b) return a+b end, 0,
>>           iterable_map(function(k,a,b) return a*b end,
>>                        multiple_ipairs({1,2,3},{4,5,6})))
> print(v)
32

Basic functions

april_list

april_list(table)

This function is this piece of code: for i,v in pairs(table) do print(i,v) end

april_help

april_help(obj)

Shows the documentation of the object given as argument. If the object is a class, you can access to instance methods by using .. operator:

> -- using .. operator to access instance method get_state_table
> april_help(trainable.train_holdout_validation.."get_state_table")
method  Returns the state table of the training

description: Returns the state table of the training

outputs:
	             best Best trained model
	       best_epoch Best epoch
	   best_val_error Best epoch validation loss
	    current_epoch Current epoch
	             last Last trained model
	      train_error Train loss
	 validation_error Validation loss

> -- showing help of the given class
> april_help(trainable.train_holdout_validation)
ID: trainable.train_holdout_validation
class  Training class using holdout validation

description: This training class defines a train_func which follows a training
             schedule based on validation error or in number of epochs. Method
             execute receives a function which trains one epoch and returns the
             trainer object, the training loss and the validation loss. This
             method returns true in case the training continues, or false if the
             stop criterion is true.

...

april_dir

april_dir(string)

This is a the same has april_help, but less verbose.

luatype

luatype(whatever)

The original type function is replaced by APRIL-ANN with a new function which returns the object id if it is a class instance. If you need to know the exact type given by Lua, this function is what you need.

check_version

boolean = check_version(major,minor,commit)

Checks if the version of the software is major.minor with the given commit, returning true if success, and returning false and showing a message in stderr otherwise.

april_print_script_header

april_print_script_header(arg, file=stdout)

This function writes at the given file (or stdout if not given) the given arg table (normally the arg received by the script), besides information about the HOST where the script is executed and the current DATETIME:

> april_print_script_header({ [0]="hello" })
# HOST:	 django
# DATE:	 dv jul  5 14:16:53 CEST 2013
# CMD: 	 hello 

multiple_ipairs

iterator,s,v = multiple_ipairs(...)

Returns an iterator which traverses a several number of tables. If they don't have the same size, the remaining elements will be nil, ensuring that in all the iterations the number of returned elements is equals to the maximum size of given tables.

> for i,a,b,c in multiple_ipairs({1,2,3,4},{1,2},{3,4,5}) do print(i,a,b,c) end
1	1	1	3
2	2	2	4
3	3	nil	5
4	4	nil	nil

multiple_unpack

... = multiple_unpack( [table1, [table2, [...]]] )

Allow to unpack multiple tables together, one at a time, and in a sequential fashion.

> print( multiple_unpack( {1,2,3}, {4,5}, {6,7,8,9} ) )
1	2	3	4	5	6	7	8	9

glob

table = glob(...)

Returns a list of filenames which match all the wildcard arguments received by the function.

> -- prints the name of all the files which have .lua or .h extensions
> for i,filename in ipairs(glob("*.lua", "*.h")) do print(filename) end

parallel_foreach

results = parallel_foreach(num_processes, iterator or array or number, func)

Executes a function over the given iterator (instance of class iterator), array table or the given number of repetitions, but forking the calling process in num_processes, improving the performance of the operation. NOTE that the parallelization is performed forking the caller process, so all child processes could access to the memory variables assigned and allocated before the fork, but they don't share the memory, it will be copied on write.

> t = map(function(v)return v end, 10)
> parallel_foreach(2, t, function(value) print(value*100) end)
200
400
600
800
1000
100
300
500
700
900

Additionally, if the function returns any value, this function would serialize the output of each process to a temporal file, and at the end, deserialize the content to the original process. This is useful when the overhead of serialization-deserialization procedure is less than the computing power needed by the processes.

> ret = parallel_foreach(2, 10, function(value) return value*100 end)
> print(table.concat(ret, "\n"))
100
200
300
400
500
600
700
800
900
1000

You can use iterators to control which data receives the called function:

> ret = parallel_foreach(2, iterator(ipairs{4,3,2,1}),
                         function(key,value) return key+value*100 end)
> print(table.concat(ret, "\n"))
401
302
203
104
>
> ret = parallel_foreach(2, iterator(ipairs{4,3,2,1}),
                         function(key,value) return key,key+value*100 end)
> print(table.tostring(ret))
{{1,401,["n"]=2},{2,302,["n"]=2},{3,203,["n"]=2},{4,104,["n"]=2}}

clrscr

clrscr()

Clears the screen.

printf

printf(...)

Equivalent to C printf function.

fprintf

fprintf(file,...)

Idem, but for the C fprintf function.

range

range(inf,sup, step=1 )

This function returns an iterator which starts at inf, ends at sup, and performs steps of the given step size.

> for i in range(10,20,2) do print(i) end
10
12
14
16
18
20

util.version

major,minor,commit = util.version()

Returns the version numbers.

util.omp_set_num_threads

util.omp_set_num_threads(number)

Modifies the number of threads for OMP.

> util.omp_set_num_threads(8)

util.omp_get_num_threads

number = util.omp_get_num_threads()

Returns the number of threads used by OMP.

> print(util.omp_get_num_threads())
8

Math table extensions

math.add

number = math.add( a ,b )

Returns the result of a+b.

> = math.add(2,3)
5

math.sub

number = math.sub( a, b )

Returns the result of a-b.

> = math.sub(2,3)
-1

math.mul

number = math.mul( a, b )

Returns the result of a*b.

> = math.mul(2,3)
6

math.div

number = math.div( a, b )

Returns the result of a/b.

> = math.div(2,3)
0.66666666666667

math.eq

number = math.eq( a, b )

Returns the result of a==b.

math.lt

number = math.lt( a, b )

Returns the result of a<b.

math.le

number = math.le( a, b )

Returns the result of a<=b.

math.gt

number = math.gt( a, b )

Returns the result of a>b.

math.ge

number = math.ge( a, b )

Returns the result of a>=b.

math.land

number = math.land( a, b )

Returns the result of a and b.

math.lor

number = math.lor( a, b )

Returns the result of a or b.

math.lnot

number = math.lnot( a )

Returns the result of not a.

math.round

number = math.round(number)

Returns the rounding integer number for the given real number.

> = math.round(4/3)
1

math.clamp

number = math.clamp(value,lower,upper)

Clamp the given value to be between [lower,upper], if it is out of the range, it is forced to be at the limit.

> print(math.clamp(15,3,6), math.clamp(0,3,6), math.clamp(4,3,6))
6        3        4

String table extensions

Operator %

Inspired in penlight library, a Python-like operator % has been defined. It allows to produce formatted strings, and implements map-like substitutions:

> = "$obj1 = %.4f\n$obj2 = %.4f" % {20.4, 12.36, obj1="cat", obj2="dog"}
cat = 20.4000
dog = 12.3600

truncate

string = str:truncate(columns, prefix)

basename

string = path:basename()

Returns the basename (the last filename) of a given path.

> print(("/a/path/to/my/file.txt"):basename())
file.txt

dirname

Returns the path, removing the basename.

> print(("/a/path/to/my/file.txt"):dirname())
/a/path/to/my/

remove_extension

string,string = path:remove_extension()

Removes the extension of the filename in the given path, and returns the path without the extension and the extension string.

> print(("/a/path/to/my/file.txt"):remove_extension())
/a/path/to/my/file	txt

get_extension

string = path:get_extension()

Returns only the extension of the given path string.

> print(("/a/path/to/my/file.txt"):get_extension())
txt

get_path

string = path_with_filename:get_path(sep)

Synonim of dirname().

lines_of

string = str:lines_of()

Returns an iterator function which traverses the given string splited by newline character.

> for line in ("one\ntwo"):lines_of() do print(line) end
one
two

chars_of

iterator = str:chars_of()

Returns an iterator function which traverses the given string splited by chars.

> for i,ch in ("one two"):chars_of() do print(i,ch) end
1	o
2	n
3	e
4	 
5	t
6	w
7	o

tokenize

table = str:tokenize(sep=' \t\n\r')

Returns a table with the string tokenized using the given sep set of characters.

> for i,token in ipairs((" one\ntwo\tthree four"):tokenize("\t\n ")) do print(i,token) end
1	one
2	two
3	three
4	four
> for i,token in ipairs(string.tokenize(" one\ntwo\tthree four", "\n ")) do print(i,token) end
1	one
2	two	three
3	four

tokenize_width

table = str:tokenize_width(width=1)

join

The string.join function is equivalent to Lua table.concat function.

Table table extensions

table.insert

table = table.insert(table,value)

The original table.insert function was replaced with a new one which returns the table given as first argument. It is combinable with reduce function.

table.luainsert

table.luainsert(table,value)

The original Lua table.insert function.

table.clear

table.clear(table)

Removes all the elements of a table, but it doesn't forces Lua to deallocate the memory. This function is useful if you want to reuse a table variable several times inside a loop, it is better to clear the table than to allocate a new one table.

> t = {}
> for i=1,1000 do table.clear(t) STUFF USING t end

table.unpack_on

table.unpack_on(table, dest_table)

This function puts the fields of the given table at the table dest_table. It is useful to put table fields on the global scope of Lua.

> print(a, b, c)
nil	nil	nil
> t = { a=1, b=2, c=3 }
> table.unpack_on(t, _G)
> print(a, b, c)
1	2	3

table.invert

table = table.invert(table)

Returns the table resulting from the inversion of key,value pairs of the given table argument.

> t = { "a", "b", "c" }
> t_inv = table.invert(t)
> for i,v in pairs(t_inv) do print(i,v) end
a	1
c	3
b	2

table.slice

table = table.slice(t, ini, fin)

Returns from the given table the slice of elements starting at ini and finishing at fin.

> t = { 1, 2, 3, 4, 5, 6, 7, 8, 9 }
> print(unpack(table.slice(t, 2, 4)))
2	3	4

table.search_key_from_value

key = table.search_key_from_value(table,value)

This function searchs a value at the given table and returns its key. If the value is repeated (obviously using different keys), any of the possible keys will be returned, but it is not possible to determine which one.

> print(table.search_key_from_value({ a=15, b=12 }, 12))
b

table.reduce

whatever = table.reduce(table,function,initial_value)

Equivalent to reduce(function, initial_value, pairs(table)).

table.imap

table = table.imap(table,function)

Equivalent to map(function, ipairs(table)).

table.map

table = table.map(table,function)

Equivalent to map(function, pairs(table)).

table.imap2

table = table.imap2(table,function)

Equivalent to map2(function, ipairs(table)).

table.map2

table = function table.map2(table,function)

Equivalent to map2(function, pairs(table)).

table.ifilter

table = table.ifilter(table,function)

This functions traverses the given table as an array (using ipairs function), and returns a new table which contains only the elements where the given function returns true. The function is called passing the pair key,value as two arguments.

table.filter

table = table.filter(table,function)

Idem as the previous one but for general tables (using pairs functions).

table.join

table = table.join(t1,t2)

Returns a table which is the concatenation of the two given tables.

> t = table.join({1,2,3}, {10,11,12})
> print(table.concat(t, " "))
1 2 3 10 11 12

table.deep_copy

table = table.deep_copy(table)

Returns a table which is a deep copy of the Lua data-values contained at the given table, and a shallow copy (copied by reference) of its C++ references.

table.linearize

table = table.linearize(table)

Converts an unsorted dictionary in an array, throwing away the keys. The order of the array is not determined.

table.tostring

string = table.tostring(table)

This function converts the given table to a string which contains the table values, and which could be loaded as a Lua chunk. It only works with tables which doesn't contain C++ references.

> t = { 1, 2, a={ ["foo"] = "bar" } }
>  print(table.tostring(t))
{
[1]=1,[2]=2,["a"]=
{
["foo"]="bar"
}
}

table.max

number,index = table.max(table)

This function returns the maximum value and the index of the key which contains it. The table is traversed using pairs function.

table.min

number,index = table.min(table)

This function returns the minimum value and the index of the key which contains it. The table is traversed using pairs function.

table.argmax

index = table.argmax(table)

This function is equivalent to table.max returning only the index.

table.argmin

index = table.argmin(table)

This function is equivalent to table.min returning only the index.

Io table extensions

io.uncommented_lines

iterator = io.uncommented_lines( [filename] )

Returns a function iterator which traverses the given filename (if not given, it uses io.stdin), removing the lines which begins with # symbol.

> for line io.uncommented_lines() do STUFF end

Miscellaneous classes

util.stopwatch

util.stopwatch

util.vector_uint

util.vector_uint

util.vector_float

util.vector_float

Clone this wiki locally