-
Notifications
You must be signed in to change notification settings - Fork 340
SPL Viewing Lambda Syntax
The main goal of Lambda syntax is to quickly and conveniently define and use temporary functions, reduce code amount, and improve development efficiency. SPL focuses on the field of structured data computing and often faces complex computational logic and lengthy computational code. In order to better adapt to specific application environments, SPL makes appropriate modifications to Lambda syntax.
The conventional Lambda syntax does not have built-in loop variables, and loop variables are used almost everywhere, and programmers need to define them every time.
Example 1: To filter out even numbers in the set of integers A=[2,1,2], the conventional Lambda syntax would be written as: A.select(n -> n % 2 == 0)
In the above pseudo code, select is a higher-order function for filtering. Its parameter can be a Lambda expression, that is, an anonymous function written in Lambda syntax. Lambda expressions are structurally divided into two parts: -> is separator, before the separator -> is the definition of function parameters, and after -> is the function body. The n in the parameter definition is a loop variable, which means that in the function body, the current member of A can be directly referenced with n.
In engineering implementation, SPL pursues the code to be as short and easy to understand as possible. It directly embeds the name of the loop member variable ~, and omits the definition of function parameters. Example 1 can be written as follows: A.select (~% 2==0)
All Lambda expressions use the same loop variable name, which is conducive to the unification of programming style:
Example 2: multiply each member by 3: A.(~* 3)
Example 3: grouping: A.group (~)
For calculations with more steps, it is usually necessary to use loop variables multiple times. Built-in loop variables not only reduce the amount of code, but also make the code structure clearer and easier to read by removing the parameter definition part in Lambda expressions:
Example 4: Sum of squares of the first 100 odd numbers: to(100).(~ * 2 - 1).sum(~ * ~)
The conventional Lambda syntax does not have built-in loop counting, which is common in Lambda expressions, requiring programmers to manage loop counting themselves, making the code quite cumbersome.
Example 5: To filter out the even position members in the set of integers B=[1,4,4,4,5], the conventional Lambda syntax would be written as:
index=0
A.select((index=index+1, index %2 ==0))
In the above pseudo code, it is necessary to define the loop count outside the Lambda expression, and then maintain and use the loop count inside the Lambda expression by the programmer. This makes the problem of variable scope expansion. It is better to change the Lambda expression to a basic for loop. Although the code is longer, the variable scope is smaller and the code is more fault-tolerant.
SPL also built-in loop count # for Lambda syntax, which not only has a smaller scope but also shorter code. The code for Example 5 can be written as: A.select (#% 2==0)
The calculation related to loop counting is usually difficult, and built-in loop variables can simplify such calculations:
Example 6: For two sequences with the same length A=[3,8,2] and B=[2,0,4], calculate the sequence of pairwise addition, i.e. [3+2,8+0,2+4]. SPL code: A1.(~+B1 (#))
Example 7: For sequence A=[3,4,5,6,7,8], calculate the difference between the sum of odd numbered members and the sum of even numbered members, i.e. (3+5+7) - (4+6+8). SPL code: A.sum(if (#% 2==1,1, -1) *~)
Example 8: Sequence A stores sales for 12 months and calculates the growth amount for each month, which is the difference between sales for each month and sales for the previous month: A.(if (#>1,~- A (# -1), 0))
Adjacent references refer to referring to members or sets that are adjacent to the current loop variable in an expression. The conventional Lambda syntax does not have adjacent reference syntax, and programmers need to write it themselves. The code is quite cumbersome, such as using A(#-1) in Example 8 above to represent sales with an interval of -1 month from the current month (last month).
SPL extends Lambda syntax and provides syntax for adjacent references, greatly simplifying the code. SPL uses ~[t] to represent adjacent references, such as ~[-1] to represent the member or value from the previous position, and ~[0] is equivalent to ~.
Example 9: rewrite Example 8 with adjacent reference syntax: A.(if(#>1,~-~[-1],0))
After using adjacent references, the related complex calculations can be simplified:
Example 10: Determine whether sequence A is increasing: A.pselect(~!=int(~) || ~<=~[-1] )==null
Example 11: Sequence A stores sales for 12 months and calculates the moving average for three months. For January and December, only calculate two months: A.(avg(~[-1],~,~[1]))
Adjacent references can also be written in the form of an interval set, that is, ~[begin:end]. When “begin” is omitted, it means to start from the beginning, and when “end” is omitted, it means to continue until the end. For example, [-1:1] represents a set of 3 members from the previous to the next. The current example can be simplified as: A1.(~[-1:1].avg())
The conventional Lambda syntax does not make special simplifications for structured data. When referencing fields in a single data table, it is usually accompanied with structured data object names or loop variable names (both are the same), which makes the code quite cumbersome. Example 12: To calculate the total amount based on the set of order records, the conventional Lambda syntax would be written as:
Orders.sum(Orders.Price * Orders.Quantity) or Orders.sum(~.Price * ~.Quantity),
SPL's Lambda syntax makes a special simplification of the absolute advantage data type of structured data. When referencing fields from a single table, SPL allows for omitting the table name. Example 12 can be written directly as: Orders.sum (Price * Quantity)
This syntax for omitting table names originates from SQL, such as select sum(Price * Quantity) from Orders
Directly referencing the field names of a single table can make SPL code shorter and easier to read, which is particularly evident in more complex calculations. Example 13: According to the sales table, a performance bonus of 5% will be given to the top 10% of salespeople in 2014.
A | B | |
---|---|---|
1 | =connect("db").query("select * from sales where year(OrderDate)=2014") | / Connect to data source, read sales table |
2 | =A1.groups(SellerId;sum(Amount):Amount) | / Group by seller and summarize total sales of the year |
3 | =A2.sort@z(Amount).to(A2.len()*0.1) | / Sort in descending order of sales revenue, getting the top 10% |
4 | =A3.run(Amount*=1.05) | / Use A.run(),loop through the first 10% and reward each person a 5% sales bonus |
Adjacent reference syntax can also be used for structured data. Example 14: The maximum consecutive days of a stock's rise.
A | B | |
---|---|---|
1 | =T("share_index.csv").sort(TDATE) | / Read file and sort |
2 | =A1.group@i(CLOSING<CLOSING[-1]) | / The records of continually rising are divided into the same group |
3 | =A2.max(~.count()) | / Get the max value |
In addition to extending the Lambda syntax of Functional programming, SPL also simplifies the object programming appropriately, both aiming to reduce the code amount and improve the development efficiency, which will not be expanded here.
SPL Resource: SPL Official Website | SPL Blog | Download esProc SPL | SPL Source Code