Neither Go nor R have matrix types as first class citizens, and R's approach to specifying matrixness is not currently understood by rgo
(this may change). So working with matrices requires some extra effort.
To show the work involved this example will reprise the Gonum stat example at https://pkg.go.dev/gonum.org/v1/gonum/stat?tab=doc#example-CC which performs a canonical correlations analysis on the MASS::Boston data.
This needs a Go wrapper around the CC.CanonicalCorrelations
method from the Gonum stat package.
func CCA(x, y blas64.GeneralCols) (ccors []float64, pVecs, qVecs, phiVs, psiVs blas64.GeneralCols, err error) {
var xdata, ydata mat.Dense
xdata.SetRawMatrix(rowMajor(x))
ydata.SetRawMatrix(rowMajor(y))
var cc stat.CC
err = cc.CanonicalCorrelations(&xdata, &ydata, nil)
if err != nil {
return nil, pVecs, qVecs, phiVs, psiVs, err
}
ccors = cc.CorrsTo(nil)
var _pVecs, _qVecs, _phiVs, _psiVs mat.Dense
cc.LeftTo(&_pVecs, true)
cc.RightTo(&_qVecs, true)
cc.LeftTo(&_phiVs, false)
cc.RightTo(&_psiVs, false)
return ccors,
colMajor(_pVecs.RawMatrix()),
colMajor(_qVecs.RawMatrix()),
colMajor(_phiVs.RawMatrix()),
colMajor(_psiVs.RawMatrix()),
err
}
Note that we need some helpers to convert the column major R matrix type to the Gonum matrix values used by CanonicalCorrelations
func rowMajor(a blas64.GeneralCols) blas64.General {
t := blas64.General{
Rows: a.Rows,
Cols: a.Cols,
Data: make([]float64, len(a.Data)),
Stride: a.Cols,
}
t.From(a)
return t
}
and back again.
func colMajor(a blas64.General) blas64.GeneralCols {
t := blas64.GeneralCols{
Rows: a.Rows,
Cols: a.Cols,
Data: make([]float64, len(a.Data)),
Stride: a.Rows,
}
t.From(a)
return t
}
With this done, we can perform the usual steps to build an rgo
package, starting with defining the package's go.mod
file with
$ go mod init github.com/rgonomic/rgo/examples/cca
running rgo init
with an argument pointing to the current directory since we are at the root of github.com/rgonomic/rgo/examples/cca,
$ rgo init .
Since there is only one function and it has an all upper-case name, we don't need to make any change to the rgo.json
file.
The wrapper code is then generated by running the build subcommand.
$ rgo build
This will generate the Go, C and R wrapper code for the R package, and collate all the licenses in the source package into the LicenseDir
directory. At this stage the DESCRIPTION
file should be edited and non-relevant licenses should be removed.
The package can now be installed.
$ R CMD INSTALL .
To replicate the Gonum example, we need the Boston data in the same format described in the example.
> library(MASS)
> x <- cbind(Boston$crim, Boston$indus, Boston$nox, Boston$dis, Boston$rad, Boston$ptratio, Boston$black)
> y <- cbind(Boston$rm, Boston$age, Boston$tax, Boston$medv)
> boston <- cbind(x, y)
We also need a couple of helpers to convert the R matrix representation based on attributes to a list with the values needed to populate the struct values (blas64.GeneralCols
) that the Go CCA
function accepts, and the convert the results back again.
> mat_list <- function(a) {
+ return(list(Rows = nrow(a), Cols = ncol(a), Data = as.vector(a), Stride = nrow(a)))
+ }
> list_mat <- function(a) {
+ return(matrix(data = a$Data, nrow = a$Rows, ncol = a$Cols))
+ }
Note that we could leave the result in row-major and assume that the user will get R to do that work by adding a byrow = TRUE
, but we don't want to be that person.
With all that done, invoking cca
with our Boston data gives us the results that we expect.
> library(cca)
> r <- cca::cca(mat_list(x), mat_list(y), NULL)
> r$err
NULL
> r$ccors
[1] 0.9451239 0.6786623 0.5714338 0.2009740
> list_mat(r$pVecs)
[,1] [,2] [,3] [,4]
[1,] -0.25743919 0.01584775 0.2122170 -0.09457338
[2,] -0.48365944 0.38371019 0.1474448 0.65973249
[3,] -0.08007764 0.34935567 0.3287336 -0.28620404
[4,] 0.12775864 -0.73374277 0.4851135 0.22479649
[5,] -0.69694320 -0.43417488 -0.3602873 0.02906616
[6,] -0.09909033 0.05034112 0.6384331 0.10223671
[7,] 0.42604600 0.03233344 -0.2289528 0.64192329
> list_mat(r$qVecs)
[,1] [,2] [,3] [,4]
[1,] 0.01816605 -0.1583489 -0.006672358 -0.98719354
[2,] -0.23476990 0.9483315 -0.146242051 -0.15544708
[3,] -0.97007040 -0.2406072 -0.025183898 0.02091341
[4,] 0.05930007 -0.1330460 -0.988905715 0.02911615
> list_mat(r$phiVs)
[,1] [,2] [,3] [,4]
[1,] -0.0027462234 0.0093444514 0.048964393 -0.015496719
[2,] -0.0428564455 -0.0241708702 0.036072347 0.183898323
[3,] -1.2248435649 5.6030921365 5.809414458 -4.792681219
[4,] -0.0043684825 -0.3424101165 0.446996122 0.115016181
[5,] -0.0741534070 -0.1193135795 -0.111551831 0.002163876
[6,] -0.0233270323 0.1046330818 0.385304598 -0.016092787
[7,] 0.0001293051 0.0004540747 -0.003029632 0.008189548
> list_mat(r$psiVs)
[,1] [,2] [,3] [,4]
[1,] 0.030159336 -0.30022193 0.087821738 -1.9583226532
[2,] -0.006548310 0.03922121 -0.011757078 -0.0061113064
[3,] -0.005207552 -0.00457702 -0.002276231 0.0008441873
[4,] 0.002011174 0.00373528 -0.129257807 0.1037709056