Skip to content

Latest commit

 

History

History

cca

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Working with matrices

Neither Go nor R have matrix types as first class citizens, and R's approach to specifying matrixness is not currently understood by rgo (this may change). So working with matrices requires some extra effort.

To show the work involved this example will reprise the Gonum stat example at https://pkg.go.dev/gonum.org/v1/gonum/stat?tab=doc#example-CC which performs a canonical correlations analysis on the MASS::Boston data.

This needs a Go wrapper around the CC.CanonicalCorrelations method from the Gonum stat package.

func CCA(x, y blas64.GeneralCols) (ccors []float64, pVecs, qVecs, phiVs, psiVs blas64.GeneralCols, err error) {
	var xdata, ydata mat.Dense
	xdata.SetRawMatrix(rowMajor(x))
	ydata.SetRawMatrix(rowMajor(y))

	var cc stat.CC
	err = cc.CanonicalCorrelations(&xdata, &ydata, nil)
	if err != nil {
		return nil, pVecs, qVecs, phiVs, psiVs, err
	}
	ccors = cc.CorrsTo(nil)

	var _pVecs, _qVecs, _phiVs, _psiVs mat.Dense
	cc.LeftTo(&_pVecs, true)
	cc.RightTo(&_qVecs, true)
	cc.LeftTo(&_phiVs, false)
	cc.RightTo(&_psiVs, false)

	return ccors,
		colMajor(_pVecs.RawMatrix()),
		colMajor(_qVecs.RawMatrix()),
		colMajor(_phiVs.RawMatrix()),
		colMajor(_psiVs.RawMatrix()),
		err
}

Note that we need some helpers to convert the column major R matrix type to the Gonum matrix values used by CanonicalCorrelations

func rowMajor(a blas64.GeneralCols) blas64.General {
	t := blas64.General{
		Rows:   a.Rows,
		Cols:   a.Cols,
		Data:   make([]float64, len(a.Data)),
		Stride: a.Cols,
	}
	t.From(a)
	return t
}

and back again.

func colMajor(a blas64.General) blas64.GeneralCols {
	t := blas64.GeneralCols{
		Rows:   a.Rows,
		Cols:   a.Cols,
		Data:   make([]float64, len(a.Data)),
		Stride: a.Rows,
	}
	t.From(a)
	return t
}

With this done, we can perform the usual steps to build an rgo package, starting with defining the package's go.mod file with

$ go mod init github.com/rgonomic/rgo/examples/cca

running rgo init with an argument pointing to the current directory since we are at the root of github.com/rgonomic/rgo/examples/cca,

$ rgo init .

Since there is only one function and it has an all upper-case name, we don't need to make any change to the rgo.json file.

The wrapper code is then generated by running the build subcommand.

$ rgo build

This will generate the Go, C and R wrapper code for the R package, and collate all the licenses in the source package into the LicenseDir directory. At this stage the DESCRIPTION file should be edited and non-relevant licenses should be removed.

The package can now be installed.

$ R CMD INSTALL .

To replicate the Gonum example, we need the Boston data in the same format described in the example.

> library(MASS)
> x <- cbind(Boston$crim, Boston$indus, Boston$nox, Boston$dis, Boston$rad, Boston$ptratio, Boston$black)
> y <- cbind(Boston$rm, Boston$age, Boston$tax, Boston$medv)
> boston <- cbind(x, y)

We also need a couple of helpers to convert the R matrix representation based on attributes to a list with the values needed to populate the struct values (blas64.GeneralCols) that the Go CCA function accepts, and the convert the results back again.

> mat_list <- function(a) {
+ 	return(list(Rows = nrow(a), Cols = ncol(a), Data = as.vector(a), Stride = nrow(a)))
+ }
> list_mat <- function(a) {
+ 	return(matrix(data = a$Data, nrow = a$Rows, ncol = a$Cols))
+ }

Note that we could leave the result in row-major and assume that the user will get R to do that work by adding a byrow = TRUE, but we don't want to be that person.

With all that done, invoking cca with our Boston data gives us the results that we expect.

> library(cca)
> r <- cca::cca(mat_list(x), mat_list(y), NULL)
> r$err
NULL
> r$ccors
[1] 0.9451239 0.6786623 0.5714338 0.2009740
> list_mat(r$pVecs)
            [,1]        [,2]       [,3]        [,4]
[1,] -0.25743919  0.01584775  0.2122170 -0.09457338
[2,] -0.48365944  0.38371019  0.1474448  0.65973249
[3,] -0.08007764  0.34935567  0.3287336 -0.28620404
[4,]  0.12775864 -0.73374277  0.4851135  0.22479649
[5,] -0.69694320 -0.43417488 -0.3602873  0.02906616
[6,] -0.09909033  0.05034112  0.6384331  0.10223671
[7,]  0.42604600  0.03233344 -0.2289528  0.64192329
> list_mat(r$qVecs)
            [,1]       [,2]         [,3]        [,4]
[1,]  0.01816605 -0.1583489 -0.006672358 -0.98719354
[2,] -0.23476990  0.9483315 -0.146242051 -0.15544708
[3,] -0.97007040 -0.2406072 -0.025183898  0.02091341
[4,]  0.05930007 -0.1330460 -0.988905715  0.02911615
> list_mat(r$phiVs)
              [,1]          [,2]         [,3]         [,4]
[1,] -0.0027462234  0.0093444514  0.048964393 -0.015496719
[2,] -0.0428564455 -0.0241708702  0.036072347  0.183898323
[3,] -1.2248435649  5.6030921365  5.809414458 -4.792681219
[4,] -0.0043684825 -0.3424101165  0.446996122  0.115016181
[5,] -0.0741534070 -0.1193135795 -0.111551831  0.002163876
[6,] -0.0233270323  0.1046330818  0.385304598 -0.016092787
[7,]  0.0001293051  0.0004540747 -0.003029632  0.008189548
> list_mat(r$psiVs)
             [,1]        [,2]         [,3]          [,4]
[1,]  0.030159336 -0.30022193  0.087821738 -1.9583226532
[2,] -0.006548310  0.03922121 -0.011757078 -0.0061113064
[3,] -0.005207552 -0.00457702 -0.002276231  0.0008441873
[4,]  0.002011174  0.00373528 -0.129257807  0.1037709056