Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected error during Ridge Regression: LAPACK GETRF error code: -4 #788

Open
alexdupre opened this issue Sep 26, 2024 · 6 comments
Open

Comments

@alexdupre
Copy link

Describe the bug
During the Ridge regression, for some slices of the original training data, the dpotrf function returns the error code -4, even if the lda argument is greater than the n argument.

Expected behavior
No error. FWIW, the ridge regression with scikit-learn on the same data works fine.

Actual behavior

Iteration: 0
Iteration: 1
Iteration: 2
10:16:43.137 [main] ERROR smile.math.matrix.Matrix - LAPACK GETRF error code: -4
Exception in thread "main" java.lang.ArithmeticException: LAPACK GETRF error code: -4
	at smile.math.matrix.Matrix.cholesky(Matrix.java:1732)
	at smile.regression.RidgeRegression.fit(RidgeRegression.java:206)
	at smile.regression.RidgeRegression.fit(RidgeRegression.java:116)
	at smile.regression.package$.$anonfun$ridge$1(package.scala:115)
	at smile.util.package$time$.apply(package.scala:67)
	at smile.regression.package$.ridge(package.scala:115)
	at Bug$.$anonfun$new$1(Bug.scala:15)

image

Code snippet

import smile._
import smile.data.formula._
import smile.regression._

import scala.language.postfixOps

object Bug extends App {

  val df = read.csv("bug.csv")
  (0 to 3).foreach { i =>
    println(s"Iteration: $i")
    val trainData = df.slice(i, 50 + i)
    ridge("Label" ~, trainData, 1)
  }

}

Input data
bug.csv

Additional context

  • OpenJDK 21 on WSL2 and OpenJDK 11 on Ubuntu
  • Smile 3.1.1
@alexdupre
Copy link
Author

After digging into the issue it looks like the problem is caused by the values of a feature column that are all equal.

In theory, this scenario should have been caught before calling the dotprf function, here:

for (int j = 0; j < scale.length; j++) {
if (MathEx.isZero(scale[j])) {
throw new IllegalArgumentException(String.format("The column '%s' is constant", X.colName(j)));
}
}

In practice, the constant check passed because the calculated standard deviation was NaN instead of a zero in machine precision.

This snippet, simulating a vector of length 48 with a constant value 62571.43, can easily show the issue:

  val c = 62571.43
  val m = 48
  val column = Array.fill(m)(c)

  val sum = column.sum
  val mean = sum / m // 62571.43000000003
  val sumsq = column.map(v => v * v).sum // 1.8792882490775534E11
  val variance = sumsq / m - mean * mean // -4.76837158203125E-7

  val sd = Math.sqrt(variance) // NaN
  val isZero = MathEx.isZero(sd) // false

The number that get passed to Math.sqrt is negative instead of zero because of floating point arithmetic and the algortihm used by colSds, and so the result is NaN. I'm not sure what's the correct way to fix the standard deviation function to work in this scenario, but the algorithm implemented by the commons math library correctly returns a variance of 0 in this case: https://github.com/apache/commons-math/blob/e580cde5f77019bb1b60c195a38de56ca4f5dcb0/commons-math-legacy/src/main/java/org/apache/commons/math4/legacy/stat/descriptive/moment/Variance.java#L399-L425

@haifengl
Copy link
Owner

haifengl commented Oct 3, 2024

Thanks for deep dive! I have added safeguard on the colSds computation. Please try the master branch if it fixed the issue.

@alexdupre
Copy link
Author

I haven't tried it, yet, but the fix seems fine for my case (I don't know if there is the possibility that with some data the math error produces a small positive variance that is then not considered a zero)

While I was experimenting with the Ridge Regression I found another unexpected error: if I use a lambda equal to 0, to emulate a simple Linear Regression, I get another LAPACK error code in the same potrf function. Looking at the following code it seems 0 should be a supported value:

for (int i = 0; i < p; i++) {
if (lambda[i] < 0.0) {
throw new IllegalArgumentException(String.format("Invalid lambda[%d] = %f", i, lambda[i]));
}
}

@haifengl
Copy link
Owner

haifengl commented Oct 4, 2024

My guess is that your data is (very close to) collinear so that potrf fails when lambda = 0.0.

@alexdupre
Copy link
Author

It might be. My experiment was to port some code from scikit-learn to scala (where I'm more proficient), and verify that it achieves similar results for the same datasets with the "same" algorithm. But it seems smile is a bit more picky about the dataset, while sklearn accepted everything. I'll continue my tests.

@haifengl
Copy link
Owner

haifengl commented Oct 4, 2024

In case of lambda = 0.0, you should use OLS that has a different way to handle collinearity. RidgeRegression is to handle collinear data with non-zero lambda.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants