Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PCA example in docs throws a matrix index error (BoundsError) #179

Open
smartinsightsfromdata opened this issue Jan 27, 2022 · 3 comments
Open

Comments

@smartinsightsfromdata
Copy link

smartinsightsfromdata commented Jan 27, 2022

I run the code in the example (bar a minor change as discussed in #166 ) and I get a matrix index error:

This is the code (till the error)

sing MultivariateStats, RDatasets, Plots
plotly() # using plotly for 3D-interacive graphing

# load iris dataset
iris = dataset("datasets", "iris")

# split half to training set
Xtr = convert(Array,DataArray(iris[1:2:end,1:4]))'
Xtr_labels = convert(Array,DataArray(iris[1:2:end,5]))

# split other half to testing set
Xte = convert(Array,DataArray(iris[2:2:end,1:4]))'
Xte_labels = convert(Array,DataArray(iris[2:2:end,5]))

# suppose Xtr and Xte are training and testing data matrix,
# with each observation in a column

# train a PCA model, allowing up to 3 dimensions
M = fit(PCA, Xtr; maxoutdim=3)

# apply PCA model to testing set
Yte = MultivariateStats.transform(M, Xte)  # this is my change to avoid confusing usage of `transform` keyword

# reconstruct testing observations (approximately)
Xr = reconstruct(M, Yte)

The above code is OK. The following gives error in the Yte[:,Xte_labels.== expression.

# group results by testing set labels for color coding
setosa = Yte[:,Xte_labels.=="setosa"]
versicolor = Yte[:,Xte_labels.=="versicolor"]
virginica = Yte[:,Xte_labels.=="virginica"]

This is the error for the first line:

BoundsError: attempt to access 2×4 Matrix{Float64} at index [1:2, 75-element BitVector]

Stacktrace:
 [1] throw_boundserror(A::Matrix{Float64}, I::Tuple{Base.Slice{Base.OneTo{Int64}}, Base.LogicalIndex{Int64, BitVector}})
   @ Base ./abstractarray.jl:691
 [2] checkbounds
   @ ./abstractarray.jl:656 [inlined]
 [3] _getindex
   @ ./multidimensional.jl:838 [inlined]
 [4] getindex(::Matrix{Float64}, ::Function, ::BitVector)
   @ Base ./abstractarray.jl:1218
 [5] top-level scope
   @ In[11]:2
 [6] eval
   @ ./boot.jl:373 [inlined]
 [7] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
   @ Base ./loading.jl:1196
@wildart
Copy link
Collaborator

wildart commented Jan 27, 2022

transform is deprecated. Please, use predict instead. See the updated example here: https://juliastats.org/MultivariateStats.jl/dev/pca/#Example

@smartinsightsfromdata
Copy link
Author

@wildart predict does not work for me.
I've julia 1.7.1 and MultivariateStats v0.8.0.

Using

Yte = predict(M, Xte)

I get

MethodError: no method matching predict(::PCA{Float64}, ::LinearAlgebra.Adjoint{Float64, Matrix{Float64}})
Closest candidates are:
  predict(::Discriminant, ::AbstractMatrix) at ~/.julia/packages/MultivariateStats/HTpHt/src/lda.jl:26

Stacktrace:
 [1] top-level scope
   @ In[9]:1
 [2] eval
   @ ./boot.jl:373 [inlined]
 [3] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
   @ Base ./loading.jl:1196

If I use Yte = MultivariateStats.transform(M, Xte) it works

@wildart
Copy link
Collaborator

wildart commented Jan 27, 2022

Please, update. A new version has been published.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants