-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GridsearchCV and pipeline: input dimensionality #263
Comments
Currently, basis objects assume a single input, and addressing this issue is a work-in-progress. The current work-around is to define a basis in
where |
Hi, sorry for the delay with this fix but we were in the middle of improving the basis module structure. The other simpler alternative is install directly the branch, from your environment,
You should update the installation once we release the new version of nemos with this fixes incorporated. Below is how to fix your script with the new TransformerBasis fixes. import nemos
import pandas as pd
import numpy as np
from sklearn.pipeline import Pipeline
from sklearn.model_selection import GridSearchCV
# generate some data for illustration
spike_dat = np.random.poisson(size=(20, 1000))
# region Hyperparameter tuning
num_bases = 10
filter_size=100
neurons_slice = np.arange(10)
time_vec_cut_index = 10000
glm_neuron_id = 0
print(f'Number of basis functions: {num_bases}')
# new basis class name (now, distinct classes for conv and eval)
basis = nemos.basis.RaisedCosineLinearConv(n_basis_funcs=num_bases, window_size=filter_size)
transformer_basis = basis.to_transformer()
neuron = 15
print(f'{neuron} Neurons considered = {neurons_slice[0:neuron]}')
spike_counts = spike_dat[:][neurons_slice[0:neuron], :time_vec_cut_index].T
# must tell the transformer how many inputs the basis has to process
# you can pass the number of inputs or the input directly
transformer_basis.set_input_shape(spike_counts)
train_spike_counts = spike_counts[0:int(len(spike_counts) * 0.7), :]
pipeline = Pipeline(
[
(
"transformerbasis",
transformer_basis,
),
(
"glm",
nemos.glm.GLM(regularizer_strength=0.5, regularizer="Ridge", solver_kwargs={'verbose': True}),
),
]
)
param_grid = dict(
glm__regularizer_strength=(0.1, 0.01, 0.001, 1e-5),
transformerbasis__n_basis_funcs=(5, 10, 15, 20),
)
gridsearch = GridSearchCV(
pipeline,
param_grid=param_grid,
cv=2
)
gridsearch.fit(train_spike_counts, train_spike_counts[:, glm_neuron_id].flatten())
cvdf = pd.DataFrame(gridsearch.cv_results_)
cvdf_wide = cvdf.pivot(
index="param_transformerbasis__n_basis_funcs",
columns="param_glm__regularizer_strength",
values="mean_test_score",
)
|
@FlyingFordAnglia let me know if that's working well for you and thank you for bringing this up! |
Hi! I am trying to fit a glm to some spiking data from a bunch of neurons. My design matrix is the binned spike counts of all neurons, and my 'y' is the spike counts of the neuron I am interested in. Before fitting the glm, I wanted to run a grid search for hyperparameter tuning.
When I run the attached code, I get the following error:
From what I can gather, it appears that the fit_transform method that gridsearchCV uses internally expects a design matrix with a single column, not a matrix of n_samples, n_features. How can I get this to work?
My installed nemos version is 0.1.6 and sklearn version is 1.5.0
A tangential question: How do I integrate batch gradient descent with this pipeline?
Any help would be appreciated, thanks!
The text was updated successfully, but these errors were encountered: