IO function for loading ops.npy at different operation system #832

Tevin-yue · 2024-12-12T05:02:05Z

Feature you'd like to see:

Thank you very much for your work. I have some suggestions regarding adding functionalities to the I/O functions. Here's the scenario: the analysis is completed on Linux, and the post-processing is done on Windows. In this case, there are issues with load_ops, which is related to pathlib. Considering such a scenario, I think it's necessary to add functionality to accommodate the differences between Linux and Windows. You can refer to the following code for a solution.
'''

    class PosixPathToWindowsPathUnpickler(pickle.Unpickler):  
        def find_class(self, module, name):  
            if module == "pathlib" and name == "PosixPath":  
                return pathlib.WindowsPath  
            return super().find_class(module, name)  
      
    def load_with_custom_unpickler(file_path):  
        with open(file_path, 'rb') as f:   
            version = np.lib.format.read_magic(f)  
            np.lib.format._check_version(version)  
            shape, fortran_order, dtype = np.lib.format.read_array_header_1_0(f)  
            return PosixPathToWindowsPathUnpickler(f).load() 
    
    def load_ops(ops_path, device=None):
        if device is None:
            if torch.cuda.is_available():
                device = torch.device('cuda')
            else:
                device = torch.device('cpu')
    
        try:   
            ops = np.load(ops_path, allow_pickle=True).item()
        except Exception as e:  
            ops = load_with_custom_unpickler(ops_path).item()
    
        for k, v in ops.items():
            if k in ops['is_tensor']:
                ops[k] = torch.from_numpy(v).to(device)
        # TODO: Why do we have one copy of this saved as numpy, one as tensor,
        #       at different levels?
        ops['preprocessing'] = {k: torch.from_numpy(v).to(device)
                                for k,v in ops['preprocessing'].items()}
    
        return ops

’‘’

Secondly, I suggest generating a file that maps clusters to specific channels in the Kilosort results, indicating which channel each cluster belongs to. This would be quite useful for post-processing.

Thirdly, I have a minor question regarding the results: there appear to be very similar waveforms. I would like to ask if this is because Neuropixels inherently capture many such signals (which aren't observed in practice), or if it requires adjusting different parameters and re-sorting.

Lastly, I hope there could be a feature to extract all waveforms, not just a subset. This would also help in assessing the quality of the sorting. Just like this.

Additional Context

No response

The text was updated successfully, but these errors were encountered:

Tevin-yue · 2024-12-13T09:22:40Z

One more question: I have noticed that the corresponding cluster channel information remains consistent after using Phy2. Which one is more accurate?

jacobpennington · 2024-12-13T22:49:58Z

What errors are you encountering when using load_ops?
There is already a utility function for this, look at kilosort.data_tools.get_best_channel to see how the channel is determined.
I don't understand your question. Which signals are you saying are not observed? Waveforms within each cluster are expected to be quite similar.
There are already utility functions for this, see kilosort.data_tools, in particular: get_cluster_spikes, get_best_channel, and get_spike_waveforms. There is also an example of this in this tutorial: https://kilosort.readthedocs.io/en/latest/tutorials/plotting_example.html
Your "one more question:" I'm not sure what you're asking for the accuracy of. Which two things are you comparing?

Tevin-yue · 2024-12-15T02:49:04Z

Thank you very much for your prompt response. I apologize for not clarifying the issue earlier. Let me provide more details:

The issue arises when KS is run on Linux and the results are then viewed on Windows, resulting in the following error:

   NotImplementedError                       Traceback (most recent call last)
   Cell In[4], line 1
   ----> 1 ops = np.load(r'./test/ops.npy', allow_pickle=True)
   
   File E:\miniconda\envs\analysis\lib\site-packages\numpy\lib\npyio.py:456, in load(file, mmap_mode, allow_pickle, fix_imports, encoding, max_header_size)
       453         return format.open_memmap(file, mode=mmap_mode,
       454                                   max_header_size=max_header_size)
       455     else:
   --> 456         return format.read_array(fid, allow_pickle=allow_pickle,
       457                                  pickle_kwargs=pickle_kwargs,
       458                                  max_header_size=max_header_size)
       459 else:
       460     # Try a pickle
       461     if not allow_pickle:
   
   File E:\miniconda\envs\analysis\lib\site-packages\numpy\lib\format.py:800, in read_array(fp, allow_pickle, pickle_kwargs, max_header_size)
       798     pickle_kwargs = {}
       799 try:
   --> 800     array = pickle.load(fp, **pickle_kwargs)
       801 except UnicodeError as err:
       802     # Friendlier error message
       803     raise UnicodeError("Unpickling a python object failed: %r\n"
       804                        "You may need to pass the encoding= option "
       805                        "to numpy.load" % (err,)) from err
   
   File E:\miniconda\envs\analysis\lib\pathlib.py:962, in Path.__new__(cls, *args, **kwargs)
       960 self = cls._from_parts(args)
       961 if not self._flavour.is_supported:
   --> 962     raise NotImplementedError("cannot instantiate %r on your system"
       963                               % (cls.__name__,))
       964 return self
   
   NotImplementedError: cannot instantiate 'PosixPath' on your system

When operating within the same system, I have not encountered this problem. As previously mentioned, it seems to be related to the pathlib path issue.

Questions Q2, Q4, and Q5 are actually about channel issues. I noticed the use of kilosort.data_tools.get_best_channel and I understand it. In Phy, I reviewed the script it uses, which differs from what is used here (Phy uses tools from phylib, which seem to determine the cluster's channel based on amplitude rather than the template). After generating results with KS and simply loading them into Phy for viewing and saving, the number of channels for a cluster changes and can differ by one or two channels from those identified by get_best_channel. An example is as follows:

Therefore, in Q5, I would like to ask which method is more appropriate to rely on after post-processing in Phy.
In Q3, when detecting spikes in NP, I noticed many small positive spikes that differ slightly from the ideal waveforms (negative followed by positive), and the proportion seems a bit high. While NP can indeed detect signals like dendritic or even intracellular signals, such cases might exist. However, should a high proportion of these signals be considered unusual?

jacobpennington · 2024-12-18T19:21:29Z

Re: 3: are you using the default values for detection thresholds? Can you please upload kilosort4.log from the results directory?

jacobpennington · 2024-12-18T19:23:31Z

Re: channel numbers sometimes being different in KS4 versus Phy, use whichever one you decide looks more appropriate for your data. You should get similar results for both methods, but they will not always be identically. Especially if the cluster contains a lot of noise, in which case the channel assignment is not reliable.

Tevin-yue · 2024-12-19T01:23:44Z

Re 3: Thank you for your help. Attached is a KS run log file related to question 3. Additionally, I understand the issue regarding the channels, thank you for the explanation.
kilosort4.log

Tevin-yue changed the title ~~FEATURE: <Please replace this text with a comprehensive title>io improve~~ IO function for loading ops.npy at different operation system Dec 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IO function for loading ops.npy at different operation system #832

IO function for loading ops.npy at different operation system #832

Tevin-yue commented Dec 12, 2024

Tevin-yue commented Dec 13, 2024

jacobpennington commented Dec 13, 2024

Tevin-yue commented Dec 15, 2024

jacobpennington commented Dec 18, 2024

jacobpennington commented Dec 18, 2024

Tevin-yue commented Dec 19, 2024

IO function for loading ops.npy at different operation system #832

IO function for loading ops.npy at different operation system #832

Comments

Tevin-yue commented Dec 12, 2024

Feature you'd like to see:

Additional Context

Tevin-yue commented Dec 13, 2024

jacobpennington commented Dec 13, 2024

Tevin-yue commented Dec 15, 2024

jacobpennington commented Dec 18, 2024

jacobpennington commented Dec 18, 2024

Tevin-yue commented Dec 19, 2024