Skip to content
This repository has been archived by the owner on Oct 30, 2024. It is now read-only.

Improve analysis #49

Merged
merged 126 commits into from
May 5, 2023
Merged

Improve analysis #49

merged 126 commits into from
May 5, 2023

Conversation

lauraporta
Copy link
Member

@lauraporta lauraporta commented Apr 20, 2023

@openai: ignore
☝🏻stops future reviews of the bot

👉🏻 @ Future human reviewer

I am experimenting using a chatgpt bot to have PR reviewed, while waiting for copilot X to arrive.

The bot is not able to process files that are too large, and unfortunately those are the core of the PR.
These files are:

  • spatial_freq_temporal_freq.py
  • photon_data.py

So no summary and no comments on the code are provided for these two files. Therefore I have updated the descriptions of the bot filling them up with the missing info.

I noticed also that applies comments relevant to one file in the wrong file. I'll see if I can move them around. Sorry about that.

⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️
N.B.: a previous PR (#25) had many relevant suggestions that are still applicable to this one. They are not addressed in this PR since they require extensive refactoring. This PR is mostly focused on adding new functionalities and writing better tests.
⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️

Finally, these changes where actually done while developing the dashboard, which you can find in the commits. Please ignore any changes related to it. I have split the changes related to the analysis and put them all in this PR to keep things cleaner.

Laura's summary

Broad description of the main changes:

  • The application can now be asily started by running python3 demo_cly.py in your environment. app.py substitutes what main.py was and now allows us to save the analysis result as an object to be then used by the dashboard.
  • The analysis class is now called FrequencyResponsiveness and used with the name responsiveness for short when instantiated.
  • Anything that can be considered "data", analysed or not, is stored in responsiveness.data which is a PhotonData object. Anything else in responsiveness is a method that takes care of one bit of the analysis. On instantiation, FrequencyResponsiveness has the attribute data evaluated.
  • PhotonData has been changed to accomodate this changes. Now all the attributes that will be filled by the analysis are instantiated at the creation of the PhotonData object. Now in the docstring of the class there is a list of all possible attributes. I added some checks to evaluate the internal consistency of the datasets with expected resutls.
  • the analysis pipeline is now triggered with __call__, by simply calling the instantiated object: responsiveness(). The analysis pipeline consists of:
    1. calculating statistics (not changed)
    2. fitting the response to a gaussian and storing the values in data. The methods dealing with this part are what has been added to the class.
  • The methods in the part 2 of the analysis call methods contained in gaussian_calculations.py. They are static methods describing the gaussian equation in use and the way to fit the data to it.
  • Tests have been improved by adding a script to generate expected outputs in order to be compared with the tests results. The results are stored in a folder called mock_data/. This new feature required changes in the tests, in conftest.py and the creation of a shared set of helper methods in generate_mock_data.py.
  • I am starting to document the comparisons that I have made with my code and the original matlab codebase in the docs.
  • There are overall various refactorying changes and a more widespread use of typing. Some docstrings have been improved.

Summary by OpenAI

Release Notes:

  • New Feature: Added fitting parameters to the config file and modified some existing parameters such as fps_tree_photon, spatial_frequencies, temporal_frequencies, directions, n_spatial_frequencies, and n_temporal_frequencies.
  • Refactor: Improved the structure of the application by replacing the main function with a new app.py file that allows the application to be started by running python3 demo_cly.py.
  • Documentation: Improved the documentation of classes and their methods. It adds more details to the docstrings, clarifies some comments, and fixes typos.
  • Test: Added unit tests for Gaussian-related methods in gaussians_calculations.py.

Changes abound, code quality improved,
Bugs squashed, docs clarified, tests approved,
With new features added, and refactorings too,
This PR brings joy, and we thank you! 🎉👏

@github-actions
Copy link

github-actions bot commented Apr 20, 2023

🤖 OpenAI

High-level Summary

The changes made to the codebase include improvements in logic, security, performance, data races, consistency, error handling, maintainability, modularity, complexity, and optimization. The changeset includes modifications to various files such as .gitignore, MANIFEST.in, demo_cli.py, app.py, config_switches.py, load_data.py, data_raw.py, parser2p.py, parser2pRSP.py, and test files. The changes involve adding new parameters to the config file, refactoring functions, improving documentation, adding unit tests, and modifying the application's structure to improve its usability.

Files Summary

File Changes
.gitignore, setup_for_demo.py Added *.pickle and *.obsidian to .gitignore file. Added fitting parameters to the config file and modified some existing parameters such as fps_tree_photon, spatial_frequencies, temporal_frequencies, directions, n_spatial_frequencies, n_temporal_frequencies, and n_directions.
MANIFEST.in Added two lines to include documentation files in the distribution package. Removed a line that includes YAML files and does not exclude .md and .png files from being included.
demo_cli.py, rsp_vision/main.py Added import statement and calls analysis_pipeline() function if it exists. The main function has been removed and replaced with a new app.py file that allows the application to be started by running python3 demo_cly.py. The analysis class is now called FrequencyResponsiveness and used with the name responsiveness for short when instantiated. The PhotonData object has been changed to accommodate these changes.
docs/reproduce_matlab_output.md Added proof of accurate MATLAB output reproduction and comparison with Python implementation. Handles NaN values better in the new implementation.
rsp_vision/console_application/app.py Added a decorator to handle exceptions, a function to start logging, and a function to get the name of the module for logging purposes. The main function now uses rich.prompt to get input from the user instead of command-line arguments. Imported new modules and classes, such as logging, pickle, rich, and PhotonData.
rsp_vision/load/config_switches.py, rsp_vision/load/load_data.py Refactored get_fps function to include type hints and remove unused ascii_array_to_string function. The read function is removed and replaced with read_config_file. The yaml module is imported. The load_data function now calls a new function load_data_from_filename which replaces the old load function. A new function read_config_file is added to read the configuration file.
rsp_vision/objects/data_raw.py, rsp_vision/objects/file.py, rsp_vision/objects/folder_naming_specs.py Improved the documentation of classes and their methods. It adds more details to the docstrings, clarifies some comments, and fixes typos.
rsp_vision/objects/parsers2p/parser2p.py, rsp_vision/objects/parsers2p/parser2pRSP.py The changes in parser2p.py involve minor modifications to the docstrings and comments. The changes improve readability and do not introduce new functionality or modify existing logic. The diff updates the docstrings and formatting of Parser2pRSP class in parser2pRSP.py. It also updates the comments and formatting of some methods.
tests/test_integration/calculate_outputs.py, tests/test_integration/generate_mock_data.py, tests/test_integration/test_photon_data.py, tests/test_integration/test_sf_tf.py, tests/test_integration/test_photon_data.py, tests/test_integration/test_sf_tf_static_methods.py Minor changes to add fixtures, rename variables, and use fixtures instead of functions for testing purposes. No issues with logic, security, performance, data races, consistency, error handling, maintainability, modularity, complexity, or optimization were found.
tests/test_unit/test_gaussian_related_methods.py Added unit tests for Gaussian-related methods in gaussians_calculations.py. Tests include functions to fit a 2D Gaussian to data, create a Gaussian matrix, and perform a single Gaussian fit.

Chat with 🤖 OpenAI Bot (@openai)

  • Reply on review comments left by this bot to ask follow-up questions. A review comment is a comment on a diff or a file.
  • Invite the bot into a review comment chain by tagging @openai in a reply.

Code suggestions

  • The bot may make code suggestions, but please review them carefully before committing since the line number ranges may be misaligned.
  • You can edit the comment made by the bot and manually tweak the suggestion if it is slightly off.

Files ignored due to filter (1)

Ignored files

  • tox.ini
Files not summarized due to errors (5)

Failed to summarize

  • rsp_vision/analysis/gaussians_calculations.py (diff tokens exceeds limit)
  • rsp_vision/analysis/spatial_freq_temporal_freq.py (diff tokens exceeds limit)
  • rsp_vision/objects/photon_data.py (diff tokens exceeds limit)
  • tests/test_integration/generate_mock_data.py (diff tokens exceeds limit)
  • tests/test_integration/test_sf_tf.py (diff tokens exceeds limit)
Files not reviewed due to simple changes (23)

Skipped review

  • .gitignore
  • MANIFEST.in
  • demo_cli.py
  • docs/reproduce_matlab_output.md
  • rsp_vision/console_application/app.py
  • rsp_vision/load/config_switches.py
  • rsp_vision/load/load_data.py
  • rsp_vision/load/read_config.py
  • rsp_vision/main.py
  • rsp_vision/objects/data_raw.py
  • rsp_vision/objects/file.py
  • rsp_vision/objects/folder_naming_specs.py
  • rsp_vision/objects/parsers2p/parser2p.py
  • rsp_vision/objects/parsers2p/parser2pRSP.py
  • rsp_vision/plots/plotter.py
  • rsp_vision/utils.py
  • setup_for_demo.py
  • tests/conftest.py
  • tests/test_integration/calculate_outputs.py
  • tests/test_integration/test_photon_data.py
  • tests/test_integration/test_sf_tf_static_methods.py
  • tests/test_unit/test_gaussian_related_methods.py
  • tests/test_unit/test_utils.py

@lauraporta lauraporta marked this pull request as ready for review April 20, 2023 14:11
@lauraporta lauraporta linked an issue Apr 20, 2023 that may be closed by this pull request
Copy link
Member

@JoeZiminski JoeZiminski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @lauraporta excellent work! This looks very clean and thoroughly tested. The demo example run well and all tests pass locally. Only minor suggestions. At some point would be nice to talk about the method (e.g. Andermann et al., 2011.) as looks interesting.

docs/reproduce_matlab_output.md Show resolved Hide resolved
@@ -30,7 +31,7 @@ def __init__(self, data: dict, is_allen: bool = True):
self.neuropil_coeficient = self._unpack_data(data["r_neu"], data)
logging.info("Unpacked r_neu")

self.stim = self._unpack_data(data["stim"], data)
self.stim: np.ndarray = self._unpack_data(data["stim"], data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice typing, I am not very sure about the difference between using the typing module or numpy.typing module and the standard types (e.g. typing.Dict vs. dict), np.ndarray vs. numpy.typing.NDArray). However down the line it might be useful to be using the typing module type hints. I think they let you do a bit more (e.g. NDArray[np.integer] or typing.Dict[str, Dict[str, int]] (e.g.). (though this might be possible with pyton standard types). But they don't seem to have everything covered yet, like str or int

rich.print("Something went wrong 😱")
logging.exception(e)

return inner_function
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

# pipeline draft
start_logging()

# TODO: add TUI or GUI fuctionality to get input from user
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have Adam in my brain, this could go into a new issue

logging.info("Analysis finished")
logging.info(f"Updated photon_data object: {photon_data}")

with open(f"{folder_name}_data.pickle", "wb") as f:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if I understand correctly this saves the results in the same directory as the original data file. It it makes sense to, down the line this could use the SWC-Blueprint setup to make a new derivatives folder where all output is saved. At some point it would be cool to discuss how this pipeline can be generalised it seems like a lot is here for 2p analysis

return result


def fit_2D_gaussian_to_data(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would consider for readability ordering the functions in the file in their call order (unless I have missed something) fit_2D_gaussian_to_data > single_fit > create_gaussian_matrix > elliptical_gaussian_andermann

An array containing the spatial frequencies of the stimuli.
tfs : np.ndarray
An array containing the temporal frequencies of the stimuli.
response_matrix : np.ndarray
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could maybe give a tiny bit more detail here e.g. the dimension sizes (in terms of the variables) e.g. is response_matrix a sfs by tfs size matrix? (i.e. the fit 2D gaussian of this matrix is a joint distribution on the probability of the neuron firing at some given spatial frequency, temporal frequency?

𝜻_power_law_exp = 0.5

# Expected output
expected_output = 0.9999000049998332
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any way to test over a range of values? I guess not without re-implementing the function. I wonder if a couple of simple / extreme cases could be added for full covering. Its not super-important though

), "Optimized parameters above upper bounds"


def test_fit_2D_gaussian_to_data(parameters_to_fit):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense to also do something like create a 2D multivariate gaussian with known means, covariances and check that the fit matches the pre-set means, covariances? This could provide a bit more flexibility on the input ranges to test. But, maybe it is not as simple as this to to the particulars of this fitting function, specialised for this use case

@@ -0,0 +1,45 @@
import pathlib
from functools import lru_cache
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the test_data folder could be moved to a folder called tests/data

@lauraporta
Copy link
Member Author

Joe, thanks for all your comments and suggestions. Right now I would not address them, I add them to the backlog of the possible improvements that I could do in the future.

@lauraporta lauraporta merged commit d585bf7 into developement May 5, 2023
@lauraporta lauraporta deleted the improve-analysis branch May 9, 2023 09:23
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Analysis behind murakami plot
2 participants