Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: Update logging to allow end-user logging level #784

Open
tabedzki opened this issue Sep 13, 2024 · 12 comments
Open

Discussion: Update logging to allow end-user logging level #784

tabedzki opened this issue Sep 13, 2024 · 12 comments
Assignees
Labels
enhancement New feature or request

Comments

@tabedzki
Copy link
Contributor

tabedzki commented Sep 13, 2024

Feature you'd like to see:

Current Situation

In run_kilosort.py, setup_logger defines the root logger via logging.basicConfig(). Python's logging guide suggests that the loggers typically be configured by the end user and not by libraries themselvs (relevant snippets posted below).

Information Seeking/Understanding

  • What was the original intention of adding hard-coding the handler?
    • Was Kilosort intended to be run as a module only and that is why the logger was set up as such?
  • Would you be open to refactoring the logger to provide more control to the end user? The current approach can lead to multiple outputs (see Example 1 below).
    • I'd be happy to open a PR if you find this useful.

Kilosort's interactions with external loggers

I've been able to generate small tests that can demonstrate the ability for kilosort4 to still log to its file handler while allowing the end user to control the root logger and other logging features.

Relevant snippets from Logging Guide

Note: It is strongly advised that you do not log to the root logger in your library. Instead, use a logger with a unique and easily identifiable name, such as the name for your library’s top-level package or module. Logging to the root logger will make it difficult or impossible for the application developer to configure the logging verbosity or handlers of your library as they wish.

Note: It is strongly advised that you do not add any handlers other than NullHandler to your library’s loggers. This is because the configuration of handlers is the prerogative of the application developer who uses your library. The application developer knows their target audience and what handlers are most appropriate for their application: if you add handlers ‘under the hood’, you might well interfere with their ability to carry out unit tests and deliver logs which suit their requirements.

Additional Context

https://docs.python.org/3/howto/logging.html#configuring-logging-for-a-library

@jacobpennington
Copy link
Collaborator

Thanks for this. Yes, please open a pull request and I'll make tweaks if needed. I'm aware of the recommendation to not use the root logger, but I was unable to find a combination of settings that achieved what I wanted while still including error messages from other libraries.

@jacobpennington jacobpennington self-assigned this Sep 13, 2024
@jacobpennington jacobpennington added enhancement New feature or request labels Sep 13, 2024
@tabedzki
Copy link
Contributor Author

Before I open a PR, what is the use case for these logs? Are you looking for these logs to be generated while using the app in the GUI mode or are you intending this to be done when other people use kilosort in library mode (as demonstrated in my examples)?

@jacobpennington
Copy link
Collaborator

The same log needs to be generated whether run through the GUI or using the API.

@tabedzki
Copy link
Contributor Author

Since the GUI could have some GUI related errors that don't appear in the API version of the code, should those also be logged in the kilosort4.log or should those be written to a different log?

If they are meant to be in the same log file, then based off the structure, I'd suggest refactoring the code to remove the basicConfig from where it is currently initialized and instead have it set up in the GUI main.py since that becomes the entry point of the standalone application. Therefore, it won't setup the root logger when another person uses Kilosort as a library.

So then each file would have near the import statements

import logging 
logger = logging.getLogger(__name__)

and the gui.py would then define the log handlers.

I'll get around to making a first draft for this PR next week.


How long do the integration tests take to run? I am curious if they would be a good place to test the logging as part of development.

@tabedzki
Copy link
Contributor Author

On reflection, could you specify what you wanted you wanted to achieve in terms of logging at a high level? Knowing what you were looking to achieve would help me for the PR.

@jacobpennington
Copy link
Collaborator

As for GUI-specific errors, it would be most convenient if they go in the same log, but they could also go in a separate log if it makes things difficult. GUI errors are usually much easier for me to debug, so I'm not as worried about that.

The tests are fairly quick on GPU, they just take 1-2 minutes for me (or ~10 seconds if you skip the full pipeline test). You just have to make sure to add the appropriate flags. If you're adding some separate tests that only interact with logging, you probably don't need the full pipeline. So you could do pytest --gpu to just run the fast ones.

The general goals with logging were:

  1. Make it automatic. There should always be a kilosort4.log file generated any time someone uses kilosort.run_kilosort or clicks the "Run" button in the GUI, without the user needing to do anything.
  2. Include the full traceback for errors from Kilosort4 and errors from 3rd party packages (hence the big try-except statement in run_kilosort and gui.sorter).
  3. Send info level statements (and above) to both console and log, send debug level statements to log only.
  4. Exclude debug level spam from 3rd party packages like numba and matplotlib, which flood the log file with a lot of unnecessary information otherwise.

@tabedzki
Copy link
Contributor Author

So I've played around with trying to get all the requirements working and believe I have been able to get something that addresses most of the points without using the root logger by instead using a module based logger called kilosort or just __package__. This works nicely as anything defined for the package logger overall is also defined for subloggers when doing logger = logging.getLogger(__name__) at the top of each file. I haven't addressed the GUI portion yet but one should be able to define the QtHandler as a handler using dictConfig. I'll have to get back to you on this.

I have been able to address the double traceback to the console. By attaching the console to the root logger only and setting the kilosort logger to propagate, you can get all the messages written to the console as well as the file while handling the different logging levels as we had talked about before. If we leave the propagate as True during the logger.exception() then we have a double traceback because one traceback comes from the logger and the other comes from the interpreter when it receives raisewithout any error handling. If we turn setpropagate=Falseright before thelogger.exception()and revert it to the old value right after, then the first traceback does not propagate up to the root logger/console and the only error that appears if you do not handle theraise`.

What are your thoughts on this workaround to prevent the double console? If you would like, I can share the test repo that I've been using to address/test the different logging configurations.

@jacobpennington
Copy link
Collaborator

Are you saying we would change every logger.debug(...) call to something like this?

set_propagate(False)  # pseudo-code
logger.debug(...)
set_propagate(True)

If so, I would rather not use that work-around. It adds a lot of extra mess to the code, and either myself or future devs will likely forget to do that for log statements that get added later which could get confusing. I think having double tracebacks is preferable to that.

@tabedzki
Copy link
Contributor Author

Hi Jacob. Since the double output only occurs when the program (A) logs the exception, (B) raises the error that then (C) is not handled by the user/program that called the function that raised the error. This occurs in only two places run_kilosort() and KiloSortWorker.run(). You would not have to do this on any logger.debug() or any other logger.exception() unless you also raise the error handled back to the interpreter.

Here is an example repo that seems to tick all the boxes specified above. I haven't written up the GUI logger yet. It seems that it is doable using a combination of the logging cookbook and this video (around 13 minutes) that explains how to use your own class in the logger.

@jacobpennington
Copy link
Collaborator

Sorry for the delayed response, I've been busy working on other problems. Yes, that looks like it does what it would need to do.

@tabedzki
Copy link
Contributor Author

Sorry for my delayed response Jacob. I am glad you like the idea. I'll try to implement it later this week or next week.

@jacobpennington
Copy link
Collaborator

Hello again,

I ended up adjusting the way we're handling logging in the most recent commit to deal with a separate issue, and figured out the missing piece for getting this to work the way I tried before. When you get a chance, can you please try pulling the latest changes and see if they fix the logging problems you were having with your use-case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants