Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fan: shutdown printer when a fan fails #6783

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

onnlucky
Copy link

@onnlucky onnlucky commented Jan 10, 2025

This PR shuts down the printer with a message when a fan with a tachometer doesn't read a fan speed when it should.

Basically, if we are at least ten seconds after the fan was last scheduled to turn on, and the rpm is zero, then it reports an error.

@onnlucky onnlucky force-pushed the feature/fan-failure branch 2 times, most recently from 5323d1c to 59a7eec Compare January 10, 2025 20:54
@Sineos
Copy link
Collaborator

Sineos commented Jan 11, 2025

Thank you for submitting a PR, please be aware that you need to sign off to accept the developer certificate of origin, please see point 3 in master/docs/CONTRIBUTING.md#what-to-expect-in-a-review

In addition you might want to cross-check the discussion here: #6429

if rpm > 0:
return
msg = "Fan '%s' Failure" % (self.name,)
self.printer.invoke_shutdown(msg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like it would be much better to have another Fan parameter which is whether to actually shutdown the print based on the tacho feedback discrepency.

There's likely people out there with defective fans that don't want their prints to crash just because the tacho feedback doesn't work 100% reliably (or that have their fans misconfigured as tacho style when they aren't).

Default behaviour should be consistent with current behaviour (i.e. no shutdown), and should be opt-in.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's likely people out there with defective fans that don't want their prints to crash just because the tacho feedback doesn't work 100% reliably

IMO, this is a very strange reasoning.
Honestly, if you want a tach signal, you want to secure critical components. Adding a tach signal to ramshackle hardware just to have "something that does not work 100%" displayed in a web interface seems nonsensical.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, this is a very strange reasoning.

Are you suggesting there is no one out there like this?
They would currently have a printer that works 'reliably' (i.e. their prints succeed).
This PR proposal as it stands will start having their prints fail.. after just upgrading Klipper... but for what gain?

When introducing behaviour that would result in issues for these people, then the behaviour should be opt in.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see the argument to make this opt-in, because it has the potential to shutdown printers that are currently working. However, it is not like a printer that can reliably do multi-hour prints today, will now fail in the second hour. It will show the error, and the fan in question, probably within 10 minutes, if not immediately.

On the other hand, how many working printers are there today, for which a fan will break within the next year? For those owners, after an upgrade, their printer will error, instead of potentially damaging itself.

And it would seem to me that klipper prefers safety. It won't make you do safe moves when not homed. Or fail if a hotend heats outside of expectations. Etc.

Also, to opt out of this new error behavior, the tachometer can simply be removed from the config. What use is a glitching or non functional tachometer anyhow?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all tacho fans are critical though. If I've got Klipper running two enclosure fans with tachometers, and I decide to replace one of them mid-print... this shouldn't be an issue right? since it's not necessarily a critical fan.
But I still want to have tacho feedback, because I did want some ability to monitor if the fan has failed, even if that monitoring isn't intended to cause a panic and an immediate shutdown.
Ditto with...
https://www.klipper3d.org/Config_Reference.html#controller_fan
https://www.klipper3d.org/Config_Reference.html#temperature_fan
https://www.klipper3d.org/Config_Reference.html#fan_generic

Lots of people have likely configured tachometer fans just because they like to see the RPM.. not because the RPM is ultra critical to their successful printing. Forcing this RPM to now be ultra critical seems like a bad move.

I get that you want to have RPM monitoring shutdown your printer in certain situations. And having this as opt-in would entirely allow that. Having it opt-in would also leave everyone else unimpacted unless they wanted to.

A new parameter which could be tachometer_fault_action could be a good option, that could have options like none (default), warn, shutdown
That way people that don't care are unimapcted, people that care a little can configure it to produce warnings on discrepencies, people that care a lot can have it shutdown the print on discrepencies.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good points. And thanks for the feedback.

I've added that tachometer_fault_action, with None as default option. Not exactly sure how to handle the warning case. I'm now logging via import logging, that is what other code seems to do. And to prevent total spamming, it only warns once per minute. Could also do that only once. Or only once per time the speed goes from 0 to some value.

However, I would prefer the default to be shutdown. WDYT?

Also, I haven't tested this version yet. It is very time consuming for me to run this code and testing the options, as I have only one printer, and it doesn't even run mainline. Is there not a way to build and run unit tests for klippy?

(Also, don't disconnect things while running, unless designed to be hot pluggable, or chances are good it will cause damage.)

@onnlucky onnlucky force-pushed the feature/fan-failure branch from 89a5361 to dd267f6 Compare January 11, 2025 19:56
@onnlucky onnlucky changed the title shutdown printer if a fan fails fan: shutdown printer when a fan fails Jan 11, 2025
@onnlucky onnlucky force-pushed the feature/fan-failure branch from 37335d2 to d4bcff9 Compare January 12, 2025 10:41
@KevinOConnor
Copy link
Collaborator

Thanks. We can certainly add a shutdown safety check for tachometers.

A couple of notes on the implementation:

  • It's not valid to perform a safety check from the get_status() method as nothing assures that method is called regularly. It gets called upon request from a GUI (the API server) or from a macro. So, if those sources don't request a fan status then the safety check wouldn't be run.
  • The fan speed requests are queued well in advance of the reception of tachometer reports. It isn't immediately clear to me that this code handles that correctly (for example, will it incorrectly shutdown if a fan enable is queued far in the future, or will it correctly shutdown if there are a constant stream of fan speed changes being queued). Can you briefly explain how the code is handling these types of corner cases?
  • I think it would be preferable for the safety checking code to be in the FanTachometer class and not the Fan class. (To get access to the requested fan speed, the Fan class can call a new self.tachometer.note_fan_speed() on speed changes.)
  • For what it is worth, the tachometer_fault_action config options seems a little unwieldy to me. Perhaps implement a tachometer_minimum_rpm option instead (and a setting of zero would disable shutdowns)?
  • I agree it would be useful to disable shutdowns for fans. For what it is worth, I don't have a preference if the default is to shutdown or the default is to not shutdown. If it is felt the default should be to shutdown then Config_Changes.md should state that, and if the default is not changing then we would not add a new entry to Config_Changes.md.

Cheers,
-Kevin

@onnlucky
Copy link
Author

  • It's not valid to perform a safety check from the get_status() method as nothing assures that method is called regularly. It gets called upon request from a GUI (the API server) or from a macro. So, if those sources don't request a fan status then the safety check wouldn't be run.

Makes sense.

  • The fan speed requests are queued well in advance of the reception of tachometer reports. It isn't immediately clear to me that this code handles that correctly (for example, will it incorrectly shutdown if a fan enable is queued far in the future, or will it correctly shutdown if there are a constant stream of fan speed changes being queued). Can you briefly explain how the code is handling these types of corner cases?

It tracks the scheduled print time. And when verifying, it also uses print times. In this last version, it actually uses the mcu's counter_state update event time (in print time). So unless the MCU is running behind its own schedule, this should work. (Right?)

It now only tracks when the fan first turns on. And allows for 10 seconds of grace time for the fan to start.

There are corner cases this wont catch, indeed. Like the moment the speed is scheduled to become 0, there will be no more checking until after that time. And if the fan is turning on and off in less the grace period, it will never check. We can probably make that grace period a lot smaller, I haven't experimented with that too much yet.

With a lot more code, we could make a queue of times the fan should be on or off, and when a counter events come in, check where we should be in that queue. But that is probably overkill. And will still not be able to catch error if fans are turned on for less than the grace period.

  • I think it would be preferable for the safety checking code to be in the FanTachometer class and not the Fan class. (To get access to the requested fan speed, the Fan class can call a new self.tachometer.note_fan_speed() on speed changes.)

It's now callbacks from the mcu counter state event. Which is setup to happen once per second.

  • For what it is worth, the tachometer_fault_action config options seems a little unwieldy to me. Perhaps implement a tachometer_minimum_rpm option instead (and a setting of zero would disable shutdowns)?

The relationship between speed and rpm is not particularly well defined. On the other hand, perhaps fans can fail, while still running, but very slowly? I'll try this in a next update.

  • I agree it would be useful to disable shutdowns for fans. For what it is worth, I don't have a preference if the default is to shutdown or the default is to not shutdown. If it is felt the default should be to shutdown then Config_Changes.md should state that, and if the default is not changing then we would not add a new entry to Config_Changes.md.

👍

Cheers, -Kevin

Thanks for the feedback.

A fan with a tachometer will now check if the fan is running when it is
supposed to. And shutdown the printer with an error if not.

Signed-off-by: Onne Gorter <[email protected]>
Add a new setting tachometer_fault_action with default setting of None.
Can also be set to warning or shutdown.
Hook up tachometer updates via fequency_counter callbacks.
And hook up fan errors via a tachometer error_callback.

Also simplify tachometer.get_rmp and fan.get_status.
@onnlucky onnlucky force-pushed the feature/fan-failure branch from 739374b to 212847e Compare January 25, 2025 09:36
@onnlucky
Copy link
Author

Apologies for not moving faster, I did not have a lot of time recently.

I've now made a queue for the checks. I've tested it decently well on my machine. And I've reworked the setting to be tachometer_min_rpm. Defaults to 1, which will just check the fan is at least running when it should. Can be set to something close to the maximum rpm of the fan. The code checks rpm >= speed * min_rpm.

I've hardcoded a 2.5 seconds period for the fan and measurements to catch up. Shorter also works for me. But maybe some fans are slower to get up to speed. But making this too long means we wont be checking when the fan switches speeds a lot.

@onnlucky onnlucky requested review from bevanweiss and Sineos January 25, 2025 14:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants