-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zpool health state as value #2
Comments
Hi Ben, The way I handle this with influxdb is to put the state as a field in its decoded form. However, influxdb allows string (enums) for its fields and, sadly, prometheus only allows floats for its values. So we could add the state as a string (enum) to the label. Thoughts? |
Or, we could expose the states directly as numbers and document the values while not trying to apply the UI logic. This would be trivial. For reference, the internal state numbers are documented starting here: |
I think that's the right call. Grafana should be able to apply that logic if we want graphs that match zpool status output. And of course the STATE label should remain as it is now. I would be happy to contribute my dashboard once I've built it (no promises on quality, I'm not a professional dashboard builder!). It looks like the zpool status logic is here: However, that includes the state SPLIT which I haven't seen documented anywhere... . Thank you! |
that is exactly right, there are "user" states that aren't described in one specific place, not even in zpool_state_to_name(). I've got some other changes to push this weekend (adding size distribution histograms) and I'll add the trivial state values. Then we can work from there to see if there is a better way to consume the info. FYI, split occurs when the zpool split command is used... not a commonly used command. But it shows how hardcoding the enum values becomes hard to maintain in dashboards. |
Oh, just realized that we rely on vdev mapping to go from ZFS vdev to physical location, so it would be most helpful if the path were included in the health metric. For example, we name our devices : and have paths like /dev/disk/by-vdev/1:7... I'm only seeing these as labels on some of the metrics. |
We began work on a python parser before finding your project (I found 4 other zfs exporters for prometheus through google, but it took combing through ZFS on Linux issues to find this!). We had decided that as long as we add new states to the end of the enum it should be safe enough. I hope that is what they will do if they need to add any internal states to ZFS. I modified the Ubuntu CmakeLists.txt to enable CPACK creation of a DEB package. I'll contribute that shortly. |
Hi Ben, can you take a look at #5 and give feedback? |
We have adopted zpool_prometheus for use on one of our clusters. We are prometheus+grafana users with some existing detailed dashboards.
I would very much like to be able to alert on health state and I cannot find a way to do this with the health state information in a label. I can make some useful graphs with the multistat plugin, grouped by label values, but that panel doesn't seem to support alerting.
Other zfs exporters index the health values like this:
0 ONLINE
1 DEGRADED
2 FAULTED
3 OFFLINE
4 UNAVAIL
5 REMOVED
6 AVAIL
7 INUSE
-1 no data/timeout
Is this something you would consider adding? No existing metrics would change, it would just be one additional metric per vdev.
Perhaps I am missing a way to do this in Grafana?
The text was updated successfully, but these errors were encountered: