-
-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feature] Added check for WiFi Clients #623
base: master
Are you sure you want to change the base?
Conversation
085383d
to
958bd6d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ordering in the alert settings tab seems alphabetical, so calling the two checks Max WiFi Clients and Min WiFi Clients causes the memory check to sit in between, which is bad ux. I think we should rename these to "WiFi Clients (Maximum)" and "WiFi Clients (Minimum)".
docs/user/checks.rst
Outdated
|
||
.. _wifi_client_check: | ||
|
||
WiFi Client |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WiFi Client | |
WiFi Clients (Maximum & Minimum) |
docs/user/checks.rst
Outdated
This check sends alerts based on the total number of WiFi Clients | ||
connected to a device. It sends two types of alerts: | ||
|
||
- **Maximum WiFi Users**: When the total number of WiFi clients connected |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- **Maximum WiFi Users**: When the total number of WiFi clients connected | |
- **Maximum WiFi Clients**: When the total number of WiFi clients connected |
I recommend consistency with the wording to avoid confusing readers.
docs/user/checks.rst
Outdated
provides valuable insights into the network's performance, signaling | ||
when a specific access point is overwhelmed by an excessive number of | ||
WiFi users. | ||
- **Minimum WiFi Users**: When the total number of WiFi clients connected |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- **Minimum WiFi Users**: When the total number of WiFi clients connected | |
- **Minimum WiFi Clients**: When the total number of WiFi clients connected |
Same here.
@@ -8,6 +8,7 @@ | |||
('openwisp_monitoring.check.classes.Ping', 'Ping'), | |||
('openwisp_monitoring.check.classes.ConfigApplied', 'Configuration Applied'), | |||
('openwisp_monitoring.check.classes.Iperf3', 'Iperf3'), | |||
('openwisp_monitoring.check.classes.WifiClient', 'Wifi Client'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
('openwisp_monitoring.check.classes.WifiClient', 'Wifi Client'), | |
('openwisp_monitoring.check.classes.WifiClients', 'Wifi Clients'), |
docs/user/settings.rst
Outdated
@@ -297,6 +297,60 @@ check when running on multiple servers. Make sure it is always greater | |||
than the total iperf3 check time, i.e. greater than the TCP + UDP test | |||
time. By default, it is set to **600 seconds (10 mins)**. | |||
|
|||
.. _openwisp_monitoring_auto_wifi_client_check: | |||
|
|||
``OPENWISP_MONITORING_AUTO_WIFI_CLIENT_CHECK`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use "clients" (plural) instead of client in all references to this feature.
}, | ||
}, | ||
}, | ||
'min_wifi_clients': { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'min_wifi_clients': { | |
'wifi_clients_min': { |
}, | ||
}, | ||
'min_wifi_clients': { | ||
'label': _('Min WiFi Clients'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar change here as for max
'min_wifi_clients': { | ||
'label': _('Min WiFi Clients'), | ||
'name': '{name}', | ||
'key': 'min_wifi_clients', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar change here as for max
'name': '{name}', | ||
'key': 'min_wifi_clients', | ||
'field_name': 'clients', | ||
'alert_settings': {'operator': '<', 'threshold': 0, 'tolerance': 240}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'alert_settings': {'operator': '<', 'threshold': 0, 'tolerance': 240}, | |
'alert_settings': {'operator': '<', 'threshold': 1, 'tolerance': 1440}, |
Shouldn't this be less than 1? Would less than zero trigger at all?
I would set a tolerance of 3 days by default (4320 minutes), which would indicate an AP hasn't been used for 3 days in a row, which would account for a full weekend plus festive day, a common occurrence in many countries.
'name': '{name}', | ||
'key': 'max_wifi_clients', | ||
'field_name': 'clients', | ||
'alert_settings': {'operator': '>', 'threshold': 40, 'tolerance': 60}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'alert_settings': {'operator': '>', 'threshold': 40, 'tolerance': 60}, | |
'alert_settings': {'operator': '>', 'threshold': 50, 'tolerance': 120}, |
b5cdd83
to
f5c0b05
Compare
def check(self, store=True): | ||
values = timeseries_db.read( | ||
key='wifi_clients', | ||
fields='COUNT(DISTINCT(clients))', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, please move the influxdb query language details to the timeseries DB backend.
'content_type': self.related_object._meta.label_lower, | ||
'object_id': str(self.related_object.pk), | ||
}, | ||
since="'{}'".format( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
@@ -962,7 +962,7 @@ def _is_crossed_by(self, current_value, time=None, retention_policy=None): | |||
if time is None: | |||
# retrieves latest measurements, ordered by most recent first | |||
points = self.metric.read( | |||
since=f'{self._tolerance_search_range}m', | |||
since=timezone.now() - timedelta(minutes=self._tolerance_search_range), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before this change, the code was iterating though points that was beyond the tolerance search range because of the broken InfluxQL query (time >= 15m
).
f7ecaac
to
a1fa913
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Testing it and waiting for more feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feedback from testing:
- As you had anticipated previously in a discussion we had, setting devices in PROBLEM state as soon as the check detects the configured minimum threshold of wifi clients is confusing for users, I think we could change the check to take into account the points of the a specified time range (eg: 3 days by default), we may need a new setting for this as using the tolerance wouldn't work, because this logic needs to be introduced in the checkitself.
- I have seen alerts for devices which have been offline for a few days, can we look into ways of avoiding the alert to trigger if the device is in critical state already (it's redundant)?
This reverts commit d68559c.
- The check counts the number of connected clients for the period defined in OPENWISP_MONITORING_WIFI_CLIENTS_MIN_CHECK_INTERVAL and OPENWISP_MONITORING_WIFI_CLIENTS_MAX_CHECK_INTERVAL - Don't send alerts for devices that have "critical" health status
a653668
to
0fc683e
Compare
docs/user/settings.rst
Outdated
<wifi_clients_check>` to monitor the maximum number of connected clients. | ||
|
||
This determines how far back the check looks when evaluating the maximum | ||
client threshold. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this sentence is not really human friendly.
I think we should explain that we look for clients which have connected between now and the interval of minutes specified there of something like that.
docs/user/settings.rst
Outdated
<wifi_clients_check>` to monitor the minimum number of connected clients. | ||
|
||
This determines how far back the check looks when evaluating the minimum | ||
clients threshold. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here and we can spend a few words on why this is a lot higher.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking better, let's continue testing a bit more!
Checklist
Description of Changes