Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema: change sort order of history event type enum #626

Merged
merged 1 commit into from
Aug 7, 2023

Conversation

julianbrost
Copy link
Contributor

@julianbrost julianbrost commented Aug 1, 2023

This improves the resulting sort order when ORDER BY event_time, event_type is used. state_change comes first as it can cause many of the other events like trigger downtimes, remove acknowledgements and send notifications. Similarly, notification comes last as any other event can result in a notification. This will result in history events for scenarios like state changes, triggers downtime, sends downtime start notification being sorted in that order.

Apart from that, end events sort before the corresponding start events as any ack/comment/downtime/flapping period should last for more than a millisecond, therefore if there should be two events within the same millisecond, the end event corresponds to the older period and is sorted first. The remaining order is by impact and cause, see inline comment or #626 (comment).

Both schema upgrades will lock the history table for a few minutes, blocking other read and write access including by the Icinga DB daemon and Icinga DB Web. On my (totally representative) laptop, this was about 2 minutes for 19M rows in MariaDB and 8 minutes for 22M rows in PostgreSQL (worst-case times, I've also seen it faster on both).

If you're wondering about why the default value for PostgreSQL is changed: I don't think that this does anything useful, it's just the first enum value to emulate that default behavior from MySQL.

For the review

  • It's better if more people look at this, we really don't want to have to change this again.
  • Please double check the ordering, ideally make your own thoughts and check if they are compatible with my suggestion.

fixes #614
refs Icinga/icingadb-web#587

@julianbrost julianbrost added this to the 1.1.1 milestone Aug 1, 2023
@cla-bot cla-bot bot added the cla/signed label Aug 1, 2023
@julianbrost julianbrost changed the base branch from master to fix/indexes-for-host-and-service-group-615 August 1, 2023 15:04
Base automatically changed from fix/indexes-for-host-and-service-group-615 to master August 2, 2023 09:20
@julianbrost julianbrost force-pushed the history-event-type-order branch 2 times, most recently from e5d2cba to 5ba31cd Compare August 3, 2023 09:15
@julianbrost julianbrost marked this pull request as ready for review August 3, 2023 09:17
Copy link
Member

@Al2Klimov Al2Klimov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record: the change as such is not super critical. (We're talking about two events for the same object in the same millisecond.) Neither is the schema upgrade which is just for consistency.

Suggestion

Defer this to 1.2.0 to make the bugfix release more attractive.

schema/mysql/schema.sql Outdated Show resolved Hide resolved
schema/pgsql/upgrades/1.2.0.sql Show resolved Hide resolved
schema/mysql/upgrades/1.2.0.sql Outdated Show resolved Hide resolved
schema/mysql/upgrades/1.2.0.sql Outdated Show resolved Hide resolved
schema/mysql/upgrades/1.2.0.sql Show resolved Hide resolved
@julianbrost
Copy link
Contributor Author

(We're talking about two events for the same object in the same millisecond.)

Which, however isn't some esoteric edge case but something that actually happens as flexible downtime starts and notifications are written with the same timestamp as the corresponding state change. So pretty much the main goal of this change is to get this sorted properly in the history.

schema/mysql/schema.sql Outdated Show resolved Hide resolved
Copy link
Member

@nilmerg nilmerg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can only comment on the ordering, which is fine.

@julianbrost julianbrost mentioned this pull request Aug 3, 2023
-- remove acknowledgements and send notifications. Similarly, notification comes last as any other event can result
-- in a notification. End events sort before the corresponding start events as any ack/comment/downtime/flapping
-- period should last for more than a millisecond, therefore, the old period ends first and then the new one starts.
event_type enum('state_change', 'ack_clear', 'comment_remove', 'downtime_end', 'flapping_end', 'ack_set', 'comment_add', 'downtime_start', 'flapping_start', 'notification') NOT NULL,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would order as follows: state_change, ack_clear, downtime_end, flapping_end, comment_remove, comment_add, flapping_start, downtime_start, ack_set, notification.

Reasoning:

  • Comments are informative.
  • Flapping is automatic and changes mechanics.
  • Downtimes are semi-automatic, require user action (or configuration) and change mechanics.
  • Acks are pure user actions and change mechanics.

In short, the ordering indicates whether the event changes Icinga mechanics and whether its origin is Icinga or the user.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only changes the ordering within the start and end events (without mixing the two groups), so basically only changes where I've arbitrarily chosen to sort alphabetically, so no complaints from me.

Reasoning: [...] In short, the ordering indicates whether the event changes Icinga mechanics and whether its origin is Icinga or the user.

So basically assigns the events some kind of importance (ack being the most important and comment the least), so that an ack_set will show up as the most recent entry (besides notifications).

This improves the resulting sort order when `ORDER BY event_time, event_type`
is used. `state_change` comes first as it can cause many of the other events
like trigger downtimes, remove acknowledgements and send notifications.
Similarly, `notification` comes last as any other event can result in a
notification. This will result in history events for scenarios like state
changes, triggers downtime, sends downtime start notification being sorted in
that order.

Apart from that, end events sort before the corresponding start events as any
ack/comment/downtime/flapping period should last for more than a millisecond,
therefore if there should be two events within the same millisecond, the end
event corresponds to the older period and is sorted first.
Copy link
Member

@Al2Klimov Al2Klimov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IF we can make the order EVEN smarter, I don't know how 👍

@julianbrost julianbrost merged commit 1ecea76 into main Aug 7, 2023
31 checks passed
@julianbrost julianbrost deleted the history-event-type-order branch August 7, 2023 11:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prepare history table for order by event_time desc, event_type desc
4 participants