-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to model crash transitions ? #77
Comments
I was considering adding Can you let me know the exact case when you would need this? Based on the name lock_service, I assume, the lock is managed by a separate microservice/database. In that case, no same-action cleanup solution would really work, when the model is translated into real code during development. The real solution would require a separate fair action to detect and cleanup. |
Yeah exactly. We have a process that acquires a lock from the lock service and then that process can die while holding the lock. In production we have a periodic health check that can detect that the process has died and then it performs some cleanup. But i'm not sure how to model that detection. |
This will depend on how much you would want to model and the what happens when the node is working and thinks it is still holding the lock, but the lock service thinks the node failed and released the lock for some other process to acquire. In the real world, the implementation most likely will have a timeout on the lock so it will get released after the timeout, and the nodes acquiring it will send periodic heartbeats, the heartbeats acts as lease renewal of the lock. The heartbeat interval will be significantly shorter than the lock release timeout. Before each critical operation, it must check if it still holds the lock. (That is, the lock was last renewed less than the heartbeat interval) Since the system you are modeling, you are not interested in how the locks renewal and leases are implemented, you don't need to model the heartbeats, renewal etc, all we need to know is if the node still holds the lock. You can simply maintain a
|
Btw, the spec might be slightly simpler if lock_service.acquire_lock() returned a lock that you store as a state variable. The |
Thanks!, it's similar to what i've been doing. I didn't realize that i need a In my actual model there are several more lines and i'd like to model failing at any point, so it gets a bit tedious to check "holds_lock" everywhere. Implementing try/except makes sense to me, but maybe with different terminology. It's up to the author to think about if the model reflects reality ?. |
In a real implementation, isn't the lock always checked (either explicitly to check if the lock hasn't timeout or implicitly where the individual operations have a much shorter timeout compared to the lock timeout) before any critical operation? In most cases |
I want to model a serial action which can crash. Heres an example
it is ok to yield anywhere in this action, but if the action crashes I would like to cleanup some state and release the lock.
right now im using a work around where i make the action atomic and use something like
to place failures in between lines.
The text was updated successfully, but these errors were encountered: