-
Notifications
You must be signed in to change notification settings - Fork 268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor towards using a database mutex #351
base: master
Are you sure you want to change the base?
Conversation
# Conflicts: # post_office/lockfile.py
Thanks for the PR. Having an optional DB backed lock is a good idea, however, I’d like to make DB based lock optional. So the file based lock should still be the default (at least for a period of time), but we should allow the user to pass in “lock_type” argument so they can choose which locking mechanism they want to use. Making locking mechanisms is often tricky, I had to make a few changes to the locking mechanism in the earlier years (even though the initial code is based on other people’s code). Does this PR reuse code which have been tested somewhere else or is this written from scratch? |
OK, that's understandable.
In a company I worked for, we used DB locks for all kinds of critical sections. This code was proprietary, written in C++ and worked only against OracleDB. I therefore looked for similar projects for Django. The one which came closest was https://github.com/ambitioninc/django-db-mutex What I really dislike there is, that a timeout exception is raised after a timed out task finishes. That imo doesn't make any sense at all. Instead I wanted a timeout, which really interrupts a running task, in case it gets stuck. Therefore you have to use processes (instead of threads), because the underlying ALRT-signal is sent to the main thread. Unfortunately the unit tests on Travis-CI fail for some reason I haven't understood yet. I will check this and re-add the file locking mechanism. |
@selwin I'm still struggling with running unit tests. Reason is the call to |
I was just thinking about this. Instead of a DB based lock, should we implement a memcached/Redis based lock first? Memcached/Redis based locks is significantly easier to implement since they're atomic and can be used on distributed systems too. |
I would stay with a database lock. It is the most portable solution, because a database is required anyway and it scales better than file based locks. Adding memcached or Redis to the technology stack adds another moving part, which might be discouraging for first- or rare-time users. This btw. does not answer my question about abandoning threads and moving multiprocessing into |
Oh yes, sorry about that.
I don't understand how multiprocessing is used in Django's request/response cycle. The API to send email is
So from my perspective, multiprocessing only really comes into play when I'm not familiar with Celery so I'd defer to your judgement. But I'd require that the Celery task that you trigger don't have a process/threads config that's separate from post office (or it should belong to a separate package altogether).
Ok, in this case I'd suggest also looking into the possibility of using an optional dependency (if there's any) approach. |
Hi @selwin
thanks for merging the last pull request, it emphasizes how important locking is! However using a file for locking, in my opinion, is not a good option anymore. With the advent of Docker/Kubernetes, very often the web application and the Celery worker run on different containers, without sharing a volume which could be used for keeping that lockfile. It also makes the handling of such a lockfile difficult, because it usually is created with a temporary name.
I therefore propose the switch to database locking. This technique is commonly used across many applications. This pull request implements database locking in a reusable manner. Please check the docstrings and the examples in the unit tests for a better understanding.