Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gitolite.conf not correctly updated while Repository.fetch_changesets is running #117

Closed
juselius opened this issue Nov 7, 2011 · 24 comments

Comments

@juselius
Copy link

juselius commented Nov 7, 2011

I recently converted a largeish project, with 1.5 million lines of code and 15 years worth of commit logs, from SVN + Trac, to git + redmine. In the process it became painfully evident that there is a severe performance problem with Repository.fetch_changesets: The initial run took no less than 52 hours!!! After the first run, it now takes "only" 2-3 minutes to chew through all projects. I know the performance of fetch_changesets has been a topic of discussion at the redmine site, and I strongly feel that this issue must be somehow resolved. The problem is that I don't know if this is an issue of redmine_git_hosting or redmine.

During the initial update we noticed a nasty problem: While fetch_changesets was running, gitolite.conf was not updated correctly for any project: Sometimes users were correctly added or removed to projects (we have about 100 different projects), but most of the time they were not. We now have a mess with projects where redmine thinks a user is member, but gitolite does not, and vice versa.

@rcross
Copy link

rcross commented Nov 7, 2011

this might be related to a problem I found. While the performance issue definitely needs to be addressed, I think that a work around that would help (and might help other problems too) would be if there was a way to regenerate the gitolite config file. Or at least be able to generate a file will all the current redmine-based permissions, which you can check against the live gitolite config file.

@juselius
Copy link
Author

Ok, so the situation is much, much worse than I suspected. The massive fetch_changesets finished on Sunday evening. On Monday morning everything looked like it was working again, but now I get mails from numerous users that their keys are not working. Reinstalling or adding new keys does not help either. The last gitolite configuration update is from 10 am Monday morning, and since then no modifications have been recorded. This is quickly becoming catastrophic! I need to reboot the server, so I hope this will help. I'm pushing 10-12 hour days already, and I simply do not have time to debug this. Sorry.

@juselius
Copy link
Author

Rebooting did not help. Gitolite is stone dead.

@ghost
Copy link

ghost commented Nov 10, 2011

Same problem here. 90 users not getting their ssh keys propagated through the system.
No information seems to move between redmine and gitolite.
At this moment new users signing up to our redmine server cannot access the git repository via ssh keys.
A bugfix would be highly appreciated.

@kubitron
Copy link

Ok. I've been tracing down these issues a bit.

First, the problem with long fetch-changesets performance is an obvious one that I have been talking to the Redmine folks about (but think will have to be fixed in this plugin, given the politics). Will provide patch. (Problem is that fetch_changesets does a save to the repository for every revision it looks at, which causes the plugin to perform update_repositories over and over again). Ack!).

The problem with inconsistencies in the gitolite config file is tricky, but I believe can be traced to a situation when two different threads are both trying to run the update_repositories routine in lib/git_hosting.rb. If one of these updates takes too long (5 seconds), then the other gives up and doesn't update the config file/keys. I am tempted to increase the time and put an error in the long to confirm premature exit. Note that the performance bug above exacerbates this problem, because the update_repositories gets run so much.

As for "gitolite being dead". I encountered this problem. It happens because something (someone?) removes the administrative key from the gitolite/.ssh/authorized_keys file!. Not sure how this happens yet, but it can be fixed by running gl_setup with the proper public key as an argument.

@kubitron
Copy link

Actually, silly me. The last item I mentioned above is Issue #110 and has been reported by someone else.

@juselius
Copy link
Author

I have checked, and the administrative key has not been removed from authorized_keys. However, when I look in my Apache error log, I find the following error:

fatal: Not a git repository: '/tmp/redmine_git_hosting/gitolite-admin/.git'.

The directory /tmp/redmine_git_hosting/gitolite-admin exists, but is empty. This is strange, because I can run

GIT_SSH=/tmp/redmine_git_hosting/gitolite_admin_ssh git clone [email protected]:gitolite-admin.git

without problems as user redmine. I have not checked the code yet, but this indicates that gitolite-admin has been checked out, and later deleted without deleting the directory. The next time git tries to clone gitolite-admin, it fails because the directory is already there. I'll see if I can identify the problem later.

kubitron: Really fantastic if you have a fix for the performance problems! :) I had a quick look at the code some months ago, and realized that it fetch_changesets is causing a storm of git clones, pulls and what-have-yous. But since I'm not much of a Ruby/Rails programmer I postponed my plans of trying to fix the problem...

@juselius
Copy link
Author

Does anybody know how to regenerate gitolite.conf and keydir from the Redmine database? I now have a whole bunch of users who have registered keys which have never made it into gitolite. The same goes for user project permissions. Redmine has all the correct information, but gitolite has not been updated.

@kubitron
Copy link

Ok. Bakerjonas.

I ran into the problem you mention. Just delete the /tmp/redmine_git_hosting directory and restart redmine. This will probably help a lot for you (the code that does the checkout is a bit primitive and isn't able to understand the existance of an empty gitolite-admin directory). This begs the question of how such a situation could occur.

As for regenerating, do you run the sys/fetch-changesets process in the background? I believe that this has the potential to fix the gitolite.conf file -- after you do what I suggest in the previous paragraph.

@kubitron
Copy link

I'll get you a patch for the performance issue as soon as I test it. (I had one that modified redmine proper, but it appears that it won't be incorporated).

@juselius
Copy link
Author

The first thing I did when I realized there was problem with the /tmp/redmine_git_hosting/gitolite-admin directory, was to delete it and restart. Unfortunately, it didn't take long before it was corrupted again. How this happens is a mystery to me. I noticed that there might have been a problem related to the use of sudo in the local cron script which runs the fetch_changesets script every 5 min. I have changed the script, and in a few hours I will know.

Anyway, I reset everything and ran fetch_changesets by hand. You were right kubitron, then gitolite was finally updated! I'm not 100% sure it's entirely correct though. It seems that only users and keys which have been added to projects have been updated. Users who were removed from projects still have their keys listed in gitolite.conf. I have to triple check this, but I'm pretty sure this is the case.

I have also spent a few hours learning more detailed Ruby and Rails (having a Python/Django background it's pretty straight forward), so now I'm much better equipped for debugging ;)

@kubitron
Copy link

Um. So, /tmp/redmine_git_hosting/gitolite-admin keeps getting corrupted by being empty?

@juselius
Copy link
Author

It looks like it's working now. It seems there was a problem with my cron script. It was running as root (it did a bunch of things before running fetch_changesets) , and used sudo to run fetch_changesets as user redmine. Something in the environment must have been initialized wrong (I'm guessing), because I changed the cron script to run as user redmine and now fetch_changesets works correctly. Weird, but strange ;)

@kubitron
Copy link

Ok. I have a patch that should fix the performance problems, if you are game to give it a try. It is the head commit on my branch:

https://github.com/kubitron/redmine_git_hosting

This patch should reduce your 52 hour times to something much shorter...!

The other commit adds selinux support, if you like. I'll post some additional patches to help with some of the weird problems we are seeing here.

@juselius
Copy link
Author

Thank you kubitron! I have applied all your patches (although we are not running the SE parts). As far as I can tell, things are working fine, and the runtime of fetch_changesets dropped from approx. 2 minutes to 30 seconds! :) Great work! In a day or two we'll know if it solves all the problems. Best regards, Jonas.

@kubitron
Copy link

Glad that this helps.

These patches will not fix the synchronization errors that you have, but should make them less frequent. I will put up another set of patches to help with that. I just wanted to get the performance fixes out first, since they will greatly reduce the chance for synchronization problems.

Let me know if there are problems with these patches. The Selinux patch that you applied will not get in your way unless you enable it. (That was my intention -- simply provide an option if people needed it). Note that there is a new bin directory in the top level of the plugin with binaries in it. Also, there is one extra level in the tmp file directory (things are in /tmp/redmine_git_hosting/git-user-name/gitolite-admin). This should all be transparent.

One thing that would be helpful -- look into your production.log file. You should see a vastly reduced number of reference to fetching of the gitolite-admin. Do you see this?

@juselius
Copy link
Author

Hi Kubitron! Yes the number of gitolite-admin updates has been significantly reduced, so the performance patch works wonders! I have now applied the patch to three different redmine servers, and they all work fine. If I don't get any complaints in 5 days, I think we can say that it works perfectly like intended :) I agree that the chances of a synchronization error is much smaller, now that fetch_changesets runs in approx. 10 seconds (if there are no updates). Great work!

@kubitron
Copy link

I added a slight patch which you can upload (I had a minor bug). Should help a bit more. (Don't forget to execute db:migrate_plugins).

Also, I have a rewrite of the internals of the update code which should resynch the gitolite.conf file under a variety of errors conditions. Will upload as soon as I test it more. This code fixes a variety of bad gitolite.conf problems, completely resyncs the keydir directory, and even fixes some cases in which the administrative key gets disconnected....

@ghost
Copy link

ghost commented Nov 19, 2011

thanks a lot kubitron!

@juselius
Copy link
Author

Thanks kubitron! I got a whole bunch of conflicts when I did a pull from your master branch (from which I had pulled earlier). I didn't have time to look at the details. Instead I just cloned your repository. So far everything seems to work smoothly!

@kubitron
Copy link

On 11/24/2011 3:11 AM, bakerjonas wrote:

Thanks kubitron! I got a whole bunch of conflicts when I did a pull from your master branch (from which I had pulled earlier).
I didn't have time to look at the details. Instead I just cloned your repository. So far everything seems to work smoothly!

Sorry about that. I decided to squash out some some updates to try to keep the number small (in case they ever decide to actually incorporate my changes into the original). I won't do that again.

Are you interested in trying a rewritten version (new, non-conflicting update) that is able to self-correct from a variety of problems?

--KUBI--

@kubitron
Copy link

Ok, guys. If you are game, I just uploaded a new set of patches on my master branch which should make the gitolite configuration largely self-correcting. Check out:

https://github.com/kubitron/redmine_git_hosting

Consider this Beta++ (should be pretty stable, but interested in what you guys think). You need to get all of the commits off the master branch (there are 5 of them). Don't forget to migrate plugins again:

rake db:migrate_plugins RAILS_ENV=production

Note that the log file should be much more informative about errors that have/are occurring.

@mindo
Copy link

mindo commented Dec 6, 2011

I have it running without problems. All the perks I had with the original plugin are gone and I don't have any stability issue.

@kubitron
Copy link

kubitron commented Dec 6, 2011

Great!

Anyone else find that my new version is more stable? I would like to get a wider set of feedback (so that I could call it better than Beta++). If so, please register your feedback on my pull request #124.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants