Optimize parser by removing repeated hash merges #515
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Changes
Thanks for the helpful gem!
While profiling the startup time for a large Rails app that's manually invoking
Dotenv.load
very early on in the boot process in order to get access to envvars ASAP, I noticed that the subsequentDotenv.load
being automatically called by this gem's provided railtie was taking an unusual amount of time to complete. The time was mostly being spent in the variable substitution module, on this line:Since
ENV
had already been loaded up with ~2,000 extra variables from the first run of dotenv, this hash merge is not a trivially cheap operation and it added up being run when parsing each line.From what I understand, the purpose of that line is to build a lookup table that gives priority to either envvars already in
ENV
or envvars from an earlier line in the file, depending on the value of the "overwrite" flag. We can make this operation unnecessary by updating the parser to simply skip over lines re-defining a variable already inENV
when "overwrite = false", leaving the variable substitution module not even having to worry about the prioritization.This leads to a modest performance improvement when parsing a large .env file, and a significant one when parsing a large .env file when
ENV
is already very populated by some other process (most likely a previous run of dotenv, but I can imagine there are other, less avoidable reasons this could happen as well):The .env file used for this benchmark was created from this script:
Validation
The RSpec test suite for this gem looks to be pretty thorough and it's all still passing after this change, so from that I don't believe that this will have any unintended changes in functionality.