Add guide on how to estimate clade frequencies #53
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of proposed changes
Adds the Jupyter notebook and corresponding restructured text version of a how-to guide to estimate clade frequencies from SARS-CoV-2 data.
An open question with this guide (and others like it in the future) is where we should source the data. The benefit to the current approach is that it does not require users to prepare any data in advance; data are fetched from the live Nextstrain builds. The disadvantages of this approach are that the guide's static figures quickly diverge from current data and we don't show users how to load their own local data which may be much more relevant.
Another potential issue is how we should maintain guides like this that we generate directly from a notebook environment. To prepare this guide for the docs, I had to manually copy images into the
images
directory and rename them for clarity. The HTML/CSS presentation of tables is also not ideal. We might want to standardize these steps for future guides, even if the standards are a checklist in the documentation's documentation.Testing
The initial guide was tested by @kistlerk and this version reflects edits based on (most of) her comments. One comment I did not address here was a suggestion to allow users to source their own local data for the guide instead of fetching the live Nextstrain data (see discussion above).
The guide is available through this PR's RTD build.