Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python plans #17

Open
gndaskalova opened this issue Nov 20, 2018 · 7 comments
Open

Python plans #17

gndaskalova opened this issue Nov 20, 2018 · 7 comments
Assignees

Comments

@gndaskalova
Copy link
Contributor

  • Reproducible environments - conda environments
  • Jupyter notebooks - Jupyter Labs too since they are the future
  • Physical modeling nympy 2 (integration methods) @dvalters
  • Statistical modeling - intro, intermediate, advanced @dfulu
  • Making maps - cartopy
  • Timeseries tutorial - xarray
    • Moving average filter
  • Text analysis - e.g. with tweets @dfulu
  • plotly & data visualisation
  • pandas advanced data vis

Note which package versions each tutorial uses.

@gndaskalova
Copy link
Contributor Author

Reproducible environments - conda environments - introduce reproducible stuff at the start of each tutorial? If it's not a tutorial on its own.
Jupyter notebooks - Jupyter Labs too since they are the future
Physical modeling nympy 2 (integration methods) @dvalters
Statistical modeling - intro, intermediate, advanced @dfulu
Making maps - cartopy Ashley
Timeseries tutorial - xarray Ashley
Text analysis - e.g. with tweets @dfulu
plotly & data visualisation @dfulu
pandas advanced data vis @dvalters

@dvalters
Copy link
Member

dvalters commented Nov 20, 2018

So for Numerical Modelling with Python I think it would be good to split it into two parts, a 'simpler' lesson that is less mathematical using Cellular Automaton models (not much maths involved and easy to understand conceptually) and then a second tutorial (Part II) looking at a Finite Difference model in Python/Numpy (slightly more mathsy, but minimal). I would take the examples from my own work with flood models/weather models. So people can get a flavour with the first tuto and take it further in the second if they like.

Proposed dates for finishing the draft tutorials:

- [ ] Numerical Modelling with Python I: Building a flood model (Cellular Automaton model) -- Dec 30th
- [ ] Numerical Modelling with Python II: Building a weather 'forecasting' model (Finite Difference Model) -- Jan 30th '19 ?

If there's time, I'll do the extra Pandas II tutorial as well - I already have some material that I could use/extend for this. I'll update with a time for this later.

  • Visualising Geospatial Data II: Geopandas and more -- Feb/March '19

@dfulu
Copy link
Member

dfulu commented Nov 30, 2018

Unsupervised, Text analysis and interactive data vis

Here is my plan for the tutorials I'd like to make. All of them are machine learning type tutorials except for the Plotly / interactive data vis tutorial which is just general data vis skills.

Having made this list I realise how much I have taken on, but I'll be happy to keep working away until it is done. I am also not sure how optimistic I am being on time scales, especially towards the end of this list when I move to MCMC and gaussian processes. I guess we will have to see

Some of the items on this list could lead to a second tutorial at some point - definite case to be made for extended funding for coding club ;)

  • Text analysis and the application of unsupervised machine learning to text. Primarily LDA and NMF, maybe k-means for an easy starting point. A good portion of this should be cleaning up text for modelling with also using the Tweepy tool to scrape tweet data.

  • Again this could be split into 2 tutorials depending on depth required.

  • First tutorial -- Dec 30

  • Unsupervised modelling tutorial. Using unsup for pattern discovery and exploring your data. Possibly including k-means, PCA, t-SNE and self-organising maps.

  • This may be multiple tutorials. I think it is too much for one. If it needs to be broken up the logical break point(s) would be something like ((intro, k-means), PCA) , ((t-SNE,), (SOM,))

  • First tutorial -- Jan 30

  • Supervised ML. Focusing on explainable models (aka no neural networks). Include simple linear/polynomial regression fitting and maybe random forest.

  • basics probably doable in one tutorial -- Jan 30

  • Making interactive plots with Plotly, IPython widgets and making animations

  • probably one tutorials but could come back to Plotly and IPython widgets
    in follow on tutorial

  • First tutorial -- Feb 30

  • MCMC fitting and parameter estimation. Calculating the uncertainty in your model parameters.

  • basics probably doable in one tutorial -- March 30

  • Gaussian Processes tutorial. Making the most from a small data source

  • basics probably doable in one tutorial depending on depth -- Apr 30

PS. Accidentally unassigned this issue. I have no idea how I did it or how to fix it @dvalters , @gndaskalova

@dfulu dfulu unassigned dvalters and dfulu Nov 30, 2018
@smithara
Copy link

smithara commented Dec 3, 2018

I can work on two tutorials (titles TBC). I might also write up some recommendations about use of Jupyter & conda, aiming towards good workflows and reproducible science, but I'm not too sure about the best way to do this yet. If we instruct people to install certain packages etc, we should be consistent about how to do this - sometimes it might be appropriate for people to use isolated conda environments, for example.

  • Time series analysis with pandas (& xarray) -- Dec 30

    • Introduce a few general techniques: resampling, superposed epoch analysis / compositing, filtering, smoothing ..?
    • Demonstrate the techniques with pandas, then show something similar with xarray for more complicated data
  • Visualisation and working with map projections with cartopy (& xarray) -- Feb?

    • @dvalters: I should see what you have planned with visualisation and geopandas etc

My intention is to give instructions for working with pandas and cartopy independently and show xarray for some more advanced applications. I'm trying to identify some good examples to show with real data as I don't have something suitable yet.

@smithara
Copy link

I didn't include xarray in the time series tutorial, although I may update it with a small additional example / mention of xarray in the future. As for the cartopy tutorial, I'm not sure when I will have time to create it, so we can leave it to a workshop next semester.

@dvalters
Copy link
Member

dvalters commented Mar 8, 2019

I've been thinking about the numerical modelling workshops, and I think there may be better/more useful topics that can be covered for people based on feedback from some of my colleagues. (It is difficult to cover this kind of modelling in a short 2hr workshop without covering lots of maths...)

I will finish the geopandas one soon, but then I'm proposing the other 2 that I'm writing will be:

  • Objected-oriented programming with Python (introductory, why it is useful, how it helps your science code etc)
  • Writing Pythonic Code (i.e. best practices in Python, tips and tricks)

As these help with understanding concepts covered in the later tutorials

@smithara
Copy link

@dvalters FYI, I started something new on plotting an cartopy, with some reference to OOP: https://github.com/smithara/python_tutorials/blob/master/matplotlib_cartopy_subplots.ipynb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants