diff --git a/lectures/pandas/data_clean.md b/lectures/pandas/data_clean.md index c7452768..328c4c26 100644 --- a/lectures/pandas/data_clean.md +++ b/lectures/pandas/data_clean.md @@ -254,12 +254,12 @@ df.fillna(value=100) ```{code-cell} python # use the _next_ valid observation to fill the missing data -df.fillna(method="bfill") +df.bfill() # in new versions of pandas, bfill will directly fill missing data ``` ```{code-cell} python # use the _previous_ valid observation to fill missing data -df.fillna(method="ffill") +df.ffill() ``` We will see more examples of dealing with missing data in future diff --git a/lectures/pandas/groupby.md b/lectures/pandas/groupby.md index fb3f249c..0b81b9e9 100644 --- a/lectures/pandas/groupby.md +++ b/lectures/pandas/groupby.md @@ -213,7 +213,7 @@ def smallest_by_b(df): ``` ```{code-cell} python -gbA.apply(smallest_by_b) +gbA.apply(smallest_by_b, include_groups=False) ``` Notice that the return value from applying our series transform to `gbA` @@ -250,7 +250,7 @@ index and a `Date` column added. df2 = df.copy() df2["Date"] = pd.date_range( start=pd.Timestamp.today().strftime("%m/%d/%Y"), - freq="BQ", + freq="BQE", periods=df.shape[0] ) df2 = df2.set_index("A") @@ -260,7 +260,7 @@ df2 We can group by year. ```{code-cell} python -df2.groupby(pd.Grouper(key="Date", freq="A")).count() +df2.groupby(pd.Grouper(key="Date", freq="YE")).count() ``` We can group by the `A` level of the index. @@ -272,14 +272,14 @@ df2.groupby(pd.Grouper(level="A")).count() We can combine these to group by both. ```{code-cell} python -df2.groupby([pd.Grouper(key="Date", freq="A"), pd.Grouper(level="A")]).count() +df2.groupby([pd.Grouper(key="Date", freq="YE"), pd.Grouper(level="A")]).count() ``` And we can combine `pd.Grouper` with a string, where the string denotes a column name ```{code-cell} python -df2.groupby([pd.Grouper(key="Date", freq="A"), "B"]).count() +df2.groupby([pd.Grouper(key="Date", freq="YE"), "B"]).count() ``` ## Case Study: Airline Delays diff --git a/lectures/pandas/timeseries.md b/lectures/pandas/timeseries.md index 9a434a68..097cfc0a 100644 --- a/lectures/pandas/timeseries.md +++ b/lectures/pandas/timeseries.md @@ -442,7 +442,7 @@ Below are some examples. ```{code-cell} python # business quarter -btc_usd.resample("BQ").mean() +btc_usd.resample("BQE").mean() ``` Note that unlike with `rolling`, a single number is returned for diff --git a/lectures/tools/maps.md b/lectures/tools/maps.md index 359cbf5c..f1ce56a7 100644 --- a/lectures/tools/maps.md +++ b/lectures/tools/maps.md @@ -13,6 +13,7 @@ kernelspec: **Co-author** > - [Kim Ruhl *University of Wisconsin*](http://kimjruhl.com) +> - [Philip Solimine *UBC*](https://www.psolimine.net) **Prerequisites**