-
Notifications
You must be signed in to change notification settings - Fork 0
/
2020-01-14_Mixed-Effect-Models.Rmd
77 lines (66 loc) · 3.31 KB
/
2020-01-14_Mixed-Effect-Models.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
---
title: "Mixed Effects Models and other notes from the session"
author: "Lukas Schmid"
date: "9 1 2020"
output:
github_document
---
# exams
## example code
- everybody got the best grade
## 48-hours exam
### formalities
- we have to send in five to ten pages (PDF)
- Henrik will not read the any more pages
- we will be sent the code at 10:00 on Saturday, 1st of March
- we have to send the code back until 10:00 on Monday, 3rd of March
- there will be an additional 5% on the grade for writing a [Haiku](https://ja.wikipedia.org/wiki/%E4%BF%B3%E5%8F%A5) about the data
- our task is to find the most important patterns of the data, "What is the
most important message in the data?"
- there will be different datasets (four to nine) which everyone of us will be
assigned to (groups of three to four people)
> I never said you should work together. But it would be stupid not to. *(Henrik von Werden)*
### content and structure
- tests can be used, but we do not have to explain how they work statistically
- how to structure the code
- tell a story about the data!
- Chan suggests to look at Lukas S.'s code about the treering-datasets:
communicate what questions you have asked yourself when you looked at the
code
- general starting point: nature of the dataset, descriptive statistics,
formulate assumptions and main questions (Henriks ideal number for research
question is five)
- focus on specifics that help you answer your question
- leave all possible specifications at the default ones provided by RStudio
### evaluation criteria that Henrik
- steps of analysis that do not contribute to (one of the) question(s) that
were posed will be counted towards a bad evaluation
- results should answer the question (or draw the conclusion that the question
can not be answered by the data)
- graphics have to adhere to common standards (labels, legends, etc.); they can
also be altered in size
- plots should not reflect the person's skills, but try to present information
as precisely as possible
- try to maintain a balance between the different parts (= answers to the
different questions)
# writing code coherently
- some code was well explained, other not so much
- we need a way to make sure that shared code works on different machines:
- **send the plain code (.R or .Rmd)** and a **PDF generated from R Markdown**
- additionally, everyone can send the report in any other format (e.g. .html)
for more interactive documents
# Mixed Effects Models
- continuation of generalised linear models
- they are very complicated and we might never need them
- baseline example: How does a drug work, not how a drug works *on a specific
person*, even though drugs do have varying effects on different people.
- way it works: takes in different factor variables (e.g. different hospitals)
as input
- e.g. medical studies try to eliminate the influence of general health
(smoking, sports, obesity)
- these factors are called random factors (even though they are not
technically 'random', only in relation to the question)
- in R, there are certain packages to calculate mixed effects models (sometimes
called multilevel models): `lme4` for multilevel modeling in general, `lmerTest` for p-values in these, and `sjstats` for some tools that surround
mixed effects models
- for a visualisation (coded in R), see <http://mfviz.com/hierarchical-models/>