Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passing "true" class labels as prior for selected z parameters #184

Open
haukelicht opened this issue Dec 9, 2023 · 1 comment
Open

Comments

@haukelicht
Copy link

Thank you for providing this awesome package!

In the "Workflow" vignette (line 103), you note that '{rater} also supports ... setting (some of) the prior parameters ...'.

How can I implement priors for the "true" category of some of the items? I have "ground truth" labels for a subset of the items in my use case and I want to "pull" the model's $z$ estimates towards these values during inference.

Take, for example, the ratings of item 12 in the anesthesia dataset:

> library(rater)
> data("anesthesia")
> anesthesia[anesthesia$item == 12, ]
   item rater rating
78   12     1      2
79   12     1      2
80   12     1      2
81   12     2      3
82   12     3      3
83   12     4      4
84   12     5      3

What if I know that the "ground truth" rating for this item is 3?

Thank you for your help!

@jeffreypullin
Copy link
Owner

jeffreypullin commented Dec 13, 2023

Hi Hauke,

Thanks for your interest in rater!

What I would recommend is creating a new 'ground truth' rater in your data for the items you have ground truth for, and then specifying the prior for that rater to encode that it is very accurate (i.e. has large on-diagonal entries in it's prior parameter matrix).

In code something like:

library(rater)

# We have 'ground truth' ratings for the first three patients.
anesthesia_w_ground_truth <- anesthesia
anesthesia_w_ground_truth[anesthesia$item %in% 1:3, "rating"] <- 1
anesthesia_w_ground_truth[anesthesia$item %in% 1:3, "rater"] <- 6

# Taken from the rater() function's code.
J <- 6
K <- 4

N <- 8
p <- 0.6
on_diag <- N * p
off_diag <- N * (1 - p) / (K - 1)

beta_slice <- matrix(off_diag, nrow = K, ncol = K)
diag(beta_slice) <- on_diag

beta <- array(dim = c(J,K,K))
for (j in 1:5) {
  beta[j, , ] <- beta_slice
}

beta_slice_ground_truth <- matrix(off_diag, nrow = K, ncol = K)
# This value may require tweaking.
diag(beta_slice_ground_truth) <- 15
beta[6, , ] <- beta_slice_ground_truth

fit_w_ground_truth <- rater(anesthesia_w_ground_truth,
                            dawid_skene(beta = beta))
fit <- rater(anesthesia, "dawid_skene")

plot(fit, "theta")
plot(fit_w_ground_truth, "theta")

A disclaimer however, I have not used this technique in a real data analysis.

Hope that helps, let me know how you get on!

Cheers,
Jeffrey

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants