Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

character limit in translating to stan with ulam()? #407

Open
lorenze3 opened this issue Aug 31, 2023 · 3 comments
Open

character limit in translating to stan with ulam()? #407

lorenze3 opened this issue Aug 31, 2023 · 3 comments

Comments

@lorenze3
Copy link

I really appreciate the work both in educating on bayesian regression in such an entertaining way and in making stan so easy to use from R.

I'm abusing the rethinking package to avoid writing macros to write stan code (because someone already did that, right?). I'd use brms for my problem except I want per-coefficient lower bounds (ie some coefficients to be non-negative, but not all). At any rate, I've got a very long deterministic assignment. Once I add 'too many' terms to the model, ulam() breaks up the assignment into two lines, but not in a way that is syntatically correct.

here's an attempt at a reproducible example:

#!R
d<-tibble(week=1:100,store_id=rep(c(1,2),50), digital=1:100,
          leads=1:100,facebook=1:100,bing=1:100,cable=1:100,
          google=1:100,marketingcloud=1:100,
          mntn=1:100,newspaper=1:100,ooh=1:100,pinterest=1:100,
          radio=1:100,simplifi=1:100,snapchat=1:100,video=1:100,
          tv=1:100,taboola=1:100,tiktok=1:100,
          twitter=1:100,sin1=sin(1:100),cos1=cos(1:100))

fff<-ulam(alist(b_bing ~normal(0,10),
                b_cable ~ normal(0,10),
                b_digital~normal(0,10),
                b_facebook~normal(0,10),
                b_google~normal(0,10),
                b_marketingcloud~normal(0,10),
                b_mntn~normal(0,10),
                b_newspaper~normal(0,10),
                b_ooh~normal(0,10),
                b_pinterest~normal(0,10),
                b_radio~normal(0,10),
                b_simplifi~normal(0,10),
                b_snapchat~normal(0,10),
                b_video~normal(0,10),
                b_tv~normal(0,10),
                b_taboola~normal(0,10),
                b_tiktok~normal(0,10),
                b_twitter~normal(0,10),
                b_sin1~normal(0,10),
                b_cos1~normal(0,10),
                store_int[store_id]~normal(0,10),
                a0 ~normal(0,10),
                b_week~normal(0,100),
                big_model <- b_bing * bing + b_cable * cable + b_digital * digital + 
                  b_facebook * facebook + b_google * google + b_marketingcloud * 
                  marketingcloud + b_mntn * mntn + b_newspaper * newspaper + 
                  b_ooh * ooh + b_pinterest * pinterest + b_radio * radio + 
                  b_simplifi * simplifi + b_snapchat * snapchat + b_video * 
                  video + b_tv * tv + b_taboola * taboola + b_tiktok * tiktok + 
                  b_twitter * twitter + b_sin1 * sin1 + b_cos1 * cos1 + b_week * 
                  week + store_int[store_id] + a0,
                big_sigma~half_normal(100,100),
                leads ~normal(big_model,big_sigma)),
          data=d,sample=F,declare_all_data = F)

If you check the resulting stan code in fff$model, you'll see that big_model[i] is split into two lines, but the first one ends in a +, and the second line does not increment the existing big_model[i]. I'll paste that code in below.

#!stan
data{
    array[100] int store_id;
    array[100] int week;
     vector[100] cos1;
     vector[100] sin1;
    array[100] int twitter;
    array[100] int tiktok;
    array[100] int taboola;
    array[100] int tv;
    array[100] int video;
    array[100] int snapchat;
    array[100] int simplifi;
    array[100] int radio;
    array[100] int pinterest;
    array[100] int ooh;
    array[100] int newspaper;
    array[100] int mntn;
    array[100] int marketingcloud;
    array[100] int google;
    array[100] int facebook;
    array[100] int digital;
    array[100] int cable;
    array[100] int bing;
     vector[100] leads;
}
parameters{
     real b_bing;
     real b_cable;
     real b_digital;
     real b_facebook;
     real b_google;
     real b_marketingcloud;
     real b_mntn;
     real b_newspaper;
     real b_ooh;
     real b_pinterest;
     real b_radio;
     real b_simplifi;
     real b_snapchat;
     real b_video;
     real b_tv;
     real b_taboola;
     real b_tiktok;
     real b_twitter;
     real b_sin1;
     real b_cos1;
     vector[2] store_int;
     real a0;
     real b_week;
     real<lower=0> big_sigma;
}
model{
     vector[100] big_model;
    leads ~ normal( big_model , big_sigma );
    big_sigma ~ normal( 100 , 100 );
    for ( i in 1:100 ) {
        big_model[i] = b_bing * bing[i] + b_cable * cable[i] + b_digital * digital[i] + b_facebook * facebook[i] + b_google * google[i] + b_marketingcloud * marketingcloud[i] + b_mntn * mntn[i] + b_newspaper * newspaper[i] + b_ooh * ooh[i] + b_pinterest * pinterest[i] + b_radio * radio[i] + b_simplifi * simplifi[i] + b_snapchat * snapchat[i] + b_video * video[i] + b_tv * tv[i] + b_taboola * taboola[i] + b_tiktok * tiktok[i] + b_twitter * twitter[i] + b_sin1 * sin1[i] + b_cos1 * cos1[i] + b_week * week[i] + store_int[store_id[i]] + ;
        big_model[i] =     a0;
    }
    b_week ~ normal( 0 , 100 );
    a0 ~ normal( 0 , 10 );
    store_int ~ normal( 0 , 10 );
    b_cos1 ~ normal( 0 , 10 );
    b_sin1 ~ normal( 0 , 10 );
    b_twitter ~ normal( 0 , 10 );
    b_tiktok ~ normal( 0 , 10 );
    b_taboola ~ normal( 0 , 10 );
    b_tv ~ normal( 0 , 10 );
    b_video ~ normal( 0 , 10 );
    b_snapchat ~ normal( 0 , 10 );
    b_simplifi ~ normal( 0 , 10 );
    b_radio ~ normal( 0 , 10 );
    b_pinterest ~ normal( 0 , 10 );
    b_ooh ~ normal( 0 , 10 );
    b_newspaper ~ normal( 0 , 10 );
    b_mntn ~ normal( 0 , 10 );
    b_marketingcloud ~ normal( 0 , 10 );
    b_google ~ normal( 0 , 10 );
    b_facebook ~ normal( 0 , 10 );
    b_digital ~ normal( 0 , 10 );
    b_cable ~ normal( 0 , 10 );
    b_bing ~ normal( 0 , 10 );
}

for my immediate purposes I'm just going to use shorter variable names, but this might be a simple fix? Changing the big_model assignment to

 big_model[i] = b_bing * bing[i] + b_cable * cable[i] + b_digital * digital[i] + b_facebook * facebook[i] + b_google * google[i] + b_marketingcloud * marketingcloud[i] + b_mntn * mntn[i] + b_newspaper * newspaper[i] + b_ooh * ooh[i] + b_pinterest * pinterest[i] + b_radio * radio[i] + b_simplifi * simplifi[i] + b_snapchat * snapchat[i] + b_video * video[i] + b_tv * tv[i] + b_taboola * taboola[i] + b_tiktok * tiktok[i] + b_twitter * twitter[i] + b_sin1 * sin1[i] + b_cos1 * cos1[i] + b_week * week[i] + store_int[store_id[i]]  ;
        big_model[i] =  big_model[i] +   a0;

seems to work (ie the sampling runs instead of generating an error), in case there is some character limit in R or Stan that is hard to work around.

@lorenze3
Copy link
Author

lorenze3 commented Aug 31, 2023

Oops -- I've now realized this seems to be about the number of terms in the assignment and not character length.

but it seems to be possible to work around it by breaking the assignment into two formulae --

alist(
    big_model <- <20 terms like b_varname * varname >,
    big_model<- big_model[i] + <remaining terms like b_varname *varname>

)

although I'm not 100% sure that this code (which does compile and run) results in the same sampling as not having it broken into two pieces

@rmcelreath
Copy link
Owner

Interesting bug.

To workaround, use multiple symbols:

big_model <- [stuff]
big_model2 <- big_model + [more stuff]

@lorenze3
Copy link
Author

lorenze3 commented Sep 1, 2023

Thanks for the response! And, dang it, I shoulda thought of that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants