1. Avana - human
2. Asiago- mouse
1. Rule Set 1 score
2. Specificity within protein-coding regions
3. Target site location within the gene
Libraries tested using resistance to vemurafenib (Zelboraf) in A375 melanoma cells which carry BRAF V600E mutation and are sensitive to MAPK pathway (Supp Fig 1 - 2)
sgRNAs were ranked by their log2 fold-change relative to their abundance in the plasmid DNA pool and averaged ranks from the two replicates - Supp Table 4
RIGER only uses data from the top two perturbations so STARS was developed - generates false-discovery rates (FDR) - similar to MAGeCK algorithm
Annotation of PanCancer genes - loss may restore MAPK pathway or find other path for survival - Supp Table 7
Negative-selection screen performance of Avana in HT29 (colon cancer cell line) - Supp Table 10 and 11 and Supp Fig 10
Screening of Avana to purine analog 6-thiguanine in A375 (HPRT1 was targeted most), HT29 and 293T cell lines (HPRT1 was enriched but NUDT5 targeting produced similar resistance) - Fig 2a, Supp Table 14 and Supp Fig 11
4000 sgRNAs targeting 17 genes - efficacy of each sgRNA versus its position in protein coding region - Supp Fig 12 and Fig 3c
- Linear Regression
- L1-regularized linear regression
- L2-regularized linear regression
- hybrid SVM plus logistic regression
- Random Forest
- Gradient-boosted regression tree
- L1 logistic regression (classifier)
- SVM classification
20% versus 80% classification using SVM with logistic regression gave best results - Fig 4a and Supp Fig 13.
Linear vs Logistic regression model performance on FC data set - Supp Fig 14 - linear performed better
- Previously single and dinucleotide position-specific nucleotides and GC count of sgRNA were used
- Position-independent nucleotide counts
- Location of sgRNA target site within gene
- Biochemical and structural properties - specifically thermodynamic properties
- Microhomology features - did not improve performance