Data science problem solver bringing an analytical perspective honed in the biotechnology industry. Experience developing personal and company-scale tools and databases to aid the translation of complex multi-source data into actionable results. Expertise in pre- and post-experiment statistics, design of experiments, molecular biology, and process development. Skilled in scientific communication, data presentation, and converting company-wide goals into specific questions and actionable experiments. Inquisitive, analytical, innovative.
NYC Data Science Academy Project Portfolio
-
Fraud Detection in Medicare Claims Data - Article - Presentation - Repository
- Constructed a model to predict fraudulent Medicare claims with a 78% accuracy and fraudulent providers with an 87% accuracy by cross-validation and tuning of scikit-learn, XGBoost, and CatBoost models.
- Reduced data requirements from 100+ claims to 10 while maintaining accuracy and specificity by predicting fraudulence on a claims level and then aggregating up to predict provider fraud, reducing time-to-prediction from over a year to under 2 months, allowing multiple opportunities to identify fraudulent providers.
-
Interactive Database Visualization - Article - Repository
- Built an interactive dashboard in R-Shiny for the management of cell stocks across multiple R&D departments with integrated data visualizations and workflow-specific data capture.
- Interviewed Managers and Operators across departments to customize the database visualizations and data capture to their specific needs.
-
Playlistr: - Repository - Web App
- Uses the Spotify API to create playlists matching a chosen message, and saves them to your Spotify account.
- The Dash App provides two interfaces: "pick-and-choose" for manual selection and "auto-solver" for automatic playlist creation. The playlistr.py file can also be run locally but requires Spotify App credentials.
-
Daylist Album Art Generator: - Repository - Web App
-
Wit, Wisdom, and Vector Embeddings - Repository
- Applied NLP to visualize the similarity of sayings from Benjamin Franklin's Poor Richard's Almanack. Applied SentenceTransfomers and UMAP to vectorize and reduce the data to a graphable form.
- The embedded quotes are visualized in a Dash app using Plotly to allow interactions with the graph. Users can enter a new phrase to find the closest match, hover over quotes to view them and their neighbors, or highlight a region and view all quotes in the table below.
- Note: Due to the size of the SentenceTransformer package, the Dash app cannot be hosted on standard platforms (Heroku), view the repo for download and installation instructions.
-
Housing Price Analysis with Machine Learning - Article - Repository
-
Web Scraping for Business Analysis - Article - Repository