Email Addresses: [email protected] | [email protected]
(You may see [email protected] - this is a personal email that I may check infrequently at times)
I am a student at Boston University (BU) where I am working on completing a Master's degree in Computer Engineering with a specialization in Data Analytics.
My degree at BU will be a little longer than a normal Master's program. Until 2021, I was a chemist. I studied chemistry at Stonehill College and worked at PerkinElmer Health Sciences afterwards, but realized I wanted to work in a different field in the long-term. Instead of gradually making up some classes to enroll directly in a Master's program at BU, I started as a LEAP student at BU (LEAP program). After a couple semesters of "catch up" courses, I will be starting as a Master's candidate with other computer engineering students pursuing a Master's degree.
Personal Projects
- Pet Classification Machine Learning Challenge - still working on finishing touches(Active)
- Sudoku Game in Python - in Development -- Link coming soon
- Data Scientist Learning Path on Codecademy.com (Active)
- Analyzing Data - Subset of Above (Active)
Spring 2022 Project Repositories
-
Priority Queues for Dijkstra's Single Source Shortest Path Algorithm
Description: I worked on a project for my Advanced Data Structures and Algorithms course where each person in our group implemented a modern heap for the Dijkstra's algorithm. My role was to implement a rank-pairing heap. This heap is designed for Dijkstra's algorithm because it has constant ($O(1)$) run time for decreasing keys and inserting elements in the priority queue. In addition to implementing four heaps, the group analyzed the runtime of our implementations for Dijkstra's algorithm. -
Multiclass SVM Approaches for Large Multiclass Problems
Description: I worked on a project for a machine learning course where my group set out to implement multiclass support vector machine (SVM) classifiers found in literature - the paper that we used as a template for our implementations is below. Since SVMs are binary classifiers, multiclass classification tasks require additional steps to use them for multiclass tasks. This repo implements four multiclass SVM approaches on multiple datasets and compares the implementations success on multiple datasets.
Multiclass SVM Reference: Chih-Wei Hsu and Chih-Jen Lin, "A comparison of methods for multiclass support vector machines," in IEEE Transactions on Neural Networks, vol. 13, no. 2, pp. 415-425, March 2002, doi: 10.1109/72.991427.
Algorithms and Data Structures Repos
Repos for a class on Intro to Software Eng. (EC327)
Repos for my first programming course at Boston University (EK125)
I feel like I am still just beginning to uncover what I want to work on in the long-term because there is still lots to learn about modern analytics and the future of the field.
That being said, I decided to start at BU because I was fascinated by the amount of information available and the insights that I was seeing from data. I noticed this in particular through sports since I was an athlete when I was younger and continue to follow many professional sports.
In my free-time, I enjoy playing with and learning from data - mostly sports data. You will find that a lot of the open-ended work here is of this nature.
However, my professional interests are more focused on applying engineering to data-driven projects. I am excited about software development as it pertains to efficient movement and transformation of data in support of machine learning and data science.
My focus right now is demonstrating through practice what I am learning.
My courses at BU are great for learning foundational concepts and approaches to problem solving, and I work on putting this into practice whenever I get the chance.
My courses at BU have primarily focused on data structures, algorithms, probability, statistics, and machine learning in C/C++ and MATLAB. To gain experience, I build in some time to learn additional skills through courses and tutorials on the web. Since I can tailor the "ciriculum," I can pick and choose lessons that I believe will be valuable based on my interests and future jobs that appeal to me.
Machine Learning - Since this is a major element of modern work with data, I believe it is extremely important to be proficient in the topic. I am currently taking a course in ML (and plan to take a Deep Learning course through BU), and practice implementing models on personal projects to understand how data must be prepared and used.
Cloud Computing & Distributed Systems - I am really excited to continue learning about this topic right now. As I have become more comfortable with software and computer engineering, the topics have become more appealing. I am excited to focus on the foundational skills of this field through my coursework, and I try to learn as much own my own as possible. On my own, I follow courses and tutorials on cloud computing tools and providers (many listed below) so that I can confidently implement techniques that I have and contiue to learn in distributed environments.
A Short List of Specific Skills
- SQL
- Apache Spark & PySpark
- Hadoop
- Apache Airflow
- AWS (S3, EC2, and RDS) - via AWS Training Site
This includes courses that I have taken at BU as well as topics that I have learned on my own.
- Algorithms
- C/C++
- Python
- probability and statistics
- Data Structures
- Data Visualization - matplotlib, seaborn, and (some) ggplot2
- Operating Systems