Skip to content
View richardmukechiwa's full-sized avatar
  • 17:50 (UTC -12:00)

Block or report richardmukechiwa

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
richardmukechiwa/README.md

About Me

I am Richard Mukechiwa, a passionate data scientist from Zimbabwe with expertise in data analysis, machine learning, and deploying predictive models to solve real-world problems. With a strong foundation in Python, SQL, Power BI, and various machine learning frameworks, I create solutions that offer meaningful business insights. My portfolio showcases my journey through various data science projects that highlight my ability to handle diverse datasets and develop models that drive decision-making.

Skills & Expertise

Data Analysis: Python (Pandas, NumPy), SQL, Power BI

Data Visualization: Matplotlib, Seaborn, Plotly, Tableau, Power BI

Machine Learning: Scikit-learn, XGBoost, Logistic Regression, Prophet, ARIMA, Linear Regression

Data Engineering: SQL, ETL, Web Scraping

Model Deployment: Streamlit

Version Control & Collaboration: Git, GitHub, Jupyter Notebooks

Cloud: AWS (S3, EC2)

Featured Projects

Objective: Built a predictive model to identify customers likely to churn, enabling the business to take proactive retention steps.

Objectives

  • Segment customers into groups based on their Income and Total Purchase Amount.
  • Analyze and visualize customer behavior patterns within each segment.
  • Provide actionable insights to help businesses focus on customer needs, improve services, and optimize resource allocation.

Objectives

  • This project showcases a causal impact analysis performed on a dataset with features such as revenue, quantity, and unit_price. The goal was to assess the impact of an intervention on these variables.

  • used the CausalImpact package to estimate the effect of an intervention on revenue.

Objectives

This project examines how two game versions, gate_30 and gate_40, influence player retention and the total number of game rounds played. The goal is to determine whether the new version (gate_40) leads to higher retention and increased gameplay.

Key Results:

  • Day 1 Retention: No significant difference (p = 0.062).
  • Day 7 Retention: Significant improvement with gate_40 (p = 0.00083).
  • Game Rounds Played: No significant difference between the two versions (p = 0.405).

Overview

The Sales Forecasting Project utilizes advanced time series analysis techniques—ARIMA and Prophet—to accurately predict weekly sales based on historical data. By analyzing key variables, we provide valuable insights into sales trends and seasonal patterns, empowering businesses to optimize inventory and enhance decision-making.

Key Highlights

Models Used: ARIMA and Prophet for robust sales forecasting.

Overview

Analyzed hotel customer reviews using sentiment analysis to extract insights. The project involved data preprocessing, EDA, and word cloud visualizations for positive and negative sentiments. Key findings revealed strengths in location and staff but highlighted room quality issues, leading to actionable recommendations for improvement.

Tools Used Python, Text blob, Nltk

Project Overview

This project explores sentiment analysis using a pre-trained BERT model (nlptown/bert-base-multilingual-uncased-sentiment) to classify textual data into sentiment categories and implements a robust pipeline for analyzing textual reviews:

  • Model: Pre-trained BERT from Hugging Face.
  • Objective: Classify sentiment in customer reviews.
  • Tools: Python, Transformers, BeautifulSoup, and Pandas.

Overview

This project involves advanced SQL analysis on a sample sales dataset, aiming to answer key business questions related to product performance, revenue trends, and category comparisons. Through a series of SQL queries, the analysis uncovers insights such as:

  • Top Revenue-Generating Products: Identifying which products contribute the most to total revenue.

  • Sales Trends: Examining day-over-day revenue growth and seasonal sales trends.

  • Category Comparisons: Analyzing performance differences between Electronics and Clothing categories.

  • Product Demand Analysis: Evaluating unit sales to understand customer demand across various products.

Tools Used Microsoft SQL server

Data Science Certifications

IBM Data Science Professional Certificate (Coursera)

World Quant Data Science Lab

Connect With Me

LinkedIn: https://www.linkedin.com/in/richardmukechiwa/

Email: [email protected]

Feel free to explore my projects, and let's connect to discuss data science opportunities or collaborations!

“Solving business problems through data-driven insights.”

Pinned Loading

  1. Employee-Attrition-Classification-model Employee-Attrition-Classification-model Public

    Jupyter Notebook

  2. Medical_Charges-Prediction-Random-Forest-Regression Medical_Charges-Prediction-Random-Forest-Regression Public

    Jupyter Notebook

  3. HealthCare-Dataset-Analysis-with-Microsoft-SQL-server HealthCare-Dataset-Analysis-with-Microsoft-SQL-server Public

  4. A-B-Testing-Project A-B-Testing-Project Public

    Jupyter Notebook

  5. SpaceX-SQL-Data-Analysis- SpaceX-SQL-Data-Analysis- Public