Skip to content

Commit

Permalink
Add files via upload
Browse files Browse the repository at this point in the history
  • Loading branch information
Henryfermi authored Jul 19, 2024
0 parents commit b9407c2
Show file tree
Hide file tree
Showing 2 changed files with 859 additions and 0 deletions.
111 changes: 111 additions & 0 deletions Weightliftingclassification.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
---
title: "WEIGHT LIFTING CLASSIFICATION"
author: "Henry Muhumuza"
date: "2024-07-19"
output: html_document
---
<style>
body {
font-family: 'Times New Roman', Times, serif;
}
h1 {
font-family: 'Courier New', Courier, monospace;
}
</style>

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

## Introduction

Using devices such as Jawbone Up, Nike FuelBand, and Fitbit, it is now possible to collect a large amount of data about personal activity. These devices are part of the quantified self movement. In this project, our goal is to predict the manner in which participants performed weight lifting exercises using data from accelerometers.The goal is to predict the manner in which they did the exercise. This is the "classe" variable in the training set.Class A corresponds to the specified execution of the exercise, while the other 4 classes (B to E) correspond to common mistakes.

## Data Loading and Exploration
First, we load the training and test datasets and explore their structure.
```{r}
# Load necessary libraries
library(caret)
library(randomForest)
library(dplyr)
# Load the data
training_data <- read.csv("https://d396qusza40orc.cloudfront.net/predmachlearn/pml-training.csv")
testing_data <- read.csv("https://d396qusza40orc.cloudfront.net/predmachlearn/pml-testing.csv")
# Display structure of the training data
str(training_data)
```
## Data Preprocessing
Next, we preprocess the data by handling missing values and ensuring classe variable is a factor variable.
```{r}
# Remove columns with too many NAs or irrelevant columns
training_data <- training_data[, colSums(is.na(training_data)) == 0]
# Remove columns with empty cells
empty_cells <- apply(training_data, 2, function(col) any(col == ""))
training_data <- training_data[, !empty_cells]
# Remove last row of testing data
testing_data <- testing_data[,-160]
# Remove columns with too many NAs or irrelevant columns
testing_data <- testing_data[, colSums(is.na(testing_data)) == 0]
# Remove columns with empty cells
empty_cells <- apply(testing_data, 2, function(col) any(col == ""))
testing_data <- testing_data[, !empty_cells]
# Ensure the classe variable is a factor
training_data$classe <- as.factor(training_data$classe)
# Display cleaned data structure
str(training_data)
```
## Model Building
Sub split the training data into a training set and a validation set, then train a Random Forest model.

```{r}
# Split the training data
set.seed(123)
trainIndex <- createDataPartition(training_data$classe, p = .7,
list = FALSE,
times = 1)
trainSet <- training_data[trainIndex,]
validSet <- training_data[-trainIndex,]
# Train a Random Forest model
model_rf <- randomForest(classe ~ ., data = trainSet, importance = TRUE)
# Display the model
print(model_rf)
```
## Model Evaluation
We evaluate the model using the validation set.
```{r}
# Make predictions on the validation set
predictions <- predict(model_rf, validSet)
# Display the confusion matrix
confusionMatrix(predictions, validSet$classe)
```

## Predictions on Test Data

We use the trained model to predict the classe variable for the test data.

```{r}
# Predict on test data
test_predictions <- predict(model_rf, testing_data)
# Display test predictions
test_predictions
```

## Conclusion
The model performed reasonably well, predicting most of the classes accurately. Random forests method is one of the best prediction algorithms.

748 changes: 748 additions & 0 deletions Weightliftingclassification.html

Large diffs are not rendered by default.

0 comments on commit b9407c2

Please sign in to comment.