Skip to content

JoeBradyDev/jb-capstone

Repository files navigation

Car Sales Price Prediction Engine

Overview

The following project was created as part of Joe Brady's capstone for his Bachelor's Degree in Computer Science at Western Governor's University (WGU).

Topic and Description

Car Sales Price Prediction Engine is a web application that will help a fictional car sales company set prices for the used vehicles they have acquired and wish to sell. The web application uses machine learning and a predictive algorithm to estimate the sales price of a given car based on the mileage. This includes descriptive methods (graphs) to help the client understand the data and a predictive method (a linear regression graph) to help them make educated price-setting decisions.

Project Purpose/Goals

The goal of this project and application is to decrease the time employees spend manually estimating appropriate prices, and, ultimately, help the company increase revenue. Historically, this fictional company has struggled to set appropriate prices for the cars they sell, leading to a loss in revenue. This project will help them to understand the market value of the cars they acquire so they can set appropriate prices to maximize their profit.

Descriptive Methods

The project has the following three descriptive methods (graphs):

  1. A scatter plot comparing the mileage to the sales price of each car.
  2. A bar graph that shows the number of cars sold at various price ranges.
  3. A pie chart showing the percent of cars sold at the price ranges used in the bar graph

Predictive/Prescriptive Method

The project has one prescriptive method, which is a scatter plot with a regression line. This regression line will be generated using machine learning and is what helps the employees determine the estimate for the sales price based on the mileage.

Environment & Architecture

Technology Used

This project is a web application written with the following technologies:

  • JavaScript/TypeScript
  • Node.js v16.13.2 LTS
  • TensorFlow.js (_Node.js library used for machine learning
  • React.js (JavaScript library for building user interfaces)
  • Next.js (Full-stack React.js framework)
  • Tailwind CSS (Utility-first CSS framework)
  • Chart.js (JavaScript charting library)

Environment Used

The environment used and target platform for this application is Microsoft Windows.

Data Format Used

The storage format that is used is CSV, which contains all the data points. The CSV is read into the system, cleaned, and unit conversions are done, as necessary. This dataset was found on Kaggle.com at the following url:

https://www.kaggle.com/datasets/nehalbirla/vehicle-dataset-from-cardekho

Prerequisites

Before you can install and run the web application, you must first install the following:

Install Node.js 16

Install Node by downloading and running the Node.js 16 LTS installer from this website: https://nodejs.org/en/

Once installed, confirm that the correct version of Node is installed by running the following command in a terminal/command prompt window:

node --version

Install Yarn 1

After verifying that Node.js 16 is installed correctly, run the following command to install Yarn:

npm install --global yarn

Verify that the correct version of Yarn is installed by running the following command:

yarn --version

Installation & Application Launch

Clone Repository

After verifying that Node.js and Yarn are installed, you are ready to install the application. In a terminal window, navigate to a folder where you'd like to install the application and clone this repository:

git clone https://github.com/JoebradyDev/jb-capstone.git

Install Application

Once the repository is cloned, using a terminal window, navigate to the directory where the application was downloaded:

cd jb-capstone

In the application root directory, install the application's dependencies using yarn, using this command:

yarn install

Launch Application

Once dependencies have been installed, you are ready to run the application. Run the following command to launch the application server:

yarn dev

Known Bug: Note that you cannot use "yarn start" or "npm start" to start the application. This will result in the charts not loading, so you must use the "yarn dev" command as mentioned above.

Open Application

After the server has been started, you can use a browser navigate to the following location and view the application:

http://localhost:3000

Application Usage

Logging In and Logging Out

How to Sign In

After starting the server and navigating to http://localhost:3000, you will be brought to a login page. Please enter the following credentials to log into the application:

  • Username: "john.doe"
  • Password: "wgu1234$"

Login Page


How to Sign Out

If you need to log out of the application, you can click the "Logout" button at the top right of the screen. Note that you will also be automatically logged out after an hour of inactivity.

Logout Link


Dashboard

When you log in, you are immediately brought to the Dashboard page. This has a welcome message and a description of each page of the application.

Dashboard


Sidebar Navigation

On the left side of the screen there is a sidebar with links you can use to navigate to each page of the application.

The sidebar contains the following items:

  • Dashboard (Descriptions of/links to each page)
  • Mileage vs. Price (Scatter plot of the data)
  • Cars Sold by Price (Bar graph of cars sold by price)
  • Percent Sold by Price (Pie chart of cars sold by price)
  • Price Prediction (Regression line chart/scatter plot)

Sidebar


Chart Pages

After clicking any of the chart pages other than the Dashboard, you are brought to each respective chart page. Note that, at this point, the chart is not loaded.

Bar Chart


Chart Actions

Default Chart Actions

For each of the charts, there are a default set of chart actions (at the bottom right of each chart), which include:

  • Clear Data (Removes data from the chart)
  • Load Data (Loads relevant data into the chart)

Chart Action Buttons


After clicking the "Load Data" button, each chart will be filled with data. Note that each chart is color-coded and labeled with the appropriate data.

Filled Bar Chart


Price Prediction Chart Action

For the "Price Prediction" chart, there is one additional action:

  • Run Prediction (Runs the machine learning algorithm and loads into chart)

    Note that the machine learning algorithm takes some time to run.


Regression Chart Actions


Selectable Price Ranges

The "Percent Sold by Price" pie chart has a bonus feature that allows the user to select price ranges and recalculate the percentages in real time.

Default Pie Chart


Price ranges are selected by clicking them in the legend at the top of the pie chart. These will cross each price range off the legend and remove it from the chart. To add the price range back, simply click the crossed off item and it will become uncrossed and reappear in the chart.

Pie Chart Legend


This is how the chart looks when the user selects to only look at cars sold with a price range between $0 and $8,000.

Pie Chart With Only Selected Ranges


Scatter Plot Tooltips

The "Mileage vs. Price" and "Price Prediction" scatter plot charts show tooltips when you hover over each data point.

Scatter Plot With Tooltip


If there are multiple data points of cars at the same mileage and price, these are repeated in the tooltip

Tooltip


Hideable Data Points/Regression Line

For the "Price Prediction" chart, the user can toggle whether to display the data points or the regression line.

Regression Chart


Data points are removed by clicking the "Actual" or "Predicted" items in the legend. Clicking the item in the legend once will check it off and remove the regression or data points from the chart. Clicking again will bring them back.

Regression Chart Legend

This is how the "Price Prediction" chart looks when the "Actual" data points have been hidden.

Regression Chart Without Data Points

Known Issues

  • This application can only be run in development mode due to a bug involving the integration of Chart.js 3 with Next.js.
  • It would be more appropriate to use logarithmic regression for this dataset, but linear regression was used because it is simpler and was sufficient for the project requirements.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages