Skip to content

Aishwaryajakka/infsci2711_finalproject

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Online Retail Datawarehouse

This project is a multi-database warehouse built using SQL, MongoDB and Neo4j on dataset : https://archive.ics.uci.edu/ml/datasets/Online+Retail+II

Each branch is an implementation of a type of database system.

For more information please contact Team Coordinator : [email protected]

Database Team
MySQL Aishwarya Jakka, Shruti Gupta
Neo4j Soham Bhatnagar, Aishwarya Jakka, Kwesi Alguillera, Kenny Wu
MongoDB Piu Mallick, Reshma Pothen

Other subgroups:

Group task Team
Data Cleaning Kwesi Alguillera
Front-end + Back-end Kenny Wu, Kwesi Alguillera
Documentation Piu mallick, Reshma Pothen , Shruti Gupta

Online Retail Database

This Online Retail II data set (the original dataset is a excel file) contains all the transactions occurring for a UK-based and registered, non-store online retail between 01/12/2009 (1st Dec 2009) and 09/12/2011 (9th Dec 2011). The company mainly sells unique all-occasion giftware. Many customers of the company are wholesalers.

Data Description

The dataset has the following attributes:
1.InvoiceNo: Invoice number. Nominal. A 6-digit integral number uniquely assigned to each transaction. If this code starts with the letter 'c', it indicates a cancellation.
2.StockCode: Product (item) code. Nominal. A 5-digit integral number uniquely assigned to each distinct product.
3.Description: Product (item) name. Nominal.
4.Quantity: The quantities of each product (item) per transaction. Numeric.
5.InvoiceDate: Invoice date and time. Numeric. The date and time when a transaction was generated.
6.UnitPrice: Unit price. Numeric. Product price per unit in sterling (£).
7.CustomerID: Customer number. Nominal. A 5-digit integral number uniquely assigned to each customer.
8.Country: Country name. Nominal. The name of the country where a customer resides.

Aggregate Queries

1.What time of the day (which hour of the day) is the sale maximum per country?
2. What is the annual TotalSales per product?
3. What is the top product per year?
4. What is the top product per country?
5.Which item is sold below a certain threshold value? Or, what are the under-performed products based on the average sales last year (the year denotes either 2009 or 2010)?
6.Which customer spends the most (per country/overall)?
7.What is the best-selling month per country? (Given the year range, 2009-2011)
8.What is the best-selling product per month? (Given the year range, 2009-2011)
9.What is the change in TotalSales per country per year (Trend of Sales)?
10.What is the average spending of a customer per country? (TotalSales/Number of customers)
11.What is the frequently purchased item per customer?

Instructions to start the server

Pre-req: please have all databases installed and configured following the steps in their respective folders

(SQL should be at default port 7778 with username and password both set as "root", MongoDB should be at default port 27017 without changing the default username and password, Neo4j should be at default port 7687 with username "neo4j" and password "password" while database name named as "neo4j")
  1. Download and install NodeJS from https://nodejs.org/en/download/
  2. Navigate into the "/infsci2711_finalproject/frontend" folder/directory using cd in command line
  3. Type in "node app.js" to the same command line window/instance.
  4. Goto http://localhost:3000/ on any browser
  5. Voila!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages