Embark on a data analysis journey using Python's Pandas library. This challenge simulates real-world data analysis tasks within an e-commerce context, focusing on customer insights, product trends, and profitability.
- Module 4 Challenge files.
- Repository:
pandas-challenge-1
- Clone and push changes to GitHub or GitLab.
- Import CSV data.
- Examine column names and statistics.
- Investigate data to answer key questions about item categories, subcategories, and client activity.
- Calculate subtotals, shipping prices, total prices, costs, and profits.
- Apply transformations to enhance data analysis capabilities.
- Verify transformed data against actual order receipts.
- Analyze spending for top clients.
- Summarize findings in a presentation-ready format.
- Write a concise summary of insights gained from the data.
- Utilize Pandas documentation for advanced functions.
- Implement well-named functions to streamline operations and improve code readability.
- Follow the analytical process, defining questions and exploring data thoroughly.
- Display column names and descriptive statistics.
- Identify top item categories and subcategories.
- Determine clients with the most data entries.
- Reveal the quantity ordered by the top client.
- Create columns for subtotals, shipping prices, total prices, costs, and profits.
- Ensure accurate calculation of financial metrics.
- Match calculated total prices with provided order receipts.
- Calculate total revenue for top clients.
- Construct a summary DataFrame for top clients with key financial metrics.
- Develop a function to convert currency to millions and format data for presentation.
- Provide a brief summary of analytical findings.
Submit your GitHub repository URL containing the challenge work for evaluation.