Skip to content

negativenagesh/Lung-Cancer-EDA

Repository files navigation

Lung Cancer Risk Factors Exploratory Data Analysis

Find my project, dataset on kaggle and upvote - https://www.kaggle.com/datasets/subrahmanya090/lung-cancer which has over 66 downloads and 452 views as of 12-04-2024 please visit above link and upvote

Medium - https://medium.com/@gaonkarsub/lung-cancer-risk-factors-exploratory-data-analysis-9634a218deaf

Video on Risk factors of Lung Cancer - https://youtu.be/0vVRp5eNDlA?feature=shared

Dataset:

GENDER: Gender of the individual (M: Male, F: Female) AGE: Age of the individual SMOKING: Smoking status (2: Yes, 1: No) YELLOW_FINGERS: Presence of yellow fingers (2: Yes, 1: No) ANXIETY: Anxiety level (2: High, 1: Low) PEER_PRESSURE: Peer pressure level (2: High, 1: Low) CHRONIC DISEASE: Presence of chronic disease (2: Yes, 1: No) FATIGUE: Fatigue level (2: High, 1: Low) ALLERGY: Allergy status (2: Yes, 1: No) WHEEZING: Wheezing condition (2: Yes, 1: No) ALCOHOL CONSUMING: Alcohol consumption status (2: Yes, 1: No) COUGHING: Presence of coughing (2: Yes, 1: No) SHORTNESS OF BREATH: Shortness of breath condition (2: Yes, 1: No) SWALLOWING DIFFICULTY: Difficulty in swallowing (2: Yes, 1: No) CHEST PAIN: Presence of chest pain (2: Yes, 1: No) LUNG_CANCER: Lung cancer diagnosis (2: Yes, 1: No)

Data has 309 rows and 16 columns with floating variables, integer, object which ranges from 0 - 308

Lung cancer is the uncontrollable growth of abnormal cells in one or both of the lungs. Cigarette smoking causes most lung cancers when smoke gets in the lungs. Lung cancer kills 1.8 million people each year, more than any other cancer. It has an 80-90% death rate, and is the leading cause of cancer death in men, and the second leading cause of cancer death in women.

The global cancer burden is estimated to have risen to 18.1 million new cases and 9.6 million deaths in 2018. One in 5 men and one in 6 women worldwide develop cancer during their lifetime, and one in 8 men and one in 11 women die from the disease. Worldwide, the total number of people who are alive within 5 years of a cancer diagnosis, called the 5-year prevalence, is estimated to be 43.8 million.

Necessary Librariies:

!pip install pandas

1pip install numpy

!pip install matplotlib

!pip install seaborn

!pip install missingno

!pip install scikit-learn

!pip install scipy

!pip install plotly