Repository containing portfolio of data science projects completed by Ian Silantev during the training courses at Yandex.Practicum
Presented in the form of iPython Notebooks and readme markdown files.
Project | Description | Used libraries |
Shop client age determination | Build a machine learning model that determines the approximate age of a person from a photo. There is a set of photos of people with an indication of their age. | Pandas, keras, Matplotlib, Seaborn, Computer Vision, Machine Learning |
Comment toxic classification | Speeded up the moderation of comments by automating their toxicity assessment. Trained the model to classify comments as toxic and non-toxic. | Pandas, Python, Machine Learning, NLTK, LightGBM, Sklearn |
Prediction of Taxi Orders | Trained a Machine Learning model to predict the number of taxi trips for the next hour. | Pandas, sklearn, numpy, LightGBM, Matplotlib, StatsModels, CatBoost, Machine Learning |
Car price determination | Trained the Machine Learning model to determine the market value of the car. | Pandas, Sklearn, Numpy, LightGBM, Machine Learning, CatBoost, XGBoost |
Personal information security | The personal information of the clients of the insurance company was protected (encrypted), using the data conversion method.. | Pandas, Seaborn, Numpy, Sklearn, Machine Learning |
Gold recovery process optimization | Developed a model predicting the recovery rate of gold from gold ore. The model helped to optimize production in order not to launch a plant with unprofitable characteristics. | Pandas, Sklearn, Numpy, Seaborn, Matplotlib, Math, Machine Learning |
Profitable drilling oil spots | Decided in which region to extract oil. Built a machine learning model that helped determine the region where mining will bring the most profit with the least risk of loss. | Pandas, Sklearn, Math, Numpy, Seaborn, Matplotlib, SciPy, Bootstrap, Machine Learning |
Banks client churn rate | Analysis of the outflow of clients from the bank to select a strategy (retaining old clients or attracting new clients) | Machine Learning, Pandas, Matplotlib, Seaborn, Numpy, Sklearn, Math |
Tariff recommendation system | Built the first Machine Learning model for the classification problem, which finds a suitable tariff | Machine Learning, Pandas, Numpy, Python, Seaborn |
Game industry research | Using historical data on sales of computer games, user and expert ratings, genres and platforms. I identified patterns that determine the success of a game | Python, Pandas, numpy, Matplotlib, Preprocessing data, Exploratory Data Analysis, Statistics, statistical hypothesis testing, Seaborn, SciPy |
Determination of a profitable plan for a telecom company | Based on the data of the mobile operators customers, i analyzed customer behavior and found the optimal plan for the company. | Python, Pandas, Matplotlib, numpy, SciPy, Statistics, Statistical Hypothesis testing, Math, Seaborn |
Apartment research analysis | Using data from the Yandex.Realty service, i determined the market value of real estate objects and typical parameters of apartments. | Python, Pandas, Matplotlib, Exploratory Data Analysis, Data Visualization, Preprocessing Data, Math |
Borrower reliability study | Based on statistics on the paying capacity of clients. I researched is the marital status and the number of the client's children affect the fact of the loan repayment on time | preprocessing data, Python, Pandas, PyMystem3,Lemmatization, SciPy, Matplotlib, Seaborn, SciKitLearn,numpy |
Music taste analysis | Comparison of musical tastes of Moscow and St. Petersburg. How different is the music that plays on the way to work on Monday morning from the one that plays on Wednesday or at the end of the work week. | Python, Pandas, Numpy |