Skip to content

Latest commit

 

History

History
25 lines (13 loc) · 2.79 KB

README.md

File metadata and controls

25 lines (13 loc) · 2.79 KB

Machine Learning from Scratch

This repo contains all the tutorial Jupyter notebooks used in our Machine Learning from Scratch presentation at Speed@BDD.

If you find any issues or have questions requiring more details, feel free to open an issue for further discussions.

Details

Machine Learning is reshaping the world. While machines previously relied on a series of crafted rules and features to solve complex problems, the new paradigm is having machines with the built-in capacity of learning how to solve these problems automatically. Of the various branches of ML, Deep Learning is at the moment the most powerful form of ML, solving an array of complex problems by using elaborate trainable mathematical models called Neural Networks.

While there is a huge amount of resources on how to build ML models using different frameworks, there is an equally huge gap in the literature on how these models’ inputs are generated from raw heterogeneous data. The requirements for deploying these models in a commercial environment are often brushed over or ignored completely when discussing the training of these models.

The aim of this talk is to describe the full life cycle of a commercially viable Machine Learning project, starting from data collection, processing and transformation up to training and deploying models in production. We will take a real-life use case and describe all these phases over two sessions:

Session 1 on Tuesday, June 20, 2017: Data Collection and Transformation.

In this session, we will talk about the different technologies used for collecting, storing and querying large amounts of heterogeneous data. More importantly, we will introduce a methodology to build labeled data from large unlabeled data sets, a critical and often incurred problem in any real ML task. Finally, we will demonstrate how to transform this data into machine-readable format (numerical vectors) that can be processed by Neural Networks.

Session 2 on Thursday, June 22, 2017: Neural Network Training and Production Deployment.

After having explained how to properly format and process data, we will go over the fundamentals of proper ML model building and training. We will provide an overview of the advantages and inconveniences of different ML algorithms and will demonstrate those building and training steps using a deep Neural Network. We will explain how to explore the hyper-parameters space of a Neural Network as well as how to evaluate the performance and select the best trained model for deployment. Finally, we will show how a model can be deployed in production to process requests in real-time using a simple web API.

This talk is presented by Salah Rifai, PhD and Fayez Zouheiry from Mantika Inc.