Name		Name	Last commit message	Last commit date
parent directory ..
SentimentAnalysis		SentimentAnalysis
README.md		README.md
SentimentAnalysis.sln		SentimentAnalysis.sln

README.md

Sentiment Analysis for User Reviews

Automated Machine Learning

Automated machine learning (AutoML) automates the end-to-end process of applying machine learning to real-world problems. Given a dataset, AutoML iterates over different data featurizations, machine learning algorithms, hyperparamters, etc. to select the best model.

Problem

This problem is to predict if a Wikipedia comment contains a personal attack or not. We will use small sample of the Wikipedia Detox dataset: one dataset for training and a second to test the model produced by AutoML. Human judges assigned every comment in these datasets a toxicity label:

0 - nice/positive
1 - toxic/negative

We will build a model that will analyze a string and predict a sentiment value of 0 or 1.

ML Task - Binary Classification

The generalized problem of binary classification is to classify items into one of two classes (classifying items into more than two classes is called multiclass classification).

Some examples:

predict if an insurance claim is valid or not
predict if a plane will be delayed or will arrive on time
predict if a face ID (photo) belongs to the owner of a device

For all these examples, the parameter we want to predict can take only one of two values. In other words, this value is represented by boolean type.

Step 1: Load the Data

Load the datasets required to train and test:

 IDataView trainingDataView = mlContext.Data.LoadFromTextFile<SentimentIssue>(TrainDataPath, hasHeader: true);
 IDataView testDataView = mlContext.Data.LoadFromTextFile<SentimentIssue>(TestDataPath, hasHeader: true);

Step 2: Build a Machine Learning Model Using AutoML

Instantiate and run an AutoML experiment. In doing so, specify how long the experiment should run in seconds (ExperimentTime), and set a progress handler that will receive notifications after AutoML trains & evaluates each new model.

// Run AutoML binary classification experiment
ExperimentResult<BinaryClassificationMetrics> experimentResult = mlContext.Auto()
    .CreateBinaryClassificationExperiment(ExperimentTime)
    .Execute(trainingDataView, progressHandler: new BinaryExperimentProgressHandler());

Step 3: Evaluate Model

Grab the best model produced by the AutoML experiment

ITransformer model = experimentResult.BestRun.Model;

and evaluate its quality on a test dataset that was not used in training (wikipedia-detox-250-line-test.tsv).

EvaluateNonCalibrated compares the predicted values for the test dataset to the real values. It produces various metrics, such as accuracy:

var predictions = trainedModel.Transform(testDataView);
var metrics = mlContext.BinaryClassification.EvaluateNonCalibrated(predictions, scoreColumnName: "Score");

Step 4: Make Predictions

Using the trained model, call the Predict() API to predict the sentiment for new sample text sampleStatement:

// Create prediction engine related to the loaded trained model
var predEngine = mlContext.Model.CreatePredictionEngine<SentimentIssue, SentimentPrediction>(model);

// Score
var predictedResult = predEngine.Predict(sampleStatement);

predictedResult.PredictionLabel will be either true or false. If true, the predicted sentiment is toxic; if false, non-toxic.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BinaryClassification_AutoML

BinaryClassification_AutoML

README.md

Sentiment Analysis for User Reviews

Automated Machine Learning

Problem

ML Task - Binary Classification

Step 1: Load the Data

Step 2: Build a Machine Learning Model Using AutoML

Step 3: Evaluate Model

Step 4: Make Predictions

Files

BinaryClassification_AutoML

Directory actions

More options

Directory actions

More options

Latest commit

History

BinaryClassification_AutoML

Folders and files

parent directory

README.md

Sentiment Analysis for User Reviews

Automated Machine Learning

Problem

ML Task - Binary Classification

Step 1: Load the Data

Step 2: Build a Machine Learning Model Using AutoML

Step 3: Evaluate Model

Step 4: Make Predictions