Skip to content

Latest commit

 

History

History
126 lines (93 loc) · 3.38 KB

README.md

File metadata and controls

126 lines (93 loc) · 3.38 KB

gomnist

This package lets you to load the MNIST data set for use with gonum package. It is useful when implementing, for example, deep learning using the gonum package.


CircleCI GolangCI Maintainability

What's MNIST ??

THE MNIST DATABASE

The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image. It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting.

Quick Start

First of all, you should download mnist file. THE MNIST DATABASE

# exmple
.
└── testdata
    ├── t10k-images-idx3-ubyte.gz
    ├── t10k-labels-idx1-ubyte.gz
    ├── train-images-idx3-ubyte.gz
    └── train-labels-idx1-ubyte.gz

Using gomnist, you can get MNIST data as gonum matrix.

package main

import "github.com/po3rin/gomnist"

func main() {
    // first arg is target dir has mnist file.
    l := gomnist.NewLoader("./testdata")

    // Do !!
    mnist, err := l.Load()
    if err != nil {
      // error handling ...
    }

    // type MNIST struct {
    //   TrainData   mat.Matrix
    //   TrainLabels mat.Matrix
    //   TestData    mat.Matrix
    //   TestLabels  mat.Matrix
    // }
    _ = mnist.TrainData.At(0, 135)
    // 55
}

Options

gomnist options is implimented as Functional Option Pattern.

Normalization

Normalization Option is whether to normalize the input image value to a value between 0 and 1 (Default false)

package main

import "github.com/po3rin/gomnist"

func main() {
    l := gomnist.NewLoader("./testdata", gomnist.Normalization(true))
    mnist, err := l.Load()
    if err != nil {
      // error handling ...
    }
    _ = mnist.TrainData.At(0, 135)
    // 0.21568627450980393
}

One-Hot Label

OneHotLabel Option is whether to set one-hot label.

package main

import "github.com/po3rin/gomnist"

func main() {
    l := gomnist.NewLoader("./testdata", gomnist.OneHotLabel(true))
    mnist, err := l.Load()
    if err != nil {
      // error handling ...
    }
    _ = mnist.TrainLabel
    // ⎡0 0 0 0 0 1 0 0 0 0 0⎤
    // ⎢          .          ⎥
    // ⎣          .          ⎦
}

Dimension

Default

(Number of images) * (Total number of pixels : 28*28)

  • trainData: 60000 - 784
  • testData: 10000 - 784

(Number of images) * (Handwritten digits value)

  • trainLabels: 60000 - 1
  • testLabels: 10000 - 1

One-Hot

(Number of images) * (Handwritten digits value)

  • trainLabels: 60000 - 10
  • testLabels: 10000 - 10

TODO

  • Download if mnist file do not exits