Skip to content

Latest commit

 

History

History
44 lines (30 loc) · 2.08 KB

README.md

File metadata and controls

44 lines (30 loc) · 2.08 KB

Transformed Distribution Matching (TDM) for Missing Value Imputation, ICML 2023

This repository contains the implementation of the ICML 2023 paper [1].

Dependencies

The dependencies are specified in the file of requirements.txt.

Data

  • The input data is an N by D matrix, the missing values of which are indicated by numpy.nan (N and D are the number of data samples and the feature dimensions respectively).

  • The completed data can also be provided (for evaluation only, not necessary), i.e., another an N by D matrix with the missing values filled.

  • We provide a dataset used in our paper, named seeds under the folder of datasets, which is preprocessed from UCI Datasets.

Run TDM

Simply run demo.py.

Acknowledgment

We gratefully thank the authors for the following software and datasets

  • UCI Datasets (We used the datasets for evaluation)
  • MissingDataOT (We used the functions of data preparation, mask generation, and evaluation)
  • hyperimpute (We used the implementations of several baselines)
  • FrEIA (We used the implementations of the invertible neural networks)

Reference

[1] He Zhao, Ke Sun, Amir Dezfouli, Edwin V. Bonilla, Transformed Distribution Matching for Missing Value Imputation, ICML 2023.

@inproceedings{zhao2023transformed,
  title={Transformed Distribution Matching for Missing Value Imputation},
  author={Zhao, He and Sun, Ke and Dezfouli, Amir and Bonilla, Edwin V},
  booktitle={International Conference on Machine Learning},
  pages={42159--42186},
  year={2023},
  organization={PMLR}
}

All the authors of the paper are with CSIRO's Data61.

The code comes without support.