Skip to content

Latest commit

 

History

History
22 lines (13 loc) · 1.66 KB

README.md

File metadata and controls

22 lines (13 loc) · 1.66 KB

Interpretable Word Vector Subspaces

Presented at the ICLR 2020 workshop on Machine Learning in Real Life (ML-IRL) on April 26 2020. Link to paper here.

Abstract

Natural Language Processing relies on high-dimensional word vector representations, which may reflect biases in the training text corpus. Identifying these biases by finding corresponding interpretable subspaces is crucial for achieving fair decisions. Existing works have adopted Principal Component Analysis (PCA) to identify subspaces such as gender but fail to generalize efficiently or provide a principled methodology. We propose a framework for existing PCA methods and considerations for optimizing them. We also present a novel algorithm for finding topic subspaces more efficiently and compare it to an existing approach.

Contents