Pre-trained vector representations in natural language processing often inadvertently encode undesirable social biases. Identifying and removing unwanted biased information from vector representation is an evolving and significant challenge. Our study uniquely addresses this issue from the perspective of statistical independence, proposing a framework for reducing bias by transforming vector representations to an unbiased subspace using sufficient projection. The key to our framework lies in its generality: it adeptly mitigates bias across both debiasing and fairness tasks, and across various vector representation types, including word embeddings and output representations of transformer models. Importantly, we establish the connection between debiasing and fairness, offering theoretical guarantees and elucidating our algorithm’s efficacy. Through extensive evaluation of intrinsic and extrinsic metrics, our method achieves superior performance in bias reduction while maintaining high task performance, and offers superior computational efficiency.

GitHub Link

Paper Link

Word Embeddings via Causal Inference: Gender Bias Reducing and Semantic Information Preserving (AAAI 2022)

With widening deployments of natural language processing (NLP) in daily life, inherited social biases from NLP models...

Balancing gender bias in job advertisements with text-level bias mitigation(Frontiers in big Data 2022)

Despite progress towards gender equality in the labor market over the past few decades, gender segregation in labor f...

Conformalized Fairness via Quantile Regression(Neurips 2022)

Algorithmic fairness has received increased attention in socially sensitive domains. While rich literature on mean fa...

Debiasing with Sufficient Projection: A General Theoretical Framework for Vector Representations(NAACL 2024)

Pre-trained vector representations in natural language processing often inadvertently encode undesirable social biase...