Countless studies have found that “bias” – typically with respect to race and gender – pervades the embeddings and predictions of the black-box models that dominate natural language processing (NLP). For example, the language model GPT-3, of OpenAI fame, can generate racist rants when given the right prompt. Attempts to detect hate speech can itself harm minority populations, whose dialect is more likely to be flagged as hateful.| Kawin Ethayarajh
Incorporating context into word embeddings - as exemplified by BERT, ELMo, and GPT-2 - has proven to be a watershed idea in NLP. Replacing static vectors (e.g., word2vec) with contextualized word representations has led to significant improvements on virtually every NLP task.| Kawin Ethayarajh
Word vectors are often criticized for capturing undesirable associations such as gender stereotypes. For example, according to the word embedding association test (WEAT), relative to art-related terms, science-related ones are significantly more associated with male attributes.| Kawin Ethayarajh
A surprising property of word vectors is that word analogies can often be solved with vector arithmetic. Most famously,| Kawin Ethayarajh