This is an excerpt from today’s issue of my weekly newsletter.| thesephist.com
My desktop wallpaper loops through a handful of screenshots of quotes I’ve collected over the years. These quotes push on my worldview in just the right places to help me approach my work in ways I find encouraging and energizing, so I like to have them in the periphery of my workspace like virtual post-it notes.| thesephist.com
Large Language Models (LLMs) have revolutionized natural language processing but can exhibit biases and may generate toxic content. While alignment techniques like Reinforcement Learning from Human Feedback (RLHF) reduce these issues, their impact on creativity, defined as syntactic and semantic diversity, remains unexplored. We investigate the unintended consequences of RLHF on the creativity of LLMs through three experiments focusing on the Llama-2 series. Our findings reveal that aligned m...| arXiv.org
Thanks to Ian McKenzie and Nicholas Dupuis, collaborators on a related project, for contributing to the ideas and experiments discussed in this post…| www.lesswrong.com