What are the “values” of AI? How do they manifest in conversation? How consistent are they? Can they be manipulated? A study by the Societal Impacts group at Anthropic (maker of Claude) tried to find out. Claude and other models are trained to observe certain rules—human values and etiquette: At Anthropic, we’ve attempted to shape the values of our AI model, Claude, to help keep it aligned with human preferences, make it less likely to engage in dangerous behaviors, and generally make...