As organizations increasingly integrate large language models (LLMs) into their operations, it has become critical to ensure that these systems operate fairly and responsibly across diverse populations. As with any AI or ML application, LLMs in both agentic and non-agentic settings can inadvertently learn and reproduce historical biases based on their training data, creating systems that systematically disadvantage certain groups. Last year, for example, a study at Stanford University found e...