Login
From:
Lj Miranda
(Uncensored)
subscribe
A lexical view of contrast pairs in preference datasets
https://ljvmiranda921.github.io/notebook/2024/03/12/contrast-pairs/
links
backlinks
Tagged with:
notebook
openai
llm
rlhf
preference data
shp
berkeley-nest
Can we spot differences between preference pairs just by looking at their word embeddings? In this blog post, I want to share my findings from examining lexical distances between chosen and rejected responses in preference datasets.
Roast topics
Find topics
Roast it!
Roast topics
Find topics
Find it!
Roast topics
Find topics
Find it!