Note: I was inspired to write this after discussions with Anil Seth and Jonas Mago on AI consciousness, where, of course, I mostly disagreed with them. As with everything on consciousness, the empirical evidence is extremely sparse so it is mostly a game of conflicting intuitions. Strong opinions lightly held,...| Beren's Blog
Epistemic status: Obviously speculative, but interesting. Sometime in the next few decades it seems likely that we will have the singularity. If all goes well and we ssuccessfully create aligned AI systems, humanity will continue to exist but in an entirely new phase. This will likely include the vindication of...| Beren's Blog
Epistemic status: Mostly re-litigating old debates, I believe. Hopefully still somewhat interesting. This is just a short post for a short note which took me a worrying length of time to realize. For a while people were claiming that the pretraining next token prediction objective could directly lead to superintelligence...| Beren's Blog
Epistemic Status: Fairly sure about this from experience but could be missing crucial considerations. I don’t present any super detailed evidence here so it is theoretically just vibes. When forecasting AI progress, the forecasters and modellers often break AI progress down into two components: increased compute, and ‘algorithmic progress’. My...| Beren's Blog
One question which I have occasionally pondered is: assuming that we actually succeed at some kind of robust alignment of AGI, what is the alignment target we should focus on? In general, this question splits into two basic camps. The first is obedience and corrigibility: the AI system should execute...| Beren's Blog
Epistemic note: Very short point and I’m pretty uncertain on this myself. Trying to work out the arguments in blog format. In the alignment discourse I notice a lot of vaguely described but very real worry along the lines of “Even if we train an AI to be aligned and...| Beren's Blog
Epistemic status: Early thoughts. Some ideas but no empirical testing or validation as yet. I’ve started thinking a fair bit about reward hacking recently. This is because frontier models are reportedly beginning to show signs of reward hacking especially for coding tasks. Thus, the era of easy-to-align pretraining-only models appears...| Beren's Blog
Recently Noumenal Labs announced themselves and I read their white paper. Although pretty light on specifics, it seems pretty clear that their issues with LLMs and generally NNs is that they do not properly reflect in their structure the true underlying generative process of reality — effectively that they do...| Beren's Blog
Epistemic status: Pretty uncertain, this is a model I have been using to think about neural networks for a while, which does have some support, but is not completely rigorous. I hear a lot of people talk about scaling laws as if they are a property of specific models or...| Beren's Blog
Occasionally I hear people say or believe that NNs are overparametrized and base their intuitions off of this idea. Certainly there is a small literature in academia around phenomena like double descent which do implicitly assume an overparametrized network. However, while overparametrized inference and generalization is certainly a valid regime...| Beren's Blog
Recent advances have begun to move AI beyond pretrained amortized models and supervised learning. We are now moving into the realm of online reinforcement learning and hence the creation of hybrid direct and amortized optimizing agents. While we generally have found that purely amortized pretrained models are an easy case...| Beren's Blog
I’ve had this book on my reading list for a while since this is the classic book everyone cites about predicting the singularity ahead of time and describing a ‘merging’ of AI and human minds into the future as a positive singularity. Since I had some time this afternoon, I...| Beren's Blog
As always, 2024 has been an interesting year marked by extremely rapid and impressive AI progress. Every year since 2020 has felt like a rollercoaster of AI surging past our expectations, which makes you think there is no way it can possibly go any faster, and then the next year...| Beren's Blog
Active Inference is a theory of adaptive action selection for agents proposed by Karl Friston initially and now expanded upon by many authors and forms a small academic subfield of research. The core claims of the theory are that action selection and decision-making can be usefully understood as inference problems...| Beren's Blog
This is a guest post by Max Buckley, a software engineer at Google and fellow AI researcher1. By some twist of fate, this blog has become the chronicle of the evolution of integer tokenization. In an earlier post in February of 2023, it was discussed how older models: GPT-2 and...| Beren's Blog
Recently, I was reading this paper which demonstrates how to do online RLHF for alignment of LLMs and a sentence stuck out to me: We conjecture that this is because the reward model (discriminator) usually generalizes better than the policy (generator) This is an offhand remark but it strikes at...| Beren's Blog
Last year, I wrote a quick post investigating the ‘unconditioned’ distribution of LLMs in the OpenAI API, where the ‘unconditioned distribution’ is simply the distribution of LLM outputs following the empty string – or beginning of sequence token. My intuition here was that this gives some idea of what the...| Beren's Blog
When discussing the future of AI, I semi-often hear an argument along the lines that in a slow takeoff world, despite AIs automating increasingly more of the economy, humanity will remain in the driving seat because of its ownership of capital. This world posits one where humanity effectively becomes a...| Beren's Blog
Synthetic data is a new frontier in AI training. Phi3 and Llama3 and other recent models have demonstrated the ability of large amounts of synthetic, well-tailored data to significantly improve performance of small models to bring them closer to the frontier by implicitly cheaply distilling from larger, more powerful, models....| Beren's Blog
After spending a lot of time with language models, I have come to the conclusion that tokenization in general is insane and it is a miracle that language models learn anything at all. To drill down into one specific example of silliness which has been bothering me recently, let’s look...| www.beren.io
Just over a year ago, I wrote about how integer tokenization using the GPT2 and GPT3 tokenizer was insane. This was because that it failed to create a coherent number representation in token space since large numbers of integers were assigned a single unique token, and even multi-token integers were...| www.beren.io