Assumed audience: Mid career technical researchers considering moving into AI Safety research, career advisors in the EA/AI Safety space, AI Safety employers and grantmakers Nonetl;dr AI career advice orgs, prominently 80,000 Hours, encourage career moves into AI risk roles, including mid‑career pivots into roles in AI safety research labs. Without side information, that advice is not credible for mid‑career readers, because it does not have a calibration mechanism. Advice organizations i...| The Dan MacKinlay stable of variably-well-consider’d enterprises
Figure 1 Agent foundations is the branch of AI alignment that tries to answer: if we were to build a superintelligent system from scratch, what clean, mathematical objective could we give it so that it robustly does what we want, even if we cannot understand the system ourselves? Unlike interpretability (which inspects black-box models) or preference learning (which tries to extract human values), agent foundations is about first principles: designing an agent that’s “aligned by construc...| The Dan MacKinlay stable of variably-well-consider’d enterprises
Figure 1 I’m working on some proposals in AI safety at the moment, including this one. I submitted this particular one to the UK AISI Alignment Project. It was not funded. Note that this post is different than many on this blog. It’s highly speculative and yet not that measured; that’s because it’s a pitch, not an analysis. It doesn’t contain a credible amount of detail (there were only two text fields with a 500 word limit to explain the whole thing) I present it here for comment ...| The Dan MacKinlay stable of variably-well-consider’d enterprises
Wherein a failed application is set forth, and two research pathways are outlined: a Bias‑Robust Oversight programme at UTS’s Human Technology Institute, and MCMC estimation of the Local Learning Coefficient with Timaeus’ Murfet.| The Dan MacKinlay stable of variably-well-consider’d enterprises
Oddesty| The Dan MacKinlay stable of variably-well-consider’d enterprises
Figure 1 Let’s reason backwards from the final destination of civilisation, if such a thing there be. What intelligences persist at the omega point? With what is superintelligence aligned in the big picture? Various authors have tried to put modern AI developments in continuity with historical trends from less materially-sophisticated societies, through more legible, compute-oriented societies, to some or set of attractors at the end of history. Computational superorganisms. Singularities....| The Dan MacKinlay stable of variably-well-consider’d enterprises
Configuring machine learning experiments with Fiddle| The Dan MacKinlay stable of variably-well-consider’d enterprises
Figure 1 Placeholder while I think about the practicalities and theory of AI agents. Practically, this usually means many agents. See also Multi agent systems. 1 Factored cognition Field of study? Or one company’s marketing term? Factored Cognition | Ought: In this project, we explore whether we can solve difficult problems by composing small and mostly context-free contributions from individual agents who don’t know the big picture. Factored Cognition Primer 2 Incoming Introducing smola...| The Dan MacKinlay stable of variably-well-consider’d enterprises
Figure 1 1 Incoming Inside the U.K.’s Bold Experiment in AI Safety | TIME Governing with AI | Justin Bullock Deep atheism and AI risk - Joe Carlsmith Wong and Bartlett (2022) we hypothesize that once a planetary civilization transitions into a state that can be described as one virtually connected global city, it will face an ‘asymptotic burnout’, an ultimate crisis where the singularity-interval time scale becomes smaller than the.env time scale of innovation. If a civilization develo...| The Dan MacKinlay stable of variably-well-consider’d enterprises
Notes on AI Alignment Fast-Track - Losing control to AI 1 Session 1 What is AI alignment? – BlueDot Impact More Is Different for AI Paul Christiano, What failure looks like 👈 my favourite. Cannot believe I hadn’t read this. AI Could Defeat All Of Us Combined Why AI alignment could be hard with modern deep learning Terminology I should have already known but didn’t: Convergent Instrumental Goals. Self-Preservation Goal Preservation Resource Acquisition Self-Improvement Ajeya Cotra’s...| The Dan MacKinlay stable of variably-well-consider’d enterprises