Assumed audience: Mid career technical researchers considering moving into AI Safety research, career advisors in the EA/AI Safety space, AI Safety employers and grantmakers Nonetl;dr AI career advice orgs, prominently 80,000 Hours, encourage career moves into AI risk roles, including mid‑career pivots into roles in AI safety research labs. Without side information, that advice is not credible for mid‑career readers, because it does not have a calibration mechanism. Advice organizations i...| The Dan MacKinlay stable of variably-well-consider’d enterprises
Figure 1 I’m working on some proposals in AI safety at the moment, including this one. I submitted this particular one to the UK AISI Alignment Project. It was not funded. Note that this post is different than many on this blog. It’s highly speculative and yet not that measured; that’s because it’s a pitch, not an analysis. It doesn’t contain a credible amount of detail (there were only two text fields with a 500 word limit to explain the whole thing) I present it here for comment ...| The Dan MacKinlay stable of variably-well-consider’d enterprises
Wherein a failed application is set forth, and two research pathways are outlined: a Bias‑Robust Oversight programme at UTS’s Human Technology Institute, and MCMC estimation of the Local Learning Coefficient with Timaeus’ Murfet.| The Dan MacKinlay stable of variably-well-consider’d enterprises
Oddesty| The Dan MacKinlay stable of variably-well-consider’d enterprises
Figure 1 Let’s reason backwards from the final destination of civilisation, if such a thing there be. What intelligences persist at the omega point? With what is superintelligence aligned in the big picture? Various authors have tried to put modern AI developments in continuity with historical trends from less materially-sophisticated societies, through more legible, compute-oriented societies, to some or set of attractors at the end of history. Computational superorganisms. Singularities....| The Dan MacKinlay stable of variably-well-consider’d enterprises
1 Key Research Directions| The Dan MacKinlay stable of variably-well-consider’d enterprises
Figure 1 Placeholder. Notes on how to implement alignment in AI systems. This is necessarily a fuzzy concept, because Alignment is fuzzy and AI is fuzzy. We need to make peace with the frustrations of this fuzziness and move on. 1 Fine tuning to do nice stuff Think RLHF, Constitutional AI etc. I’m not greatly persuaded that these are the right way to go, but they are interesting. 2 Classifying models as unaligned I’m familiar only with mechanistic interpretability at the moment; I’m su...| The Dan MacKinlay stable of variably-well-consider’d enterprises
Figure 1 1 Incoming Inside the U.K.’s Bold Experiment in AI Safety | TIME Governing with AI | Justin Bullock Deep atheism and AI risk - Joe Carlsmith Wong and Bartlett (2022) we hypothesize that once a planetary civilization transitions into a state that can be described as one virtually connected global city, it will face an ‘asymptotic burnout’, an ultimate crisis where the singularity-interval time scale becomes smaller than the.env time scale of innovation. If a civilization develo...| The Dan MacKinlay stable of variably-well-consider’d enterprises
Notes on AI Alignment Fast-Track - Losing control to AI 1 Session 1 What is AI alignment? – BlueDot Impact More Is Different for AI Paul Christiano, What failure looks like 👈 my favourite. Cannot believe I hadn’t read this. AI Could Defeat All Of Us Combined Why AI alignment could be hard with modern deep learning Terminology I should have already known but didn’t: Convergent Instrumental Goals. Self-Preservation Goal Preservation Resource Acquisition Self-Improvement Ajeya Cotra’s...| The Dan MacKinlay stable of variably-well-consider’d enterprises