Even though Artificial Intelligence (AI) has existed since the 1950s, the rapid developments in the past years have turned it into one of the most promising sectors within ICT. There has been an enormous growth in investments and use of AI systems in many sectors such as automobile, health and aeronautics, creating new challenges for both industry and society. To ensure the development of trustworthy AI systems that respect fundamental values and human rights recognized in Europe, standardiza...| CEN-CENELEC
The nation’s leading AI labs treat security as an afterthought. Currently, they’re basically handing the key secrets for AGI to the CCP on a silver platter. Securing the AGI secrets and weights against the state-actor threat will be an immense effort, and we’re not on track. They met in the evening in Wigner’s office. “Szilard| SITUATIONAL AWARENESS
12 week online course, covering a range of policy levers for steering AI development. By taking this course, you’ll learn about the risks arising from future AI systems, and proposed governance interventions to address them. You’ll consider interactions between AI and biosecurity, cybersecurity and defence capabilities, and the disempowerment of human decision-makers. We’ll also provide an overview of open technical questions such as the control and alignment problems – which posit th...| BlueDot Impact
Avoiding unhelpful work as a new AI governance researcher| adamjones.me
Why having a human-in-the-loop doesn't solve everything| adamjones.me
AI could bring significant rewards to its creators. However, the average person seems to have wildly inaccurate intuitions about the scale of these rewards. By exploring some conservative estimates of the potential rewards AI companies could expect to see from the automation of human labour, this article tries to convey a grounded sense of ‘woah, this could […]| BlueDot Impact
Reinforcement learning from human feedback (RLHF) has emerged as a powerful technique for steering large language models (LLMs) toward desired behaviours. However, relying on simple human feedback doesn’t work for tasks that are too complex for humans to accurately judge at the scale needed to train AI models. Scalable oversight techniques attempt to address this […]| BlueDot Impact
OHGOOD: A coordination body for compute governance| adamjones.me
This article explains key concepts that come up in the context of AI alignment. These terms are only attempts at gesturing at the underlying ideas, and the ideas are what is important. There is no strict consensus on which name should correspond to which idea, and different people use the terms differently.[[1]] This article explains […]| BlueDot Impact
AI systems already pose many significant existing risks including harmful malfunctions, discrimination, reducing social connection, invasions of privacy and disinformation. Training and deploying AI systems can also involve copyright infringement and worker exploitation. Future AI systems could exacerbate anticipated catastrophic risks, including bioterrorism, misuse of concentrated power, nuclear and conventional war. We might also gradually […]| BlueDot Impact
How are AI companies doing with their voluntary commitments on vulnerability reporting?| adamjones.me
A WIRED investigation shows that the AI-powered search startup Forbes has accused of stealing its content is surreptitiously scraping—and making things up out of thin air.| WIRED
Perplexity AI claims it sends a user agent and respects robots.txt but it absolutely does not| rknight.me
Has Sam Altman told the truth about OpenAI’s NDA scandal? How the OpenAI CEO and other key players treated the question of vested equity, explained.| Vox
Advanced AI systems could have massive impacts on humanity and potentially pose global catastrophic risks. There are opportunities...| 80,000 Hours
In this post, we try to estimate what fraction of all chips were high-end data center AI chips in 2022. When discussing compute governance measures for AI regulation, it is crucial to precisely define the scope of any such measures to prevent regulatory overreach and counterproductive side effects.| Blog - Lennart Heim
Dark Mode Toggle| vitalik.eth.limo
Aims to educate developers, designers, architects, managers, and organizations about the potential security risks when deploying and managing Large Language Models (LLMs)| owasp.org
×Sorry to interrupt| register.fca.org.uk
On July 26, 2024, NIST released NIST-AI-600-1, Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile. The profile can help organizations identify unique risks posed by generative AI and proposes actions for generative AI risk management that best aligns with their goals and priorities. | NIST
The OWASP Top 10 is the reference standard for the most critical web application security risks. Adopting the OWASP Top 10 is perhaps the most effective first step towards changing your software development culture focused on producing secure code.| owasp.org
If you thought we might be able to cure cancer in 2200, then I think you ought to expect there’s a good chance we can do it within years of the advent of AI systems that can do the research work humans can do.| Planned Obsolescence