AI Safety Fundamentals: Alignment
Podcast autorstwa BlueDot Impact
Kategorie:
83 Odcinki
-
Public by Default: How We Manage Information Visibility at Get on Board
Opublikowany: 12.05.2024 -
Writing, Briefly
Opublikowany: 12.05.2024 -
Being the (Pareto) Best in the World
Opublikowany: 4.05.2024 -
How to Succeed as an Early-Stage Researcher: The “Lean Startup” Approach
Opublikowany: 23.04.2024 -
Become a Person who Actually Does Things
Opublikowany: 17.04.2024 -
Planning a High-Impact Career: A Summary of Everything You Need to Know in 7 Points
Opublikowany: 16.04.2024 -
Working in AI Alignment
Opublikowany: 14.04.2024 -
Computing Power and the Governance of AI
Opublikowany: 7.04.2024 -
AI Control: Improving Safety Despite Intentional Subversion
Opublikowany: 7.04.2024 -
Emerging Processes for Frontier AI Safety
Opublikowany: 7.04.2024 -
AI Watermarking Won’t Curb Disinformation
Opublikowany: 7.04.2024 -
Challenges in Evaluating AI Systems
Opublikowany: 7.04.2024 -
Interpretability in the Wild: A Circuit for Indirect Object Identification in GPT-2 Small
Opublikowany: 1.04.2024 -
Towards Monosemanticity: Decomposing Language Models With Dictionary Learning
Opublikowany: 31.03.2024 -
Zoom In: An Introduction to Circuits
Opublikowany: 31.03.2024 -
Weak-To-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
Opublikowany: 26.03.2024 -
Can We Scale Human Feedback for Complex AI Tasks?
Opublikowany: 26.03.2024 -
Machine Learning for Humans: Supervised Learning
Opublikowany: 13.05.2023 -
Visualizing the Deep Learning Revolution
Opublikowany: 13.05.2023 -
Four Background Claims
Opublikowany: 13.05.2023
Listen to resources from the AI Safety Fundamentals: Alignment course!https://aisafetyfundamentals.com/alignment