Best AI papers explained
Podcast autorstwa Enoch H. Kang
529 Odcinki
-
Data Mixture Optimization: A Multi-fidelity Multi-scale Bayesian Framework
Opublikowany: 31.03.2025 -
Why MCP won
Opublikowany: 31.03.2025 -
SWEET-RL: Training LLM Agents for Collaborative Reasoning
Opublikowany: 31.03.2025 -
TheoryCoder: Bilevel Planning with Synthesized World Models
Opublikowany: 30.03.2025 -
Driving Forces in AI: Scaling to 2025 and Beyond (Jason Wei, OpenAI)
Opublikowany: 29.03.2025 -
Expert Demonstrations for Sequential Decision Making under Heterogeneity
Opublikowany: 28.03.2025 -
TextGrad: Backpropagating Language Model Feedback for Generative AI Optimization
Opublikowany: 27.03.2025 -
MemReasoner: Generalizing Language Models on Reasoning-in-a-Haystack Tasks
Opublikowany: 27.03.2025 -
RAFT: In-Domain Retrieval-Augmented Fine-Tuning for Language Models
Opublikowany: 27.03.2025 -
Inductive Biases for Exchangeable Sequence Modeling
Opublikowany: 26.03.2025 -
InverseRLignment: LLM Alignment via Inverse Reinforcement Learning
Opublikowany: 26.03.2025 -
Prompt-OIRL: Offline Inverse RL for Query-Dependent Prompting
Opublikowany: 26.03.2025 -
Alignment from Demonstrations for Large Language Models
Opublikowany: 25.03.2025 -
Q♯: Distributional RL for Optimal LLM Post-Training
Opublikowany: 18.03.2025 -
Scaling Test-Time Compute Without Verification or RL is Suboptimal
Opublikowany: 14.03.2025 -
Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning
Opublikowany: 14.03.2025 -
Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning
Opublikowany: 14.03.2025 -
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Opublikowany: 14.03.2025 -
Revisiting Superficial Alignment Hypothesis
Opublikowany: 14.03.2025 -
Diagnostic uncertainty: teaching language Models to describe open-ended uncertainty
Opublikowany: 14.03.2025
Cut through the noise. We curate and break down the most important AI papers so you don’t have to.
