Abstracts: NeurIPS 2024 with Weizhu Chen
Microsoft Research Podcast - Podcast autorstwa Researchers across the Microsoft research community
 
   Kategorie:
Next-token prediction trains a language model on all tokens in a sequence. VP Weizhu Chen discusses his team’s 2024 NeurIPS paper on how distinguishing between useful and “noisy” tokens in pretraining can improve token efficiency and model performance.Read the paperGet the code
