EA - Some research ideas in forecasting by Jaime Sevilla
The Nonlinear Library: EA Forum - Podcast autorstwa The Nonlinear Fund
Kategorie:
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Some research ideas in forecasting, published by Jaime Sevilla on November 15, 2022 on The Effective Altruism Forum.In the past, I have researched how we can effectively pool the predictions of many experts. For the most part, I am now focusing on directing Epoch and AI forecasting.However, I have accumulated a log of research projects related to forecasting. I have the vague intention of working on them at some point, but this will likely be months or years, and meanwhile I would be elated if someone else takes my ideas and develops them.And with the Million Predictions Hackathon by Metaculus looming, now seems a particularly good moment to write down some of these project ideas.Compare different aggregation methodsDifficulty: easyThe ultimate arbiter of what aggregation works is what performs best in practice.Redoing a comparison of forecast aggregation methods on metaculus / INFER / etc questions would be helpful data for that purpose.For example, here is a script I wrote to compare some aggregation methods, and the results I obtained:MethodWeightedBrier-logQuestionsNeyman aggregate (p=0.36)Yes0.1060.340899Extremized mean of logodds (d=1.55)Yes0.1110.350899Neyman aggregate (p=0.5)Yes0.1110.351899Extremized mean of probabilities (d=1.60)Yes0.1120.355899Metaculus predictionYes0.1110.361774Mean of logoddsYes0.1160.370899Neyman aggregate (p=0.36)No0.1200.377899MedianYes0.1210.381899Extremized mean of logodds (d=1.50)No0.1260.391899Mean of probabilitiesYes0.1220.392899Neyman aggregate (o=1.00)No0.1260.393899Extremized mean of probabilities (d=1.60)No0.1270.399899Mean of logoddsNo0.1300.410899MedianNo0.1340.418899Mean of probabilitiesNo0.1380.439899Baseline (p = 0.36)N/A0.2300.652899It would be straightforward to extend this analysis with new questions that resolved since then, other dataset or new techniques.Literature review of weight aggregationDifficulty: easyWhen aggregating forecast, we usually resort to formulas like ∑iailogo1, where oi are the individual predictions (expressed in odds) and ai the weights assigned to each prediction.Right now I have a lot of uncertainty about what are the best theoretical and empirical approaches to assigning weights to predictions. These could be based on factors like the date of the prediction, the track record of the forecaster or other factors.The first step would be to a literature review of schemes to weigh the predictions of experts when aggregating, and compare them using Metaculus data.Comparing methods for predicting base ratesDifficulty: mediumUsing historical data is always a must when forecasting.While one can rely on intuition to extract lessons from the past, it is often convenient to have some rules of thumb that inform how to translate historical frquencies to baserate probabilities.The classical method in this situation is Laplace's rule of succession. However, we showed that this method gives inconsistent results when trying to apply it to observations over a time period, and we proposed a fix here.Number of observed successes S during time TProbability of no successes during t timeS=0(1+tT)−1S>0(1+tT)−S if the sampling time period is variable(1+tT)−(S+1) if the sampling time period is fixedWhile theoretically appealing, we did not show that employing this fix actually improves performance, so there is a good research opportunity for someone to collect data and investigate this.Decay of predictionsDifficulty: mediumImagine I predict that no earthquakes will happen in Chile before 2024 with 60% probability today. Then in April 2023, if no earthquakes have happened, my implied probability should be lower than 60%.Theoretically, we should be derive the implied probabability under some mild assumptions that the probability was uniform over time, maybe following a framework like the time-...
