Annotate: Use the Hypothes.is browser extension to leave comments on this page. Public annotations are picked up by the workshop Slack bot. For private notes, create a private group.

Wellbeing Workshop · Beliefs Analysis

WELLBY reliability & funder adoption forecasts · March 16 2026 workshop · The Unjournal · Internal draft updated June 18 2026 — estimates visible, for internal review only

PQ1A: WELLBY Reliability
Sub-questions
Individual Responses
Methods & Notes

PQ1A: What is your probability that linear WELLBY comparisons are reliable enough for comparing interventions in LMICs? Respondents gave a central estimate (0–100%) and a 90% credible interval.

Dataset filter
Researcher / Academic
Practitioner
Anonymous / unknown
PQ1A aggregate

Individual estimates with 90% credible intervals · linear scale 0–100%

Each shaded bar spans the respondent's 90% CI. Dot = central estimate. Sorted by central estimate (high to low).

June 18 update: Added Miles Kimball's June 17 response. The most visible effects are a higher GiveWell-adoption forecast (new max 50%), a higher major-funder adoption forecast (60%), and a much wider DALY/WELLBY conversion range because Kimball gives 1 WELLBY/DALY with a 0.1-10 interval while stressing that his team has not studied that conversion directly.
Interpretation notes:
  • Caspar Kaiser's central estimate of 100% is conditional: he interpreted "reliable" relative to other available measures (where linear WELLBY is likely best). He notes he would give ~0.5% if compared to possible measures. His wide CI [20–100%] reflects this framing uncertainty.
  • One response shown as "Anon. participant 2" is anonymized at the respondent's request.
  • Dan Benjamin (UCLA), one anonymous respondent, and Miles Kimball (CU Boulder) submitted post-workshop (Mar 17–Jun 17).

PQ1B — Recommended measure for funders

What measure should funders focused on quality-of-life improvements use? 7 of 8 respondents answered.

There is broad agreement that calibration is worth pursuing. The dividing line is whether calibrated WELLBY alone suffices, or whether a broader composite (multiple wellbeing dimensions + revealed preference anchors) is preferable. Plant is the sole advocate for linear WELLBY as the best currently available, without endorsing calibration as a first step.

PQ2 — DALY / WELLBY conversion factor

How many WELLBYs equal 1 DALY? 5 of 8 respondents gave a numeric estimate with 90% CI. Plant declined on principle; Benjamin and Anon. participant 1 left blank.

Plant's objection: "The evidence suggests that for comparing disability weights and wellbeing weights, you get different answers based on the intervention and problem under consideration. So I'd resist doing a simple exchange rate." McGuire similarly advocates separate conversion rates for mental vs. physical health contexts.

PQ3 — Funder adoption forecasts

Point estimates only (no CIs collected for PQ3).

PQ3A — P(GiveWell adopts WELLBY in 3 years)

8 responses · range 5–50% · mean 20% · median 20%

PQ3B — Share of major funders using WELLBY in 5 years

6 responses (Anon. participant 2, Anon. participant 1: no answer) · range 15–67% · mean 37% · median 30%

PQ3C — P(HLI abandons WELLBY within 3 years)

8 responses · range 14–70% · mean 36% · median 35% · Notable: McGuire (HLI) gives the highest estimate at 70%, citing deworming's limited long-term wellbeing impact.

Click a card to expand. One response anonymized at respondent's request.

Data collection

Responses collected via Netlify form at uj-wellbeing-workshop.netlify.app/beliefs during and after the March 16, 2026 workshop. Raw data stored in wellbeing-beliefs-elicitation-submissions.json (private repo). One test submission (Claude Test) excluded. 8 substantive responses remain.

Response timing

Credible intervals

The form asked for a central estimate and a 90% credible interval for PQ1A and PQ2. No CIs were collected for PQ3A, PQ3B, or PQ3C (point estimates only). Not all respondents provided CIs for all questions.

Anonymization

One respondent (shown as "Anon. participant 2," affiliation: funder / evaluator) requested that their individual response not be shared publicly. Their quantitative estimates are included in aggregates and individual CI charts. Their qualitative reasoning is shown in summary form in their response card. This page uses an unlinked internal URL and is not indexed from the main workshop site.

HLI concentration note

3 of the 5 workshop-day named respondents have HLI affiliations (Plant: founder/director; Kaiser: board chair; McGuire: researcher). The Benjamin, Jamison, and Kimball responses provide non-HLI academic perspectives; Benjamin and Kimball are co-authors of related scale-use and multi-dimensional wellbeing work. Interpret aggregate figures accordingly — they do not represent a balanced cross-section of the field.

Completeness

Question N responses Notes
PQ1A central estimate8All 8 provided
PQ1A 90% CI8All 8 provided CIs
PQ1B measure recommendation7Anon. participant 1 did not answer
PQ2 DALY/WELLBY estimate + CI5Plant declined (principled); Benjamin and Anon. participant 1 left blank
PQ3A GiveWell adoption8All provided
PQ3B funder share6Anon. participant 2 (ran out of time) and Anon. participant 1 did not answer
PQ3C HLI abandons8All provided