⚠️ DRAFT: AI-Generated Content — Requires Verification (click to expand)
This page was largely generated with AI assistance (Claude Code + deep research, March 2026) and has not yet been fully fact-checked. It is shared for workshop discussion and review purposes only.
- Conversion factor ranges (e.g., "7 WELLBYs per QALY") are illustrative and require primary-source verification
- Citations to specific papers (Peasgood et al., EEPRU report, etc.) should be checked against originals
- Interactive demos use simplified assumptions — not for production cost-effectiveness analysis
- Framing and emphasis may not reflect consensus views in the field
Please annotate errors or concerns directly on this page. Your feedback will improve the final version.
Audio Version (~37 min)
Listen to an audio narration of this page (British academic voice):
Download MP3 (15 MB) Text ScriptGenerated with Microsoft Edge TTS (en-GB-RyanNeural)
The conversion problem: what are we trying to do?
Funders and evaluators often face a practical comparison problem: one intervention is evaluated in DALYs averted or QALYs gained (health metrics), while another is evaluated in WELLBYs (life satisfaction point-years). To compare them on a common basis requires some form of translation or mapping.
The focal question for this workshop segment is:[1]This framing comes from the Pivotal Questions database, question codes DALY_01 / DALY_03 / DALY_05.
Focal Question (PQ2)
"How should we translate between health measures (DALYs/QALYs) and subjective wellbeing (WELLBYs) for cross-intervention comparison?"
Sub-questions include: What numerical conversion factor should we use? Should it vary by domain? How should we treat uncertainty?
A key insight to internalize up front: a "conversion" is not a single fact like a currency exchange rate. DALYs/QALYs were designed to measure health burden or health-related quality of life, while WELLBYs are anchored to self-reported life evaluation intended to capture welfare across all domains.[2]WHO methods explicitly note that disability weights are intended to quantify "loss of health" rather than general welfare or social undesirability. These are different targets.
Workshop framing: The goal is not to find "the one true conversion factor," but to choose (and stress-test) a mapping structure that leads to the least expected decision error given the information available.
The measurement-to-decision pipeline
measured in DALYs] --> C[Mapping / conversion
choice] B[Intervention B
measured in WELLBYs] --> C C --> D[Common comparison unit
or multi-metric frame] D --> E[Ranking / allocation
decision]
The mapping choice sits between measurement and decision. Different mappings embed different assumptions—about what welfare is, how metrics relate to it, and what level of precision is appropriate.
Why this is a "forced comparison" rather than a scientific question
Funders cannot wait for perfect measurement. If you must allocate $1M between a malaria program and a mental health program this year, you are implicitly adopting some conversion—even if that conversion is "treat them as incomparable" (which is itself a choice).
The practical question is: given imperfect information, what mapping structure leads to least regret?[17]"Least regret" (or "minimax regret") is a formal criterion from decision theory: choose the action that minimizes the maximum regret across possible states of the world. Here, it means choosing conversion assumptions that minimize expected decision error across plausible true values. This is different from asking "what is the true conversion factor?"—a question that may not have a coherent answer.
Who uses DALY↔WELLBY conversions in practice?
- Founders Pledge: Compares interventions across their four "ways of doing good" (lives, DALYs, WELLBYs, income doublings)[6]Founders Pledge internal framework; they explicitly flag DALY↔WELLBY conversion as a key uncertainty.
- Happier Lives Institute: Uses WELLBY-based cost-effectiveness analysis for mental health interventions[16]HLI has developed explicit WELLBY→monetary conversion methods, controversially applied to StrongMinds.
- GiveWell: Primarily DALY-based but engages with WELLBY evidence when evaluating mental health[11]GiveWell's StrongMinds analysis explicitly discusses the WELLBY→DALY mapping problem.
- UK Government: Uses WELLBYs for policy appraisal via Green Book guidance[7]HM Treasury (2021) provides explicit QALY↔WELLBY conversion methodology.
Definitions and notation
Before discussing conversion, we need clear definitions of the objects being converted. These definitions follow primary sources (WHO, NICE, UK Green Book) rather than informal usage.
DALY (Disability-Adjusted Life Year)
A measure of health burden combining:
- YLL (Years of Life Lost): Deaths × years lost to premature mortality
- YLD (Years Lived with Disability): Incidence × duration × disability weight
where $DW \in [0,1]$ is a disability weight: 0 = full health, 1 = death-equivalent.[3]WHO GHE Methods (2020). Note that GBD 2010+ moved to simplified prevalence-based YLD and removed discounting/age-weighting.
Where do disability weights come from?
Disability weights are elicited through population surveys using choice-based methods:
- Paired comparison: "Which of these two health states would you consider worse?"
- Population health equivalence: "Which is worse: 1000 people with condition X or 2000 with condition Y?"
The GBD 2019 study collected ~60,000 responses from 9 countries to set disability weights for 234 health states.
Importantly, respondents are asked about health loss specifically—not overall welfare, social functioning, or quality of life. This is a deliberate methodological choice that makes DALYs narrower than some alternatives.
QALY (Quality-Adjusted Life Year)
A measure of health benefit adjusting life-years by health-related quality of life:
where $q_t \in [0,1]$ is a health utility weight (sometimes negative for "worse than death" states). 1 QALY = 1 year in perfect health.[4]NICE Glossary. QALYs are widely used in health technology assessment.
DALY vs. QALY: What's the relationship?
DALYs and QALYs are often treated as "opposites" (DALYs measure burden; QALYs measure benefit), but the relationship is more complex:
- Disability weights ≠ 1 − utility weights. The elicitation methods are different, and empirical mappings show poor correspondence at the extremes.
- Reference states differ. DALYs reference "full health"; QALYs allow negative values for "worse than death."
- Aggregation differs. DALYs are population-summed burdens; QALYs are individual-level benefit calculations.
For conversion purposes, 1 DALY averted ≈ 1 QALY gained is a common working assumption, but it's not definitionally true.
WELLBY (Wellbeing-Adjusted Life Year)
A measure of welfare based on self-reported life satisfaction:
where $LS_{it}$ is reported life satisfaction (0-10 scale) for person $i$ at time $t$. One WELLBY = one person experiencing a one-point LS increase for one year.[5]UK Green Book Wellbeing Guidance (2021). This definition is also used by OECD and in the World Happiness Report.
What question produces the LS score?
The standard OECD/ONS question is:
"Overall, how satisfied are you with your life nowadays?"
(0 = "Not at all satisfied" to 10 = "Completely satisfied")
This is an evaluative measure—it asks for a cognitive assessment of one's life, not a measure of current mood or momentary affect.
Other common SWB questions (happiness, worthwhileness, anxiety) capture different constructs and don't combine into WELLBYs the same way.
Why are these different objects?
The metrics have different scopes and measurement architectures:
- DALYs/QALYs focus on health—they do not necessarily capture non-health welfare (income, relationships, meaning).
- WELLBYs aim to capture overall evaluative wellbeing via self-report—they integrate across domains but depend on how people interpret and use response scales.
- DALYs/QALYs use externally-elicited weights (disability weights, health utility tariffs), while WELLBYs use direct self-reported scores.
Important conceptual note: DALYs measure "loss of health," not necessarily welfare
WHO's methods documentation explicitly states that the DALY framework evolved from earlier "welfare/quality of life" framing toward quantifying loss of health (departures from perfect health). Disability weights are intended to reflect health states, not social value, stigma, or general quality-of-life.
This means DALYs and WELLBYs are measuring different targets, even in principle. A "conversion" is really a mapping between proxies, not a unit transformation within the same construct.
Why this matters in practice
The conversion problem is not merely academic. Funders like Founders Pledge, GiveWell, and Open Philanthropy must compare interventions across different outcome spaces. Their four main "ways of doing good" include: lives saved, DALYs averted, WELLBYs generated, and income doublings.[6]Founders Pledge internal framing; also reflected in EA-adjacent cost-effectiveness analysis.
Canonical comparison examples
Malaria bednets vs. mental health treatment
Malaria interventions are typically evaluated in DALYs averted (mortality risk reduction + reduced morbidity). Psychotherapy programs like StrongMinds are often evaluated with depression scales or life satisfaction. Comparing them requires mapping across metrics.
Cash transfers vs. health interventions
Cash transfer RCTs often measure consumption, income, and life satisfaction. Health interventions measure DALYs or QALYs. Cost-effectiveness comparison requires choosing how to weight these.
The stakes are real: different conversion assumptions can materially change which intervention appears more cost-effective. This is not a reason to avoid conversion, but a reason to be explicit about what is being assumed.
The StrongMinds / HLI / GiveWell controversy
The Happier Lives Institute (HLI) ranked StrongMinds—a mental health NGO—as potentially more cost-effective than GiveWell's top charities, primarily based on WELLBY-measured impacts from group therapy.[16]HLI (2023). StrongMinds cost-effectiveness analysis. Available at happierlivesinstitute.org.
GiveWell's reanalysis raised several concerns:
- Depression → LS mapping: Studies measured depression severity (PHQ-9), not life satisfaction. Converting requires assumptions about the depression-LS relationship.
- Effect durability: What persistence should we assume for psychotherapy effects? HLI's assumptions were more optimistic.
- Spillover effects: HLI counted benefits to household members; GiveWell was more skeptical of the evidence base.
This controversy illustrates how DALY↔WELLBY mapping issues can materially affect charity recommendations, not just academic debates.
Implicit conversions: what happens when we don't choose?
Refusing to make a conversion explicit doesn't avoid the problem—it just hides it. Common implicit approaches include:
- "Only compare within metrics": This effectively assigns infinite value to one metric and zero to the other across boundaries.
- "Fund both categories proportionally": This implicitly assumes some conversion ratio equal to budget shares, regardless of cost-effectiveness.
- "Use expert judgment case-by-case": This may embed inconsistent conversions across decisions.
Making conversion assumptions explicit—even with wide uncertainty ranges—is almost always preferable to leaving them implicit.
Candidate conversion approaches
There is no single "correct" method for converting DALYs to WELLBYs. Instead, there are several candidate approaches, each with different data requirements, assumptions, and failure modes.
| Approach | How it works | Main strengths | Main limitations |
|---|---|---|---|
| Fixed conversion factor | Assume 1 QALY ≈ X WELLBYs (constant X) | Simple; easy sensitivity analysis; explicit | Hides domain variation; X is contested; may mislead outside calibration range |
| Anchor-span mapping | Use LS span from "full health" to "as bad as death" to define X | Explicit anchors; traceable to UK guidance | Anchors are empirically uncertain; death-equivalence point is contested |
| Monetary peg ratio | Convert each metric to £/$ using WTP values, then take ratio | Leverages existing valuations; policy-consistent | Inherits valuation uncertainties; circular if values are WELLBY-derived |
| SD-equivalence | Treat 1 SD improvement in one metric ≈ 1 SD in another | Standardizes across scales; used in practice | SDs depend on population variance; not measurement-invariant |
| Empirical crosswalk | Estimate LS = f(health utility, covariates) from datasets with both | Data-driven; can estimate domain-specific mappings | Population-specific; may not generalize; requires joint measurement |
| Component decomposition | Convert YLL and YLD separately (mortality vs. morbidity paths) | Localizes disagreements; transparent | More complex; requires separate evidence for each path |
| Multi-metric sensitivity | Present rankings under multiple X values; report robustness | Explicit uncertainty; avoids false precision | Less actionable if rankings are unstable; requires interpretation |
When to use each approach
Fixed factor is appropriate when you need a simple baseline for sensitivity analysis, or when communicating to audiences who need a single number.
Anchor-span is preferred when you have good estimates of LS at health extremes and want a traceable derivation. The UK Green Book makes this approach explicit.
Monetary peg works when both metrics already have established £/$ valuations (e.g., from health technology assessment). Useful for policy consistency.
Empirical crosswalk is ideal when you have datasets measuring both health status and life satisfaction for the same individuals. This allows population-specific calibration.
Multi-metric sensitivity is recommended for final decision-making when conversion is contested. It makes uncertainty visible rather than hidden.
Why "empirical crosswalk" is harder than it sounds
The idea of estimating LS = f(health) from datasets that measure both seems straightforward, but complications include:
- Selection: People measured on both instruments may not be representative (e.g., patients vs. general population).
- Simultaneity: Health affects LS, but LS may also affect health behaviors and reporting.
- Ceiling effects: At high health levels, LS variation may reflect non-health factors.
- Instrument mismatch: EQ-5D measures "today," while LS measures "overall evaluation." Timing differences matter.
EEPRU's work (Mukuria et al., 2016) found that generic SWB measures are often less sensitive than disease-specific instruments for physical conditions, complicating simple crosswalks.[10]Mukuria, C., et al. (2016). EEPRU research report on SWB measures.
The UK Green Book anchor-span approach (explicit example)
The UK Green Book wellbeing guidance provides an unusually explicit mapping logic:[7]HM Treasury (2021). Wellbeing Guidance for Appraisal: Supplementary Green Book Guidance.
- Average LS for those with no health problems ≈ 8 on a 0-10 scale
- Assume the LS level equivalent to "as bad as death" (QALY = 0) ≈ 1
- Therefore: 1 QALY ↔ (8 - 1) = 7 WELLBY
This is valuable not as "the answer" but as an explicit, traceable derivation. The guidance itself flags uncertainty about the low-end anchor and cites alternative evidence suggesting ~2 rather than 1.
What if the death-equivalent anchor is 2 instead of 1?
Peasgood et al. (2018), cited in the UK guidance, found an indifference point around LS = 2. Using this anchor:
$X = 8 - 2 = 6$ WELLBY per QALY
A one-point shift in the anchor changes the conversion factor by ~14%. This illustrates why the neutral/death-equivalence debate is not semantic—it is numerically load-bearing for mortality comparisons.
Core assumptions behind a simple conversion
Any fixed conversion factor (e.g., "1 DALY ≈ X WELLBYs") implicitly assumes several things. Making these explicit helps identify where conversion may be most fragile.
WELLBY cardinality
Equal steps on the 0-10 LS scale correspond to equal welfare changes. A move from 3→4 has the same welfare meaning as 7→8.
If violated: Summing LS points across people/time may distort welfare comparisons.
Interpersonal comparability
A 1-point LS change means the same welfare change for different people (at least approximately).
If violated: Equal reported changes may hide unequal welfare impacts.
Domain invariance
The conversion factor is stable across domains—the same X applies whether the DALY is from malaria, depression, or chronic pain.
If violated: A single factor may systematically over- or under-weight certain domains.
Baseline invariance
The conversion factor doesn't depend on the starting LS or health state of the beneficiary.
If violated: The same health improvement may yield different LS gains at different baselines.
Linearity (no saturation)
Marginal improvements in health produce proportional LS gains across the severity spectrum.
If violated: Linear extrapolation may miss ceiling/floor effects or diminishing returns.
Stable link function
The relationship between health status and life satisfaction is consistent across contexts, populations, and time.
If violated: Cross-study and cross-context comparisons become unreliable.
Plant's Cardinality Thesis decomposition
Plant (2025) provides a useful decomposition of the assumptions needed for cardinal WELLBY use:
- C1: Phenomenal cardinality (subjective experiences have inherent magnitudes)
- C2: Linearity (equal scale steps = equal welfare differences)
- C3: Intertemporal comparability (same person uses scale consistently over time)
- C4: Interpersonal comparability (different people's reports are comparable)
This framework helps locate which assumptions are most load-bearing for a given comparison.
Where linear conversion is most likely to go wrong
A constant DALY↔WELLBY factor may be useful as a temporary decision heuristic, but it can fail in predictable ways. This section maps the main failure modes.
Severity and baseline dependence
The LS impact of a given health improvement may depend on the severity of the condition and the baseline LS of the beneficiary. Evidence suggests that people at very low baselines may show either larger or smaller LS responses to health changes, depending on context.
Mental health vs. physical health
EEPRU research found that SWB measures are generally less sensitive to physical health conditions than EQ-5D/SF-6D, while results for depression/mental health are more mixed. This suggests a single conversion factor may systematically mis-weight mental vs. physical health domains.
Duration and adaptation effects
DALYs treat duration as additive (more years = more burden), but LS may show adaptation effects. Someone who adapts to a chronic condition may report similar LS to a healthy person, even though DALYs continue accumulating.
The adaptation paradox for conversion
Adaptation creates a fundamental tension between DALY and WELLBY accounting:
- DALY perspective: A person with chronic paraplegia accumulates ~0.3-0.4 YLD per year indefinitely (based on disability weights).
- WELLBY perspective: After initial adjustment, LS may return close to pre-injury levels (substantial evidence of hedonic adaptation).
If adaptation is complete, a simple conversion implies the person is "not losing welfare" each year—even though DALYs continue accruing. This is either:
- Evidence that DALYs over-count chronic morbidity (the WELLBY perspective wins), or
- Evidence that LS under-counts genuine welfare losses that people have adapted to accepting (the DALY perspective wins).
The correct interpretation is a substantive philosophical question, not just a measurement issue.[9]Frijters, P., et al. (2024). Discusses adaptation as a challenge for WELLBY interpretation.
Nonlinearity at extremes
Both scales have ceiling and floor effects:
- LS is bounded at 0 and 10; people at high baselines have limited room to improve
- Disability weights are bounded at 0 and 1; "worse than death" states are controversial
- The relationship between health and LS may be nonlinear, especially at extremes
SD-equivalence fragility
The "1 SD ≈ 1 SD" mapping is particularly vulnerable to variance heterogeneity. If two interventions produce identical welfare impacts but are measured in populations with different baseline variance, they will generate different z-scores. The conversion becomes a function of sample properties, not just treatment effects.
Where simple conversion may work
- Within-study comparisons using the same instruments
- Similar populations and health domains
- Marginal changes from moderate baselines
- Sensitivity analysis showing robust rankings
Where simple conversion is risky
- Cross-study synthesis with different instruments
- Mortality vs. morbidity comparisons
- Mental health vs. physical health domains
- Extreme severity or extreme baseline LS
- LMIC contexts with limited LS calibration data
Conversion Factor Sensitivity Demo
See how the relative ranking of two interventions changes as you vary the assumed DALY↔WELLBY conversion factor.
Note: This uses simplified assumptions. Real comparisons involve uncertainty in effect sizes, costs, and the conversion factor itself.
Practical guidance for funders now
Given the uncertainties above, what should funders actually do? This section offers a decision-oriented framework, not a single prescription.
Decision framework by data situation
The least harmful decision procedure
Rather than asking "what is the best exact conversion factor?", consider asking: "which mapping structure causes the least expected decision error?"
This reframing suggests:
- Present ranges, not point estimates: A distribution of X values (e.g., 4-10 WELLBYs per DALY) may be more honest than a single number.
- Report ranking robustness: If intervention A beats B under all plausible X values, the comparison is robust. If ranking reverses within the plausible range, flag this as a key uncertainty.
- Consider domain-specific factors: Use different X values for mortality vs. morbidity, or for physical vs. mental health, if evidence supports this.
- Prefer direct measurement where feasible: When possible, measure LS directly rather than converting from DALYs.
Worked example: Finding the "indifference threshold" X*
Suppose you're comparing two interventions:
- Intervention A: $50/person, generates 0.3 WELLBYs/person
- Intervention B: $100/person, averts 0.05 DALYs/person
Cost-effectiveness in WELLBYs per $1000:
- A: 0.3 / $50 × 1000 = 6 WELLBYs/$1000
- B (converted): 0.05 × X / $100 × 1000 = 0.5 × X WELLBYs/$1000
B beats A when 0.5X > 6, i.e., when X > 12.
This tells you: if you believe the conversion factor is below 12, fund A; if above 12, fund B. The "indifference threshold" X* = 12 is the key number for decision-making, not the conversion factor itself.
Template: How to report conversion sensitivity in a CEA
When presenting cost-effectiveness analyses that involve DALY↔WELLBY conversion, consider including:
- Base case: State your assumed conversion factor and cite the source (e.g., "7 WELLBYs/QALY per UK Green Book").
- Sensitivity range: Report results at X = 4, 7, and 10 (or whatever range brackets the literature).
- Threshold analysis: Report X* at which ranking reverses. State whether X* falls within the plausible range.
- Robustness statement: "Intervention A is preferred under all conversion factors below [X*]" or "Ranking is sensitive to conversion assumptions."
This template makes your assumptions transparent and allows readers with different priors to interpret your results.
What evidence would reduce uncertainty most?
Priority evidence gaps
- Direct beneficiary tradeoff studies: How do beneficiaries themselves trade off health improvements against LS improvements? Stated preference methods could anchor conversion factors more directly.
- Joint measurement RCTs: Trials that measure both LS and health metrics (EQ-5D, DALYs) for the same intervention, allowing empirical estimation of the LS-health relationship.
- Domain-specific mappings: Evidence on whether mental health → LS mapping differs from physical health → LS mapping, and by how much.
- LMIC scale-use calibration: Cheap methods for identifying and adjusting scale-use heterogeneity in low-resource settings.
- Neutral point studies: Better estimates of the LS level equivalent to "as bad as death" across populations.
- SD interchangeability: Evidence on whether SD changes on mental health instruments (PHQ-9, etc.) correspond to comparable welfare changes as SD changes on LS.
- Conditions under which rankings change: Systematic analysis of when different conversion approaches lead to different top-charity recommendations.
How might these gaps be filled? Practical research designs
1. Beneficiary tradeoff studies:
- Use discrete choice experiments asking beneficiaries to choose between health improvements and income/wellbeing improvements.
- Could be embedded in existing RCT follow-up surveys at low marginal cost.
2. Joint measurement:
- Add LS questions (1-2 items) to health intervention trials that already measure EQ-5D or SF-6D.
- Cost: ~$1-5/participant for additional survey items; high value-of-information.
3. LMIC calibration:
- Benjamin et al. (2023) methods could be adapted for LMIC populations using vignettes in local languages.[13]Benjamin, D.J., et al. (2023). "Adjusting for Scale-Use Heterogeneity in Self-Reported Well-Being." NBER WP 31728.
- Could be combined with anchoring vignettes (King et al., 2004) for cross-population comparison.
Neutral Point / Mortality Demo
When comparing mortality-reducing interventions to non-mortality wellbeing programs, the assumed "death-equivalent" LS level becomes central.
Key insight: The UK guidance uses LS = 1 as a working assumption but cites evidence suggesting ~2 may be more accurate. This uncertainty propagates directly into mortality comparisons.
Bottom line
A single, universal DALY↔WELLBY conversion factor probably does not exist in any meaningful sense. The metrics were designed for different purposes, measure different things, and embed different assumptions about what matters for welfare.
The practical goal is not to find "the one true scalar conversion" but to choose the mapping structure that causes the least expected decision error given the information available. Current best practice is likely:
- Multi-method: Use more than one conversion approach and compare results
- Domain-sensitive: Allow different factors for different health domains if evidence supports
- Uncertainty-explicit: Present ranges, scenarios, or distributions rather than single numbers
- Decision-focused: Report whether rankings are robust across the plausible range of conversion factors
This page should be useful even for readers who are skeptical of WELLBYs, or skeptical of DALYs. The underlying question—how do we compare interventions that affect different outcomes?—does not go away by ignoring it. Making the mapping structure explicit is better than leaving it implicit.
A note for WELLBY skeptics
If you believe WELLBYs have fundamental measurement problems (scale-use heterogeneity, demand effects, philosophical objections to hedonism), the conversion framework here still applies—just with much wider uncertainty ranges or higher weight on DALY-based evidence.
The practical value of making conversion explicit is that it lets you:
- See how much your skepticism matters for specific decisions (threshold analysis)
- Communicate your reasoning transparently to others with different priors
- Update systematically as new evidence arrives
"WELLBYs are too unreliable to use" is itself an implicit conversion assumption (X ≈ 0 or undefined). Making it explicit is more honest.
A note for DALY skeptics
If you believe DALYs miss important welfare effects (mental health, non-health domains, adaptation, social context), you might prefer to:
- Use WELLBYs as the primary metric and convert DALYs to WELLBYs (rather than vice versa)
- Apply domain-specific adjustments to DALY-based estimates
- Flag DALY-only evidence as potentially understating welfare effects
The conversion framework accommodates this perspective—just flip the direction and acknowledge that mortality comparisons require additional neutral-point assumptions.
Prompts for workshop discussion
These prompts are designed to elicit participant reasoning and surface disagreements:
1. What conversion factor (or range) do you currently use in practice, and what is the basis for it? Is this explicit or implicit in your cost-effectiveness models?
2. Should the conversion factor vary by domain (mental health vs. physical health, mortality vs. morbidity)? What evidence would convince you to use domain-specific factors?
3. How should we handle the death-equivalence anchor? Is LS = 1, 2, or something else the right assumption? Does this depend on population or context?
4. When mapping depression scale improvements to WELLBYs (as in the StrongMinds analysis), what evidence would make the mapping credible? What is the minimum acceptable standard?
5. Is "minimize expected decision error" the right objective, or should we prioritize other properties (transparency, robustness, theoretical consistency)?
6. What single study or evidence type would most reduce your uncertainty about DALY↔WELLBY conversion?
References
- The Unjournal Pivotal Questions database, codes DALY_01 / DALY_03 / DALY_05. See also: beliefs elicitation page.
- WHO (2020). Methods and data sources for global burden of disease estimates. The framework distinguishes "loss of health" from welfare or quality-of-life.
- WHO (2020). WHO methods and data sources for global burden of disease estimates 2000-2019. Technical paper.
- NICE. "Quality-adjusted life year (QALY)." NICE Glossary.
- HM Treasury (2021). Wellbeing Guidance for Appraisal: Supplementary Green Book Guidance.
- Founders Pledge (2024). Internal cost-effectiveness framework documentation.
- HM Treasury (2021). The guidance explicitly derives 7 WELLBYs per QALY from an 8-to-1 LS span.
- Peasgood, T., et al. (2018). "The impact of health on wellbeing: A comparison of SWB and health utility instruments." Cited in UK Green Book guidance.
- Frijters, P., et al. (2024). "Using wellbeing for public policy: taking stock." Nature Human Behaviour.
- Mukuria, C., et al. (2016). EEPRU report comparing SWB measures and health measures.
- GiveWell (2023). "Our Assessment of Happier Lives Institute's Cost-Effectiveness Analysis of StrongMinds."
- Plant, M. (2025). "A Happy Possibility About Happiness Scales: An Exploration of the Cardinality Assumption." Working paper.
- Benjamin, D.J., et al. (2023). "Adjusting for Scale-Use Heterogeneity in Self-Reported Well-Being." NBER WP 31728.
- Bond, T.N. & Lang, K. (2019). "The Sad Truth about Happiness Scales." Journal of Political Economy.
- OECD (2021). "United Kingdom." Case study in OECD Guidelines on Measuring Subjective Well-being.
- HLI (2023). "StrongMinds cost-effectiveness analysis." Happier Lives Institute. Available at happierlivesinstitute.org.
- "Least regret" (or "minimax regret") is a formal criterion from decision theory developed by Leonard Savage (1951). The approach chooses the action that minimizes the maximum regret—the difference between the outcome of the chosen action and the best possible outcome—across all possible states of the world. In the context of DALY↔WELLBY conversion, this means choosing mapping assumptions that minimize expected decision error across plausible true conversion values.