Is Random Forest a Superior Methodology for Predicting Poverty? : An Empirical Assessment

Random forest is in many fields of research a common method for data driven predictions. Within economics and prediction of poverty, random forest is rarely used. Comparing out-of-sample predictions in surveys for same year in six countries shows t...

Full description

Bibliographic Details
Main Authors: Sohnesen, Thomas Pave, Stender, Niels
Format: Working Paper
Language:English
en_US
Published: World Bank, Washington, DC 2016
Subjects:
Online Access:http://documents.worldbank.org/curated/en/2016/03/26089791/random-forest-superior-methodology-predicting-poverty-empirical-assessment
http://hdl.handle.net/10986/24154
id okr-10986-24154
recordtype oai_dc
spelling okr-10986-241542021-04-23T14:04:19Z Is Random Forest a Superior Methodology for Predicting Poverty? : An Empirical Assessment Sohnesen, Thomas Pave Stender, Niels PREDICTIONS POOR HOUSEHOLD CONSUMPTION EXPENDITURES HOUSEHOLD SIZE HOUSEHOLD SURVEY AGRICULTURAL GROWTH CONSUMPTION POVERTY REDUCTION IMPACT ON POVERTY POVERTY RATES ERRORS FARMER POVERTY RATE FOOD CONSUMPTION INCOME LINEAR REGRESSION POVERTY RATES POVERTY ESTIMATES ALGORITHMS HOUSEHOLD SURVEYS PROGRAMS CONSUMPTION DATA HOUSEHOLD SIZE HOUSING POVERTY ESTIMATES AGRICULTURAL PRACTICES IMPACTS NATIONAL POVERTY SAMPLES RURAL VARIABLES MEASUREMENT COUNTING HOUSEHOLD BUDGET CONSUMPTION AGGREGATE QUALITY SURVEYS SOCIAL ASSISTANCE MEASURES INSTRUMENTS POVERTY REDUCTION TARGETING RANDOM SAMPLES AGRICULTURAL PRACTICES CONSUMPTION EXPENDITURE RURAL AREAS CROSS‐SECTION DATA WELFARE MEASURES CROSS‐SECTION DATA WELFARE INDICATORS PANEL DATA SETS SOCIAL ASSISTANCE REGIONS STATISTICS EVALUATION SIGNIFICANCE LEVEL POOR HOUSEHOLDS SAMPLING RURAL AREAS POVERTY POOR HOUSEHOLD HOUSEHOLD HEAD PANEL DATA SETS CONSUMPTION EXPENDITURES SIGNIFICANCE LEVEL NATIONAL POVERTY HOUSEHOLD CONSUMPTION ECONOMETRICS STANDARD ERRORS CONSUMPTION DATA POVERTY STATUS POVERTY RATE POOR PREDICTION POVERTY ASSESSMENT CONSUMPTION EXPENDITURE HOUSEHOLD SURVEYS LEARNING INDICATORS RESEARCH CONSUMPTION POVERTY WELFARE INDICATORS OUTCOMES SOCIAL INDICATORS POVERTY STATUS LINEAR REGRESSION MISSING OBSERVATIONS INEQUALITY POOR HOUSEHOLDS Random forest is in many fields of research a common method for data driven predictions. Within economics and prediction of poverty, random forest is rarely used. Comparing out-of-sample predictions in surveys for same year in six countries shows that random forest is often more accurate than current common practice (multiple imputations with variables selected by stepwise and Lasso), suggesting that this method could contribute to better poverty predictions. However, none of the methods consistently provides accurate predictions of poverty over time, highlighting that technical model fitting by any method within a single year is not always, by itself, sufficient for accurate predictions of poverty over time. 2016-04-26T16:49:08Z 2016-04-26T16:49:08Z 2016-03 Working Paper http://documents.worldbank.org/curated/en/2016/03/26089791/random-forest-superior-methodology-predicting-poverty-empirical-assessment http://hdl.handle.net/10986/24154 English en_US Policy Research Working Paper;No. 7612 CC BY 3.0 IGO http://creativecommons.org/licenses/by/3.0/igo/ World Bank World Bank, Washington, DC Publications & Research Publications & Research :: Policy Research Working Paper
repository_type Digital Repository
institution_category Foreign Institution
institution Digital Repositories
building World Bank Open Knowledge Repository
collection World Bank
language English
en_US
topic PREDICTIONS
POOR HOUSEHOLD
CONSUMPTION EXPENDITURES
HOUSEHOLD SIZE
HOUSEHOLD SURVEY
AGRICULTURAL GROWTH
CONSUMPTION
POVERTY REDUCTION
IMPACT ON POVERTY
POVERTY RATES
ERRORS
FARMER
POVERTY RATE
FOOD CONSUMPTION
INCOME
LINEAR REGRESSION
POVERTY RATES
POVERTY ESTIMATES
ALGORITHMS
HOUSEHOLD SURVEYS
PROGRAMS
CONSUMPTION DATA
HOUSEHOLD SIZE
HOUSING
POVERTY ESTIMATES
AGRICULTURAL PRACTICES
IMPACTS
NATIONAL POVERTY
SAMPLES
RURAL
VARIABLES
MEASUREMENT
COUNTING
HOUSEHOLD BUDGET
CONSUMPTION AGGREGATE
QUALITY
SURVEYS
SOCIAL ASSISTANCE
MEASURES
INSTRUMENTS
POVERTY REDUCTION
TARGETING
RANDOM SAMPLES
AGRICULTURAL PRACTICES
CONSUMPTION EXPENDITURE
RURAL AREAS
CROSS‐SECTION DATA
WELFARE MEASURES
CROSS‐SECTION DATA
WELFARE INDICATORS
PANEL DATA SETS
SOCIAL ASSISTANCE
REGIONS
STATISTICS
EVALUATION
SIGNIFICANCE LEVEL
POOR HOUSEHOLDS
SAMPLING
RURAL AREAS
POVERTY
POOR HOUSEHOLD
HOUSEHOLD HEAD
PANEL DATA SETS
CONSUMPTION EXPENDITURES
SIGNIFICANCE LEVEL
NATIONAL POVERTY
HOUSEHOLD CONSUMPTION
ECONOMETRICS
STANDARD ERRORS
CONSUMPTION DATA
POVERTY STATUS
POVERTY RATE
POOR
PREDICTION
POVERTY ASSESSMENT
CONSUMPTION EXPENDITURE
HOUSEHOLD SURVEYS
LEARNING
INDICATORS
RESEARCH
CONSUMPTION POVERTY
WELFARE INDICATORS
OUTCOMES
SOCIAL INDICATORS
POVERTY STATUS
LINEAR REGRESSION
MISSING OBSERVATIONS
INEQUALITY
POOR HOUSEHOLDS
spellingShingle PREDICTIONS
POOR HOUSEHOLD
CONSUMPTION EXPENDITURES
HOUSEHOLD SIZE
HOUSEHOLD SURVEY
AGRICULTURAL GROWTH
CONSUMPTION
POVERTY REDUCTION
IMPACT ON POVERTY
POVERTY RATES
ERRORS
FARMER
POVERTY RATE
FOOD CONSUMPTION
INCOME
LINEAR REGRESSION
POVERTY RATES
POVERTY ESTIMATES
ALGORITHMS
HOUSEHOLD SURVEYS
PROGRAMS
CONSUMPTION DATA
HOUSEHOLD SIZE
HOUSING
POVERTY ESTIMATES
AGRICULTURAL PRACTICES
IMPACTS
NATIONAL POVERTY
SAMPLES
RURAL
VARIABLES
MEASUREMENT
COUNTING
HOUSEHOLD BUDGET
CONSUMPTION AGGREGATE
QUALITY
SURVEYS
SOCIAL ASSISTANCE
MEASURES
INSTRUMENTS
POVERTY REDUCTION
TARGETING
RANDOM SAMPLES
AGRICULTURAL PRACTICES
CONSUMPTION EXPENDITURE
RURAL AREAS
CROSS‐SECTION DATA
WELFARE MEASURES
CROSS‐SECTION DATA
WELFARE INDICATORS
PANEL DATA SETS
SOCIAL ASSISTANCE
REGIONS
STATISTICS
EVALUATION
SIGNIFICANCE LEVEL
POOR HOUSEHOLDS
SAMPLING
RURAL AREAS
POVERTY
POOR HOUSEHOLD
HOUSEHOLD HEAD
PANEL DATA SETS
CONSUMPTION EXPENDITURES
SIGNIFICANCE LEVEL
NATIONAL POVERTY
HOUSEHOLD CONSUMPTION
ECONOMETRICS
STANDARD ERRORS
CONSUMPTION DATA
POVERTY STATUS
POVERTY RATE
POOR
PREDICTION
POVERTY ASSESSMENT
CONSUMPTION EXPENDITURE
HOUSEHOLD SURVEYS
LEARNING
INDICATORS
RESEARCH
CONSUMPTION POVERTY
WELFARE INDICATORS
OUTCOMES
SOCIAL INDICATORS
POVERTY STATUS
LINEAR REGRESSION
MISSING OBSERVATIONS
INEQUALITY
POOR HOUSEHOLDS
Sohnesen, Thomas Pave
Stender, Niels
Is Random Forest a Superior Methodology for Predicting Poverty? : An Empirical Assessment
relation Policy Research Working Paper;No. 7612
description Random forest is in many fields of research a common method for data driven predictions. Within economics and prediction of poverty, random forest is rarely used. Comparing out-of-sample predictions in surveys for same year in six countries shows that random forest is often more accurate than current common practice (multiple imputations with variables selected by stepwise and Lasso), suggesting that this method could contribute to better poverty predictions. However, none of the methods consistently provides accurate predictions of poverty over time, highlighting that technical model fitting by any method within a single year is not always, by itself, sufficient for accurate predictions of poverty over time.
format Working Paper
author Sohnesen, Thomas Pave
Stender, Niels
author_facet Sohnesen, Thomas Pave
Stender, Niels
author_sort Sohnesen, Thomas Pave
title Is Random Forest a Superior Methodology for Predicting Poverty? : An Empirical Assessment
title_short Is Random Forest a Superior Methodology for Predicting Poverty? : An Empirical Assessment
title_full Is Random Forest a Superior Methodology for Predicting Poverty? : An Empirical Assessment
title_fullStr Is Random Forest a Superior Methodology for Predicting Poverty? : An Empirical Assessment
title_full_unstemmed Is Random Forest a Superior Methodology for Predicting Poverty? : An Empirical Assessment
title_sort is random forest a superior methodology for predicting poverty? : an empirical assessment
publisher World Bank, Washington, DC
publishDate 2016
url http://documents.worldbank.org/curated/en/2016/03/26089791/random-forest-superior-methodology-predicting-poverty-empirical-assessment
http://hdl.handle.net/10986/24154
_version_ 1764455791054553088