Arbeitspapier

Optimal data collection for randomized control trials

In a randomized control trial, the precision of an average treatment effect estimator and the power of the corresponding t-test can be improved either by collecting data on additional individuals, or by collecting additional covariates that predict the outcome variable. We propose the use of pre-experimental data such as other similar studies, a census, or a household survey, to inform the choice of both the sample size and the covariates to be collected. Our procedure seeks to minimize the resulting average treatment effect estimator's mean squared error or the corresponding t-test's power, subject to the researcher's budget constraint. We rely on a modification of an orthogonal greedy algorithm that is conceptually simple and easy to implement in the presence of a large number of potential covariates, and does not require any tuning parameters. In two empirical applications, we show that our procedure can lead to reductions of up to 58% in the costs of data collection, or improvements of the same magnitude in the precision of the treatment effect estimator.

Language
Englisch

Bibliographic citation
Series: cemmap working paper ; No. CWP15/17

Classification
Wirtschaft
Large Data Sets: Modeling and Analysis
Methodology for Collecting, Estimating, and Organizing Microeconomic Data; Data Access
Subject
randomized control trials
big data
data collection
optimal survey design
orthogonal greedy algorithm
survey costs

Event
Geistige Schöpfung
(who)
Carneiro, Pedro M.
Lee, Sokbae
Wilhelm, Daniel
Event
Veröffentlichung
(who)
Centre for Microdata Methods and Practice (cemmap)
(where)
London
(when)
2017

DOI
doi:10.1920/wp.cem.2017.1517
Handle
Last update
10.03.2025, 11:42 AM CET

Data provider

This object is provided by:
ZBW - Deutsche Zentralbibliothek für Wirtschaftswissenschaften - Leibniz-Informationszentrum Wirtschaft. If you have any questions about the object, please contact the data provider.

Object type

  • Arbeitspapier

Associated

  • Carneiro, Pedro M.
  • Lee, Sokbae
  • Wilhelm, Daniel
  • Centre for Microdata Methods and Practice (cemmap)

Time of origin

  • 2017

Other Objects (12)