Complex Survey
Data Analysis

A Tidy Introduction with {srvyr} and {survey}

Stephanie A. Zimmer

RTI International

Rebecca J. Powell

Fors Marsh

Isabella C. Velásquez

Posit

2025-08-08

Introduction

About Us

  • Stephanie Zimmer, RTI International

  • Rebecca Powell, Fors Marsh

  • Isabella Velásquez, Posit

Prerequisites - tidyverse familiarity

  • Selecting a set of variables (select(starts_with("TOT")))
  • Creating new variables with mutate()
  • Summarizing data with summarize()
  • Using group_by() with summarize() to create group summaries

Background

  • This tutorial largely builds off our book: Exploring Complex Survey Data Analysis Using R
  • This book covers additional topics outside this tutorial including:
    • Overview of survey process
    • Linear regression and logistic regression
    • Communication of results (tables and plots)
    • Reproducible research best practices
  • Not covered
    • Weighting (calibration, post-stratification, raking, etc.)
    • Survival analysis
    • Nonlinear models

How is survey analysis different?

  • Data often includes weights. These weights extrapolate each response to the population of interest
  • Data is often sampled in a complex manner using strata or clusters. This impacts how standard errors are calculated
  • Fortunately, the packages we discuss today do the hard math for you using tidyverse syntax you’re familiar with

Overview of tutorial

  • At the end of this tutorial, you should be able to
    • Calculate point estimates and their standard errors with survey data
      • Proportions, totals, and counts
      • Means, quantiles, and correlations
    • Perform t-tests and chi-squared tests
    • Specify a survey design in R to create a survey object

Overview of survey process

flowchart TD
  A[Survey Concept]-->B[Sampling Design]
  A-->C[Questionnaire Design]
  A-->D[Data Collection Planning]
  B-->E[Data Collection]
  C-->E
  D-->E
  E-->F[Post-Survey Processing]
  F-->G[Analysis]
  G-->H[Reporting]
  style G fill:#ff8484

Roadmap for today

  • Getting started
  • Descriptive analysis
  • Statistical testing
  • Survey design objects

Logistics