Inference for a single proportion

IMS, Ch. 16

Smith College

Apr 1, 2026

Inference for a single proportion

Nuclear Arms Reduction Survey

A simple random sample of 1,028 US adults in March 2013 found that 56% support nuclear arms reduction.

library(tidyverse)
library(openintro)
nuclear_survey |>
  group_by(arms_reduction) |>
  summarize(num_responses = n()) |>
  mutate(pct = num_responses / sum(num_responses))
# A tibble: 2 × 3
  arms_reduction num_responses   pct
  <fct>                  <int> <dbl>
1 against                  452 0.440
2 favor                    576 0.560
  • Does a majority support nuclear arms reduction?

Sample proportion (\(\hat{p}\))

library(infer)
p_hat <- nuclear_survey |>
  observe(response = arms_reduction, success = "favor", stat = "prop")
p_hat
Response: arms_reduction (factor)
# A tibble: 1 × 1
   stat
  <dbl>
1 0.560
  • Note: p_hat is a \(1 \times 1\) data.frame
  • p_hat$stat is the actual (scalar) value

NHST Setup

  • \(H_0: p = 0.5\)
  • \(H_A: p \neq 0.5\)
  • \(\alpha = 0.05\)
n <- nrow(nuclear_survey)
p_0 <- 0.5
alpha <- 0.05

Big Question

How do we construct the null distribution?

Three methods for inference

  1. Use probability theory
    • today
  2. Use normal approximation
    • also today
    • mathematical model (Ch. 16.2)
  3. Use simulations
    • parametric bootstrap (Ch. 16.1)

What do we need to know about the null dist?

  • Center, shape, and spread!
  • Once we have the null distribution…
    • What is the standard error?
    • Where does the test statistic lie in the null distribution?
    • What is the p-value?
    • What is our decision?

Method 3: Simulation

Parametric bootstrap (review)

library(infer)
nuclear_pbstrap <- nuclear_survey |>
  specify(response = arms_reduction, success = "favor") |>
  hypothesize(null = "point", p = p_0) |>
  generate(2000, type = "draw") |>
  calculate(stat = "prop")

Null distribution (simluated)

null_dist <- nuclear_pbstrap |>
  ggplot(aes(x = stat)) +
  geom_density(fill = "dark gray") + 
  geom_vline(xintercept = p_0, linetype = 3) + 
  geom_vline(xintercept = p_hat$stat, linetype = 2) + 
  geom_errorbarh(
    aes(
      xmax = p_0 + sd(nuclear_pbstrap$stat), 
      xmin = p_0 - sd(nuclear_pbstrap$stat), 
      y = 12
    )
  )

Null distribution (simulated, in black)

null_dist

Standard error and p-value

  • Standard error (\(SE_{\hat{p}}\)):
se_bstrap <- nuclear_pbstrap |>
  summarize(SE = sd(stat)) |>
  pull(SE)
se_bstrap
[1] 0.01570862
  • p-value:
p_bstrap <- nuclear_pbstrap |>
  get_p_value(p_hat$stat, direction = "two-sided") |>
  pull(p_value)
p_bstrap
[1] 0

Your turn

  • Does a majority support nuclear arms reduction?

Why simulate the null?

Pros:

  • intuitive
  • no math
  • few assumptions

Cons:

  • requires computer
  • requires coding ability
  • solution is approximate
  • non-deterministic

Method 1: Probability theory

Probability theory

  • Compute exact null distribution
  • Let \(X \sim Bernoulli(p)\). Then,

    \[ \mathbb{E}[X] = p, Var(X) = p(1-p) \]

  • Let \(Y = X_1 + \cdots + X_n\). Then \(Y \sim Binom(n, p)\) and,

    \[ \mathbb{E}[Y] = np, Var(Y) = np(1-p) \]

Probability theory (cont’d)

  • \(Z = Y/n\) is a r.v. giving the mean of \(n\) draws from \(X\)!

  • Then,

    \[ \mathbb{E}[Z] = p \]

  • And,

    \[ Var(Z) = \frac{1}{n^2} \cdot np(1-p) = \frac{p(1-p)}{n} \]

  • And, \(sd(Z) = \sqrt{\frac{p(1-p)}{n}}\) is the standard error!

Properties of null distribution

  • Same shape as binomial distribution curve
  • In our case, \(\hat{p} \sim Z\) and \(\mathbb{E}[X_i] = p_0\)
  • Thus, \(SE_{\hat{p}} = \sqrt{\frac{p_0(1-p_0)}{n}}\)
se <- sqrt(p_0 * (1 - p_0) / n)

dbinom_p <- function (x, size, prob, log = FALSE) {
  size * dbinom(round(x * size), size, prob, log)
}

null_dist_2 <- null_dist +
  geom_function(
    fun = dbinom_p, color = "red", 
    args = list(size = n, prob = p_0)
  ) + 
  geom_errorbarh(
    aes(xmax = p_0 + se, xmin = p_0 - se, y = 10),
    color = "red"
  )

Null distribution (exact, in red)

null_dist_2

Standard error and p-value

  • Standard error (\(SE_{\hat{p}}\)):
se_math <- se
se_math
[1] 0.01559457
  • p-value:
p_math <- 2 * pbinom(n * p_hat$stat, size = n, prob = p_0, lower.tail = FALSE)
p_math
[1] 9.494704e-05

Your turn

  • Does a majority support nuclear arms reduction?

  • Compare the SE and p-value from the previous method. Are they meaningfully different?

Why compute the null?

Pros:

  • correct answer
  • deterministic

Cons:

  • Had to do a bunch of math. Math wasn’t too bad in this case, but it gets much harder in other cases.
  • only possible for simple situations
  • For large \(n\), computation of binomial distribution is not that easy

Method 2: Normal approximation

Normal approximation

  • By CLT, for \(n\) large enough, null distribution is approximately normal

  • If \(np_0(1-p_0) > 10\), then approximation is reasonably good

  • Use \(SE_{\hat{p}} = \sqrt{\frac{p_0(1-p_0)}{n}}\) (from before)

    • 💡💡 Now you know why!! 💡💡

Null distribution (normal approx.)

null_dist_3 <- null_dist_2 +
  geom_function(
    fun = dnorm, color = "blue", 
    args = list(mean = p_0, sd = se)
  ) + 
  geom_errorbarh(
    aes(xmax = p_0 + se, xmin = p_0 - se, y = 8),
    color = "blue"
  )

Null distribution (approx., in blue)

null_dist_3

Standard error and p-value

  • Standard error (\(SE_{\hat{p}}\)):
se_approx <- se
se_approx
[1] 0.01559457
  • p-value:
p_approx <- 2 * pnorm(p_hat$stat, mean = p_0, sd = se, lower.tail = FALSE)
p_approx
[1] 0.0001099777

Z-scores

  • Compute z-score:
z_hat <- (p_hat$stat - p_0) / se
z_hat
[1] 3.867454
  • But,
z_star <- qnorm(alpha / 2, lower.tail = FALSE)
z_star
[1] 1.959964
  • Is \(\hat{p}\) in the rejection region?
p_0 + z_star * se
[1] 0.5305648

Alternatively, using infer

nuclear_survey |>
  specify(response = arms_reduction, success = "favor") |>
  hypothesize(null = "point", p = p_0) |>
  assume("z") |>
  get_p_value(z_hat, direction = "two-sided")
# A tibble: 1 × 1
   p_value
     <dbl>
1 0.000110

Your turn

  • Does a majority support nuclear arms reduction?

  • Compare the SE and p-value from the previous method. Are they meaningfully different?

tibble(
  method = c("bstrap", "math", "approx"),
  se = c(se_bstrap, se_math, se_approx),
  p_value = c(p_bstrap, p_math, p_approx)
)
# A tibble: 3 × 3
  method     se   p_value
  <chr>   <dbl>     <dbl>
1 bstrap 0.0157 0        
2 math   0.0156 0.0000949
3 approx 0.0156 0.000110 

Why approximate the null?

Pros:

  • provable approximation quality
  • no computer required
    • (kind of)
  • deterministic
  • widely-used

Cons:

  • Solution is approximate, and sometimes approximation is not great

  • SE formula comes out of nowhere

  • harder to connect to big ideas?

  • Most common, well-understood method