Proposal

Authors
Affiliation

Description

Please review the datasets and accompanying data dictionaries on the Project Overview page Your final project should use data from one of these sources, but beyond that the choice of topic is left up to you. Think about:

  • what questions you’d like to answer with the data
  • what questions you actually can answer with the data

Count on brainstorming at least half a dozen ideas before you can groom one of them into a mature proposal.

Format

Your final project proposal should be submitted as a PDF file, rendered from a Quarto file (*.qmd).

Please include (at least) the following in your YAML header:

---
title: "Write something interesting here, don't just call it 'Final Project'!!"
author:
  - name: 
      given: First 
      family: Student
    affiliation:
      - id: smith
        name: Smith College
        department: Statistical & Data Sciences
        address: 44 College Lane
        city: Northampton, MA
        country: United States of America
        postal-code: 01063
        url: https://www.smith.edu/
    email: fstudent@smith.edu
  - name:
      given: Second
      family: Student
    affiliation:
      - ref: smith
  - name:
      given: Third
      family: Student
    affiliation:
      - ref: smith
date: 2026-05-08
date-format: medium
editor: source
number-sections: true
format: pdf
execute: 
  echo: true
knitr:
  opts_chunk: 
    message: false
---
  • Please make sure that all members of your group are listed in the author: field in the YAML header in your Quarto document.
  • Please write a descriptive title for your project in the title: field in the YAML header in your Quarto document.

Content

In addition to the relevant information in the YAML header above, your proposal should contain the following content, using section headers appropriately.

Introduction

Describe the general topic/phenomenon you want to study, as well some focused questions that you hope to answer. Clearly identify one or two (co-)primary hypotheses of interest; also state any other secondary/exploratory hypotheses that you intend to assess.

Data

  1. Describe the data that you plan to use.

  2. Identify the study population. Specify what the observational units are (i.e. the rows of the data frame), describe the larger population/phenomenon to which you will generalize your results. Briefly discuss any potential limitations, biases, or threats to generalizability that may be present in your sample.

  3. What the response variable for your primary hypothesis(es)? What are its units? Estimate the range of possible values that it may take on.

  4. Describe the explanatory variables that you’ll examine for each observational unit (i.e. the columns of the data frame). Carefully define each variable and describe how each was measured.

    • For categorical variables, list the possible categories
    • For quantitative variables, specify the units of measurement.

    You may add additional variables to your analysis as the project progresses, but you should have at least two explanatory variables already: at least one quantitative variable and at least one categorical variable.

Submission

Please have one member of your group turn in the rendered PDF to Moodle by the due date.