Proposal
Description
Please review the datasets and accompanying data dictionaries on the Project Overview page Your final project should use data from one of these sources, but beyond that the choice of topic is left up to you. Think about:
- what questions you’d like to answer with the data
- what questions you actually can answer with the data
Count on brainstorming at least half a dozen ideas before you can groom one of them into a mature proposal.
Format
Your final project proposal should be submitted as a PDF file, rendered from a Quarto file (*.qmd).
Please include (at least) the following in your YAML header:
---
title: "Write something interesting here, don't just call it 'Final Project'!!"
author:
- name:
given: First
family: Student
affiliation:
- id: smith
name: Smith College
department: Statistical & Data Sciences
address: 44 College Lane
city: Northampton, MA
country: United States of America
postal-code: 01063
url: https://www.smith.edu/
email: fstudent@smith.edu
- name:
given: Second
family: Student
affiliation:
- ref: smith
- name:
given: Third
family: Student
affiliation:
- ref: smith
date: 2026-05-08
date-format: medium
editor: source
number-sections: true
format: pdf
execute:
echo: true
knitr:
opts_chunk:
message: false
---- Please make sure that all members of your group are listed in the
author:field in the YAML header in your Quarto document. - Please write a descriptive title for your project in the
title:field in the YAML header in your Quarto document.
Content
In addition to the relevant information in the YAML header above, your proposal should contain the following content, using section headers appropriately.
Introduction
Describe the general topic/phenomenon you want to study, as well some focused questions that you hope to answer. Clearly identify one or two (co-)primary hypotheses of interest; also state any other secondary/exploratory hypotheses that you intend to assess.
Data
Describe the data that you plan to use.
Identify the study population. Specify what the observational units are (i.e. the rows of the data frame), describe the larger population/phenomenon to which you will generalize your results. Briefly discuss any potential limitations, biases, or threats to generalizability that may be present in your sample.
What the response variable for your primary hypothesis(es)? What are its units? Estimate the range of possible values that it may take on.
Describe the explanatory variables that you’ll examine for each observational unit (i.e. the columns of the data frame). Carefully define each variable and describe how each was measured.
- For categorical variables, list the possible categories
- For quantitative variables, specify the units of measurement.
You may add additional variables to your analysis as the project progresses, but you should have at least two explanatory variables already: at least one quantitative variable and at least one categorical variable.
Submission
Please have one member of your group turn in the rendered PDF to Moodle by the due date.