Major Assignment 1: Short Article

Instructions

You are to write an original piece of investigative data journalism, in approximately 500 words. Your piece must include 1 original source.

Data source

For this story you will pick a data set from the Massachusetts Education-to-Career Research & Data Hub.

For example, you could choose the MCAS Achievement Results data set.

  • Read about the variables in the data set!
  • Click on Actions -> Query data…
  • In this case, we set SY to 2025, DIST_NAME to “Northampton”, and TEST_GRADE to “03”. This limits the results to 107 rows!
  • Click on Export ->
    • Choose API Endpoint
    • Choose Data Format CSV
    • Select Version SODA2
    • Copy the URL
  • Paste the URL into R and feed it to readr::read_csv()
library(tidyverse)
mcas <- read_csv("https://educationtocareer.data.mass.gov/resource/i9w6-niyt.csv?$query=SELECT%0A%20%20%60sy%60%2C%0A%20%20%60dist_code%60%2C%0A%20%20%60dist_name%60%2C%0A%20%20%60org_code%60%2C%0A%20%20%60org_name%60%2C%0A%20%20%60org_type%60%2C%0A%20%20%60test_grade%60%2C%0A%20%20%60subject_code%60%2C%0A%20%20%60stu_grp%60%2C%0A%20%20%60m_plus_e_cnt%60%2C%0A%20%20%60m_plus_e_pct%60%2C%0A%20%20%60e_cnt%60%2C%0A%20%20%60e_pct%60%2C%0A%20%20%60m_cnt%60%2C%0A%20%20%60m_pct%60%2C%0A%20%20%60pm_cnt%60%2C%0A%20%20%60pm_pct%60%2C%0A%20%20%60nm_cnt%60%2C%0A%20%20%60nm_pct%60%2C%0A%20%20%60stu_cnt%60%2C%0A%20%20%60stu_part_pct%60%2C%0A%20%20%60avg_scaled_score%60%2C%0A%20%20%60avg_sgp%60%2C%0A%20%20%60avg_sgp_incl%60%2C%0A%20%20%60ach_percentile%60%2C%0A%20%20%60district_and_school%60%0AWHERE%0A%20%20caseless_one_of(%60dist_name%60%2C%20%22Northampton%22)%0A%20%20AND%20caseless_one_of(%60sy%60%2C%20%222025%22)%0A%20%20AND%20caseless_one_of(%60test_grade%60%2C%20%2203%22)")
glimpse(mcas)
Rows: 107
Columns: 26
$ sy                  <dbl> 2025, 2025, 2025, 2025, 2025, 2025, 2025, 2025, 20…
$ dist_code           <chr> "02100000", "02100000", "02100000", "02100000", "0…
$ dist_name           <chr> "Northampton", "Northampton", "Northampton", "Nort…
$ org_code            <chr> "02100029", "02100029", "02100005", "02100005", "0…
$ org_name            <chr> "R. K. Finn Ryan Road", "R. K. Finn Ryan Road", "B…
$ org_type            <chr> "Public School", "Public School", "Public School",…
$ test_grade          <chr> "03", "03", "03", "03", "03", "03", "03", "03", "0…
$ subject_code        <chr> "ELA", "MATH", "ELA", "MATH", "ELA", "MATH", "ELA"…
$ stu_grp             <chr> "Title I", "Title I", "All Students", "All Student…
$ m_plus_e_cnt        <dbl> 19, 14, 11, 16, 4, 5, 3, 5, 0, 2, 1, 2, 7, 10, 10,…
$ m_plus_e_pct        <dbl> 0.50, 0.36, 0.31, 0.43, 0.29, 0.36, 0.16, 0.25, 0.…
$ e_cnt               <dbl> 3, 2, 2, 5, 1, 2, 1, 2, 0, 1, 0, 0, 1, 3, 2, 5, 1,…
$ e_pct               <dbl> 0.08, 0.05, 0.06, 0.14, 0.07, 0.14, 0.05, 0.10, 0.…
$ m_cnt               <dbl> 16, 12, 9, 11, 3, 3, 2, 3, 0, 1, 1, 2, 6, 7, 8, 9,…
$ m_pct               <dbl> 0.42, 0.31, 0.25, 0.30, 0.21, 0.21, 0.11, 0.15, 0.…
$ pm_cnt              <dbl> 17, 15, 20, 8, 8, 2, 12, 5, 10, 3, 11, 5, 11, 6, 9…
$ pm_pct              <dbl> 0.45, 0.38, 0.56, 0.22, 0.57, 0.14, 0.63, 0.25, 0.…
$ nm_cnt              <dbl> 2, 10, 5, 13, 2, 7, 4, 10, 2, 6, 4, 10, 3, 6, 1, 3…
$ nm_pct              <dbl> 0.05, 0.26, 0.14, 0.35, 0.14, 0.50, 0.21, 0.50, 0.…
$ stu_cnt             <dbl> 38, 39, 36, 37, 14, 14, 19, 20, 12, 11, 16, 17, 21…
$ stu_part_pct        <dbl> 0.98, 1.00, 0.86, 0.91, 0.88, 0.88, 0.77, 0.85, 0.…
$ avg_scaled_score    <dbl> 499, 487, 490, 491, 488, 486, 483, 479, 478, 478, …
$ avg_sgp             <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ avg_sgp_incl        <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ ach_percentile      <dbl> 90, 44, 34, 35, 27, 29, 37, 19, 30, 33, 19, 5, 43,…
$ district_and_school <chr> "Northampton - R. K. Finn Ryan Road", "Northampton…

Then you can make a plot!

ggplot(mcas, aes(y = avg_scaled_score, x = org_name, color = stu_grp, size = stu_cnt)) +
  geom_point() +
  facet_wrap(vars(subject_code)) +
  scale_y_continuous("Average Scaled Score") +
  scale_x_discrete("Elementary School", labels = label_wrap_gen(width = 15))

Recall what you learned in SDS 100 about making data graphics.

Journalistic style

Use standard journalistic style, as previously described.

  • Introductory Hook: something catchy and current.
  • Nutgraph: the meat of the story, provides the main point or theme (1-3 sentences).
  • Mechanistic Development: sequence of facts, quotes, and analysis that tells the story.
  • Inclusion of quotes by a peer or other sources. Effective quotes employ key details, characterization, entertaining and clear analogies.
  • Counterargument: usually occurs about two-thirds of the way through the piece.
  • Conclusion: the broader meaning and implications.

Deadlines

  • Sun, 2/15: Draft submission due
  • Tue, 2/17: Peer review in class
  • Sun, 2/22: Final submission due