Introduction

This document is intended to describe how to access data from a MySQL database using R. It utilizes a database of wideband acoustic immitance variables from humans with normal hearing (see http://www.science.smith.edu/wai-database/ and https://projectreporter.nih.gov/project_info_description.cfm?aid=8769352&icde=30039221&ddparam=&ddvalue=&ddsub=&cr=10&csb=default&cs=ASC for more details).

A relevant paper on the topic of data management and databases in R can be found at http://chance.amstat.org/2015/04/setting-the-stage.

Accessing data from a database using SQL commands

First we demonstrate how to access data using SQL (structured query language) commands and the dbGetQuery() function. We begin by setting up a connection to the database.

library(tidyverse)
library(RMySQL)
con <- dbConnect(MySQL(), host = "scidb.smith.edu",
                user = "waiuser", password = "smith_waiDB", 
                dbname = "wai")

Next a series of SQL queries can be sent to the database. These return R dataframes.

dbGetQuery(con, "SHOW TABLES")
##   Tables_in_wai
## 1  Measurements
## 2       PI_Info
## 3       Subject
dbGetQuery(con, "EXPLAIN PI_Info")
##          Field          Type Null Key Default Extra
## 1   Identifier   varchar(20)  YES        <NA>      
## 2      PI_Year       int(11)  YES        <NA>      
## 3           PI  varchar(500)  YES        <NA>      
## 4  Affiliation  varchar(500)  YES        <NA>      
## 5        Email   varchar(30)  YES        <NA>      
## 6        Title  varchar(140)  YES        <NA>      
## 7          Pub   varchar(30)  YES        <NA>      
## 8         Date      char(20)  YES        <NA>      
## 9          URL  varchar(140)  YES        <NA>      
## 10    PI_Notes varchar(1500)  YES        <NA>
ds <- dbGetQuery(con, "SELECT * from Measurements LIMIT 10")
ds
##    Identifier Sub_Number Session Left_Ear MEP Instrument    Freq
## 1   Abur_2014          1       1        0  -5          1 210.938
## 2   Abur_2014          1       1        0  -5          1 234.375
## 3   Abur_2014          1       1        0  -5          1 257.812
## 4   Abur_2014          1       1        0  -5          1 281.250
## 5   Abur_2014          1       1        0  -5          1 304.688
## 6   Abur_2014          1       1        0  -5          1 328.125
## 7   Abur_2014          1       1        0  -5          1 351.562
## 8   Abur_2014          1       1        0  -5          1 375.000
## 9   Abur_2014          1       1        0  -5          1 398.438
## 10  Abur_2014          1       1        0  -5          1 421.875
##    Absorbance      Zmag      Zang
## 1   0.0451375 110638000 -0.228113
## 2   0.0441247 100482000 -0.230561
## 3   0.0495935  90561100 -0.230213
## 4   0.0516088  83515500 -0.230959
## 5   0.0590836  77476800 -0.229652
## 6   0.0628038  71229100 -0.230026
## 7   0.0682962  66615500 -0.229576
## 8   0.0738373  61996200 -0.229327
## 9   0.0794857  58193600 -0.228984
## 10  0.0932373  54845900 -0.226507

Accessing a database using dplyr commands

Alternatively, a connection can be made to the server by creating a series of dplyr table objects.

db <- src_mysql(dbname = "wai", host = "scidb.smith.edu", user = "waiuser", 
                password = "smith_waiDB")
Measurements <- tbl(db, "Measurements")
PI_Info <- tbl(db, "PI_Info")
Subject <- tbl(db, "Subject")

Let’s explore the PI_Info table.

PI_Info %>% 
  summarise(total = n())
## # Source:   lazy query [?? x 1]
## # Database: mysql 5.5.58-0ubuntu0.14.04.1-log
## #   [waiuser@scidb.smith.edu:/wai]
##   total
##   <dbl>
## 1    10
PI_Info %>% 
  collect() %>%  # collect() is a bad idea when dealing with large tables!
  data.frame()   
##       Identifier PI_Year
## 1  Rosowski_2012    2012
## 2      Abur_2014    2014
## 3   Shahnaz_2006    2006
## 4      Voss_1994    1994
## 5    Werner_2010    2010
## 6       Sun_2016    2016
## 7      Voss_2010    2010
## 8       Liu_2008    2008
## 9    Shaver_2013    2013
## 10    Neely_2015    2015
##                                                                                                         PI
## 1                                                                                         John J. Rosowski
## 2                                                            Defne Abur; Nicholas J. Horton; Susan E. Voss
## 3                                                                                Navid Shahnaz; Karin Bork
## 4                                                                                            Susan E. Voss
## 5                                                                                            Douglas Keefe
## 6                                                                                            Xiao-Ming Sun
## 7                                                                                            Susan E. Voss
## 8  Yi-Wen Liu; Chris A. Sanford; John C. Ellison; Denis F. Fitzpatrick; Michael P. Gorga; Douglas H. Keefe
## 9                                                                            Mark D. Shaver, Xiao-Ming Sun
## 10                                                                                        Stephen T. Neely
##                                                                                                                                                                                                                                                               Affiliation
## 1  Eaton-Peabody Laboratory, Massachusetts Eye and Ear Infirmary, Boston; Department of Otology and Laryngology, Harvard Medical School, Boston; Speech and Hearing Bioscience and Technology Program, Harvard-MIT Division of Health Sciences and Technology, Cambridge.
## 2                                                                                                                                                                                                                                                           Smith College
## 3                                                                                                                                                                                                                                          University of British Columbia
## 4                                                                                                                                                                                                                                  Smith College, formerly AT&T Bell Labs
## 5                                                                                                                                                                                                                                             Boys Town National Hospital
## 6                                                                                                                                                                                                                                                Wichita State University
## 7                                                                                                                                                                                                                                                           Smith College
## 8                                                                                                                                                                                                                                    Boys Town National Research Hospital
## 9                                                                                                                                                                                                                                                Wichita State University
## 10                                                                                                                                                                                                                                   Boys Town National Research Hospital
##                             Email
## 1  John_Rosowski@meei.harvard.edu
## 2                 svoss@smith.edu
## 3    nshahnaz@audiospeech.ubc.ca 
## 4                 svoss@smith.edu
## 5      Douglas.Keefe@boystown.org
## 6       xiao-ming.sun@wichita.edu
## 7                 svoss@smith.edu
## 8      Douglas.Keefe@boystown.org
## 9       xiao-ming.sun@wichita.edu
## 10     Stephen.Neely@boystown.org
##                                                                                                                                     Title
## 1                                                         Ear-Canal Reflectance, Umbo Velocity, and Tympanometry in Normal-Hearing Adults
## 2                                                                                           Intrasubject Variability in Power Reflectance
## 3                                                                       Wideband Reflectance Norms for Caucasian and Chinese Young Adults
## 4                                                                Measurement of acoustic impedance and reflectance in the human ear canal
## 5                                             Ear-Canal Wideband Acoustic Transfer Functions of Adults and Two- to Nine-Month-Old Infants
## 6                       Wideband acoustic immittance: Normative study and test-retest reliability of tympanometric measurements in adults
## 7                                                                Posture systematically alters ear-canal reflectance and DPOAE properties
## 8                    Wideband absorbance tympanometry using pressure sweeps: System development and results on adults with normal hearing
## 9  Wideband energy reflectance measurements: Effects of negative middle ear pressure and application of a pressure compensation procedure
## 10                                                                  Non-invasive estimation of middle-ear input impedance and efficiency.
##                               Pub        Date
## 1                   Ear & Hearing  11/06/2015
## 2               J. Am Acad Audiol  08/24/2016
## 3                  Ear & Hearing   08/24/2016
## 4  Journal of the Acoustical Soci  02/16/2017
## 5                 Ear and Hearing    9/1/2017
## 6  Journal of Speech, Language, a  10/31/2017
## 7                Hearing Research  03/02/2010
## 8  The Journal of the Acoustical   09/25/2008
## 9  The Journal of the Acoustical   10/06/2018
## 10 Journal of the Acoustical Soci 10/10/2018.
##                                                                                                               URL
## 1                                                                     http://www.ncbi.nlm.nih.gov/pubmed/21857517
## 2                                                                                                   Not available
## 3   http://journals.lww.com/ear-hearing/Abstract/2006/12000/Wideband_Reflectance_Norms_for_Caucasian_and.15.aspx 
## 4                                                                     http://www.ncbi.nlm.nih.gov/pubmed/21857517
## 5                                                                    https://www.ncbi.nlm.nih.gov/pubmed/20517155
## 6                                                                    https://www.ncbi.nlm.nih.gov/pubmed/27517667
## 7                                                                    https://www.ncbi.nlm.nih.gov/pubmed/20227475
## 8                                                                    https://www.ncbi.nlm.nih.gov/pubmed/19206798
## 9              https://www.ncbi.nlm.nih.gov/pubmed/23862811\nhttps://asa.scitation.org/doi/full/10.1121/1.4807509
## 10                                                            https://asa.scitation.org/doi/abs/10.1121/1.4927408
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              PI_Notes
## 1  HearID (Mimosa Acoustics); \nNormal Criteria as follows: \n(1) There was no history of significant middle ear disease (e.g., otitis media or effusion 2 or more years previously were not considered significant if there were no known residual consequences).\n(2) There was no history of otologic surgery, with the exception of myringotomy or tympanostomy tube placement over 2 yr prior. \n(3) The external ear and TM revealed no abnormalities on otoscopic examination. \n(4) Audiometric measurements had pure-tone thresholds of 20 dB HL or better at octave frequen- cies between 0.250 and 8 kHz. \n(5) Air-bone gaps were no greater than 15 dB at 0.25 kHz and 10 dB between frequencies of 0.5 to 4 kHz. Most subjects had air and bone thresholds between 0 and 10 dB HL with an average near 8 to 9 dB HL at the highest frequencies. \n(6) Tympanograms were Type-A peaked, with peak pressures of 100 to 50 daPa, static compliance of 0.3 to 2.0 cc, total tympanometric volumes (static compliance ear canal volume) between 0.7 and 2.7 cc, and normal-appearing shape that is neither rounded nor sharp. \n(7) All subjects included in the \x93normal hearing\x94 population were required to have two \x93normal\x94 ears (as defined by criteria described earlier).
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    Database includes measurements at Position 1 and Channel B only.
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   Measurements taken with a system using sysid and the Etymotic ER-2 pressure transducer and ER-7c probe microphone
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           Used an ER-1 earphone and ER-7C microphone.  Data provided by Doug Keefe and formatted by Susan Voss with help.  Lynne Werner is retired.
## 6                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 N/A
## 7                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 N/A
## 8                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 N/A
## 9                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              A research version of Titan (Interacoustics) was used. In this study, a total of five reflectance measurements at ambient pressure were taken per ear (detailed in the article). But, results from only two sessions were reported in this article. Included in this database is the second session (baseline), as the normative data.
## 10                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            Custom system used for measurement of ear canal impedance designed by Dr. J. H. Siegel at Northwestern University. Ear canal areas were estimated from the surge impedance. Impedance was measured at 2 insertion depths (deep \x96 session 1, shallow \x96 session 2).

Let’s explore the Subjects table.

Subject %>% 
  summarise(total = n())
## # Source:   lazy query [?? x 1]
## # Database: mysql 5.5.58-0ubuntu0.14.04.1-log
## #   [waiuser@scidb.smith.edu:/wai]
##   total
##   <dbl>
## 1   600
Subject %>% 
  collect()  # be careful with collect() with large tables!
## # A tibble: 600 x 12
##    Identifier Sub_Number Session_Total   Age Female  Race Ethnicity
##    <chr>      <chr>              <int> <int>  <int> <int>     <int>
##  1 Rosowski_… 3                      1    30      1     5         2
##  2 Rosowski_… 6                      1    29      0     5         2
##  3 Rosowski_… 11                     1    64      1     5         2
##  4 Rosowski_… 12                     1    42      1     5         2
##  5 Rosowski_… 14                     1    24      0     5         2
##  6 Rosowski_… 15                     1    32      1     5         2
##  7 Rosowski_… 17                     1    22      0     5         2
##  8 Rosowski_… 18                     1    33      1     5         2
##  9 Rosowski_… 21                     1    36      1     3         2
## 10 Rosowski_… 22                     1    33      0     5         2
## # ... with 590 more rows, and 5 more variables: Left_Ear_Status <int>,
## #   Right_Ear_Status <int>, Left_Ear_Area <dbl>, Right_Ear_Area <dbl>,
## #   Sub_Notes <chr>

Let’s explore the Measurements table.

Measurements %>% 
  summarise(total = n())
## # Source:   lazy query [?? x 1]
## # Database: mysql 5.5.58-0ubuntu0.14.04.1-log
## #   [waiuser@scidb.smith.edu:/wai]
##    total
##    <dbl>
## 1 251305

Let’s download the data from a given subject

onesubj <- Measurements %>% 
  filter(Identifier == "Rosowski_2012", Sub_Number == 3) %>%
  collect %>%
  mutate(SessionNum = as.factor(Session))
head(onesubj)
## # A tibble: 6 x 11
##   Identifier Sub_Number Session Left_Ear   MEP Instrument  Freq Absorbance
##   <chr>           <int>   <int>    <int> <dbl>      <int> <dbl>      <dbl>
## 1 Rosowski_…          3       1        1    NA          1  211.     0.0852
## 2 Rosowski_…          3       1        1    NA          1  234.     0.0903
## 3 Rosowski_…          3       1        1    NA          1  258.     0.112 
## 4 Rosowski_…          3       1        1    NA          1  281.     0.103 
## 5 Rosowski_…          3       1        1    NA          1  305.     0.129 
## 6 Rosowski_…          3       1        1    NA          1  328.     0.136 
## # ... with 3 more variables: Zmag <dbl>, Zang <dbl>, SessionNum <fct>

Finally we can plot the results

onesubj <- onesubj %>%
  mutate(Ear = ifelse(Left_Ear == 1, "Left", "Right"))

ggplot(data = onesubj, aes(x = Freq, y = Absorbance, colour = Ear)) + 
  geom_point() +
  scale_x_log10() + 
  ggtitle("Absorbance by ear Rosowski subject 3")