This document is intended to describe how to access data from a MySQL database using R. It utilizes a database of wideband acoustic immitance variables from humans with normal hearing (see http://www.science.smith.edu/wai-database/ and https://projectreporter.nih.gov/project_info_description.cfm?aid=8769352&icde=30039221&ddparam=&ddvalue=&ddsub=&cr=10&csb=default&cs=ASC for more details).
A relevant paper on the topic of data management and databases in R can be found at http://chance.amstat.org/2015/04/setting-the-stage.
First we demonstrate how to access data using SQL (structured query language) commands and the dbGetQuery()
function. We begin by setting up a connection to the database.
library(tidyverse)
library(RMySQL)
con <- dbConnect(MySQL(), host = "scidb.smith.edu",
user = "waiuser", password = "smith_waiDB",
dbname = "wai")
Next a series of SQL queries can be sent to the database. These return R dataframes.
dbGetQuery(con, "SHOW TABLES")
## Tables_in_wai
## 1 Measurements
## 2 PI_Info
## 3 Subject
dbGetQuery(con, "EXPLAIN PI_Info")
## Field Type Null Key Default Extra
## 1 Identifier varchar(20) YES <NA>
## 2 PI_Year int(11) YES <NA>
## 3 PI varchar(500) YES <NA>
## 4 Affiliation varchar(500) YES <NA>
## 5 Email varchar(30) YES <NA>
## 6 Title varchar(140) YES <NA>
## 7 Pub varchar(30) YES <NA>
## 8 Date char(20) YES <NA>
## 9 URL varchar(140) YES <NA>
## 10 PI_Notes varchar(1500) YES <NA>
ds <- dbGetQuery(con, "SELECT * from Measurements LIMIT 10")
ds
## Identifier Sub_Number Session Left_Ear MEP Instrument Freq
## 1 Abur_2014 1 1 0 -5 1 210.938
## 2 Abur_2014 1 1 0 -5 1 234.375
## 3 Abur_2014 1 1 0 -5 1 257.812
## 4 Abur_2014 1 1 0 -5 1 281.250
## 5 Abur_2014 1 1 0 -5 1 304.688
## 6 Abur_2014 1 1 0 -5 1 328.125
## 7 Abur_2014 1 1 0 -5 1 351.562
## 8 Abur_2014 1 1 0 -5 1 375.000
## 9 Abur_2014 1 1 0 -5 1 398.438
## 10 Abur_2014 1 1 0 -5 1 421.875
## Absorbance Zmag Zang
## 1 0.0451375 110638000 -0.228113
## 2 0.0441247 100482000 -0.230561
## 3 0.0495935 90561100 -0.230213
## 4 0.0516088 83515500 -0.230959
## 5 0.0590836 77476800 -0.229652
## 6 0.0628038 71229100 -0.230026
## 7 0.0682962 66615500 -0.229576
## 8 0.0738373 61996200 -0.229327
## 9 0.0794857 58193600 -0.228984
## 10 0.0932373 54845900 -0.226507
Alternatively, a connection can be made to the server by creating a series of dplyr table objects.
db <- src_mysql(dbname = "wai", host = "scidb.smith.edu", user = "waiuser",
password = "smith_waiDB")
Measurements <- tbl(db, "Measurements")
PI_Info <- tbl(db, "PI_Info")
Subject <- tbl(db, "Subject")
PI_Info
table.PI_Info %>%
summarise(total = n())
## # Source: lazy query [?? x 1]
## # Database: mysql 5.5.58-0ubuntu0.14.04.1-log
## # [waiuser@scidb.smith.edu:/wai]
## total
## <dbl>
## 1 10
PI_Info %>%
collect() %>% # collect() is a bad idea when dealing with large tables!
data.frame()
## Identifier PI_Year
## 1 Rosowski_2012 2012
## 2 Abur_2014 2014
## 3 Shahnaz_2006 2006
## 4 Voss_1994 1994
## 5 Werner_2010 2010
## 6 Sun_2016 2016
## 7 Voss_2010 2010
## 8 Liu_2008 2008
## 9 Shaver_2013 2013
## 10 Neely_2015 2015
## PI
## 1 John J. Rosowski
## 2 Defne Abur; Nicholas J. Horton; Susan E. Voss
## 3 Navid Shahnaz; Karin Bork
## 4 Susan E. Voss
## 5 Douglas Keefe
## 6 Xiao-Ming Sun
## 7 Susan E. Voss
## 8 Yi-Wen Liu; Chris A. Sanford; John C. Ellison; Denis F. Fitzpatrick; Michael P. Gorga; Douglas H. Keefe
## 9 Mark D. Shaver, Xiao-Ming Sun
## 10 Stephen T. Neely
## Affiliation
## 1 Eaton-Peabody Laboratory, Massachusetts Eye and Ear Infirmary, Boston; Department of Otology and Laryngology, Harvard Medical School, Boston; Speech and Hearing Bioscience and Technology Program, Harvard-MIT Division of Health Sciences and Technology, Cambridge.
## 2 Smith College
## 3 University of British Columbia
## 4 Smith College, formerly AT&T Bell Labs
## 5 Boys Town National Hospital
## 6 Wichita State University
## 7 Smith College
## 8 Boys Town National Research Hospital
## 9 Wichita State University
## 10 Boys Town National Research Hospital
## Email
## 1 John_Rosowski@meei.harvard.edu
## 2 svoss@smith.edu
## 3 nshahnaz@audiospeech.ubc.ca
## 4 svoss@smith.edu
## 5 Douglas.Keefe@boystown.org
## 6 xiao-ming.sun@wichita.edu
## 7 svoss@smith.edu
## 8 Douglas.Keefe@boystown.org
## 9 xiao-ming.sun@wichita.edu
## 10 Stephen.Neely@boystown.org
## Title
## 1 Ear-Canal Reflectance, Umbo Velocity, and Tympanometry in Normal-Hearing Adults
## 2 Intrasubject Variability in Power Reflectance
## 3 Wideband Reflectance Norms for Caucasian and Chinese Young Adults
## 4 Measurement of acoustic impedance and reflectance in the human ear canal
## 5 Ear-Canal Wideband Acoustic Transfer Functions of Adults and Two- to Nine-Month-Old Infants
## 6 Wideband acoustic immittance: Normative study and test-retest reliability of tympanometric measurements in adults
## 7 Posture systematically alters ear-canal reflectance and DPOAE properties
## 8 Wideband absorbance tympanometry using pressure sweeps: System development and results on adults with normal hearing
## 9 Wideband energy reflectance measurements: Effects of negative middle ear pressure and application of a pressure compensation procedure
## 10 Non-invasive estimation of middle-ear input impedance and efficiency.
## Pub Date
## 1 Ear & Hearing 11/06/2015
## 2 J. Am Acad Audiol 08/24/2016
## 3 Ear & Hearing 08/24/2016
## 4 Journal of the Acoustical Soci 02/16/2017
## 5 Ear and Hearing 9/1/2017
## 6 Journal of Speech, Language, a 10/31/2017
## 7 Hearing Research 03/02/2010
## 8 The Journal of the Acoustical 09/25/2008
## 9 The Journal of the Acoustical 10/06/2018
## 10 Journal of the Acoustical Soci 10/10/2018.
## URL
## 1 http://www.ncbi.nlm.nih.gov/pubmed/21857517
## 2 Not available
## 3 http://journals.lww.com/ear-hearing/Abstract/2006/12000/Wideband_Reflectance_Norms_for_Caucasian_and.15.aspx
## 4 http://www.ncbi.nlm.nih.gov/pubmed/21857517
## 5 https://www.ncbi.nlm.nih.gov/pubmed/20517155
## 6 https://www.ncbi.nlm.nih.gov/pubmed/27517667
## 7 https://www.ncbi.nlm.nih.gov/pubmed/20227475
## 8 https://www.ncbi.nlm.nih.gov/pubmed/19206798
## 9 https://www.ncbi.nlm.nih.gov/pubmed/23862811\nhttps://asa.scitation.org/doi/full/10.1121/1.4807509
## 10 https://asa.scitation.org/doi/abs/10.1121/1.4927408
## PI_Notes
## 1 HearID (Mimosa Acoustics); \nNormal Criteria as follows: \n(1) There was no history of significant middle ear disease (e.g., otitis media or effusion 2 or more years previously were not considered significant if there were no known residual consequences).\n(2) There was no history of otologic surgery, with the exception of myringotomy or tympanostomy tube placement over 2 yr prior. \n(3) The external ear and TM revealed no abnormalities on otoscopic examination. \n(4) Audiometric measurements had pure-tone thresholds of 20 dB HL or better at octave frequen- cies between 0.250 and 8 kHz. \n(5) Air-bone gaps were no greater than 15 dB at 0.25 kHz and 10 dB between frequencies of 0.5 to 4 kHz. Most subjects had air and bone thresholds between 0 and 10 dB HL with an average near 8 to 9 dB HL at the highest frequencies. \n(6) Tympanograms were Type-A peaked, with peak pressures of 100 to 50 daPa, static compliance of 0.3 to 2.0 cc, total tympanometric volumes (static compliance ear canal volume) between 0.7 and 2.7 cc, and normal-appearing shape that is neither rounded nor sharp. \n(7) All subjects included in the \x93normal hearing\x94 population were required to have two \x93normal\x94 ears (as defined by criteria described earlier).
## 2 Database includes measurements at Position 1 and Channel B only.
## 3
## 4 Measurements taken with a system using sysid and the Etymotic ER-2 pressure transducer and ER-7c probe microphone
## 5 Used an ER-1 earphone and ER-7C microphone. Data provided by Doug Keefe and formatted by Susan Voss with help. Lynne Werner is retired.
## 6 N/A
## 7 N/A
## 8 N/A
## 9 A research version of Titan (Interacoustics) was used. In this study, a total of five reflectance measurements at ambient pressure were taken per ear (detailed in the article). But, results from only two sessions were reported in this article. Included in this database is the second session (baseline), as the normative data.
## 10 Custom system used for measurement of ear canal impedance designed by Dr. J. H. Siegel at Northwestern University. Ear canal areas were estimated from the surge impedance. Impedance was measured at 2 insertion depths (deep \x96 session 1, shallow \x96 session 2).
Subjects
table.Subject %>%
summarise(total = n())
## # Source: lazy query [?? x 1]
## # Database: mysql 5.5.58-0ubuntu0.14.04.1-log
## # [waiuser@scidb.smith.edu:/wai]
## total
## <dbl>
## 1 600
Subject %>%
collect() # be careful with collect() with large tables!
## # A tibble: 600 x 12
## Identifier Sub_Number Session_Total Age Female Race Ethnicity
## <chr> <chr> <int> <int> <int> <int> <int>
## 1 Rosowski_… 3 1 30 1 5 2
## 2 Rosowski_… 6 1 29 0 5 2
## 3 Rosowski_… 11 1 64 1 5 2
## 4 Rosowski_… 12 1 42 1 5 2
## 5 Rosowski_… 14 1 24 0 5 2
## 6 Rosowski_… 15 1 32 1 5 2
## 7 Rosowski_… 17 1 22 0 5 2
## 8 Rosowski_… 18 1 33 1 5 2
## 9 Rosowski_… 21 1 36 1 3 2
## 10 Rosowski_… 22 1 33 0 5 2
## # ... with 590 more rows, and 5 more variables: Left_Ear_Status <int>,
## # Right_Ear_Status <int>, Left_Ear_Area <dbl>, Right_Ear_Area <dbl>,
## # Sub_Notes <chr>
Measurements
table.Measurements %>%
summarise(total = n())
## # Source: lazy query [?? x 1]
## # Database: mysql 5.5.58-0ubuntu0.14.04.1-log
## # [waiuser@scidb.smith.edu:/wai]
## total
## <dbl>
## 1 251305
onesubj <- Measurements %>%
filter(Identifier == "Rosowski_2012", Sub_Number == 3) %>%
collect %>%
mutate(SessionNum = as.factor(Session))
head(onesubj)
## # A tibble: 6 x 11
## Identifier Sub_Number Session Left_Ear MEP Instrument Freq Absorbance
## <chr> <int> <int> <int> <dbl> <int> <dbl> <dbl>
## 1 Rosowski_… 3 1 1 NA 1 211. 0.0852
## 2 Rosowski_… 3 1 1 NA 1 234. 0.0903
## 3 Rosowski_… 3 1 1 NA 1 258. 0.112
## 4 Rosowski_… 3 1 1 NA 1 281. 0.103
## 5 Rosowski_… 3 1 1 NA 1 305. 0.129
## 6 Rosowski_… 3 1 1 NA 1 328. 0.136
## # ... with 3 more variables: Zmag <dbl>, Zang <dbl>, SessionNum <fct>
onesubj <- onesubj %>%
mutate(Ear = ifelse(Left_Ear == 1, "Left", "Right"))
ggplot(data = onesubj, aes(x = Freq, y = Absorbance, colour = Ear)) +
geom_point() +
scale_x_log10() +
ggtitle("Absorbance by ear Rosowski subject 3")