+ - 0:00:00
Notes for current slide
Notes for next slide

Database Querying with SQL

IMDB

Ben Baumer

SDS 192
April 20th, 2020
(http://beanumber.github.io/sds192/lectures/mdsr_sql_03-imdb.html)

1 / 13

IMDB

2 / 13

Do we need the MySQL Workbench?

  • Short answer: No
3 / 13

Do we need the MySQL Workbench?

  • Short answer: No

  • Long answer: I find it helpful for:

    • exploring databases with a GUI
    • fiddling around with queries
    • more informative error messages
    • built-in help
    • adding keys and indexes (more on that later)
    • then copy-and-paste working queries into R Markdown
3 / 13

Ways to interact with MySQL

  1. GUI SQL client: MySQL Workbench (previous video)

  2. command line SQL client (next slide)

  3. generic SQL access in R: DBI::dbGetQuery()

  4. RMySQL through R Markdown chunks (see labs & hw)

  5. Lots of other ways...

4 / 13

Ways to interact with MySQL

  1. GUI SQL client: MySQL Workbench (previous video)

  2. command line SQL client (next slide)

  3. generic SQL access in R: DBI::dbGetQuery()

  4. RMySQL through R Markdown chunks (see labs & hw)

  5. Lots of other ways...

I prefer:

  • (1) for SQL development

  • (4) for SQL reporting

4 / 13

Method 2: Using the terminal

mysql -h scidb.smith.edu -u sds192 -pDSismfc@S
SHOW DATABASES;
## +--------------------+
## | Database |
## +--------------------+
## | information_schema |
## | airlines |
## | citibike |
## | fec |
## | imdb |
## | lahman |
## | nyctaxi |
## | yelp |
## +--------------------+
5 / 13

Getting information about databases;

USE imdb;
SHOW TABLES;
## +-----------------+
## | Tables_in_imdb |
## +-----------------+
## | aka_name |
## | aka_title |
## | cast_info |
## | char_name |
## | comp_cast_type |
## | company_name |
## | company_type |
## | complete_cast |
## | info_type |
## | keyword |
## | kind_type |
## | link_type |
## | movie_companies |
## | movie_info |
## | movie_info_idx |
## | movie_keyword |
## | movie_link |
## | name |
## | person_info |
## | role_type |
## | title |
## +-----------------+
6 / 13

Getting information about tables

DESCRIBE movie_info;
## +--------------+---------+------+-----+---------+----------------+
## | Field | Type | Null | Key | Default | Extra |
## +--------------+---------+------+-----+---------+----------------+
## | id | int(11) | NO | PRI | NULL | auto_increment |
## | movie_id | int(11) | NO | MUL | NULL | |
## | info_type_id | int(11) | NO | MUL | NULL | |
## | info | text | NO | | NULL | |
## | note | text | YES | | NULL | |
## +--------------+---------+------+-----+---------+----------------+
7 / 13

People in the IMDB

https://en.wikipedia.org/wiki/Primary_key

  • name: people and their names
    • id column is primary key
8 / 13

People in the IMDB

https://en.wikipedia.org/wiki/Primary_key

  • name: people and their names
    • id column is primary key
  • person_id column references name.id from:
    • person_info
    • cast_info
    • aka_name
8 / 13

Movies in the IMDB

  • title: movies and their titles
    • id column is primary key
9 / 13

Movies in the IMDB

  • title: movies and their titles
    • id column is primary key
  • movie_id column references title.id from:
    • cast_info
    • aka_title
    • movie_companies
    • movie_info
    • movie_info_idx
    • movie_keyword
    • movie_link
9 / 13

What's the deal with table aliases?

Consider:

SELECT title.title,
company_name.name,
title.production_year
FROM movie_companies
JOIN company_name
ON movie_companies.company_id = company_name.id
JOIN title
ON title.id = movie_companies.movie_id
WHERE company_name.name LIKE '%All Rise Films%'
AND title.kind_id = 1
AND movie_companies.company_type_id = 2;
11 / 13

What's the deal with table aliases?

Instead:

SELECT t.title,
cn.name,
t.production_year
FROM movie_companies AS mc
JOIN company_name AS cn
ON mc.company_id = cn.id
JOIN title AS t
ON t.id = mc.movie_id
WHERE cn.name LIKE '%All Rise Films%'
AND t.kind_id = 1
AND mc.company_type_id = 2;
12 / 13

Example: Henry & June

Restrict to English-language films only

SELECT t.title, t.production_year,
mi.info AS rating, mi2.info AS language
FROM title t
JOIN movie_info AS mi
ON mi.movie_id = t.id
JOIN movie_info AS mi2
ON mi2.movie_id = t.id
WHERE mi.info_type_id IN (67, 97)
AND mi.info LIKE '%NC-17%'
AND kind_id = 1
AND mi2.info_type_id = 4
AND mi2.info = 'English'
ORDER BY production_year;
13 / 13

IMDB

2 / 13
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow