This lab provides more practice with SQL.

library(tidyverse)
library(RMySQL)
db <- dbConnect(
  MySQL(),
  host = "scidb.smith.edu",
  user = "sds192",
  password = "DSismfc@S",
  dbname = "imdb"
)
knitr::opts_chunk$set(connection = db, max.print = 20)

Exercises

  1. List all movies that have Spike Lee as a director.
-- Sample solution
-- retrieve Spike Lee's person ID
SELECT * FROM name WHERE name LIKE 'Lee, Spike%';

-- Find all of her roles
SELECT 
  t.production_year AS Year, 
  t.title AS Title
FROM cast_info AS ci
JOIN title AS t ON ci.movie_id = t.id
WHERE ci.person_id = 1365233
  AND t.kind_id = 1
  AND ci.role_id = 8
ORDER BY production_year;
  1. List Alfre Woodard’s complete filmography. Include the columns Year, Title, and Role (like on the linked Wikipedia page).
-- Sample solution
-- retrieve Alfre Woodard's person ID
SELECT * FROM name WHERE name LIKE 'Woodard, Alfre%';

-- Find all of her roles
SELECT 
  t.production_year AS Year, 
  t.title AS Title, 
  cn.name AS Role
FROM cast_info AS ci
JOIN title AS t ON ci.movie_id = t.id
JOIN char_name AS cn ON cn.id = ci.person_role_id
WHERE ci.person_id = 4117290
  AND t.kind_id = 1
  AND ci.role_id = 2
ORDER BY production_year;
  1. How many times has Sean Connery played James Bond?
-- Sample solution
-- retrieve Sean Connery's person ID
SELECT * FROM name WHERE name LIKE 'Connery, Sean%';

-- Find all of his Bond roles
SELECT t.title, t.production_year, cn.name
FROM cast_info AS ci
JOIN title AS t ON ci.movie_id = t.id
JOIN char_name AS cn ON cn.id = ci.person_role_id
WHERE ci.person_id = 472637
  AND t.kind_id = 1
  AND ci.role_id = 1
  AND cn.name LIKE '%bond%'
ORDER BY production_year;
  1. List all of the actors who have played James Bond and how many times they have done so.
-- Sample solution
SELECT ci.person_id, n.name, sum(1) AS N
FROM cast_info AS ci
JOIN title AS t ON ci.movie_id = t.id
JOIN char_name AS cn ON cn.id = ci.person_role_id
JOIN name AS n ON n.id = ci.person_id
WHERE t.kind_id = 1
  AND ci.role_id = 1
  AND cn.name LIKE 'James Bond'
GROUP BY ci.person_id
ORDER BY N desc;
  1. On, November 25, 2020, The New York Times published a list of the The 25 Greatest Actors of the 21st Century (So Far). Choose one of these actors and find their filmography.
-- Sample solution for Michael B. Jordan, using a subquery
SELECT 
  t.production_year AS Year, 
  t.title AS Title, 
  cn.name AS Role
FROM cast_info AS ci
JOIN title AS t ON ci.movie_id = t.id
JOIN char_name AS cn ON cn.id = ci.person_role_id
WHERE ci.person_id = 
  (SELECT id FROM name WHERE name LIKE 'Jordan, Michael B.%')
  AND t.kind_id = 1
  AND ci.role_id = 1
ORDER BY production_year;
  1. Challenge: Among movies with at least 100,000 votes on IMDB, retrieve the list of the top 100 highest-rated movies of all-time.
-- Sample solution
SELECT t.id, t.title, t.production_year, mii.info AS votes, mii2.info AS rating
FROM title AS t
JOIN movie_info_idx AS mii ON mii.movie_id = t.id
JOIN movie_info_idx AS mii2 ON mii2.movie_id = t.id
WHERE t.kind_id = 1
  AND mii.info_type_id = 100
  AND mii2.info_type_id = 101
  AND mii.info > 100000
ORDER BY mii2.info DESC
LIMIT 0, 100;

Your Learning

Please respond to the following prompt on Slack in the #mod-sql channel.

Prompt: What questions do you still have about SQL?