class: center, middle, inverse, title-slide # Database Querying with SQL ## Searching efficiently ### Ben Baumer ### SDS 192April 22th, 2019(
http://beanumber.github.io/sds192/lectures/mdsr_sql_06-explain.html
) --- class: center, middle, inverse # Query plans --- ## Find all movies that were filmed in Northampton .center[![](https://www.visitnorthampton.net/images/image-3.jpg)] --- ## Search `movie_info` ```sql DESCRIBE movie_info; ``` ``` ## +--------------+---------+------+-----+---------+----------------+ ## | Field | Type | Null | Key | Default | Extra | ## +--------------+---------+------+-----+---------+----------------+ ## | id | int(11) | NO | PRI | NULL | auto_increment | ## | movie_id | int(11) | NO | MUL | NULL | | ## | info_type_id | int(11) | NO | MUL | NULL | | ## | info | text | NO | | NULL | | ## | note | text | YES | | NULL | | ## +--------------+---------+------+-----+---------+----------------+ ``` ```sql SHOW INDEXES FROM movie_info; ``` ``` ## Key_name Column_name Cardinality ## 1 PRIMARY id 24620359 ## 2 idx_mid movie_id 4103393 ## 3 movie_info_info_type_id_exists info_type_id 70 ``` --- ## [Query plan](https://en.wikipedia.org/wiki/Query_plan), no index ```sql *EXPLAIN SELECT * FROM movie_info WHERE info LIKE '%Northampton, M%'; ``` ``` ## table type key ref rows ## 1 movie_info ALL <NA> <NA> 24620359 ``` - No index on `info` - Must **scan whole table**! - 24.6 million rows --- ## SQL query optimization .center[![](https://docs.oracle.com/database/121/TGSQL/img/GUID-D0B38295-1289-42A5-94CC-4F1857D00835-default.png)] .footnote[https://en.wikipedia.org/wiki/Query_optimization] --- ## Query plan, with index ```sql EXPLAIN SELECT * FROM movie_info *WHERE info_type_id = 18; ``` ``` ## table type key ref rows ## 1 movie_info ref movie_info_info_type_id_exists const 1222424 ``` - Use index on `info_type_id` - Only need to consult 1.2 million rows --- ## Query plan, no index, with join ```sql EXPLAIN SELECT * FROM movie_info mi JOIN title t ON t.id = mi.movie_id WHERE mi.info LIKE '%Northampton, M%'; ``` ``` ## table type key ref rows ## 1 mi ALL <NA> <NA> 24620359 ## 2 t eq_ref PRIMARY imdb.mi.movie_id 1 ``` -- - `JOIN` on PRIMARY KEY is not expensive --- ## Query plan, indexed `WHERE`, with join ```sql EXPLAIN SELECT * FROM movie_info mi JOIN title t ON t.id = mi.movie_id WHERE mi.info_type_id = 18 AND mi.info LIKE '%Northampton, M%'; ``` ``` ## table type key ref rows ## 1 mi ref movie_info_info_type_id_exists const 1222424 ## 2 t eq_ref PRIMARY imdb.mi.movie_id 1 ``` --- ## Query plan, `WHERE` vs. `HAVING` ```sql EXPLAIN SELECT * FROM movie_info mi JOIN title t ON t.id = mi.movie_id *HAVING mi.info_type_id = 18 AND mi.info LIKE '%Northampton, M%'; ``` ``` ## table type key ref rows ## 1 mi ALL <NA> <NA> 24620359 ## 2 t eq_ref PRIMARY imdb.mi.movie_id 1 ``` - `HAVING` **never** uses an index! --- ## Query plan, `WHERE`, join, and filter ```sql EXPLAIN SELECT * FROM movie_info mi JOIN title t ON t.id = mi.movie_id WHERE mi.info_type_id = 18 AND mi.info LIKE '%Northampton, M%' * AND t.id < 1000; ``` ``` ## table type key ref rows ## 1 mi range idx_mid <NA> 1867 ## 2 t eq_ref PRIMARY imdb.mi.movie_id 1 ``` - Reduce scope of search with `WHERE` clause --- ## Movies filmed in Northampton ``` ## year title location ## 1 NA Homeless in a College Town Northampton, Massachusetts, US ## 2 1930 Coolidges Greet Mr Santa Claus Northampton, Massachusetts, US ## 3 1966 Who's Afraid of Virginia Woolf Smith College - 7 College Lane ## 4 1971 Carnal Knowledge Northampton, Massachusetts, US ## 5 1971 Carnal Knowledge Smith College - 7 College Lane ## 6 1975 The Reincarnation of Peter Pro Northampton, Massachusetts, US ## 7 1993 Malice Northampton, Massachusetts, US ## 8 1993 Malice Smith College - 7 College Lane ## 9 1995 Circuit Northampton, Massachusetts, US ## 10 1999 In Dreams Northampton State Hospital - 1 ## 11 1999 In Dreams Northampton, Massachusetts, US ## 12 1999 In Dreams Smith College - 7 College Lane ## 13 1999 The Cider House Rules Northampton State Hospital - 1 ## 14 1999 The Cider House Rules Northampton, Massachusetts, US ## 15 2001 Goodbye to You Northampton, Massachusetts, US ## 16 2002 Detour Northampton, Massachusetts, US ## 17 2003 Der Schrei der Vergessenen Northampton State Hospital - 1 ## 18 2006 The Nightingale Princess Northampton, Massachusetts, US ## 19 2007 Over the River... The Life of Northampton, Massachusetts, US ## 20 2007 Young @ Heart Aerostat Promotions, Northampt ## 21 2007 Young @ Heart Cooley Dickinson Hospital, 30 ## 22 2007 Young @ Heart Diva's, 492 Pleasant Street, N ## 23 2007 Young @ Heart Florence Community Center, 140 ## 24 2007 Young @ Heart Hampshire County Jail and Hous ## 25 2007 Young @ Heart Northampton, Massachusetts, US ## 26 2008 It's Complicated Northampton, Massachusetts, US ## 27 2009 Le frère qui envoya les Rosen Northampton, Massachusetts, US ## 28 2009 Moon and Star Northampton, Massachusetts, US ## 29 2009 Some Assembly Required Northampton, Massachusetts, US ## 30 2010 Asleep in Child's Park Northampton, Massachusetts, US ## 31 2010 Edge of Darkness Northampton, Massachusetts, US ## 32 2010 Hidden Battles Northampton, Massachusetts, US ## 33 2010 Kip and Vanilla Chip Northampton, Massachusetts, US ## 34 2012 Guam Northampton, Massachusetts, US ## 35 2013 Doomsday Northampton, Massachusetts, US ## 36 2015 The Answer Northampton, Massachusetts, US ## 37 2015 Why Are We Doing This in Front Northampton, Massachusetts, US ## 38 2016 Sonny Side Up Northampton, Massachusetts, US ## 39 2016 Split Costs Northampton, Massachusetts, US ## 40 2016 The Energy Specialist Northampton, Massachusetts, US ## 41 2017 The Detective's Daughter Northampton, Massachusetts, US ``` --- ## Help, my query won't run! - If query times out: - Increase timeout limit in Workbench - All queries should execute within a few minutes! -- - Run `EXPLAIN` on query - Are there millions of rows involved? - Are you doing a table scan on a big table? - Which [`JOIN` type](https://dev.mysql.com/doc/refman/8.0/en/explain-output.html#explain-join-types) is being used? - ~~Would adding an index help?~~ - Can you filter the table first? - Can you aggregate the data? .footnote[https://dev.mysql.com/doc/refman/8.0/en/using-explain.html]