The Art of SQL JOINs: Combining Data from Multiple Tables
In the world of databases, JOINs are a fundamental concept that allows us to combine data from multiple tables into a single result set. But what exactly is a JOIN, and how does it work?
A JOIN is a way to combine rows from two or more tables based on a related column between them. This allows us to combine data from multiple tables into a single result set, making it easier to analyze and manipulate the data.
List the films where the yr is 1962 [Show id, title]
SELECT id, title
FROM movie
WHERE yr=1962
Give year of 'Citizen Kane'.
SELECT yr FROM movie
WHERE title='Citizen Kane'
List all of the Star Trek movies, include the id, title and yr (all of these movies include the words Star Trek in the title). Order results by year.
SELECT id,title,yr FROM movie
WHERE title LIKE 'Star Trek%'
ORDER BY yr
What id number does the actor 'Glenn Close' have?
SELECT id FROM actor
WHERE name='Glenn Close'
What is the id of the film 'Casablanca'
SELECT id FROM movie
WHERE title='Casablanca'
Obtain the cast list for 'Casablanca'.
Use movieid=11768, (or whatever value you got from the previous question)
SELECT actor.name
FROM actor
JOIN casting
ON casting.actorid = actor.id
WHERE casting.movieid = 11768
Obtain the cast list for the film 'Alien'
SELECT actor.name
FROM actor
JOIN casting
ON casting.actorid = actor.id
JOIN movie
ON movie.id = casting.movieid
WHERE movie.title = 'Alien'
List the films in which 'Harrison Ford' has appeared
SELECT movie.title
FROM movie
JOIN casting
ON casting.movieid = movie.id
JOIN actor
ON casting.actorid = actor.id
WHERE actor.name = 'Harrison Ford'
List the films where 'Harrison Ford' has appeared - but not in the starring role. [Note: the ord field of casting gives the position of the actor. If ord=1 then this actor is in the starring role]
SELECT movie.title
FROM movie
JOIN casting
ON casting.movieid = movie.id
JOIN actor
ON actor.id = casting.actorid
WHERE actor.name = 'Harrison Ford'
AND casting.ord != 1;
List the films together with the leading star for all 1962 films.
SELECT movie.title, actor.name
FROM movie
JOIN casting
ON casting.movieid = movie.id
JOIN actor
ON actor.id = casting.actorid
WHERE movie.yr = 1962
AND casting.ord = 1;
Which were the busiest years for 'Rock Hudson', show the year and the number of movies he made each year for any year in which he made more than 2 movies.
SELECT yr,COUNT(title) FROM
movie JOIN casting ON movie.id=movieid
JOIN actor ON actorid=actor.id
WHERE name='Rock Hudson'
GROUP BY yr
HAVING COUNT(title) > 2
List the film title and the leading actor for all of the films 'Julie Andrews' played in.
SELECT DISTINCT m.title, a.name
FROM (SELECT movie.*
FROM movie
JOIN casting
ON casting.movieid = movie.id
JOIN actor
ON actor.id = casting.actorid
WHERE actor.name = 'Julie Andrews') AS m
JOIN (SELECT actor.*, casting.movieid AS movieid
FROM actor
JOIN casting
ON casting.actorid = actor.id
WHERE casting.ord = 1) as a
ON m.id = a.movieid
ORDER BY m.title;
Obtain a list, in alphabetical order, of actors who've had at least 15 starring roles.
SELECT actor.name FROM actor
JOIN casting ON casting.actorid = actor.id
WHERE casting.ord = 1
GROUP BY actor.name
HAVING COUNT(*) >= 15
List the films released in the year 1978 ordered by the number of actors in the cast, then by title.
SELECT movie.title, COUNT(*) FROM movie
JOIN casting ON movie.id = casting.movieid
WHERE movie.yr = 1978
GROUP BY movie.id
ORDER BY COUNT(*) DESC
List all the people who have worked with 'Art Garfunkel'.
SELECT a.name
FROM (SELECT movie.*
FROM movie
JOIN casting
ON casting.movieid = movie.id
JOIN actor
ON actor.id = casting.actorid
WHERE actor.name = 'Art Garfunkel') AS m
JOIN (SELECT actor.*, casting.movieid
FROM actor
JOIN casting
ON casting.actorid = actor.id
WHERE actor.name != 'Art Garfunkel') as a
ON m.id = a.movieid;
In conclusion, JOINs are a powerful tool in the SQL language that allows us to combine data from multiple tables into a single result set. By understanding the different types of JOINs and following best practices, we can create complex queries that combine data from multiple tables, making it easier to analyze and manipulate the data.