SQL exercises from the Stanford DB course
These SQL tables and the questions asked of them are from the famed Stanford DB Course ( older link).
Solutions are my own, in PostgreSQL.
I learn by example and like to write code by starting from a previously-working template. The goal of this article is to have a reproducible example for later personal reference (and to make note of some personal corrections/updates).
Contents
- Introduction
-
Question 1
Find the names of all students who are friends with someone named Gabriel. -
Question 2
For every student who likes someone two or more grades younger than themselves, return that student's name and grade, and the name and grade of the student they like. -
Question 3
For every pair of students who both like each other, return the name and grade of both students. Include each pair only once, with the two names in alphabetical order. -
Question 4
For each student A who likes student B where the two are not friends, find if they have a friend C in common (who can introduce them!). For all such trios, return the name and grade of A, B, and C. -
Question 5
Find the name and grade of all students who are liked by more than one other student. - Appendix: Code to create tables
Introduction
Students at your hometown high school have decided to organize their social network using databases. So far, they have collected information about sixteen students in four grades, 9-12. The schema is as follows.
- highschooler
-
(id, name, grade)
Interpretation There is a high school student with a unique ID and a given first name is a certain grade. - friend
-
(id1, id2)
Interpretation The student with id1 is friends with the student with id2. Friendship is mutual, so if(123, 456)
is in thefriend
table, so is(456, 123)
. - likes
-
(id1, id2)
Interpretation The student with id1 likes the student with id2. Liking someone is not necessarily mutual, so if(123, 456)
is in thelikes
table, there is no guarantee that(456, 123)
is also present.
On the right is a graph showing the various connections between the students in the database. 9th graders are blue, 10th graders are green, 11th graders are yellow, and 12th graders are purple. Undirected black edges indicate friendships, and directed red edges indicate that one student likes another student. Complete contents of the database can be viewed in the appendix.
Questions
The following questions have the respondent writing SQL queries to produce the solution.
Question 1 Find the names of all students who are friends with someone named Gabriel.
Expected output
Alexis |
Andrew |
Cassandra |
Jessica |
Jordan |
Solution 1 This problem can be solved with a semi-join. In other words, we'll decide what values to keep in the left table by using values in the right.
/* LOGIC
- fix id1 as Gabriel
- we want to find all the id2 with id1 as Gabriel
- we want to find the names associated with those id2
*/
SELECT h.name
FROM friend f
LEFT JOIN highschooler h
ON f.id2 = h.id
WHERE f.id1 IN (
SELECT id
FROM highschooler
WHERE name = 'Gabriel'
)
Question 2 For every student who likes someone two or more grades younger than themselves, return that student's name and grade, and the name and grade of the student they like.
Expected output
John | 12 | Haley | 10 |
Solution 2 We want to select rows of the likes
table, filtered by whether the grade of the liker (e.g., John) is at least 2 grades more than the grade of the likee (Haley).
/* LOGIC
- for every pair in the likes table,
pull their name and grade
- then filter for pairs where student1's grade is more
then 2 levels greater than student 2
*/
SELECT
h1.name AS student1,
h1.grade AS student1_grade,
h2.name AS student2,
h2.grade AS student2_grade
FROM likes l
LEFT JOIN highschooler h1
ON h1.id = l.id1
LEFT JOIN highschooler h2
ON h2.id = l.id2
WHERE h1.grade >= h2.grade + 2
Question 3 For every pair of students who both like each other, return the name and grade of both students. Include each pair only once, with the two names in alphabetical order.
Expected output
Cassandra | 9 | Gabriel | 9 |
Jessica | 11 | Kyle | 12 |
Solution 3 We'll do a self-join with the likes table to identify pairs of students that like each other.
/* LOGIC
- (in a subquery, w/ a self-join)
first select pairs of students
that like each other, by their ID numbers
- in the outer query, attach names and grades to these IDs
- filter duplicates / alphabetize in a WHERE statement
*/
SELECT
h1.name AS student1,
h1.grade AS student1_grade,
h2.name AS student2,
h2.grade AS student2_grade
FROM (
-- select pairs of students that like each other
SELECT
l.id1,
l.id2
FROM likes l
INNER JOIN likes as ll
ON ll.id1 = l.id2
AND ll.id2 = l.id1
) AS soln
LEFT JOIN highschooler h1
ON h1.id = soln.id1
LEFT JOIN highschooler h2
ON h2.id = soln.id2
WHERE h1.name < h2.name
ORDER BY student1
Question 4 For each student A who likes student B where the two are not friends, find if they have a friend C in common (who can introduce them!). For all such trios, return the name and grade of A, B, and C.
Expected output
Andrew | 10 | Cassandra | 9 | Gabriel | 9 |
Austin | 11 | Jordan | 12 | Andrew | 10 |
Austin | 11 | Jordan | 12 | Kyle | 11 |
Solution 4 We know who likes who (unidirectional relationship) from the likes table. We'll filter down these pairs through a series of joins.
-
For every pair in the likes table, we want to identify which pairs are not friends.
- We'll do this by left joining friends to likes on both like-pair IDs.
- There will be null values for friends columns associated with like-pairs that are not friends.
- For pairs A and B, we want to find out if A and B have a friend in common, and identify that friend.
- Finally, we attach names and grades to those ID numbers.
/* LOGIC
- identify which pairs in the likes table are not friends
- identify if any of these pairs have a friend in common
- attach names and grades to those ID numbers
*/
SELECT
h1.name, h1.grade,
h2.name, h2.grade,
h3.name, h3.grade
FROM (
SELECT
-- student A
l.id1 AS student_A,
-- student B
l.id2 AS student_B,
-- friends of student A
-- (will later filter such that these are also
-- friends of B)
(CASE WHEN l.id1 = f.id1
THEN f.id2 END) AS student_C
FROM likes l
-- table f: friends of student A
LEFT JOIN friend f
ON l.id1 = f.id1
-- table f2: identifies whether A & B are friends
-- (not friends if f2.id* is null)
LEFT JOIN friend f2
ON l.id1 = f2.id1 -- student A
AND l.id2 = f2.id2 -- student B
-- table f3: identifies student C
-- i.e., friends of B who are also friends of A
INNER JOIN friend f3
ON l.id2 = f3.id1 -- student B
AND f3.id2 = f.id2 -- B's friend
WHERE f2.id1 IS NULL -- filter 'like' pairs
-- that are not friends
) AS students
LEFT JOIN highschooler h1
ON student_A = h1.id
LEFT JOIN highschooler h2
ON student_B = h2.id
LEFT JOIN highschooler h3
ON student_C = h3.id
Question 5 Find the name and grade of all students who are liked by more than one other student.
Expected output
Cassandra | 9 |
Kris | 10 |
Solution 5 We'll aggregate the likes table and filter out those students that have less than two admirers.
SELECT h.name, h.grade
FROM (
SELECT id2
FROM likes
GROUP BY id2
HAVING COUNT(id1) > 1
) AS liked
LEFT JOIN highschooler h
ON h.id = liked.id2
Appendix: Code to create tables
I used the code below to populate the highschooler, friend, and likes tables in a PostgreSQL CoderPad sandbox.