Mon 12 Jun 2006
An SQL Puzzle
Posted at 2:40 +1000
A quick little SQL problem that I came across again recently. Read no further if you don't like databases.
This problem is probably well-known to many people and easy for people with lots of SQL experience. But it is the prelude to a post I will write in a couple of days about writing custom SQL in Django and is based on a real-world problem that came up on the Django mailing lists a month or so ago. It's kind of fun to think about, too, if you have not done so before, too.
Aside
A dilemma I am going to have fairly frequently here is how to juggle presenting some of the more technical things I want to discuss with keeping it interesting for a general audience. At least for now, I think I will have every entry appear on the front page, but any really specialised article will have the bulk of the content on another page. Hopefully this will mean that reading the front-page will still be of some interest even to people who don't have my precise set of interests.
The Setup
Suppose I have a many-to-many relationship modelled in some database tables like the following (think of this as a way of allocating students to multiple classes). I have omitted everything but the primary and foreign keys:
CREATE TABLE Class (
id integer NOT NULL PRIMARY KEY,
...
);
CREATE TABLE Student (
id integer NOT NULL PRIMARY KEY,
...
);
CREATE TABLE Reln_Class_Student (
class_id integer NOT NULL REFERENCES Class(id),
student_id integer NOT NULL REFERENCES Student(id)
);
The Problem
If I am given a list of students (in particular, a list of student ID values), how can I find all the classes that contain all of those students?
In case it is not clear: each class in the result can contain more students than just those in the list, but each class must contain all of the students in the list.
Justification
This is not an uncommon request. It could also be disguised as "find all blog posts containing all of these tags", "find books covering these subjects", "find the tennis players who have played all of these opponents" and so forth.
The goal here is not to just find a solution, but find a neat solution. A single SQL query suffices to extract the information we need. Efficiency would be nice, but not a real concern. This isn't going to scale to Flickr-like volumes of data in any case.
In a reasonable world, if I was going to post homework problems like this, I would have a commenting system in place so that people could post answers. This isn't a reasonable world. For now, you'll just have to write down your solution on a napkin and keep it around for a day or two. One day a commenting system will be rolled out here. Today is not that day.
Topics: software/databases, software/django