let's say I have three tables (MySQL), one for users, one for tags and a last one for joining them (many to many relation):
create table user_tag(
user_id int unsigned not null,
tag_id int unsigned not null,
primary_key (user_id, tag_id)
);
I want to get out of the database a complete list of users (or at least many of them) along with the tags they have associated. I use PHP as a server language for that.
So my question is, is it better to perform one SQL query fetching all the information like this:
select user.name, user.image, ..., tag.name from user, tag, user_tag
where user.user_id = user_tag.user_id and tag.tag_id = user_tag.tag_id;
or is it better to perform one first query for the users and afterwards fetch their tags:
select user.name, user.image, ... from user;
and then for each user:
select tag.name from tag, user_tag where tag.tag_id = user_tag.tag_id
and user_tag.user_id = $USERID;
I feel like second is better option but I am afraid that can be too much queries for the database (NOTE that this is a general design example, but this pattern can be present multiple times over the database with different tables).
Which is better? Pros and Cons? Other ways?
Thanks
PS: please dont take into account SQL syntax, I haven't checked it out against a real database, it is only design question, thanks
I would only consider two options:
One query:
SELECT user.user_id, user.name, user.image, ..., tag.tag_id, tag.name
FROM user
LEFT JOIN user_tag ON user.user_id = user_tag.user_id
LEFT JOIN tag ON user_tag.tag_id=tag.tag_id
Two queries:
SELECT user_id, name, image, ...
FROM user
SELECT user_tag.user_id, tag.tag_id, tag.name
FROM user_tag
INNER JOIN tag ON user_tag.tag_id=tag.tag_id
If you have 100 users, there's no point in issuing an identical query 100 times.
I normally go for #1 when I'm only fetching a few columns and I go for #2 otherwise. In this case, you seem to be retrieving a pretty complete user profile so #2 sounds good.
This really depends on lots of specific items, database size, data size, usage, number of concurrent queries.....
I would suggest that you try (for performance) each of the methods and apply the one that most fits your requirements / needs.
I would 99% of the time use a single query - if you have all of the relavent indexes in place IMO its far quicker to send a single query .... quicker in terms of coding, server response and also makes things a little easier to read - you can understand the database structure as you can see joins - you could use different alises in another query, and that might look confusing.
There's no easy answer to this as much will depend on your architecture. The best way to find out would to benchmark the two solutions in your architecture. However, given the choice between one query and many, as long as your tables are indexed correctly then I would go with the one query solution. That way you will save the overhead of the database connection and file reads.
More options:
Option 3: Two queries. First fetch all users:
select user.name, user.image, ...
from user
ORDER BY user_id;
and then get tags for all users (and combine the 2 queries in PHP):
select user_tag.user_id, tag.name
from tag
JOIN user_tag
ON tag.tag_id = user_tag.tag_id
GROUP BY tag.user_id
ORDER BY tag.user_id
, tag.name;
Option 4. get all in one query, one row per user:
SELECT user.name, user.image, ...
, GROUP_CONCAT(tag.name ORDER BY tag_name SEPARATOR ' ') AS tags
FROM user
LEFT JOIN user_tag
ON user_tag.user_id = user.user_id
LEF JOIN tag
ON tag.tag_id = user_tag.tag_id
GROUP BY user.user_id
ORDER BY user.user_id;