I have a product_table
with a product_id
among several other fields.
I am creating a tag_table
with 2 fields, tag_id
and tag_name
.
The tags could be anything like "silver" or "penn state" etc.
I am also creating a product_tag_map
table with product_id
and tag_id
in order to map a product to any number of tags.
If I wanted to create a category that contained all products tagged as "silver", "necklace", "diamond". But also exclude any product tagged as "initial", "journey"
(Obviously I would use the tag_id not tag_name, so products with tags [2,4,5] that do not have tags [3,6])
How could a create a temporary product table and populate it with the matching products?
here is my product_tag_map table:
CREATE TABLE `product_tag_map` (
`product_tag_map_id` int(11) NOT NULL auto_increment,
`tag_id` int(11) NOT NULL,
`product_id` int(11) NOT NULL,
PRIMARY KEY (`product_tag_map_id`),
UNIQUE KEY `tag_id` (`tag_id`,`product_id`),
KEY `tag_id_2` (`tag_id`)
) ENGINE=MyISAM AUTO_INCREMENT=7897 DEFAULT CHARSET=utf8
Note: I am not using product_tag_map_id
for anything. I am just in the habit of giving each table a primary key like so. So If I should remove it thats fine.
The question is, what would you need a temp table for? Why not query it directly?
SELECT
p.product_id,
p.product_name
FROM
product_table p
WHERE
EXISTS (
SELECT 1
FROM product_tag_map
WHERE product_id = p.product_id AND tag_id IN (2,4,5)
)
AND NOT EXISTS (
SELECT 1
FROM product_tag_map
WHERE product_id = p.product_id AND tag_id IN (3,6)
)
Create appropriate indexes (one multi-column index over (product_tag_map.product_id, product_tag_map.tag_id)
and one separate over (product_tag_map.tag_id)
, in addition to the "normal" PK/FK indexes) and this should be plenty fast.
EDIT: A cacheable (as far as query plans go) and more dynamic variant of the above would be:
Create a user_searches
table (search_session_id, tag_id, include)
with a multi_column index over (search_session_id, include)
and a separate index over tag_id
. Then fill it as the user selects criteria:
search_session_id tag_id include
...
4711 2 1
4711 4 1
4711 5 1
4711 3 0
4711 6 0
...
And query like this:
SELECT
p.product_id,
p.product_name
FROM
product_table p
WHERE
EXISTS (
SELECT 1
FROM product_tag_map m INNER JOIN user_searches s ON s.tag_id = m.tag_id
WHERE m.product_id = p.product_id
AND s.search_session_id = 4711 /* this should be a parameter */
AND s.include = 1
)
AND NOT EXISTS (
SELECT 1
FROM product_tag_map m INNER JOIN user_searches s ON s.tag_id = m.tag_id
WHERE m.product_id = p.product_id
AND s.search_session_id = 4711 /* this should be a parameter */
AND s.include = 0
)
You can create a view on a select statement.
(A view is a basically a virtual table that you can query easily but that is popuplated with data from a complex statement. Read the manual, its not so trivial when it comes to performance and read/write behaviour.)
However your query could look like this:
SELECT
product_id, count(*) as total
FROM tag_map WHERE tag_id IN (2,4,5)
AND total = 3
AND product_id NOT IN (
SELECT product_id, count(*) as total FROM tag_map WHERE tag_id IN (3,6)
WHERE total = 2 GROUP BY ( product_id )
)
GROUP BY ( product_id )
you can also do joins but i think it will be slower.