So we have a Drupal site with nodes tagged in 4 vocabularies. In SQL terms, these are all simply joins on a base table. In simplified terms, it looks roughly like this:
SELECT a.name AS location, b.name AS sector, c.name AS tag, d.name AS status
FROM node n
LEFT JOIN a ON a.id = n.id
LEFT JOIN b ON b.id = n.id
LEFT JOIN c ON c.id = n.id
LEFT JOIN d ON d.id = n.id
WHERE n.type = 'X'
GROUP BY a.name, b.name, c.name, d.name
We have a listing page where you can filter by A, B, C and D; each has a list of all the values and an "All" option.
What we need is a list of distinct combinations of A, B, C and D which have results. This is to generate a sitemap and obviously, we don't want to guide bots to empty result pages.
I have tried using WITH ROLLUP with partial success. This gives me a/b/c/d
, a/b/c/*
, a/b/*/*
, a/*/*/*
and */*/*/*
. However it does not do */*/*/d
, */b/*/*
or */b/*/d
(along with all like-combinations).
Any suggestions on if this is possible using SQL with this approach? I wouldn't be aversed to using subqueries.
Our alternative theory is to loop through all the nodes and build an array in PHP with all the combinations that node appears under. Eg: if Node 1 was in a/b/c/d
, then our array would have the following added to it
[
'a/b/c/d',
'a/b/c/*',
'a/b/*/d',
'a/*/c/*',
'a/*/c/d',
'a/*/*/*',
'*/*/*/d',
'*/b/c/d',
'*/b/c/*',
'*/b/*/d',
'*/*/c/*',
'*/*/c/d',
'*/*/*/*',
'*/*/*/d',
'*/*/*/*',
]
(I think thats all the combinations).
The unique set of those, at the end, will be all available paths/options without any empty ones. I think.
This is obviously a bit of a brute-force approach. It feels like it would work, but it feels less elegant than doing it with SQL.
EDIT: This doesn't work, I'm pretty sure it's not all the results
Hmm... just tried this out and it looks like you might get all possible results by using 2 queries, grouping in each direction:
ie GROUP BY a.name, b.name, c.name, d.name
then GROUP BY d.name, c.name, b.name, a.name
But then i tried adding both those grouping directions in one query and it looks quite promising:
ie GROUP BY a.name, b.name, c.name, d.name, c.name, b.name, a.name
I think this works but there could be a better way.
I'm using 4 queries to get all results, like this:
SELECT a.name AS location, b.name AS sector, c.name AS tag, d.name AS status
FROM node n
LEFT JOIN a ON a.id = n.id
LEFT JOIN b ON b.id = n.id
LEFT JOIN c ON c.id = n.id
LEFT JOIN d ON d.id = n.id
WHERE n.type = 'X'
GROUP BY a.name, b.name, c.name, d.name
SELECT b.name AS sector, c.name AS tag, d.name AS status
FROM node n
LEFT JOIN b ON b.id = n.id
LEFT JOIN c ON c.id = n.id
LEFT JOIN d ON d.id = n.id
WHERE n.type = 'X'
GROUP BY b.name, c.name, d.name
SELECT c.name AS tag, d.name AS status
FROM node n
LEFT JOIN c ON c.id = n.id
LEFT JOIN d ON d.id = n.id
WHERE n.type = 'X'
GROUP BY c.name, d.name
SELECT d.name AS status
FROM node n
LEFT JOIN d ON d.id = n.id
WHERE n.type = 'X'
GROUP BY d.name
So the first query returns all results where all 4 categories has a value, ie a/b/c/d and the roll up adds them for a/b/c/-, a/b/-/-, a/-/-/- (the the hyphens are for All filter value, asterisks vanish when i type them for some reason)
Then the second query returns everything for */b/c/d and the roll up adds -/b/c/-, -/b/-/-
The third returns -/-/c/d and the roll up adds -/-/c/-
then the fourth adds the -/-/-/d and -/-/-/-
Which i think is everything