There is a project that I've to work on, the main transaction table and some sub transaction tables called provider. Each provider has its own table. The main table just keeps amount (as sub ones keep too), date and some essential data, also reference id of subtable. I want to join sub tables by based on provider id. If things can go messy, I can keep table names as an associative array. What makes me confused is each provider's table has a different primary key name.
Provider tables are pretty much identical excepts some columns. What I really try to achieve is performing a search in all of these 3 tables as one.
One other question, is this some silly idea, if so which approach would be better? Daily 400-500 records are expected. Also note, more provider tables can be added in future. This structure is designed by someone more experienced than me, I couldn't convince anyone this is bad.
Transaction
+-----+-----+-----+-----+
| id | ref | prv | date|
+-----+-----+-----+-----+
| 1 | 4 | 2 | .. |
+-----+-----+-----+-----+
| 2 | 4 | 3 | .. |
+-----+-----+-----+-----+
| 3 | 5 | 2 | .. |
+-----+-----+-----+-----+
| 4 | 7 | 1 | .. |
+-----+-----+-----+-----+
| 5 | 22 | 3 | .. |
+-----+-----+-----+-----+
Providers (prv value)
+-----+---------------+-----+
| pID | providerName | .. |
+-----+---------------+-----+
| 1 | providerA | .. |
+-----+---------------+-----+
| 2 | providerB | .. |
+-----+---------------+-----+
| 3 | providerC | .. |
+-----+---------------+-----+
p_providerA (ref value)
+-----+--------+------+-----+
| aID | amount | name | .. |
+-----+--------+------+-----+
| 1 | 90.20 | alf | .. |
+-----+--------+------+-----+
| 2 | 70.00 |willie| .. |
+-----+--------+------+-----+
| 3 | 43.10 | kate | .. |
+-----+--------+------+-----+
p_providerB (ref value)
+-----+--------+------+-----+
| bID | amount | name | .. |
+-----+--------+------+-----+
| 3 | 65.20 | jane | .. |
+-----+--------+------+-----+
| 4 | 72.00 | al | .. |
+-----+--------+------+-----+
| 5 | 84.10 | bundy| .. |
+-----+--------+------+-----+
p_providerC (ref value)
+-----+--------+------+-----+
| bID | amount | name | .. |
+-----+--------+------+-----+
| 3 | 10.20 | mike | .. |
+-----+--------+------+-----+
| 4 | 40.00 | kitt | .. |
+-----+--------+------+-----+
| 6 | 27.60 | devon| .. |
+-----+--------+------+-----+
Expected Result
+-----+-----+-----+-----+----+--------+------+-----+
| id | ref | prv | date| | | | |
+-----+-----+-----+-----+----+--------+------+-----+
| 1 | 4 | 2 | .. | 4 | 72.00 | al | .. | (from prv. b)
+-----+-----+-----+-----+----+--------+------+-----+
| 2 | 4 | 3 | .. | 4 | 40.00 | kitt | .. | (from prv. c)
+-----+-----+-----+-----+----+--------+------+-----+
Given the current table design, one of the ways to get the desired result is to "break down" the Transaction table into separate queries, and combine those with a UNION ALL
The rows from Transaction
table could be returned like this:
SELECT t.* FROM Transaction t WHERE t.prv = 1
UNION ALL
SELECT t.* FROM Transaction t WHERE t.prv = 2
UNION ALL
SELECT t.* FROM Transaction t WHERE t.prv = 3
UNION ALL
...
Now each of those SELECT can implement a join to the appropriate provider table
SELECT t.*, pa.amount, pa.name
FROM Transaction t
JOIN p_providerA pa ON pa.aid = t.ref
WHERE t.prv = 1
UNION ALL
SELECT t.*, pb.amount, pb.name
FROM Transaction t
JOIN p_providerB pb ON pb.bid = t.ref
WHERE t.prv = 2
UNION ALL
...
The other option is almost equally ugly
SELECT t.*
, CASE t.prv
WHEN 1 THEN pa.amount
WHEN 2 THEN pb.amount
WHEN 3 THEN pc.amount
END AS `p_amount`
, CASE t.prv
WHEN 1 THEN pa.name
WHEN 2 THEN pb.name
WHEN 3 THEN pc.name
END AS `p_name`
FROM Transaction t
LEFT JOIN p_providerA pa ON pa.aid = t.ref AND t.prv = 1
LEFT JOIN p_providerB pb ON pb.bid = t.ref AND t.prv = 2
LEFT JOIN p_providerC pc ON pc.cid = t.ref AND t.prv = 3
Bottom line... there's no way to dynamically use of the Providers
table in a single query. We could make use of that information in a pre-query, to get back a resultset that helps us create the statement we need to run.
Another option (if the p_providerX tables aren't too large) would be to concatenate all of those together in an inline view, and the join to that. (This could be expensive for large sets; the derived table might get an index created on it...)
SELECT t.*
, p.amount AS p_amount
, p.name AS p_name
FROM `Transaction` t
JOIN (
SELECT 1 AS pID, pa.aid AS rID, pa.amount, pa.name FROM p_providerA
UNION ALL
SELECT 2 , pb.bid , pb.amount, pb.name FROM p_providerB
UNION ALL
SELECT 3 , pc.cid , pc.amount, pc.name FROM p_providerC
UNION ALL
...
) p
ON p.pID = t.pID
AND p.rID = t.ref
If we are going to be repeatedly running queries like that, we could materialize that inline view into a table... I'm just guessing at the datatypes here...
CREATE TABLE p_provider
( pID BIGINT UNSIGNED NOT NULL
, rID BIGINT UNSIGNED NOT NULL
, amount DECIMAL(20,2)
, name VARCHAR(255)
, PRIMARY KEY (pID,id)
);
INSERT INTO p_provider (pID, rID, amount, name)
SELECT 1 AS pID, pa.aid AS rID, pa.amount, pa.name FROM p_providerA
;
INSERT INTO p_provider (pID, rID, amount, name)
SELECT 2 AS pID, pb.aid AS rID, pb.amount, pb.name FROM p_providerB
;
INSERT INTO p_provider (pID, rID, amount, name)
SELECT 3 AS pID, pc.aid AS rID, pc.amount, pc.name FROM p_providerC
;
...
And then reference the new table
SELECT ...
FROM `Transaction` t
JOIN `p_provider` p
ON p.piD = t.pID
AND p.rID = t.ref
Of course that new p_provider
table is going to be out-of-sync when changes are made to p_providerA
, p_providerB
, et al.