We have some properties whose owners and area may change over time. I want to show states (default and last state) of properties which state changed during time (default v last state differs = default v last row in table differs).
Having table states
:
stateID propertyID owner area timestamp stateYear
1 1 David Smith 20 123456789 2017
2 2 Amanda Green 74 123456799 2017
3 1 David Smith 19 123456999 2017
1 3 Amanda Green 12 123459999 2018
2 1 David Smith 20 123499999 2018
3 4 Jack Stone 62 123999999 2018
4 2 Amanda Green 73 129999999 2018
What is default state of each property is inserted in table properties
:
propertyID userID defaultStateID defaultStateYear
1 1 1 2017
2 1 2 2017
3 2 1 2018
4 3 3 2018
Here is the code I tried to use:
SELECT s.*
FROM properties p JOIN
(SELECT s2.propertyID, s2.owner as owner, s2.area as area, max(s2.timestamp) as maxtimestamp
FROM states s2
GROUP BY s2.propertyID
HAVING COUNT(DISTINCT owner) >= 2 OR/** where owner changed **/
COUNT(DISTINCT area) >= 2/** or where area changed **/
) sp
ON sp.propertyID = p.propertyID JOIN
states s
ON s.propertyID = p.propertyID AND
(/** default state + last state **/
(s.stateYear = p.defaultStateYear AND
s.stateId = p.defaultStateID) OR
s.timestamp = sp.maxtimestamp
)
WHERE p.userID = 1/** only properties of user 1 **/
ORDER BY sp.propertyID ASC, s.stateYear ASC, s.stateID ASC;
SQL Fiddle with my code.
What I expect to get is couple of states (default + last state) for each property whose states changed during time (default against last state) for specified user.
However, my code shows states even for properties where changes occurred in the meantime between default and last state (but default and last state are the same) - which I don't want to get.
Expected result:
stateID propertyID owner area timestamp stateYear
2 2 Amanda Green 74 123456799 2017
4 2 Amanda Green 73 129999999 2018
I'm not sure if that is correct, even though it returns the expected result (due to non complete sample data).
However - The following query will return the default state and last state but in in a single row per propertyID
:
SELECT p.propertyID
, ds.stateID as defaultStateId
, ds.stateYear as defaultStateYear
, ls.stateID as lastStateId
, ls.stateYear as lastStateYear
FROM properties p
JOIN states ds -- default state
ON ds.propertyID = p.propertyID
AND ds.stateID = p.defaultStateID
JOIN states ls -- last state
ON ls.propertyID = p.propertyID
AND ls.timestamp = (
SELECT MAX(s.timestamp)
FROM states s
WHERE s.propertyID = p.propertyID
)
WHERE p.userID = 1
http://sqlfiddle.com/#!9/5b3a1/21
Result:
| propertyID | defaultStateId | defaultOwner | defaultArea | defaultTimestamp | defaultStateYear | lastStateId | lastOwner | lastArea | lastTimestamp | lastStateYear |
|------------|----------------|--------------|-------------|------------------|------------------|-------------|--------------|----------|---------------|---------------|
| 1 | 1 | David Smith | 20 | 123456789 | 2017 | 2 | David Smith | 20 | 123499999 | 2018 |
| 2 | 2 | Amanda Green | 74 | 123456799 | 2017 | 4 | Amanda Green | 73 | 129999999 | 2018 |
Finding the row with default state is kind of trivial ds.stateID = p.defaultStateID
- but there should only be one corresponding entry.
For the "last" row we can use a SELECT MAX(s.timestamp)
subquery in the ON clause.
Now - having this - we can add additional filters in the WHERE clause.
Last state must be different from default state:
AND ls.stateID <> p.defaultStateID
Area and owner must be different in the two selected rows:
AND ls.area <> ds.area
AND ls.owner <> ds.owner
But note that this is not the same as "has changed over time". Because they might be equal in the two rows, but have another value in a row between them. And that will return no rows for your sample data. So I now need to guess your requerements. If you mean "Area or owner must be different" - then it would be
AND (ls.area <> ds.area OR ls.owner <> ds.owner)
Now it returns the expected result. But you might also just be lucky.
If you need the data in two separate rows, you can add another join to the query
JOIN states s
ON (s.stateID, s.stateYear) IN (
(ds.stateID, ds.stateYear),
(ls.stateID, ls.stateYear)
)
You should use the primary key here to identify the default and the last state. Another way for the join condition would be
JOIN states s
ON (s.stateID = ds.stateID AND s.stateYear = ds.stateYear)
OR (s.stateID = ls.stateID AND s.stateYear = ls.stateYear)
I'm not sure if MySQL is able to optimize one of of those conditions.
The final query might be something like
SELECT s.*
FROM properties p
JOIN states ds -- default state
ON ds.propertyID = p.propertyID
AND ds.stateID = p.defaultStateID
JOIN states ls -- last state
ON ls.propertyID = p.propertyID
AND ls.timestamp = (
SELECT MAX(s.timestamp)
FROM states s
WHERE s.propertyID = p.propertyID
)
JOIN states s
ON (s.stateID = ds.stateID AND s.stateYear = ds.stateYear)
OR (s.stateID = ls.stateID AND s.stateYear = ls.stateYear)
WHERE p.userID = 1
AND ls.stateID <> p.defaultStateID
AND (ls.area <> ds.area OR ls.owner <> ds.owner)
ORDER BY s.propertyID ASC, s.stateYear ASC, s.stateID ASC;
returning the expected result
| stateID | propertyID | owner | area | timestamp | stateYear |
|---------|------------|--------------|------|-----------|-----------|
| 2 | 2 | Amanda Green | 74 | 123456799 | 2017 |
| 4 | 2 | Amanda Green | 73 | 129999999 | 2018 |