My tables:
hourly_weather electrical_readings
---------------- -----------------------
meter | time_read | temp meter | time | kwh
---------------- -----------------------
1 1316044800 55 1 1316136250 19.24
1 1316138400 56 1 1316044320 18.29
(...) (...)
I want to retrieve two important values from this data:
1) I want the total KW for a given day
2) And I want the max temperature for that day
The query I'm using takes WAYYYY too long to run but I can't think of another way to do it. Like, several hours for 100,000 rows of data in both tables.
SELECT * FROM (
SELECT * , SUM(kwh) AS sumkwh,
DATE( FROM_UNIXTIME( r.time_read ) ) AS datex,
UNIX_TIMESTAMP( DATE( FROM_UNIXTIME( r.time_read ) ) ) AS datey,
(
SELECT MAX( temp )
FROM hourly_weather hw
WHERE hw.meter = 1
AND time_read >= datey
AND time_read < datey + 86400
) AS temp
FROM electrical_readings r
WHERE id = 1
GROUP BY datex
) as t1
WHERE t1.temp != '';
SELECT DATE(FROM_UNIXTIME(r.time_read)) AS datex,
SUM(r.kwh) AS sumkwh, MAX(hw.temp) AS temp
FROM electrical_readings r
LEFT OUTER JOIN hourly_weather hw
ON DATE(FROM_UNIXTIME(r.time_read)) = DATE(FROM_UNIXTIME(hw.time_read))
AND hw.meter = 1
WHERE r.id = 1
GROUP BY datex
HAVING temp IS NOT NULL
This will still be a problem for performance, because this uses expressions for the joins. It therefore has to read every row of both tables, to evaluate the expressions before it can tell if the join is satisfied.
It would therefore be much better if you could add an extra column to both tables for the date (with no time) and index those columns.
ALTER TABLE electrical_readings ADD COLUMN date_read DATE, ADD KEY (date_read);
UPDATE electrical_readings SET date_read = DATE(FROM_UNIXTIME(time_read));
ALTER TABLE hourly_weather ADD COLUMN date_read DATE, ADD KEY (date_read);
UPDATE hourly_weather SET date_read = DATE(FROM_UNIXTIME(time_read));
SELECT r.date_read,
SUM(r.kwh) AS sumkwh, MAX(hw.temp) AS temp
FROM electrical_readings r
LEFT OUTER JOIN hourly_weather hw
ON r.date_read = hw.date_read
AND hw.meter = 1
WHERE r.id = 1
GROUP BY r.date_read
HAVING temp IS NOT NULL
In any case, adding SELECT *
to either of these queries is not a good idea, because the results will be arbitrary.
Re your comment, sorry, the sum is multiplied by the number of matching rows in hourly_weather.
We can compensate by doing the aggregate for hourly_weather in a derived table subquery.
SELECT r.date_read,
SUM(r.kwh) AS sumkwh, hw.temp
FROM electrical_readings r
LEFT OUTER JOIN (
SELECT date_read, MAX(temp) AS temp
FROM hourly_weather
WHERE meter = 1
GROUP BY date_read) AS hw
ON r.date_read = hw.date_read
WHERE r.id = 1
GROUP BY r.date_read
HAVING temp IS NOT NULL
It would be good to create an index on hourly_weather:
ALTER TABLE hourly_weather ADD KEY (date_read, meter, temp);
I think it would be simpler to calculate both values in separate queries and then joining the resulting data sets. You can even define temporary variables and tables to make things easier:
# Temp variables for the dates
set @t0 = cast('2013-02-01' as date);
set @t1 = cast('2013-02-02' as date);
# Temporary table 1: Sum of KWH
create temporary table temp_sum_kw
select
date(from_unixtime(timeread)) as `date`, sum(KWH) as sum_kwh
from
electrical_readings
where
timeread >= unix_timestamp(@t0) and timeread < unix_timestamp(date_add(@t1, interval +1 day))
group by
date(from_unixtime(timeread));
alter table temp_sum_kw
add index idx_date(`date`);
# Temporary table 2: Max temp
create temporary table temp_max_temperature
select
date(from_unixtime(timeread)) as `date`, max(temp) as max_temp
from
hourly_weather
where
(timeread >= @t0 and timeread < date_add(@t1, interval +1 day))
and meter = 1
group by
date(from_unixtime(timeread));
alter table temp_max_temperature
add index idx_date(`date`);
# Put it all together
select
m.*, t.max_temp
from
temp_sum_kw as m
inner join temp_max_temperature as t on m.`date` = t.`date`;
The reason for using the where
condition timeread >= @t0 and timeread < date_add(@t1, interval +1 day)
is to include everything that happens until the last moment of @t1
.
Hope this helps you