I currently have a large table (millions of rows) with two columns: timestamp (ms) and value.
I want to be able to generate CSV files in PHP using the most recent value of an instrument at fixed time increments. To provide an example:
Table: data
timestamp value
50 1
700 2
1500 3
2100 4
3100 5
3900 6
Assuming the fixed time increment is 1 second (1000 ms), the output should look like this:
timestamp value
1000 2
2000 3
3000 4
4000 6
I am currently issuing many queries of the following form:
SELECT * FROM `data` WHERE timestamp<=2000 ORDER BY `timestamp` DESC LIMIT 1
I then output each row to php://output and the user's browser downloads the file as it is created. However, this method of generating the data in the first place is turning out to be excessively slow given all the queries.
How can I speed up this process? I assume I should probably move most of the processing to MySQL to reduce the interprocess communication, but I'm not sure how to go about doing that.
You can also round up each timestamp and then get the maximum TimeStamp of each rounded result.
SELECT
a.RoundedTimeStamp,
t.Value
FROM(SELECT
CEIL(TimeStamp/1000)*1000 as RoundedTimeStamp,
MAX(TimeStamp) as TimeStamp
FROM tablename
GROUP BY CEIL(TimeStamp/1000)*1000
) a
JOIN tablename t
ON t.TimeStamp = a.TimeStamp
I don't know how performant this is going to be, because math and no indexes, but you can use a query like this:
select q1.nearest, q1.value
from
(select t,
if(mod(t, 1000), (floor(t/1000) + 1) * 1000, t) nearest,
if(mod(t, 1000), (floor(t/1000) + 1) * 1000, t) -t as diff,
value
from data
) q1
left join
(select t,
if(mod(t, 1000), (floor(t/1000) + 1) * 1000, t) nearest,
if(mod(t, 1000), (floor(t/1000) + 1) * 1000, t) -t as diff,
value
from data
) q2
on q1.nearest = q2.nearest
and q1.diff > q2.diff
where q2.diff is null
with demo fiddle here: http://sqlfiddle.com/#!9/5a199/13
Note this assumes no two timestamps will be the same.
You want the values on the largest timestamp equal to or less than multiples of the increment 1000. You can automate your process:
select 1000 * ceil(lt.timestamp / 1000) as timestamp, value
from data lt
where not exists (select 1
from data lt2
where lt2.timestamp > lt.timestamp and
lt2.timestamp <= 1000 * ceil(lt.timestamp / 1000)
);
An index on data(timestamp)
should be helpful.
You can't quite do your formulation in MySQL, because MySQL does not support limit
in these types of subqueries.