I am using PHP and MYSQL to graph call concurenncy from an Asterisk CDR database,
I currently use the following prepared statement:
$query=$cdrdb->prepare('select count(acctid) from cdr where calldate between ? and ? or DATE_ADD(calldate, INTERVAL duration SECOND) between ? and ?');
and then the following foreach loop to enter the variables:
foreach ($timerange as $startdatetime){
$start=$startdatetime->format("Y-m-d H:i:s");
$enddatetime=new DateTime($start);
$enddatetime->Add($interval);
$end=$enddatetime->format("Y-m-d H:i:s");
if(!$query->execute(array($start, $end, $start, $end))){
echo "Execute failed: (" . $stmt->errno . ") " . $stmt->error;
}
if (!($res = $query->fetchall())) {
echo "Getting result set failed: ";
}
array_push($callsperinterval,$res[0][0]);
}
Timerange can be every hour for a day, every day for a month or every week for a year.
the calldate column is marked as an index column.
The table currently holds 122000 records.
the result of running an EXPLAIN on the query:
mysql> explain select count(acctid) from cdr where calldate between '2014-10-02 23:30:00' and '2014-11-03 00:00:00' or DATE_ADD(calldate, INTERVAL duration SECOND) between '2014-10-02 23:30:00' and '2014-11-03 00:00:00';
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+
| 1 | SIMPLE | cdr | ALL | calldate | NULL | NULL | NULL | 123152 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+
A single run of the query takes around 0.14s so for a 24 hour period with an hourly interval the script should finish in about 3.36 seconds, but it ends up taking about 12 seconds
Currently the whole process can take up to 20 seconds to run for a 24 hour period,can anyone please help me to improve the speed of this query?
This part is the bottleneck in your query:
DATE_ADD(calldate, INTERVAL duration SECOND)
This is because MySQL is performing "math" on each row of the first subset determined from your first WHERE
condition every row on your entire table that didn't match the first part of your WHERE
statement since you are using WHERE OR
, not WHERE AND
.
I assumed your table looks something a little like:
acctid | calldate | duration
========================================
1 | 2014-12-01 17:55:00 | 300
... etc.
Consider rewriting your schema such that you are not using intervals that MySQL must calculate for each row, but full DateTime columns that MySQL can perform immediate comparisons on:
acctid | calldate | duration_end
==================================================
1 | 2014-12-01 17:55:00 | 2014-12-01 18:00:00
To rewrite this schema, you can make that new column and then do (this may take a while to process but will serve you well in the long run):
UPDATE cdr SET duration_end = DATE_ADD(calldate, INTERVAL duration SECOND);
Then scrap the duration
column and rewrite your application to save into the new column!
Your resulting query will be:
select count(acctid) from cdr where calldate > ? and (calldate < ? or duration_end between ? and ?)
Assuming that nothing can change in the schema, then you're stuck with that function. However, you can try having MySQL work with subsets so that it's not doing math on so many rows:
select
count(acctid)
from
cdr
where
calldate > ? and
(calldate < ? or DATE_ADD(calldate, INTERVAL duration SECOND) between ? and ?)
I can't guarantee much of a performance increase from this solution although it may be a noticeable one depending on your data set.
For asterisk cdrs you can just do like this
Let say you used:
$query=$cdrdb->prepare('select count(acctid) from cdr where calldate between ? and ? or DATE_ADD(calldate, INTERVAL duration SECOND) between ? and ?');
$query->execute(array($start, $end, $start, $end))
You have use like this
$query=$cdrdb->prepare('select count(acctid) from cdr where calldate between ? and DATE_ADD(?, interval ? SECOND) and (calldate between ? and ? or DATE_ADD(calldate, INTERVAL duration SECOND) between ? and ?)
');
$MAX_CALL_LENGHT_POSIBLE = 60*60*10; # usualy 10 hr is not reachable on most calls. If you limit it in call, you can decrease to even less values
$query->execute(array($start, $end,$MAX_CALL_LENGHT_POSIBLE,$start,$end $start, $end))
So just first limit query to interval where that stop_time can be.
But much simple will be add column call_end_time and create trigger
DROP TRIGGER IF EXISTS cdr_insert_trigger;
DELIMITER //
CREATE TRIGGER cdr_insert_trigger BEFORE INSERT ON cdr
FOR EACH ROW BEGIN
Set NEW.call_end_time=DATE_ADD(OLD.calldate,interval OLD.duration second);
END//
DELIMITER ;
Sure you need create index on BOTH calldate and call_end_time column and use Union instead of OR(otherwise one part will not use index)
If disk space is less important than speed, try:
ALTER TABLE cdr ROW_FORMAT = FIXED;