I have 80k plus customers and having 4 groups. Now I want to find 2 groups users with a query in mysql. My query is like below:
select c.customers_firstname as recipient_firstname,
c.customers_lastname as recipient_lastname,
c.customers_id as recipient_id,
c.customers_email_address as recipient_email_address
from customers c
where customers_group_id = '1' OR customers_group_id = '3'
When I run this query in phpmyadmin I got the result : Showing rows 0 - 29 ( 59,815 total, Query took 0.0034 sec)
But when I added order by ORDER BY recipient_firstname ASC
in this query the result time is : Showing rows 0 - 29 ( 59,815 total, Query took 0.2607 sec)
the order by query is taking too much time for the result.
I want to reduce the time of order by query.
Please help if there is another way to get the same result in less time.
You need an index on the recipient_firstname
field (so really customers.customers_firstname). An index allows for an ordered, linear time iteration over the result set.
If you don't have an index, the result set must be aggregated and then sorted. This sorting is going to be n log n
. That's obviously pretty slow for large sets, and if it can't fit into memory (and 60k records might not depending on configuration), it's going to do a very slow file based sort.
tl;dr You need an index. An index on recipient_firstname
will make the query extremely close in performance to the non ORDER BY
version.
By the way, if customers_group_id is an integral field, use integer literals, not strings. It likely won't make a difference, but it's misleading, and there are actually a few situations where it matters.
Depending on the situation, it's probably also worth putting an index on the group id. For small sets, the results can just be filtered as the set is built, but for large result sets, that will end up requiring a rather disk heavy full table scan.
You need to create index on column on which order by clause is applied.
CREATE INDEX index_name ON customers (customers_firstname);
Try to create index (customers_group_id, customers_firstname)
- this should work.
You have to index on the customers_firstname
field: this will speed up the ORDER BY
, but will also slow down the WHERE
(which is probably indexed now).
So the index must be customers_group_id, customers_firstname
in this order.
CREATE INDEX my_query_ndx
ON customers ( customers_group_id, customers_firstname );
In theory you might enlarge the index to be a covering index and contain, after the two key fields, all other fields you require in the SELECT
. Maintaining this kind of index is expensive, though; you'll have to balance advantages and drawbacks. If the table is very "wide", it might be advantageous to index on group id, firstname, lastname, id and email.
where customers_group_id = '1' OR customers_group_id = '3'
This can be rewritten for clarity (it changes nothing) as
WHERE customers_group_id IN ('1','3')
But now, either customer_group_id
is an integer field, or it isn't. If it is, then it's better to treat is as such:
WHERE customers_group_id IN (1, 3)
In some cases, you can plan ahead your IDs so that for example group 3 is actually group 2, i.e., the groups you might be interested in are contiguous. That way, you can rewrite the query as variable < value
or variable > value
or variable BETWEEN
, which is twice as fast as an OR
. With large OR
sets you can get 4x speedups easily.
If it is not an integer field, then by all means strive to make it one. Integer performance (and index size) will benefit greatly (note, however, that with strings, '3' is greater than '12', just as 'C' is greater than 'AB'; so, type conversion is not necessarily without side effects).