I have a query:
SELECT users.name,
COUNT(DISTINCT leads.id) as leads,
COUNT(DISTINCT properties.id) as properties,
COUNT(DISTINCT buyers.id) as buyers,
null
FROM users
LEFT JOIN (SELECT * from leads WHERE date(created_at) BETWEEN '2019-04-13' AND'2019-04-13') leads ON users.id = leads.user_id
LEFT JOIN (SELECT * from properties WHERE date(contract_start_date) BETWEEN '2019-04-13' AND'2019-04-13') as properties ON users.id = properties.user_id
LEFT JOIN (SELECT * from buyers WHERE date(created_at) BETWEEN '2019-04-13' AND'2019-04-13') as buyers ON users.id = buyers.user_id
LEFT JOIN buyer_searches ON buyers.id = buyer_searches.buyer_id
GROUP BY users.id
UNION
SELECT users.name,
null, null, null,
COUNT(DISTINCT buyer_searches.id) as buyer_searches
FROM users
LEFT JOIN buyers ON users.id = buyers.user_id
LEFT JOIN (SELECT * from buyer_searches WHERE date(created_at) BETWEEN '2019-04-08' AND'2019-04-13') as buyer_searches ON buyers.id = buyer_searches.buyer_id
GROUP BY users.id
I get a table with duplicate users of course, but with the correct params:
users field1 field2 field3 etc....
user1 data . null . null etc...
user2 data . null . null etc...
user1 null . data . data etc...
user2 null . data .data etc...
etc...
I would like concatenate the rows by users.
Use subqueries in the SELECT
for this:
SELECT u.name,
(SELECT COUNT(*)
FROM leads l
WHERE l.user_id = u.id AND
l.created_at >= '2019-04-13' AND
l.created_at < '2019-04-14'
) as num_leads,
(SELECT COUNT(*)
FROM properties p
WHERE p.user_id = u.id AND
p.created_at >= '2019-04-13' AND
p.created_at < '2019-04-14'
) as num_buyers,
(SELECT COUNT(*)
FROM buyers b
WHERE b.user_id = u.id AND
b.created_at >= '2019-04-13' AND
b.created_at < '2019-04-14'
) as num_buyers,
(SELECT COUNT(*)
FROM buyers b JOIN
buyer_searches bs
ON b2.buyer_id = b.id AND
WHERE b.user_id = u.id AND
bs.created_at >= '2019-04-13' AND
bs.created_at < '2019-04-14'
) as num_buyer_searches
FROM users u;
By JOIN
ing the tables together you are creating a (potentially) vast Cartesian product for each user. Then you are sifting through this to count distinct ids. It is more efficient to just count them directly.
This should have one row for each user.
Note that I've changed the date comparisons. Functions on columns almost always preclude the use of indexes. This version of the queries can make use of index on the tables -- such as leads(user_id, created_at)
-- if they are available.