I need to return multiple value in a single SQL query on my postgresql database. Here is my query so far:
SELECT AVG("percent"), MIN("percent"), MAX("percent"), AVG("profit"), MIN("profit"), MAX("profit")
FROM public.log_analyticss
WHERE "buyPlatform" = 'platA' AND
"date" >= '1526356073.6126819'
Data
date sellPlatform profit percent
---------- ------------ ---------- ----------
1526356073.61 platA 0 10.1
1526356073.62 platA 22 11
1526356073.63 platA 3 7
1526356073.64 platA 1 8
1526356073.65 platA 11 9
1526356073.66 platA 12 10
1526356073.67 platA 13 15
Desired Result
date sellPlatform profit percent
---------- ------------ ---------- ----------
1526356073.61 platA 0 10.1 //MIN Profit
1526356073.62 platA 22 11 //MAX Profit
1526356073.63 platA 3 7 //MIN Perc
1526356073.67 platA 13 15 //MAX Perc
//Then somehow I want it return AVG as well if that is every possible. Otherwise, I don't mind running another query to do that.
The issue is that I don't just want the MIN and MAX values. I want the entire row of data from which the MIN and MAX value is taken.
I understand that I am asking for an AVG and a MIN/MAX value which is going to return data in two different formats. I know this is might not be possible. But any help on how to most efficiently do this would be extremely helpful.
Right now I am just pulling the entire dataset into my code and calculating the mean, min and max within my code which I know if very bad and also very very slow. The table has about 8 million rows and the dataset that I am grabbing is about 9000 rows so it is very slow the way I am doing it now.
The most efficient way to retrieve the row associated with the minimum / maximum generally doesn't involve the MIN()
/MAX()
aggregates at all; instead, you can just attach an ORDER BY
to your query, and add a LIMIT 1
to fetch only the first record.
This means you need four SELECT
statements with four different orderings, but you can factor out the expensive part (the fetch from log_analyticss
) into a temp table or a CTE, e.g.:
WITH Data AS (
SELECT *
FROM public.log_analyticss
WHERE "buyPlatform" = 'platA' AND
"date" >= '1526356073.6126819'
)
(SELECT 'Min percent', * FROM Data ORDER BY "percent" ASC LIMIT 1)
UNION ALL
(SELECT 'Max percent', * FROM Data ORDER BY "percent" DESC LIMIT 1)
UNION ALL
(SELECT 'Min profit', * FROM Data ORDER BY "profit" ASC LIMIT 1)
UNION ALL
(SELECT 'Max profit', * FROM Data ORDER BY "profit" DESC LIMIT 1)
In your case, a temp table might be better than a CTE, as you can re-use it to compute the averages.
Note that if one of these maximum/minimum values is shared by two different rows, this query would only return one of them. The chosen row is selected effectively at random, but you can attach more fields to the ORDER BY
clause to serve as a tie-breaker.
If you actually want both records in this case, you'll need something more like Auston's or Radim's answer, i.e. compute the aggregates first, then join back to the data on the profit
and percent
columns. You can still make use of a temp table / CTE here to avoid hitting log_analyticss
more than once.
I think the best way to do this is by two queries: The first retrieve the metrics, like you have done; The second query retrieve the sample registers.
Or you can try run over a temporary table (automatic dropped after end session):
CREATE TEMP TABLE statistics AS
SELECT AVG(percent) as perc_avg, MIN(percent) as perc_avg, MAX(percent) as perc_max, AVG(profit) as pro_avg, MIN(profit) as pro_min, MAX(profit) as pro_max
FROM public.log_analyticss
WHERE buyPlatform = 'platA' AND
sellPlatform = 'platB' AND
productId = '183948' AND
date >= '1526356073.6126819'
;
SELECT date, sellPlatform, profit, percent
FROM public.log_analyticss a join statistics s
on (a.profit = s.pro_max or a.profit = s.pro_min or
a.percent = s.perc_max or a.percent = s.perc_min)
WHERE buyPlatform = 'platA' AND
sellPlatform = 'platB' AND
productId = '183948' AND
date >= '1526356073.6126819';
Reference to temporary tables: http://www.postgresql.org/docs/9.2/static/sql-createtable.html
You need something along these lines:
SELECT a.*
FROM public.log_analyticss a
JOIN
(
SELECT
MIN("percent") min_percent,
MAX("percent") max_percent,
MIN("profit") min_profit,
MAX("profit") max_profit
FROM public.log_analyticss
) t ON a.date = t.date AND
a.sellPlatform = t.sellPlatform AND
(a.profit = minprofit OR
a.profit = maxprofit OR
a.percent = minpercent OR
a.percent = maxpercent)