AVG的最后三个条目按日期

Table

I have this table below my goal is to take the average of columns perDem,perGop,perInd but only the last three based on the day.How can I write such a query I tried to use order by,group by but am not able to achieve the results. I need an avg of each column but only the last three entries based on the day. so for Alabama for example average perDem,perGop,PerInd but only three rows which are last if ordered by day. This is what I have so far but I need this to only take the avg of the last three based on the day submitted for that particular state.

select polls.state,evotes,avg(perDem),avg(perGOP),avg(perInd)
             from polls,electoral
             where electoral.state=polls.state
             group by electoral.state,polls.state";

One possible approach is to make use of MySQL user-defined variables as a way to emulate analytic/windowing functions available in other databases.

SELECT v.state
     , e.evotes
     , AVG(IF(v.n<=3,v.perDem,NULL)) AS perDem_avg_last_3_day
     , AVG(IF(v.n<=3,v.perGOP,NULL)) AS perGOP_avg_last_3_day
     , AVG(IF(v.n<=3,v.perInd,NULL)) AS perInd_avg_last_3_day
  FROM ( SELECT @i := IF(p.state = @p_state,@i+1,1) AS n
              , @p_state := p.state AS `state`
              , p.perDem
              , p.perGOP
              , p.perInd
           FROM polls p
          CROSS
           JOIN ( SELECT @p_state = '', @i := 0 ) i
          ORDER
             BY p.state DESC
              , p.day DESC
       ) v
  JOIN electoral e
    ON e.state = v.state
 GROUP BY v.state, e.evotes
 ORDER BY v.state

NOTE: The MySQL Reference manual specifically warns against this type of usage of user-defined variables; but up through MySQL 5.6, we observe consisted behavior with carefully constructed SELECT statements.)

From a performance standpoint, this approach is suboptimal for large sets (materializing the inline view, and likely requiring a "Using filesort" operation.)

Please try the following...

set @rowNum := 0;
set @state := '';

SELECT state,
       AVG( perDem ) AS avgPerDem,
       AVG( perGOP ) AS avgPerGOP,
       AVG( perInd ) AS avgPerInd
FROM
(
    SELECT day AS day,
           state AS state,
           perDem AS perDem,
           perGOP AS perGOP,
           perInd AS perInd,
           @rowNum := if ( @state = state,
                           @rowNum + 1,
                           1 ) AS rowNum,
           @state := state AS valueHolder
    FROM polls
    ORDER BY state,
             day DESC
) rowNumGenerator
WHERE rowNum <= 3
GROUP BY state
ORDER BY state;

Here the inner query returns sorts the contents sorts the contents of polls by state in alphabetical order and for each state it further sorts the rows by day, from largest to smallest.

Once this is done it returns the fields of interest (day, state, perDem, perGOP and perInd) as well as a row number that it generates based on the following pattern...

Where a new state is encountered it gives this first row a row number of 1.

Each subsequent row for that state is given the next available row number.

This effectively assigns a number to each record's position within the ordered list, relative to the first record for that state.

The outer query selects only those rows whose position / row number places it in the top three for its state.

The records of the resulting dataset are then grouped by state. Please note that the outer SELECT statement does not know the grouping used by the inner one, and can not safely assume it. Thus it will assume that there is no grouping unless otherwise instructed.

The mean average for each state's per fields are then calculated and all specified fields are returned to the user.

If you have any questions or comments, then please feel free to post a Comment accordingly.