如何优化MySQL游戏排行榜 - 大规模子查询问题

I have a huge bottle neck on my Game Server for the following query which is used to store the current leaderboard.

I am currently only calling this query via cron once every 5 mins, but would love to optimise it enough to be called every minute or when needed.

The query is taking 30 seconds and currently only ~2000 users and 7000 games played (stored in Games and TopPlayerScores). I'm afraid it will only get worse!! Please Help me Overflow-Kenobi! Your my only hope!

SET @rank=0;
INSERT INTO Board (TopScorePKID, GamePKID, UserPKID, UniquePlayerID, PlayerName, TopPlayerScore, Position, Date)
(SELECT bad.ID AS TopScorePKID, bad.GamePKID, bad.UserPKID, bad.UniquePlayerID, bad.PlayerName, bad.TopPlayerScore, @rank:=@rank+1 AS Position, bad.Date
FROM (
    SELECT g.GamePKID, g.TopPlayerScore, l.ID,  l.UserPKID, u.UniquePlayerID, u.PlayerName, (l.Date) AS Date
    FROM Games g, TopPlayerScores l, UserDetails u
    WHERE l.GamePKID = g.GamePKID
    AND u.UserPKID = l.UserPKID
    AND u.SECRET_DETAIL = 0 
    AND g.TopPlayerScore >= (SELECT DISTINCT k.TopPlayerScore AS Highest 
        FROM Games k, TopPlayerScores t 
        WHERE t.UserPKID = l.UserPKID
        AND k.GamePKID = t.GamePKID
        ORDER BY k.TopPlayerScore DESC
        LIMIT 1) 
    GROUP BY l.UserPKID
    ORDER BY g.TopPlayerScore DESC, Date ASC) 
AS bad);

Please someone help!! Should I break it up onto views? Or use Inner Join keywords? What is the best approach?

Thanks so much for even looking at this mess :D!

UPDATED 1.0 : EXPLAIN EXTENDED Results:

id  select_type table   type    possible_keys   key key_len ref rows    filtered    Extra
1   PRIMARY   ALL NULL    NULL    NULL    NULL    1521    100.00  
2   DERIVED l   ALL NULL    NULL    NULL    NULL    6923    100.00  Using temporary; Using filesort
2   DERIVED u   eq_ref  PRIMARY PRIMARY 4   DBNAME.l.UserPKID   1   100.00  Using where
2   DERIVED k   eq_ref  PRIMARY PRIMARY 4   DBNAME.l.GamePKID   1   100.00  Using where
3   DEPENDENT SUBQUERY  t   ALL NULL    NULL    NULL    NULL    6923    100.00  Using where; Using temporary; Using filesort
3   DEPENDENT SUBQUERY  g   eq_ref  PRIMARY PRIMARY 4   DBNAME.t.GamePKID   1   100.00  Using where

UPDATED 2.0: Limited Schema for querying tables

Using the Games to store game scores and other information about a game

`Games` (
  `GamePKID` int(11) NOT NULL AUTO_INCREMENT,
  `TopPlayerScore` int(11) NOT NULL,
  `OTHER_MISC_STUFF_REMOVED` int(11) NOT NULL
  PRIMARY KEY (`GamePKID`)
)

Using the following to link the user to the Game and store the time/date

`TopPlayerScores` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `UserPKID` int(11) NOT NULL,
  `GamePKID` int(11) NOT NULL,
  `Date` datetime NOT NULL,
  PRIMARY KEY (`ID`)
)

Used to store each unique player

`UserDetails` (
  `UserPKID` int(11) NOT NULL AUTO_INCREMENT,
  `UniquePlayerID` char(40) NOT NULL,
  `PlayerName` char(96) NOT NULL,
  `SECRET_DETAIL` tinyint(1) NOT NULL DEFAULT '0',
  `isPlayer` tinyint(4) DEFAULT NULL,
  PRIMARY KEY (`UserPKID`)
)

The first thing I would note, although this will not improve performance is that the JOIN syntax you are using was replaced more than 20 years ago by the ANSI 92 expcict join syntax, it is entirely subject of course, but Aaron Bertrand explains some very good reasons to switch to the newer syntax.

The second thing to note is that your results will be non deterministic. You are selecting columns not contained in either an aggregate or a group by. While MySQL allows this, you are not using the feature as MySQL intended. The MySQL docs state:

MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate.

However, some of the columns you have included (g.GamePKID, g.TopPlayerScore, l.ID, l.Date), do not satisfy the condition are the same for each group, therefore, as stated, MySQL is free to choose whichever values it likes, and even though you have ORDER BY g.TopPlayerScore DESC, Date ASC this does not influence the single row for each group MySQL chooses.

Thirdly, MySQL has limitations with correlated subqueries and it this can hinder performance. If you can change these to JOINs you should see a performance improvement.

With all this in mind I would rewrite your query as so:

SET @rank=0;
INSERT INTO Board (TopScorePKID, GamePKID, UserPKID, UniquePlayerID, PlayerName, TopPlayerScore, Position, Date)
SELECT  bad.ID AS TopScorePKID, 
        bad.GamePKID, 
        bad.UserPKID, 
        bad.UniquePlayerID, 
        bad.PlayerName, 
        bad.TopPlayerScore, 
        @rank:=@rank+1 AS Position, 
        bad.Date
FROM    (   SELECT  g.GamePKID, 
                    g.TopPlayerScore, 
                    l.ID,  
                    l.UserPKID, 
                    u.UniquePlayerID, 
                    u.PlayerName, 
                    l.Date
            FROM    Games g
                    INNER JOIN TopPlayerScores l
                        ON l.GamePKID = g.GamePKID
                    INNER JOIN UserDetails u
                        ON u.UserPKID = l.UserPKID
                    INNER JOIN
                    (   SELECT  TopPlayerScores.UserPKID, MAX(games.TopPlayerScore) AS MaxPlayerScore
                        FROM    TopPlayerScores
                                INNER JOIN Games
                                    ON Games.GamePKID = TopPlayerScores.GamePKID
                        GROUP BY TopPlayerScores.UserPKID
                    ) MaxScore
                        ON MaxScore.UserPKID = l.UserPKID
                        AND MaxScore.MaxPlayerScore = g.TopPlayerScore
            WHERE   u.SECRET_DETAIL = 0 
        ) AS bad
ORDER BY bad.TopPlayerScore DESC, bad.Date ASC;

Example on SQL Fiddle

The subquery MaxScore should have the effect of limiting the results to one row per player (only their highest score), although additional logic may be required to handle ties (e.g. where the player has the same topscore in more than one game). Without knowing exact requirements I can't correct this.

EDIT

In order to remove duplicates where a player has the same top score on 2 or more games, and make it truly deterministic you need to add a further subquery:

SET @rank=0;

SELECT  bad.ID AS TopScorePKID, 
        bad.GamePKID, 
        bad.UserPKID, 
        bad.UniquePlayerID, 
        bad.PlayerName, 
        bad.TopPlayerScore, 
        @rank:=@rank+1 AS Position, 
        bad.Date
FROM    (   SELECT  Games.GamePKID, 
                    Games.TopPlayerScore, 
                    TopPlayerScores.ID,  
                    TopPlayerScores.UserPKID, 
                    UserDetails.UniquePlayerID, 
                    UserDetails.PlayerName, 
                    TopPlayerScores.Date
            FROM    Games
                    INNER JOIN TopPlayerScores
                        ON TopPlayerScores.GamePKID = Games.GamePKID
                    INNER JOIN UserDetails
                        ON UserDetails.UserPKID = TopPlayerScores.UserPKID
                    INNER JOIN
                    (   SELECT  TopPlayerScores.UserPKID, MAX(games.TopPlayerScore) AS TopPlayerScore
                        FROM    TopPlayerScores
                                INNER JOIN Games
                                    ON Games.GamePKID = TopPlayerScores.GamePKID
                        GROUP BY TopPlayerScores.UserPKID
                    ) MaxScore
                        ON MaxScore.UserPKID = TopPlayerScores.UserPKID
                        AND MaxScore.TopPlayerScore = Games.TopPlayerScore
                    INNER JOIN
                    (   SELECT  TopPlayerScores.UserPKID, games.TopPlayerScore, MAX(Date) AS Date
                        FROM    TopPlayerScores
                                INNER JOIN Games
                                    ON Games.GamePKID = TopPlayerScores.GamePKID
                        GROUP BY TopPlayerScores.UserPKID, games.TopPlayerScore
                    ) MaxScoreDate
                        ON MaxScoreDate.UserPKID = TopPlayerScores.UserPKID
                        AND MaxScoreDate.TopPlayerScore = Games.TopPlayerScore
                        AND MaxScoreDate.Date = TopPlayerScores.Date
            WHERE   UserDetails.SECRET_DETAIL = 0 
        ) AS bad
ORDER BY bad.TopPlayerScore DESC, bad.Date ASC;

Example on SQL Fiddle


N.B. This query will become much simpler if/when MySQL introduce analytical functions such as ROW_NUMBER(), or if you switch to a DBMS that already supports them, so just in case either of these things happen, here is a solution using ROW_NUMBER()`

SELECT  bad.ID AS TopScorePKID, 
        bad.GamePKID, 
        bad.UserPKID, 
        bad.UniquePlayerID, 
        bad.PlayerName, 
        bad.TopPlayerScore, 
        ROW_NUMBER() OVER(ORDER BY TopPlayerScore DESC) AS Position, 
        bad.Date
FROM    (   SELECT  Games.GamePKID, 
                    Games.TopPlayerScore, 
                    TopPlayerScores.ID,  
                    TopPlayerScores.UserPKID, 
                    UserDetails.UniquePlayerID, 
                    UserDetails.PlayerName, 
                    TopPlayerScores.Date,
                    ROW_NUMBER(PARTITION BY UserDetails.UserPKID 
                                ORDER BY Games.TopPlayerScore DESC,
                                        TopPlayerScores.Date DESC) AS RN
            FROM    Games
                    INNER JOIN TopPlayerScores
                        ON TopPlayerScores.GamePKID = Games.GamePKID
                    INNER JOIN UserDetails
                        ON UserDetails.UserPKID = TopPlayerScores.UserPKID
            WHERE   UserDetails.SECRET_DETAIL = 0 
        ) AS bad
WHERE   bad.RN = 1
ORDER BY bad.TopPlayerScore DESC, bad.Date ASC;

Example on SQL Fiddle using ROW_NUMBER()