选择一个不同的列,I.E Finetuning选择查询

I use the bellow query to merge 1 (or possibly sevral tables and is generated from a php script I have written) on Id's that are not the same but parts of Id in first table is part of the second Id so thats why I do a Substring on it. In the most perfect world I would just left join on id in both tables if there were the same but they're not.

select t0.Id,t0.CustomerName,t0.Region,t0.Country,t0.StopTime,t0.CustomerId,t1.Id, t1.Time
from (select distinct Id,CustomerName,Region,Country, StopTime,CustomerId from [dbcust].[dbo].[_Content]) t0 
Inner JOIN 
(select distinct Id, Time from [dbcust].[dbo].[_Cpu]) t1 
        on SUBSTRING(t1.Id,CHARINDEX('_',t1.Id,10)+1,(CHARINDEX('_',t1.Id,15) - CHARINDEX('_',t1.Id,10)-1))=SUBSTRING(t0.Id,CHARINDEX('_',t0.Id,10)+1,(CHARINDEX('_',t0.Id,15) - CHARINDEX('_',t0.Id,10)-1)) ORDER BY t1.Time DESC

Here I get Alot of fields that are the same except for StopTime, see example bellow:

       StopTime                   Time
2015-04-01 23:59:00.000    2015-04-18 23:00:01
2015-04-02 23:59:00.000    2015-04-18 23:00:01
2015-04-03 23:59:00.000    2015-04-18 23:00:01
2015-04-04 23:59:00.000    2015-04-18 23:00:01
2015-04-05 23:59:00.000    2015-04-18 23:00:01
2015-04-06 23:59:00.000    2015-04-18 23:00:01
2015-04-07 23:59:00.000    2015-04-18 23:00:01
2015-04-08 23:59:00.000    2015-04-18 23:00:01
2015-04-09 23:59:00.000    2015-04-18 23:00:01
2015-04-10 23:59:00.000    2015-04-18 23:00:01
2015-04-11 23:59:00.000    2015-04-18 23:00:01
2015-04-12 23:59:00.000    2015-04-18 23:00:01
2015-04-13 23:59:00.000    2015-04-18 23:00:01
2015-04-14 23:59:00.000    2015-04-18 23:00:01
2015-04-15 23:59:00.000    2015-04-18 23:00:01
2015-04-16 23:59:00.000    2015-04-18 23:00:01
2015-04-17 23:59:00.000    2015-04-18 23:00:01
2015-04-18 23:59:00.000    2015-04-18 23:00:01

But here I only want unique Time, is it possible to get a row with the unique Time and the latest StopTime?

Like the bellow?

        StopTime                   Time
2015-04-01 23:59:00.000    2015-04-18 23:00:01

I tried with a group by statement inside the second select statement like:

(select distinct Id,CustomerName,Region,Country, StopTime,CustomerId from [dbcust].[dbo].[_Content] group by StopTime)

But I get a syntax error

Column 'dbcust.dbo._Content.Id' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

Perhaps you guys can help me with finetuning my select query in order to speed up the data gathering? =)

Thanks in advance.

You can do this:

WITH YourQuery
AS
(
    select t0.Id,t0.CustomerName,t0.Region,t0.Country,t0.StopTime,t0.CustomerId,t1.Id, t1.Time
    from (select distinct Id,CustomerName,Region,Country, StopTime,CustomerId from [dbcust].[dbo].[_Content]) t0 
    Inner JOIN 
    (select distinct Id, Time from [dbcust].[dbo].[_Cpu]) t1 
            on SUBSTRING(t1.Id,CHARINDEX('_',t1.Id,10)+1,(CHARINDEX('_',t1.Id,15) - CHARINDEX('_',t1.Id,10)-1))=SUBSTRING(t0.Id,CHARINDEX('_',t0.Id,10)+1,(CHARINDEX('_',t0.Id,15) - CHARINDEX('_',t0.Id,10)-1)) 
), Ranked
AS
(
   select Id,
     CustomerName,
     Region,
     Country,
     StopTime,
     CustomerId,
     Id, 
     Time,
     ROW_NUMBER() OVER(PARTITION BY Time ORDER BY StopTime DESC) AS R1
    from  YourQuery
)
SELECT Id,
     CustomerName,
     Region,
     Country,
     StopTime,
     CustomerId,
     Id, 
     Time
FROM Ranked
WHERE RN = 1;

The ROW_NUMBER function will give a ranking number for each time, so selecting where rn = 1 will give you the latest stoptime.


For the query you tried:

select distinct Id,CustomerName,Region,Country, 
   StopTime,CustomerId 
from [dbcust].[dbo].[_Content] 
group by StopTime

In sql server, when you group by a column, you can't select any column unless it is in the group by clause or in an aggregate function, so in order to write it correctly it should be like this:

select stopTime, MIN(CustomerId) -- just an example
from [dbcust].[dbo].[_Content] 
group by StopTime

Or using your full query you can do this:

SELECT stoptime, MAX(Time) AS LatestTime
FROM
(
    select t0.Id,t0.CustomerName,t0.Region,t0.Country,t0.StopTime,t0.CustomerId,t1.Id, t1.Time
            from (select distinct Id,CustomerName,Region,Country, StopTime,CustomerId from [dbcust].[dbo].[_Content]) t0 
            Inner JOIN 
            (select distinct Id, Time from [dbcust].[dbo].[_Cpu]) t1 
                    on SUBSTRING(t1.Id,CHARINDEX('_',t1.Id,10)+1,(CHARINDEX('_',t1.Id,15) - CHARINDEX('_',t1.Id,10)-1))=SUBSTRING(t0.Id,CHARINDEX('_',t0.Id,10)+1,(CHARINDEX('_',t0.Id,15) - CHARINDEX('_',t0.Id,10)-1)) 
) AS t
GROUP BY stoptime

This will give you the exact result that you are looking for, but only the stoptime and the latest time.

So to select the other columns as will as the two columns I used the ranking function.