In SAS, you can first sort the dataset and then use first. or last. statement to select the first/last row within a group. There’s no such command in SQL. Below is my way to select the first/last row within a group in SQL.
Code:
select a.* into table_out
from table_in a,
(select id, min(rank) as minrank from table_in group by id) b
where a.id = b.id and a.rank = b.minrank
I first create table b which is equivalent to sort the table_in and then match to itself. If I want to select the last row, I can then use max() function.
If min rank is not unique within each ID group, the above code will then choose all the rows with a rank equal to the min rank. To only select the first/last row within a group in SQL, we can create a global unique identifier and then match it. Below is the code:
select *, identity(int,1,1) as rowid into #temp from table_in
where condition_Expression order by id, rank
select a.*
into table_out
from table_in a,
(select id, min(rowid) as minrowid from #temp group by id) b
where a.rowid = b.minrowid
In this example, we first use the indentity() function to create a new globally unique identifier rowid. Then we pick the row with minimum rowid. Be careful that you should sort your data first.
Added: March 9, 2009
We can use this to dedup the data within a group.