在Perl中高效sort——Guttman-Rosler转换(GRT)

The Guttman-Rosler Transform is a technique from Uri Guttman and Larry Rosler for improving sort speed in perl.

The Guttman-Rosler transform runs faster than the orcish or Schwartzian transforms by avoiding the use of custom search subroutines. This is accomplished via pre and post transformation of the list. This allows the native optimizations of the perl sort function to shine.

Synopsis:

  • prefix each element with a synthetic key (string or numeric).
  • sort.
  • remove the prefix key from each element.

Example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
my %colors = (
    "Granny Smith"     ="green",
    "Golden Delicious" ="yellow",
    "Pink Lady"        ="pink",
);
 
#A) standard sort:
my @sorted_by_color = sort { $colors{$a} cmp $colors{$b} } keys %colors;
 
#B) GRT sort:
# 1. Encode
my @sortkeys;
while ( my ( $variety, $color ) = each %colors ) {
    push @sortkeys, sprintf( "%-6s%-16s", $color, $variety );
}
# @sortkeys = (
#     "green Granny Smith    ",
#     "yellowGolden Delicious",
#     "pink  Pink Lady       ",
# );
 
# 2. Sort
my @sorted_by_color = sort @sortkeys;    # no sortsub!
 
# 3. Decode
@sorted_by_color = map { substr( $_, 6 ) } @sorted_by_color;

参考资料:

A Fresh Look at Efficient Perl Sorting
GRT : Guttman Rosler Transform
Data Munging with Perl_sort 之Guttman-Rosler transform
简简单单讲sort(仙子译创)
Advanced Sorting – GRT – Guttman Rosler Transform