在Perl中高效sort——Schwartzian变换

In computer science, the Schwartzian transform is a Perl programming idiom used to improve the efficiency of sorting a list of items. This idiom is appropriate for comparison-based sorting when the ordering is actually based on the ordering of a certain property (the key) of the elements, where computing that property is an intensive operation that should be performed a minimal number of times. The Schwartzian Transform is notable in that it does not use named temporary arrays.

The general form of the Schwartzian Transform is:

1
2
3
4
@sorted = map  { $_-[0] }
          sort { $a-[1] cmp $b-[1] }
          map  { [$_, foo($_)] }
               @unsorted;

Where foo($_) represents an expression that takes $_ (each item of the list in turn) and produces the corresponding value that is to be compared in its stead.

Reading from right to left (or from the bottom to the top):

  • the original list @unsorted is fed into a map operation that wraps each item into a (reference to an anonymous 2-element) array consisting of itself and the calculated value that will determine its sort order (list of item becomes a list of [item=>value]);
  • then the list of lists produced by map is fed into sort, which sorts it according to the values previously calculated (list of [item, value] => sorted list of [item, value]);
  • finally, another map operation unwraps the values (from the anonymous array) used for the sorting, producing the items of the original list in the sorted order (sorted list of [item, value] => sorted list of item).

The use of anonymous arrays ensures that memory will be reclaimed by the Perl garbage collector immediately after the sorting is done.

参考资料:

Schwartzian transform
A Fresh Look at Efficient Perl Sorting
Data Munging with Perl_sort 之Schwartzian transform
简简单单讲sort(仙子译创)
对sort排序的加速 — Schwartzian变换
多重排序--Schwartzian转换