By default, when a for
loop traverses an array, the order
is undefined, meaning that the awk implementation
determines the order in which the array is traversed.
This order is usually based on the internal implementation of arrays
and will vary from one version of awk to the next.
Often, though, you may wish to do something simple, such as “traverse the array by comparing the indices in ascending order,” or “traverse the array by on comparing the values in descending order.” gawk provides two mechanisms which give you this control.
PROCINFO["sorted_in"]
to one of a set of predefined values.
We describe this now.
PROCINFO["sorted_in"]
to the name of a user-defined function
to be used for comparison of array elements. This advanced feature
is described later, in Array Sorting.
The following special values for PROCINFO["sorted_in"]
are available:
"@unsorted"
"@ind_str_asc"
"10"
rather than numeric 10.)
"@ind_num_asc"
"@val_type_asc"
"@val_str_asc"
"@val_num_asc"
qsort()
function,1 which gawk uses internally
to perform the sorting.
"@ind_str_desc"
"@ind_num_desc"
"@val_type_desc"
"@val_str_desc"
"@val_num_desc"
The array traversal order is determined before the for
loop
starts to run. Changing PROCINFO["sorted_in"]
in the loop body
will not affect the loop.
For example:
$ gawk 'BEGIN { > a[4] = 4 > a[3] = 3 > for (i in a) > print i, a[i] > }' -| 4 4 -| 3 3 $ gawk 'BEGIN { > PROCINFO["sorted_in"] = "@ind_str_asc" > a[4] = 4 > a[3] = 3 > for (i in a) > print i, a[i] > }' -| 3 3 -| 4 4
When sorting an array by element values, if a value happens to be a subarray then it is considered to be greater than any string or numeric value, regardless of what the subarray itself contains, and all subarrays are treated as being equal to each other. Their order relative to each other is determined by their index strings.
Here are some additional things to bear in mind about sorted array traversal.
PROCINFO["sorted_in"]
is global. That is, it affects
all array traversal for
loops. If you need to change it within your
own code, you should see if it's defined and save and restore the value:
... if ("sorted_in" in PROCINFO) { save_sorted = PROCINFO["sorted_in"] PROCINFO["sorted_in"] = "@val_str_desc" # or whatever } ... if (save_sorted) PROCINFO["sorted_in"] = save_sorted
"@unsorted"
. You can also get the default behavior by assigning
the null string to PROCINFO["sorted_in"]
or by just deleting the
"sorted_in"
element from the PROCINFO
array with
the delete
statement.
(The delete
statement hasn't been described yet; see Delete.)
In addition, gawk provides built-in functions for sorting arrays; see Array Sorting Functions.
[1] When two elements
compare as equal, the C qsort()
function does not guarantee
that they will maintain their original relative order after sorting.
Using the string value to provide a unique ordering when the numeric
values are equal ensures that gawk behaves consistently
across different environments.