When dealing with a file that has a large number of entries, sorting the contents in a certain order can make things so much easier. In Linux, thesortcommand, which employs the merge sort algorithm, is used for this purpose.

The sort command compares all lines from the given files and sorts them in the specified order based on the sort keys. The-koption, in particular, is used to sort on a certain column.

sort-k

In conjunction with the-koption, you’ll also need to know about other options like-nor-rto efficiently sort by column. We’ve detailed these and other related topics in the sections below.

Sort By Column

The base syntax for the sort command issort . To apply this, let’s start with an example. Let’s say we have acontacts.txtfile with the following entries:

Just using thesort contactscommand would sort the entries in alphabetical order based on the first column. To sort by another specific column, or in another order, you’ll have to use various options, which we’ve listed below.

sort-k-k

Option -k

As stated, the–key, or-koption is used tosort on a specific columnor field. For instance, in our example, if you wanted to sort by nationality, you’d use the following command to sort on the third column:sort -k3 contacts

This command will take all fields from 3 to the end of the line into account. Alternatively, if you usedsort -k3,4 contacts, this would specify to use the fields from #3 to #4, i.e., the third to fourth fields.

sort-n-k

If you were trying to sort by surnames in the second column, there are two scenarios to consider. As there are identical entries (Smith), sort will use the next entry for the tiebreaker. I.e., in the case of the two smiths,France would be sorted aboveUS.

But what if you needed to sort by surname, name, and only then nationality? In such cases, you can manipulate the sort order by specifying keys multiple times as such:sort -k2,2 -k1,1 contacts

sort-r-reverse

Option -n

If you use the commands shown above to sort by age on the fourth column, the results would seem inaccurate. As sort takes a lexicographic approach, fiftyone would rank above nineteen.

Tosort a column by numbers, you have to instead specify the -n option as such:sort -n -k4 contacts

sort-by-column-reverse-numeric

Option -r

If you were trying to sort entries in reverse order, you would use the-roption. This is true for both alphabetical and numeric values, as shown below:sort -r -k1,1 contactssort -r -k3,4 contacts

By combining the basic options we’ve listed so far, you could perform complex sorting operations like reverse sorting on the second column and numerically sorting on the fourth column at once:sort -k2,2r -k4,4n contacts

sort-t-delimiter

Option -t

A blank space is the default field separator, i.e.,delimiterwith sort. But CSV files use commas (,) to separate values. Depending on what delimiter is used, you’re able to specify it using the -t option. For instance, if:is the delimiter, you would specify it as such:sort -t ‘:’ -k2 contact

Option -o

You can use the-ooption tosave the sorted output into a specified fileas such:sort -k2 contacts -o sortedcontacts

Additional Options

The options detailed above are the most commonly used ones. But sort has countless other flags that could be useful in niche scenarios like-bto ignore blank spaces at the start of the file,-Mto sort the contents as per the calendar month, or-cto check if data is already sorted. As there are too many to list here, we recommend referring to thesort man pagefor the full list of such options.

sort-o-save-output-to-file