Content from 2015-08
Let's go back to the GNU Coreutils list of tools. ls
for example. Usually
the user will have set some alias to ls
instead of the plain invocation,
either to enable highlighting (--color
), sorting (--sort
), or to add more
information than just the filenames (e.g. --format
). There is even
integration with Emacs (--dired
).
The question then is: How much of the functionality of ls
is actually
devoted to secondary formatting instead of listing files? And shouldn't this
functionality be moved into separate tools? Since output is intended for
multiple kinds of recipients, additional data creeps in and complicate tools a
lot.
Alternatively, we could imagine using ls
only to get unformatted and unsorted
output. Which would then be passed through to a sort
command and a fmt
command of sorts. Of course this all takes some more time, re-parsing of
output etc., so it's understandable in the interest of performance not to do
this in the traditional Unix shell.
However, let's assume a more sophisticated shell. Assuming ls
is limited to
listing files, then the user will alias ls
to a pipeline instead, namely
something akin to ls | sort | fmt
. Then again, formatting is part of the
user interface, not the functionality, so it should rather be part of the
internal shell formatting, possibly exposed as separate filters as well.
The result of ls
is a (possibly nested) directory listing. Regardless of
post-processing, this "object" should still be available for further
investigation. Which means that while sorting may be applied destructively,
formatting may not, unless specifically requested, in which case the result
would be a kind of "formatted" object (text, GUI widget) instead.
In other terms, the user should be able to refer to the last results
immediately, instead of rerunning the whole pipeline. E.g. coming from Common
Lisp, variables like *
to ***
will store the last three results for
interactive use. In the shell then, ls
would set *
to the generated
directory listing; since the listing is also most likely printed to the screen,
the full listing will also be stored (in that object) to be used again if e.g.
*
is requested again. Rerunning the command, on the other hand, will
possibly generate a different directory listing as files may have been changed,
so there is an immediate difference between the two forms.
Examples
The pipeline ls | wc -l
is (at least for me) often used to get the number of
files in the (current) directory. Unfortunately there is no direct way to get
this number directly except to enumerate the entries in a directory (under
Linux that is).