Print Newline, Word and Byte Counts with wc

wc counts the number of bytes, characters, whitespace-separated words, and newlines in each given file, or standard input if none are given or for a file of -. Synopsis:

wc [option]… [file]…

wc prints one line of counts for each file, and if the file was given as an argument, it prints the file name following the counts. If more than one file is given, wc prints a final line containing the cumulative counts, with the file name total. The counts are printed in this order: newlines, words, characters, bytes, maximum line length. Each count is printed right-justified in a field with at least one space between fields so that the numbers and file names normally line up nicely in columns. The width of the count fields varies depending on the inputs, so you should not depend on a particular field width. However, as a GNU extension, if only one count is printed, it is guaranteed to be printed without leading spaces.

By default, wc prints three counts: the newline, words, and byte counts. Options can specify that only certain counts be printed. Options do not undo others previously given, so

wc --bytes --words

prints both the byte counts and the word counts.

With the –max-line-length option, wc prints the length of the longest line per file, and if there is more than one file it prints the maximum (not the sum) of those lengths. The line lengths here are measured in screen columns, according to the current locale and assuming tab positions in every 8th column.

The program accepts the following options.

-c or --bytes prints only the byte counts.

-m or --chars prints only the character counts.

-w or --words prints only the word counts.

-l or --lines prints only newline counts.

-L or --max-line-length prints only the maximum display widths. Tabs are set at every 8th column. Display widths of wide characters are considered. Non-printable characters are given 0 width.

--files0-from=file disallows processing files named on the command line, and instead process those named in file file; each name being terminated by a zero byte (ASCII NUL). This is useful when the list of file names is so long that it may exceed a command line length limitation. In such cases, running wc via xargs is undesirable because it splits the list into pieces and makes wc print a total for each sublist rather than for the entire list. One way to produce a list of ASCII NUL terminated file names is with GNU find, using its -print0 predicate. If file is - then the ASCII NUL terminated file names are read from standard input.

For example, to find the length of the longest line in any .c or .h file in the current hierarchy, do this:

find . -name '*.[ch]' -print0 |  wc -L --files0-from=- | tail -n1

Consider we have a file test.txt with following content:
line 1
line 2

Invoking wc on it gives following output:

$ wc test.txt
2 4 14 test.txt

To do this for a directory you can execute it like find . - name "*"|xargs wc

Leave a comment

Your email address will not be published. Required fields are marked *