KevCaz's Website

Yesterday, I was trying to find a way to count lines of a set of R files without comments nor empty lines and found this answer on stackoverflow:

1
cat R/* | sed '/^\s*#/d;/^\s*$/d' | wc -l

Exactly what I needed! A few explanations are in order:

  • cat reads files sequentially;
  • "|" is the the pipe operator;
  • sed parses and transforms text;
  • wc counts words, lines and more!

Basically, counting lines of a file is done with the option -l of wc (option -m for characters, see man wc to learn more):

1
wc -l file   

cat reads the set of file and with pass it to wc using the pipe:

1
cat R/* | wc -l     

To remove blank lines (here comments and empty lines) we use the stream editor as follows:

1
sed '/^\s*#/d;/^\s*$/d'

There are two instructions separated by a semi colon:

  1. /^\s*#/d
  2. /^\s*$/d

Characters between the two slashes / / is the selection and d at the end of the instruction means “delete the selection”. ^\s*# means “lines starting by (^) an arbitrary number (*) of whitespace characters (\s) followed by a # (the character for comments)”, so basically lines of comments. The second line is similar but $ indicates the end of the line, so /^\s*$/d reads “lines that start and end by an arbitrary number of whitespace characters”, that is blank lines!

1
2
cat R/* | sed '/^\s*#/d;/^\s*$/d' | wc -l
cat R/* | sed '/^\s*#/d' | sed '/^\s*$/d' | wc -l

Stream editors are very powerful tools, if you are interested in learning more about one of them, have a careful look at the documentation of GNU sed.