stupid_idiot
Group: Members
Posts: 344
Joined: Oct. 2006 |
|
Posted: Dec. 07 2007,12:41 |
|
Big Fat Warning: Hi all, None of the methods described here actually work. I haven't found any workable solution yet. Everything I've posted are just preliminary attempts. Please do not use ANY of these examples in their unmodified form on any source code!!! Problems: -- No reliable way to preserve directives ("#define", "#ifdef", "#endif", etc). -- "Missing semicolon" compilation errors after deleting comment blocks (the comment blocks were acting the role of semicolons by signifying the end of a function). -- Most probably, there are many, many more problems which I haven't ran into yet.
I am a newbie who is very new to sed and regular expressions, and: I am not even sure sed is the proper tool for this purpose (stripping source code). I am also a total Perl newbie (only just started reading the 'Llama Book' AKA 'Learning Perl, Third Edition' from O'Reilly). (I wholeheartedly recommend the 'Llama Book' to anyone who is new to Perl -- It is very well written!) Speaking as a total Perl newbie: IMHO, I think someone who is versed in Perl may be able to produce a more ideal solution in Perl.
Code Sample | find ./ -name "*.h" | \ while read i; do \ cpp -P -fpreprocessed "$i" > "$i.tmp" \ && mv "$i.tmp" "$i"; done find ./ -name "*.h" | \ while read i; do \ sed -i '/^ *$/d' "$i"; done | This will recursively strip all comments and empty lines from any C headers in the current directory. This would help reduce the size of any '-dev' extensions. Any improvements are most welcome! Thanks.
Explanation:Code Sample | cpp -P -fpreprocessed "$i" > "$i.tmp" && mv "$i.tmp" "$i"; done | 1. Process all '.h' files with 'cpp' and overwrite original files with processed files.Code Sample | sed -i '/^ *$/d' "$i" | 2. '/^ *$/d' -- Remove lines that: begin with any number of spaces and end after any number of spaces (i.e. lines that contain only spaces).
Possibly useful also: 1. 'whitespace_stripper.sh'Code Sample | for i in "$@"; do sed -i '/^ *$/d' "$i"; done | e.g. 'whitespace_stripper.sh file1 file2 file3 file4 [...]' 2. 'stripsh' (strip shell scripts)Code Sample | for i in "$@"; do sed -i -e 's/\t*#.*//g' \ -e 's/\ *#.*//g' \ -e '/^#/d' \ -e '/^\t*#/d' \ -e '/^ *#/d' \ -e '/^ *$/d' \ "$i"; done | e.g. 'stripsh file1 file2 file3 file4 [...]' NOTE: This script has a problem -- It deletes the first line of any script; for example:You have to put the line back again after running the script. Question for everyone: How do we make 'sed' ignore lines that begin with "#!"? Thank you very much!
Explanation of above script: 's/\t*#.*//g' -- (Substitution) Pattern: Begins with any number of TABs ("\t"), followed by a "#", followed by any number of any character (".*"). Replacement: Null (i.e. deletes the relevant part of matching lines -- does NOT mean deleting the entire line).
'/^\t*#/d' -- (Deletion) Delete lines that: begin with any number of TABs, followed by a "#" (i.e. bash and perl comments).
'/^ *#/d' -- (Deletion) Delete lines that: begin with any number of SPACEs, followed by a "#" (i.e. bash and perl comments).
'/^ *$/d' -- (Deletion) Delete lines that: begin with any number of SPACEs, followed by an end-of-line ("$") -- i.e. delete empty lines.
|