Comments Stripping Framework

NOTE: The Comments Stripping Framework is not yet a part of SCons. The current code is viewable here. Creating Comments Stripping Framework is a part of my Google Summer of Code project.

NOTE: This documentation refers to CommentsRE module and not to Comments.py module. Even when I refer to SCons.Comments module I really mean SCons.CommentsRE module.

An installation package is also available for download:

Comments Stripping Framework is responsible for stripping out comments (or any content not regarded as significant in particular case) and counting checksums only for significant content.

When the comment is changed in a C program, there is no need to recompile the source code. When the code is changed, there is no need to regenerate the DOXYGEN documentation.

At the moment CSF is turned on by default and there is no way to turn it off (other than modifying the source code).

GenericStripCode() function

GenericStripCode(filename, patterns) - this function makes it possible to build your own code stripping functions (i.e. functions that return comments from the files).

Let's say you need a C-like code stripping function, but with a possibility to strip multiline comments that start with /@ signs and end with @/.

You can use GenericStripCode() function from SCons.CommentsRE module along with two predefined regular expressions (c_comment for /* ... */ comments and cxx_comment for one-line // ... comments) and a function to create your own multiline regular expressions: multiline_comment_regexp().

Sample SConstruct file:

   1 from SCons.CommentsRE import GenericStripCode, c_comment, cxx_comment, multiline_comment_regexp
   2 
   3 def StripAtCode(filename):
   4     """Strips source code from file 'filename'.
   5     Returns traditional C comments ("/* ... */"),
   6     C++ one-line comments ("// ...") and at-multiline-comments
   7     ("/@ ... @/")."""
   8 
   9     at_comment = multiline_comment_regexp('/\@', '\@/')
  10     return GenericStripCode(filename,
  11                             (c_comment, cxx_comment, at_comment))
  12 
  13 # StripAtCode() is a valid stripper now.
  14 # You can add it to your builder using
  15 # add_stripper() method.

Predefined wrappers for GenericStripCode():

GenericStripComments() function

GenericStripComments(filename, patterns, quotings = ('"', "'"), comment_first_chars = ('//','/\*'), preprocessor = False) - this function makes it possible to build your own comments stripping functions (i.e. functions that return code from the files).

GenericStripComments() function returns contents of the filename file except of strings that fit regular expressions defined in patterns tuple.

patterns may be a string, list or tuple containing regular expression strings ready to be compiled with re.compile() function.

quotings is a tuple of characters, each of which marks beginning (and end) of a string. Patterns from the patterns argument found between the quotings characters won't be stripped.

comment_first_chars is a tuple that defines signs that comments start with. For C-like comments comment_first_chars is equal to ('//', '/\*').

When preprocessor is True GenericStripComments() won't strip whitespaces from the lines that start with # sign.

Let's say you need a C-like comments stripping function, but with a possibility to strip oneline comments that start with $ sign and end with a new-line.

You can use GenericStripComments() function from SCons.CommentsRE module along with two predefined regular expressions (c_comment for '/* ... */' comments and cxx_comment for one-line '// ...' comments) and a function to create your own multiline regular expressions: multiline_comment_regexp().

   1 from SCons.CommentsRE import GenericStripComments, c_comment, cxx_comment, oneline_comment_regexp
   2 
   3 def StripDollarComments(filename):
   4     """Strips comments from file 'filename'.
   5     Returns source code.
   6     Works for traditional C comments ("/* ... */"),
   7     C++ one-line comments ("// ...") and dollar-oneline-comments
   8     ("$ ... ")."""
   9 
  10     dollar_comment = oneline_comment_regexp('\$')
  11     return GenericStripComments(filename,
  12                             (c_comment,
  13                              cxx_comment,
  14                              dollar_comment),
  15                              comment_first_chars = ('//', '/\*', '\$'))
  16 
  17 # StripDollarComments() is a valid stripper now.
  18 # You can add it to your builder using
  19 # add_stripper() method.

Predefined wrappers for GenericStripComments():

Creating new strippers

Let's say that you want to use your XYZ compiler to compile hello.xyz file. XYZ language uses % signs as the beginning of the comments (comments end with a new line). You often change and recompile your packages created in XYZ language, so you don't want SCons to rebuild when only comments were changed.

All you have to do is to create your builder and add stripper to it. You can use SCons.Comments.GenericStripComments() function to create the function you need.

The SConstruct file could look like this:

   1 import os
   2 import SCons.Builder
   3 from SCons.Comments import GenericStripComments, oneline_comment_regexp
   4 def StripPercentComments(filename):
   5     percent_comment = oneline_comment_regexp('\%')
   6     return GenericStripComments(filename,
   7                                 (percent_comment),
   8                                  comment_first_chars = ('\%',))
   9 
  10 def XYZBuilder(target, source, env):
  11     os.system("xyz %s" % str(source[0]))
  12 
  13 xyz_builder = Builder(action = XYZBuilder)
  14 xyz_builder.add_stripper('.xyz', StripPercentComments)
  15 env = Environment(BUILDERS = { 'XYZ': xyz_builder })
  16 env.XYZ('hello.xyz')

In case you need other kind of stripping functions you can use one of the predefined functions or create your own function. Your function should take one argument: a file name to strip comments/code from and return stripped contents of the file.

GSoC2008/MatiGruca/StrippingFramework (last edited 2008-07-31 14:53:33 by apn-77-114-173-108)