Chapter 20. Writing Scanners

SCons has built-in scanners that know how to look in C, Fortran and IDL source files for information about other files that targets built from those files depend on--for example, in the case of files that use the C preprocessor, the .h files that are specified using #include lines in the source. You can use the same mechanisms that SCons uses to create its built-in scanners to write scanners of your own for file types that SCons does not know how to scan "out of the box."

20.1. A Simple Scanner Example

Suppose, for example, that we want to create a simple scanner for .foo files. A .foo file contains some text that will be processed, and can include other files on lines that begin with include followed by a file name:


      include filename.foo
    

Scanning a file will be handled by a Python function that you must supply. Here is a function that will use the Python re module to scan for the include lines in our example:


      import re
      
      include_re = re.compile(r'^include\s+(\S+)$', re.M)
      
      def kfile_scan(node, env, path, arg):
          contents = node.get_text_contents()
          return env.File(include_re.findall(contents))
    

It is important to note that you have to return a list of File nodes from the scanner function, simple strings for the file names won't do. As in the examples we are showing here, you can use the File function of your current Environment in order to create nodes on the fly from a sequence of file names with relative paths.

The scanner function must accept the four specified arguments and return a list of implicit dependencies. Presumably, these would be dependencies found from examining the contents of the file, although the function can perform any manipulation at all to generate the list of dependencies.

node

An SCons node object representing the file being scanned. The path name to the file can be used by converting the node to a string using the str() function, or an internal SCons get_text_contents() object method can be used to fetch the contents.

env

The construction environment in effect for this scan. The scanner function may choose to use construction variables from this environment to affect its behavior.

path

A list of directories that form the search path for included files for this scanner. This is how SCons handles the $CPPPATH and $LIBPATH variables.

arg

An optional argument that you can choose to have passed to this scanner function by various scanner instances.

A Scanner object is created using the Scanner function, which typically takes an skeys argument to associate the type of file suffix with this scanner. The Scanner object must then be associated with the $SCANNERS construction variable of a construction environment, typically by using the Append method:


       kscan = Scanner(function = kfile_scan,
                       skeys = ['.k'])
       env.Append(SCANNERS = kscan)
    

When we put it all together, it looks like:


        import re

        include_re = re.compile(r'^include\s+(\S+)$', re.M)

        def kfile_scan(node, env, path):
            contents = node.get_text_contents()
            includes = include_re.findall(contents)
            return env.File(includes)

        kscan = Scanner(function = kfile_scan,
                        skeys = ['.k'])

        env = Environment(ENV = {'PATH' : '/usr/local/bin'})
        env.Append(SCANNERS = kscan)

        env.Command('foo', 'foo.k', 'kprocess < $SOURCES > $TARGET')