CDT Build Model

 

Build systems for C and C++ are widely varied. A majority of them have some features in common. We will strive to achieve only those features. If someone has other requirements that this build model doesn’t meet, we can either extend the build model or provide a separate build system.

 

This document first gives a quick description of the standard make build system that exists in the CDT today. We then go into detail of the managed build system we are proposing.

Standard Make Build

The original build system in the CDT is the standard make system. This system is very simple. The user provides a makefile in the project directory and the build system by default invokes make in that directory. A facility is provided to override the command that gets invoked. As well, a facility is provided to define make targets which are simply passed to the make command on the command line. For example, a make target of “debug” will result in the build command “make debug”.

 

At the back end of the build system is the error parsers. The error parsers are registered through an extension point. The build system processes the output of the build line by line passing it to each error parser until one of them handles the error. This error parser is then placed at the head of the list to improve performance for the next error, which will likely be handed by the same error parser. The error parser parses the output line and creates a marker for the error and places it in the tasks list. There is obviously a chance for contention for the errors when the tool chains for multiple compilers are registered.

 

And that’s about all it does.

Managed Build Overview

The main objective of to the managed build system is to alleviate the user from having to manage their makefiles manually. To do this we first need to understand the underlying semantics of make and how it figures out what commands to execute. Then we must build a model that simplifies the job of feeding this information to make.

Semantics of Make

The main job that make does is to update target files when source files are newer than the target files. This is done by calling a list of commands when this happens and hopefully these commands update the target file. The source/target relationship as well as the commands that run is known as a rule. A common scenario is to define a pseudo target that isn’t intended to exist on the file system but is used to capture lists of commands to be run manually (e.g., make clean runs a list of commands to clean up files).

 

To simplify specifying dependencies in a makefile for large projects, make supports implicit rules. There are a number of ways to specify the files involved, either by file extensions (e.g. .c.o which means a files with an extension of .o (a.o) depends on a file with the same basename but with the extension .c (a.c)) or using wild cards (e.g %.o: %.c where the % is the wild card and what ever matches (a in a.c) is appended o in the source (to produce a.o)). The list of commands then uses special macros to fill in the names of the files that were actually matched.

 

And it has macros which are generally just ways to insert variable text into rules.

The Managed Build Model

The build model attempts to generalize the concepts involved in a makefile. The main purpose to doing this is to be able to automatically generate a user interface that will allow the user to create and edit the information that goes into a makefile. This interface must allow the user to work at a higher level of abstraction than the text that goes in the makefile. It also must not be too high level that it prevents the user from doing things they are accustomed to doing with makefiles. These are somewhat conflicting requirements so a delicate balance between ease of use and expressiveness needs to be achieved.

 

The build model is as follows:

The following sections describes each class and it’s relationships on more detail.

Target

A target represents an execution platform. From the build perspective, it is expected that the build results are intended to execute on such a platform. A target defines a set of tools that produce target specific output or work in some target specific way, including extracting information from target specific files.

 

Examples of targets include: linux-x86, cygwin, win32, embedded-os-X.

 

Some of these tools are build related, some are not. As such, these concepts can be considered part of a target model that transcends the build model and is applicable to all build systems.

Machine

A machine is a particular instance of a target. It represents an actual physical machine that the CDT will interact with. Machines are used to provide settings for launch configurations.

 

Examples of machines include: dschaefer2 (an instance of linux-x86), dschaefer1-cygwin (and instance of cygwin), dschaefer1-win32 (an instance of win32).

 

Issue: the above example brings up a good issue: is it possible for a give machine to be instances of multiple targets. Or does a machine represent a more logical thing.

Tool

A tool represents some sort of executable component that can take a set of inputs and produce a set of outputs. Either or both of these sets can be empty or can be considered “uninteresting” (i.e., doesn’t feed into any other tools).

 

Issue: this sounds close to the External Tool concept in Eclipse. Are there opportunities for reuse?

 

A tool can be target specific or target independent.  If a tool is target specific, it can only be used to build for that target.

 

Issue: can a tool be specific to more than one target?

 

A tool can have an associated error parser which is responsible for parsing the output of the tool and producing appropriate task items.

 

A tool can have an associated dependency generator that can be used to figure out dynamically what other files serve as implicit inputs to the tool.  This is similar to the functionality provided by the makedepend tool or the –M option of gcc and other compilers.

 

Certain components of the CDT and it’s addons require certain information in tool settings.  To assist this while keeping the build model tool independent, a tool has a tag attribute which can contain standardized identifying information.  For example, a tool can have a “compiler” tag to signify that this tool is a compiler.

 

Issue: how do we manage the standardization process to make sure everyone follows it and that the appropriate tags are defined.

 

For command line based tools, the tool has an associated script generator that produces the actual command line that will appear in the make file.  There will be a standard generic script generator that will use script fragment information in the various components in the build model to produce the command line.  This generator can be used in most of the cases, but can be overridden as necessary.

Option

Tools have a set of options.  These options often manifest themselves as command line options, such as –g for emitting symbolic debugging information into the object file.  The option concept is formalized to allow for the automatic generation of GUI elements for options and to allow the appropriate settings to be extracted and generated into the makefiles.

 

As with tools, options have tags to allow for identification.  For example, the inclusion paths option of the “compiler” tool can be tagged “inclusionPaths” to allow for other CDT components to find out the inclusion paths used by the compiler.

Option Enum

An option has a type.  If the type requires a list of possible choices for its value, a set of Option Enums can be defined.

Option Category

A tools can have a large number of options.  To help organize the user interface for these options, a hierarchical set of option categories can be defined.

Configuration

A configuration represents a makefile.

 

A project may have multiple configurations.  In a given user’s workspace, each project selects one configuration which is the active configuration and the one use for the eclipse build calls.

 

A configuration builds for a given target and uses tools specific to that target or tools that are target independent.

 

A configuration lists a result which is one or more files that are produced by the build.  The configuration lists a clean command which is the command executed during rebuild operations (as part of make clean).

 

A configuration can also list a build directory.  The makefile is generated to this directory and the build is run from that directory.  The makefile generator will figure out the paths from the build directory to the source files so that they can be found by the build tools.  It is expected that the output of the build tools will go into the build directory, or some hierarchical structure below the build directory.  If the build directory is not specified, the build will be run from the project root directory.

 

Issue: running the build from the root directory is problematic if multiple configurations are defined.  Although it is a bad idea, are there ways we can support it like forcing a clean when the active configuration changes.

 

A configuration defines a set of rules.

Rule

A rule pretty much maps directly to a make rule.  The rule defines a set of results which forms the target of the rule as well as a set of sources which forms the explicit dependencies for the rule.  A rule defines a list of commands which execute when the targets are older than their dependencies.

 

Rules can be either implicit or explicit.  Implicit rules use wild cards to specify the results and sources.  The gnu make syntax for wild cards is supported.  Explicit rules name the results and sources directly.

 

Issue: if the user is not using gnu make, then a conversion needs to happen at makefile generation time.

 

To help define the rules, two macros are generated into the makefile.  ROOT specifies where the location of the root of the project in relation to the build directory.  RESOURCES lists every resource in the project.  Gnu make syntax may then be used to process the values in these macros.  For example, to generate a list of object files produced by a C++ compiler, one may use ‘$(RESOURCES:$(ROOT)/%.cpp=%.o)’.

 

Issue: again, if the user is not using gnu make, these macros will need to be processed by the makefile generator.

Command and Option Values

A command represents the invocation of a tool to satisfy a rule.  The command may override the default option values for the tool for the configuration containing the rule.

Makefile Generation

Given the way the build model is defined, makefile generation should be very easy, especially for gnu make.  The following is an example makefile:

 

ROOT = ..

RESOURCES = $(ROOT)/client.cpp $(ROOT)/main.cpp

 

# Rule 1

all:  client.exe

 

# Rule 2

clean:

      rm -f client.exe $(RESOURCES:$(ROOT)/%.cpp=%.o)

 

# Rule 3

%.o:  $(ROOT)/%.cpp

      g++ -g -c -o $@ $<

 

# Rule 4

client.o:   $(ROOT)/client.cpp

      g++ -g -DDOUG -c -o $@ $<

 

# Rule 5

client.exe: $(RESOURCES:$(ROOT)/%.cpp=%.o)

      g++ -g -o $@ $^ -L /usr/X11R6/lib -lX11

 

# Generated dependencies

client.o: /usr/include/sys/types.h /usr/include/_ansi.h /usr/include/newlib.h

client.o: /usr/include/sys/config.h /usr/include/machine/ieeefp.h

client.o: /usr/include/sys/_types.h /usr/include/machine/types.h

client.o: /usr/include/sys/socket.h /usr/include/features.h

client.o: /usr/include/sys/features.h /usr/include/cygwin/socket.h

client.o: /usr/include/asm/socket.h /usr/include/cygwin/if.h

client.o: /usr/include/cygwin/sockios.h /usr/include/cygwin/uio.h

client.o: /usr/include/sys/time.h /usr/include/netinet/in.h

client.o: /usr/include/cygwin/in.h /usr/include/cygwin/types.h

client.o: /usr/include/sys/sysmacros.h /usr/include/asm/byteorder.h

client.o: /usr/include/stdio.h /usr/include/sys/reent.h

client.o: /usr/include/sys/stdio.h /usr/include/string.h

 

The makefile generator generates one makefile for each configuration.  At the top of the makefile is the ROOT macro which points at the root directory for the project.  In this case we are building an a subdirectory of the root so it points to the parent directory.  Following this macro is the RESOURCES macro which lists all of the resources in project (Issue: we might want to put in filters to reduce the contents to only those resources we would be interested in).

 

This is followed by the rules.  There is one make rule for each Rule in the Configuration.  The target and dependencies are extracted directly from the fields in the Rule.  The makefile generator calls the script generator on the tools to generate the command lines with the appropriate option settings for the rules.

 

The following sections describe each rule in more detail.

Rule 1

all:  client.exe

 

‘all’ is the main target of the makefile.  It lists as dependencies the results field of the Configuration.

Rule 2

clean:

      rm -f client.exe $(RESOURCES:$(ROOT)/%.cpp=%.o)

 

‘clean’ is called as part of the rebuild command to remove any previous build results.  Since it is hard to tell which ones they are, the Configuration provides the complete command to be run for this rule.

 

Issue: these first two rules are pseudo targets and should be marked as such.

Rule 3

%.o:  $(ROOT)/%.cpp

      g++ -g -c -o $@ $<

 

This is the standard implicit rule for compiling a C++ file with the cpp file extension to an object file.

Rule 4

client.o:   $(ROOT)/client.cpp

      g++ -g -DDOUG -c -o $@ $<

 

This is an explicit rule which does pretty much the same thing but with a special define.

Rule 5

client.exe: $(RESOURCES:$(ROOT)/%.cpp=%.o)

      g++ -g -o $@ $^ -L /usr/X11R6/lib -lX11

 

This is an explicit rule which defines the final step in making the result for the configuration.

Generated Dependencies

This contains the output from the dependency generator when ran on the client.cpp resource.

 

Issue: If the dependency generator is associated with the tool, how would this work with implicit rules.  How does it know what resources to look at.  Or does the CDT determine which dependency generators to run based on the tools mentioned in the configuration and run them at the project level.

Creating New Configurations

To simplify the task of creating new configurations and to get new projects up and running in a hurry, the CDT provides for the creation of Configuration Templates.  Configuration Templates are Configurations not associated with Projects (and probably defined by extension points) that are copied into Projects.  The configuration templates should contain implicit rules in general with explicit rules for standard results (like libraries and executables).

 

Issues: How do the explicit rules in the template figure out the names of the result (e.g. client.exe from the above example).  Do we need to define an expansion wizard class for the template that is called get the required information to fill out the template.  We could probably provide a set of standard ones like executables, libraries, etc.

 

Once copied into a project, the configuration can then be changed as the user deems necessary.