Copyright © 2007 International Business Machines Corp.

 

 

Customizing UML:

Which Technique is Right for You?

 

Summary

Extending or restricting UML to suit a particular domain is not a simple task.  Several different options are available, each of which presents its own set of challenges.  This article walks the reader through the decision making process of figuring out which technique to use.  The mechanics of actually extending UML to suit a domain will be handled in future articles. 

 

By James Bruck and Kenn Hussey, IBM
June 19, 2008


 

Glossary

A few terms and abbreviations will be used throughout this document.

Term

Definition

DSL

Domain Specific Language.   A language designed to be useful for a specific set of tasks.  This is in contrast to a general-purpose modeling language such as UML.  A DSL is created specifically to solve problems in a particular domain and is not intended to be able to solve problems outside of it.  DSL’s are usually small, more declarative and less expressive than a general-purpose language.

 

MOF

Meta Object Facility.  MOF is an OMG standard for Model Driven Engineering.  MOF is designed as a four level architecture.   UML is considered a layer 2 MOF model.  M3 is the language used by MOF to build meta-models.

 

OMG

Object Management Group.  A consortium that promotes the adoptions of standards.

 

UML Testing Profile

The UML testing profile can be found at: http://www.omg.org/technology/documents/formal/test_profile.htm.  It is a demonstration of creating a DSL using both a profile and MOF based approach.

 

UML

UML2 2.1. This refers to the latest version of the UML API which is based on the UML 2.1.2 specification defined by OMG

The latest draft of the UML Superstructure Specification can be found at: http://www.omg.org/technology/documents/formal/uml.htm

 

 

Introduction

The ability to customize UML to a specific domain is one of the great features of UML.  Creating customizations allows one to leverage existing modeling tools and conventions defined by the UML specification while making modeling easier for the end user (and possibly less abstract).   The type of customizations required often depends on the nature of the domain and how the extended model is intended to be used.  If one wants to make simple customizations, by adding new properties to existing UML meta-classes, then a light-weight extension is the way to go.  However, if one wants to extend the behavior of UML, add restrictions on certain collections, or, take advantage of the more complex features of UML such as redefinition, then a heavy weight extension is the way to go.  As the superstructure specification further points out, there are several reasons why one may want to customize a meta-model:

 

  • Give a terminology that is adapted to a particular domain.
  • Give a syntax for constructs that do not have a notation.
  • Give a different notation for already existing symbols
  • Add semantics that is left unspecified in the meta-model.
  • Add semantics that do not exist in the meta-model.
  • Add constraints that restrict the way you use the meta-model.
  • Add information that can be used when transforming a model to another model or code.

Decisions, Decisions …

Creating your own Domain Specific Language has several advantages:   

  • DSL’s allow solutions to be expressed at the level of abstraction of the problem domain.  Therefore domain experts can understand, validate and modify them easily.
  • DSL’s enhance quality, productivity, maintainability, reusability.

 

And some disadvantages:

  • Cost of designing, implementing and maintaining a DSL.
  • Difficulty in balancing between domain-specific and general-purpose language constructs.

 

Where do you go from here?

The first decision that will have to be made will be to decide if you want to build on top of existing UML concepts or not.    That is to say, do you intend on extending or restrict existing UML concepts and meta-types, or not.  This can be answered by analyzing the domain space of your DSL.  If there is much overlap between the concepts in UML and those within your DSL then favor extending UML.  If there is little overlap then favor a MOF based solution. 

Figure 1:  UML Extension vs. MOF based

 

An example of MOF based vs. profile based approaches to developing a DSL is demonstrated with the UML testing profile.   In that specification, the same DSL is developed using two different techniques. 

 

 Have a look at the UML testing profile.

 

This article focuses on extending concepts from UML.   If you wish to take the MOF based approach, you can stop reading here and have a look at the OMG’s MOF standard.    Although the preferred approach in the modeling of languages involves using UML with customizations, it should be noted that, there is great value in creation of custom languages in specialized circumstances.   UML itself is defined using MOF and there exist many other languages that are defined in MOF terms.   In fact, the Eclipse Modeling Framework (EMF), the component on which UML2 component is based, has the ability to generate a Java implementation for working with MOF-defined languages.  

 

Techniques for extending UML

You have several options when it comes to extending/constraining UML in your quest to create a DSL.  Choosing the correct method is critical for the success of your project because you will have to commit to your decision and invest time and effort addressing the hurdles associated with each.  To that end, a general description of each technique and then a table summarizing the features of each approach will be provided.

 As you continue to read about the techniques for extending UML, you should keep the following in mind: 

  • As much as possible, you should be favoring the “lightweight approach” or the use of profiles.
  • As much as possible, you should be leaning away from the use of “middleweight” extensions.

 

Featherweight extension

Featherweight extensions involve the use of adding keywords.    Keywords are reserved words that normally appear as text annotations attached to a UML element.   The superstructure specification describes the use of keywords and maintains a list of predefined keywords (see Superstructure Specification Annex B).   Keywords are always enclosed in guillemots which serve as visual cues to more readily distinguish when a keyword is being used. 

 Keywords are case sensitive.

Example:

The meta-type  uml::Interface has a similar appearance to uml::Class.   For this reason, the keyword <<interface>> is used to distinguish interfaces from other classifiers.

        

Figure 2: Use of keywords on meta-type

Example:

In the following example, it becomes immediately obvious that keywords are critical for distinguishing between different types of relationships.

Figure 3: Use of keyword on relationships

Keywords are used for four different purposes:

  1. To distinguish a particular UML concept (meta-class) from others sharing the same general graphical form.
  2. To distinguish a particular kind of relationship between UML concepts from other relationships sharing the same graphical form.
  3. To specify the value of some modifier attached to a UML concept (meta-attribute value).   Thus the keyword <<singleExecution>> appearing within an Activity signifies that the isSingleExecution attribute of the Activity is true.
  4. To indicate standard stereotype.  For example the <<modelLibrary>> keyword attached to a package identify that the package contains a set of model elements intended to be shared by multiple models.

Keywords created in this manner will appear in the labels of the item when viewed with the existing UML editor.  When keywords get created, they are added as an annotation as shown below. 

Figure 4: Keywords in a model

It is important to note that the use of annotations in this manner is non-standard and therefore not directly supported by the UML editor.   If you decide to export your model with keywords to XMI, the annotations would be moved into an XMI extension.   Consumers of the XMI format could conceivably continue to use your keywords if they know how to work with the newly created XMI elements. 

 

The UML2 API does support the use a keywords through the API:

  • Element#addKeyword()
  • Element#getKeyword()
  • Element#removeKeyword()

 

The use of keywords is somewhat limited in that one cannot attach constraints or add properties to existing meta-types.   Keywords are strictly used as a tag to visually distinguish similar items.  Another downside to keywords is that there is no formally defined concept of a dictionary of keywords.   The implication of this is that if one defines some keywords for a particular domain, he cannot directly share them with others.   The set of appropriate keywords would have to be agreed upon and applied by the end user.   A means of consistently displaying keywords would also have to be supported by the modeling tool you are using. 

 

Pros

  • Adding keywords is trivial
  • Great to visually distinguish between similar looking items.

 

Cons

  • Limited functionality.
  • No concept of dictionary of keywords to share common keywords.
  • No way to validate application of keywords.
  • Not standardized or formally defined by superstructure specification.
  • No way to truly extend existing meta-types by adding attributes or operations

 

 

Lightweight extension

Lightweight extensions involve using profiles.  Profiles are described in detail in the UML Superstructure specification chapter 18.

 

Profiles should be your first instinctive choice when deciding to extend or customize UML.  A profile defines limited extensions to a reference meta-model with the purpose of adapting the meta-model to a specific platform domain.  The primary extension construct is the Stereotype, which is defined as part of a Profile.  Stereotypes can be used to add:  keywords, constraints, images, and properties (tagged values) to model elements. 

 

The profile mechanism is not a first-class mechanism, that is to say, it does not allow for specialization through inheritance of meta-types from the referenced meta-model.  Rather, the intention of a profile is to give a straight-forward mechanism for adapting an existing meta-model with constructs that are specific to a particular domain.  Each such adaptation is grouped in a profile. 

Figure 5: Profile as defined by UML meta-model

Two new concepts have been introduced into the UML2 2.1 API.  These concepts will be mentioned here but will be explored in more detail in future articles.    

  • Static Profile Definition:   Newly introduced in UML2 2.1 is the ability to create statically defined profiles.   Users now have the option to generate code from their profile and provide implementations for operations and derived features.

 

  • OCL Integration:  Users can specify invariant constraints or operation bodies in OCL and have code generated from the expressions entered in the UML model.   Validation of constraints created on stereotypes is possible after the stereotype has been applied.

 

 

Newly introduced in UML 2.0 is the notion of “strict” application of a profile.  This is a means of specifying the kinds of meta-types that your DSL is concerned with.   For example, say you were really only interested in Classes, Properties and Operations but you only wanted to specify a stereotype for Class.   You could create meta-class reference to Property and Operation.  Then when applying your profile, you could specify a “strict” application.    UML editors should respect the strict attribute of the profile application and remove all other UML concepts from palettes etc and just leave those that your DSL is concerned with.  This is a feature that may or may not be supported by a given tool.   Without a strict application of a profile, all other UML concepts would be available.

 

Pros

  • Easy to create such extensions
  • Well described with documentation in Superstructure Specification
  • Standard means to define icons
  • Well defined display options.
  • Application of profiles and how to use them is well defined.
  • Can add structure
  • Low development cost
  • Leverage existing UML editors
  • Ease of deployment.

 

Cons

  • Cannot specify behavior
  • Only possibly to add new constraints, not to remove constraints.
  • Clumsy programmatic usage
  • Cannot modify existing structures

 

 

Middleweight extension

The middleweight extension mechanism extends UML through specialization of UML meta-types.  Middleweight extensions can be considered a first class extension mechanism and as such expose advanced building block concepts such as sub-setting and redefinition (concepts explored later in this document).

Middleweight extensions have two main components:

  1. Extend by referencing UML.metamodel.uml (the merged UML metamodel).
  2. To that merged set of meta-types, add your own domain specific types.

We make the distinction between middleweight and heavyweight based on what gets extended.

Even though middleweight extensions are initially easier to create (because you omit the Language unit merge step in heavyweight extensions) middleweight extensions are discouraged for two main reasons.  

  • You create dependency on a specific version of UML.
  • You extend all of UML even if you may only be interested in certain aspects of it.
  • Implementation classes in the specialized meta-model reference internal UML implementation classes and result in compiler warnings.

 

If you require the flexibility of defining behavior, sub-classing or redefining existing collections, then you should favor heavyweight extensions.   If your DSL on the other hand has much overlap with UML, you could consider using middleweight extensions.

 

It is important to stress that when using such extensions,  you will be creating a dependency on open-source UML implementation and the internal .implementation classes.  That is to say, if UML changes in some way, your newly introduced model elements that extend them might also have to change

 

Pros

  • Easier than heavyweight to create initially.
  • Easy for end user to use programmatically.
  • Can add and modify behavior
  • Can add and modify structure
  • Can add and modify constraints
  • API is domain specific
  • First class extension mechanism

 

Cons

  • Creates dependence on specific version of UML.
  • Difficult to maintain especially if UML specification changes.
  • High development cost
  • File format is non-standard
  • End users must know about 2 factories for creating elements: One factory for creating types from the extended meta-model and one defined by the extension.

 

 

Heavyweight extension

Heavyweight extensions involve reuse by copy and merge instead of reuse by specializing types from a referenced meta-model as with middleweight extensions.

 

Creating heavyweight extensions involve 2 steps:

  1. Select the language units you wish to extend and merge them.  
  2. To that merged set of meta-types, add your own domain specific types.

 

Although the concept and terminology for “heavyweight” extension is not mentioned in the official UML specification, the notion of using package merge to define languages is mentioned and forms the bases for constructing UML itself.

 

So why exactly would you even consider a heavyweight extension?  The simple answer is: “ability to customize and specify behavior”.    With heavyweight extensions you have access to advanced concepts such as sub-setting and redefinition that are used to create UML itself.   Support for these concepts is handled through the customized UML code generator.  The details of these concepts will be explored in a future article but the basics are as follows:

 

Heavyweight extensions are great if you want to allow users to work programmatically with your code by presenting a very clean API or if you want to add or customize behavior.  The downside to heavyweight extension is that it is the most costly approach:  it is the most difficult to develop and its overall usefulness might be limited to those who have intimate knowledge of your API and customizations. 

 

Deploying heavyweight extensions so that user may programmatically create instances of your DSL involve deploying all the plugins involved as opposed to a deploying a single profile as with lightweight extensions. 

 

Pros

  • Easy for end users to use programmatically (only 1 meta-model in the end)
  • Ability to override or customize operations and behavior
  • Selectively reuse UML concepts as required.
  • Constraints are enforced at compile time ( type safety )
  • API is domain specific
  • Can add behavior/constraints/structure
  • Isolate yourself from changes to UML meta-model (you merge a copy into your meta-model)
  • You can stratify your own specialized meta-model for different levels of abstraction and DSL concerns.

 

Cons

  • Costly development (more complex than profiles)
  • Difficult to develop such an extension
  • Difficult to maintain (re-merge)
  • Slightly more difficult to deploy.  Involves deploying all plugins involved.
  • Cannot modify existing behavior
  • File format is non-standard
  • Lose interoperability with other UML based tools since we have a new meta-model.

 

 

“First Class” extension concepts

Subsets

The UML specification describes subsetting in the following way:

Subsetting represents the familiar set-theoretic concept. It is applicable to the collections represented by association ends, not to the association itself. It means that the subsetting association end is a collection that is either equal to the collection that it is subsetting or a proper subset of that collection. (Proper subsetting implies that the superset is not empty and that the subset has fewer members.) Subsetting is a relationship in the domain of extensional semantics. Specialization is, in contrast to subsetting, a relationship in the domain of intentional semantics, which is to say it characterized the criteria whereby membership in the collection is defined, not by the membership. One classifier may specialize another by adding or redefining features; a set cannot specialize another set. A naïve but popular and useful view has it that as the classifier becomes more specialized, the extent of the collection(s) of classified objects narrows. In the case of associations, subsetting ends, according to this view, correlates positively with specializing the association. This view falls down because it ignores the case of classifiers which, for whatever reason, denote the empty set. Adding new criteria for membership does not narrow the extent if the classifier already has a null denotation.”

 

A property may be marked as the subset of another, as long as every element in the context of the subsetting property conforms to the corresponding element in the context of the subsetted property.  The UML2 API and code generator provide support for Java code to enforce subset constraints.

Subsetting comes in two basic flavors:

  • Derived subsets

If a property is a derived subset, then its value or values can be computed from the value of one or more other properties.  That is to say that the values exist in some form somewhere else in the meta-model and that those values can be computed.  Since the values for derived subsets are calculated, users cannot add directly to such collections.

 

Derived properties are often specified to be read-only.  

 

Example:

Package.nestedPackage is a derived subset of Package.packageableElement.   In this case derivation is based on some aspect of the property, namely, the type.

 

 

  • Non-derived subsets

Non-derived subsets also apply to properties but such properties contain values that cannot be calculated directly from existing features. 

 

 Non-derived subsets must be writable.

 

Example:  

Operation::precondition subset Namespace::ownedRule but must be populated explicitly.  Newly added items to Operation::precondition also get added to the Namespace::ownedRule collection.   A precondition has a “subset-superset” implementation meaning that if a constraint was added to the owned rule collection that it would not be added to the precondition list.   However, adding to the precondition collection does mean that it gets added to the owned rule collection.

 

Redefinition

The UML specification defines redefinition in the following way:

“Redefinition is a relationship between features of classifiers within a specialization hierarchy. Redefinition may be used to change the definition of a feature, and thereby introduce a specialized classifier in place of the original featuring classifier, but this usage is incidental. The difference in domain (that redefinition applies to features) differentiates redefinition from specialization.”

 

More simply put, redefinition is a way to narrow the scope of a property or to constrain it.  If you wish to restrict what can be added to an existing collection, you should use redefinition.  In effect, redefinition replaces an existing property.  You cannot widen the scope of a property with redefinition.  Redefinition only makes sense in the context of specialization.  That is to say, for example, that a redefining property must be owned by a uml::Class that is a specialization of another uml::Class which owns the redefined property.

 

Redefinition can be used to narrow the type of a property by referring to a more specific type.  Redefinition of features which are lists of items replaces the entire list.  That is, any items contributed via inheritance will be disregarded and the redefined list will be recalculated.   If you wish to contribute items to an existing list, you can use a derived union. The name and visibility of a property are not required to match those of any property it redefines. 

 

It should be pointed out that the detailed semantics of redefinition vary for each specialization of RedefinableElement in UML, and that the name of a property which as been redefined does not have to match the one redefining it.

Example:

In the following example, Class::superClass redefines Classifier::general.  In this case, the name of the redefining property has changed as well as the type!

Figure 6: Redefinition

 

Derived Unions

The UML specification defines derived unions in the following way:

“A property may be marked as being a derived union. This means that the collection of values denoted by the property in some context is derived by being the strict union of all of the values denoted, in the same context, by properties defined to subset it. If the property has a multiplicity upper bound of 1, then this means that the values of all the subsets must be null or the same.”

 

More simply put, derived unions indicate that a feature is the union of one or more collections or scalar. Derived unions are analogous to abstract methods in Java.  As with abstract methods, derived unions really only make sense in the context of a concrete type defining what is contributed.   Derived unions are useful if you wish to indicate that a particular meta-type defines a feature but that feature is to be defined within the context of other meta-types.   Users of derived unions would create a property which subsets the property marked as a derived union and then contributes more objects to the collection.  A derived union is typically applied to properties on abstract types high up in the inheritance hierarchy. 

 

  • A derived union is read only. 
  • A derived union is derived.
  • A derived union is a derived subset but not vice versa necessarily.
  • Concrete types make derived unions useful by contributing subsets.

Example:

In this example we see that Element::ownedElement is a derived union.  Element contributes Element::ownedComment to that collection.  Package contributes Package::ownedTemplateSignature and Package::profileApplication amongst others.

Option summary

If you have stuck with this document this far you have a sense that the decision making process of how to extend UML is not straightforward or simple.  The following is a table summarizing the various options.

 

 

 

Featherweight

Lightweight

Middleweight

Heavyweight

Keyword support

Yes

Simply used as a tag to highlight certain characteristics

Yes

Part of Stereotype definition in Superstructure Specification.

Yes*

Not directly supported although can be accomplished if implementer customizes display code.

 

Yes*

Not directly supported although can be accomplished if implementer customizes display code.

 

Icon support

No

Yes

Profiles directly support applying and displaying icons.

Yes*

Not directly supported although can be accomplished if implementer customizes display code

 

Yes*

Not directly supported although can be accomplished if implementer customizes display code.

Restrict or constrain existing types

No

Yes

Add additional OCL constraints.  Some OCL integration support.

 

Yes

Add additional OCL constraints.  Some OCL integration support.

 

Yes

Add additional OCL constraints.  Some OCL integration support

 

Extend existing types

No

Yes

Extend existing types through Stereotype Generalization

 

Yes

Extend existing types through generalizations

Yes

Extend existing types through generalizations

Add new types that do not extend an existing type

No

No

Stereotypes must be applied to some existing UML concept to come into existence

 

Yes

Create any type you want without generalization and use freely.

Yes

Create any type you want without generalization and use freely.

Remove existing type

No

No *

You are applying your newly created Stereotypes to the UML domain.

Use of “strict” application of profile.

No

All of UML pulled in.  Your DSL extends all of UML.

Simply extend uml.metamodel.uml

 

Yes

Your DSL extension would extend only the merged language units that you would be interested in.

Development is complicated because of this merge step.

 

Add new properties

No

Yes

Apply Profile, apply Stereotype that adds property. Use setValue() to get and set property.

 

Yes

Getters and setters for newly added properties are created automatically.

Yes

Getters and setters for newly added properties are created automatically.

Remove existing properties

No

No

No*

Can only exclude entire types depending on granularity of Language Unit.

 

No

Restrict/constrain existing properties

No

Yes

OCL constraints.

Yes

Directly by use of Sub-setting and Redefinition.  Indirectly by adding additional constraints in which case code is generated.

 

Yes

Directly by use of Sub-setting and Redefinition.  Indirectly by adding additional constraints in which case code is generated.

 

Add new operations

No

No

Not able to specify behavior in a profile currently.   In the future, you might be able to add behavior through OCL constraints.

 

Yes

You can specify operations and behavior.

Yes

You can specify operations and behavior.

Remove existing operations

No

No

No*

 

Yes*

Since the merged language unit code gets regenerated it is possible to completely customize.

 

Restrict/constrain existing operations

No

Yes

OCL constraints can be attached to operations

 

Yes

 

Yes

Reuse UML concepts

Yes

Yes

Yes

 

Yes

Possible to customize exactly which concepts to reuse by merging your own Language Units

 

Restrict multiplicity

No

Yes

Yes

 

Yes

Remove existing constraints

No

No

Yes

 

Yes

Add new constraints

No

Yes

Add OCL constraints.  Currently no runtime checks

Yes

Add operations that directly restrict operations.  Type safety at compile time.

 

Yes

Add operations that directly restrict operations.  Type safety at compile time.

 

First class extensibility

No

No

Yes

 

Yes

Validation

No

No

Yes

Custom validation stubs are automatically generated.

 

Yes

Custom validation stubs are automatically generated.

Programmatic usage

Easy

Simply apply tag.

Awkward

*Programmatic usage covered in future document

 

Easy

Concepts are straight forward and familiar to developers.

*Programmatic usage covered in future document

 

Easy

Efficiency of running code

Very efficient

Not optimal

Very efficient

Very efficient

Cost of development

Low

Trivial development

Medium

Extension technique is well defined and documented.

High

Developer must understand complex concepts such as redefinition, subsetting, derived unions

 

 

Highest

Must merge language units and developer must understand complex concepts such as redefinition, subsetting, derived unions

 

Ability to evolve

Easy

Easy

Just modify profile and reapply

 

Difficult

Regenerate code

Difficult

Regenerate code

Deploy so end users can work with

N/A

Easy

Deploy profile, user must apply profile and apply stereotype.

 

Easy

Deploy plugins

Easy

Deploy plugins

Dependency on UML implementation

No

No *

Unless the meta-type is removed altogether your extension will be valid.

Yes

Generator model references uml.metamodel.uml which causes dependency on a particular version of UML2.  Depends on uml2 implementation

 

No *

You will be dependent on your merged model based on language units but you control that.

 

 

Conclusion

Congratulations! You’ve made it this far, so you’ve gone through the thought process of selecting the extension technique that is right for you.  Hopefully enough information has been provided to enable you to decisively conclude the right approach for you.

We have explored some of the pros and cons of each type of extension as well as provided a summary of all the techniques.  Detailed information of how to create heavy-weight extensions will be explored in more detail in future articles.

The main conclusion of this article is: The use of Middleweight extensions is discouraged and Lightweight extensions should be your first instinct.   Heavyweight extensions should be used in rare cases where much control is required.

For more information on UML2, visit the home page or join the newsgroup.

References

[1] K. Hussey. “Getting Started with UML2”. International Business Machines Corp., 2004, 2006.

[2] K. Hussey. “Introduction to UML2 Profiles”. International Business Machines Corp., 2004, 2006.

[3] UML Testing Profile. Version 1.0, formal/05-07-07. OMG., July 2005.

 

 

Java and all Java-based trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States, other countries, or both.