Simplifying product line development using UCM streams

UML软件工程组织

2008-08-29 作者：Jason Leonard 来源：IBM

In this article:

Managing Change
Setup for Variant Development
Controlling Changes Across Multiple Variants
Example: Developing a GPS Tracking System
The Problem
The UCM Solution
Tracking the Work of Multiple Developers
Scaling Up
Example: GPS Tracking System with New Capability
Quality Assurance
Streams for Phased Integration
Streams for Efficiency
References
Notes

from The Rational Edge: Concentrating on the configuration management discipline, this article explains how the Unified Change Management (UCM) concept of streams can help organizations support multiple product family projects by reducing start-up timeframes, tracking changes, managing dependencies between product variants, and propagating changes between product variants.

Many organizations struggle with the complexity inherent in developing a product family. This complexity can be partially managed by maintaining a strong focus on system architecture and program management techniques. Concentrating on the configuration management discipline, this article explains how the Unified Change Management (UCM) concept of streams can help organizations support multiple product family projects by reducing start-up timeframes, tracking changes, managing dependencies between product variants, and propagating changes between product variants.

In typical product line development, several variants of a product are under development simultaneously. This can happen for any -- or sometimes all -- of the following reasons:

Products with the same basic architecture are needed to provide different end-user services/functionality.
The same product needs to be deployed on different target platforms (hardware or software).
Older versions of the system need maintenance support.
Development timeframes for multiple versions overlap; for example, Version 2 might still be under development when development of Version 3 is slated to begin.

The simultaneous development of multiple variants presents a significant challenge to the configuration manager: There is a delicate balancing act between the need to share desirable changes across all variants and the need to keep firm control over the configuration of a given product.

In the modern software development environment, the scope, frequency, and volume of change, coupled with continually increasing system size and complexity, makes change tracking extremely difficult, even for a single project. With variants, this task becomes even more difficult. One way to tackle it is by adopting the advanced configuration management practices detailed in Rational's Unified Change Management (UCM)¹ approach.

Based on studies of software development projects in a range of highly effective organizations, UCM was developed to help codify and automate change and configuration management best practices. It is implemented with the Rational® ClearCase® software configuration management tool. By applying the UCM concept of streams, organizations can efficiently develop variants, even in large-scale systems. Streams allow different project teams to work on a common set of artifacts independently, sharing changes only when it is desirable to do so.

Managing Change

In a software development environment, we can classify changes according to two levels of granularity:

Microlevel, day-to-day changes that individual developers make to a component or set of components (e.g., bug fixes or small enhancement requests).
Macrolevel changes that add significant new product capabilities to the system.

When planning development of a new product variant, it is convenient to think in terms of extending capabilities. We would specify a variant in terms of adding a new set of capabilities to the existing set of capabilities provided by a product framework. We may distinguish between the macrolevel and microlevel in a number of ways, but it is perhaps easiest to categorize macrolevel changes as those that are visible from a project management or end-user viewpoint: key features, performance improvements, and so on.

That said, management of microlevel changes is still very important. Customers hate nothing more than to see old defects find their way back into a release. You also want to be efficient about ensuring that bug fixes and small enhancements are propagated across all product variants. A common practice within more advanced software development environments is to create and maintain an architecture with a set of components that do not need modification from one product variant to another. This is usually realized by a layered architecture: Perhaps a "business" layer provides components that automate business logic common across each variant. Similarly, a "system" layer might provide a set of components that perform lower-level functionality (e.g., operating system interfaces, hardware interfaces, middleware interfaces, etc.) that is invariant from one project to another. Unfortunately, it is still possible -- and even likely -- that a component might need to be modified for some, or even every, variant in any of the layers. For instance, there might be subtle modifications in the business logic between projects, or a requirement to support new computing hardware. In this situation, it is much better to automate (or at least partially automate) the propagation of required changes to the relevant variant, than to have to make these changes manually.

So, in summary, the goals of change management for developing product variants are:

Specify variants in terms of adding new capabilities to an existing set of selected capabilities.
Manage change at the microlevel and macrolevel ("product capability").
Streamline the management of microlevel changes.

Let's explore how to use UCM (see definitions in the sidebar) by working through some scenarios.

Setup for Variant Development

Imagine that a development organization wants to port a system from a UNIX platform to MS-Windows, while continuing to support their UNIX-based customers. This is an example of a variant.

Project setup for variant development is a major hurdle for many software organizations. The first issue is often that they do not understand the need for robust configuration management practices to support variants. The organization might maintain multiple repositories (one per variant), each with slight differences -- and no one really understands what these differences are. Under these circumstances, the only way to propagate changes across variants is through "Copy and Paste." Changes to copied data then need to be maintained in each repository. But no one knows which repositories already have the change and which do not. Ensuring a high-quality product release will require a large amount of tedious and time-consuming inspection and adjustments.

A second issue is that, to start development of a variant with an existing set of artifacts, the team leader and/or architect must define a starting point for the development team and in many instances they choose component versions, rather than versions of individual files. This complicates project setup: It is not sufficient to track only component versions, because each variant might modify the same component. For example, specifying Version 5 of a file management component is meaningless if there is a Version 5 for the UNIX platform variant and a different Version 5 for the Windows platform variant.

Before starting work, development teams need to consider the following:

What is the quality of the initial component versions? Preferably, only well-tested and approved changes should be included in the foundation for the new project, so that the project team can avoid spending time on resolving broken builds. So, the team needs a way to quickly select just those component versions that have undergone appropriate levels of quality assurance.
What are the relationships between component versions? A component often has dependency relationships with several other components, so an incorrect set of component versions could cause integration problems. This means that a configuration management approach that focuses exclusively on components while neglecting their relationships is sub-optimal.

UCM Definitions ²

Activity: A unit of work performed by an individual. In UCM. an activity tracks a change set, that is, a list of versions of files created to perform the work.
Stream: A UCM construct that tracks activities. Work within a stream is always performed one activity at a time. Because streams track activities, and activities track file versions, at any given point in time a stream selects a single version of each file that corresponds to all work undertaken for the activities within that stream.
Baseline: Data that identifies the versions of the components, files, and directories worked on. Baselines provide a consistent foundation for a team to work from, including the versions of components, files, and directories that constitute a previous release. Baselines can be applied to streams to identify their configuration at a point in time.
Project: A UCM construct that groups configuration management information, such as which components will be worked on during the project. A project contains at least one stream.
View: A virtual working area showing artifacts (documents, code, directories, etc.). A view is associated with a stream, and the stream specifies the configuration of the view; that is, the stream selects the version of the artifacts the user will see.
Component: A nontrivial, nearly independent, and replaceable part of a system that fulfills a clear function in the context of a well-defined architecture³. In the context of configuration management, a component groups a set of related directory and file elements. Typically, elements that make up a component are developed, integrated, and released together.

Large projects can spend an enormous amount of time determining the answers to these questions, so what is needed is a streamlined approach.

Using UCM streams represents a very efficient approach. Each stream specifies the right version of every file. This greatly reduces the need to track file or component versions, and the dependencies between these versions, because the stream represents the foundation for a project in terms of a single entity.

Of course, a developer might not want the latest changes made in the stream if they have not have been tested. UCM streams have a baseline to record their configuration at a point in time, so the team lead can select not only a stream, but also a baseline within that stream, to provide a stable foundation for team members to work from. The stream concept represents a change in mindset for many experienced software configuration managers, most of whom are used to thinking of baselines as versions of components. But UCM extends the concept to improve efficiency: It defines a baseline for a stream, which in turn defines baselines for the components in that stream.

With this stream-based approach, larger projects can use UCM components -- rather than a myriad of small logical components -- to represent large subsystems.

Controlling Changes Across Multiple Variants

Now, let's imagine that we have a microlevel change (e.g., a bug fix or small enhancement) to make. If we are developing variants, there are three possibilities:

Case 1: The change is required by just one variant of our product; for example, the change might be specific to software running on the MS-Windows platform.

Case 2: The change is required by all variants; for example, the change might relate to how a calculation is performed, which should be platform independent.

Case 3: The change is required by selected variants; for example, a change might be required only for the variants created for Japanese and Chinese customers.

Case 1 is relatively easy to deal with. So is Case 2, provided that the project team has a good architect; the calculation could be compartmentalized into a single component that is shared among all variants. Case 3 requires selectively propagating the change to multiple variants, which is where UCM can really help out.

Let's review in detail how UCM assists teams in each of these situations.

Under the UCM model, the team would walk through the following steps:

Step 1: When starting on a project, each team member "joins" the project. This creates a view for the person to work in. The configuration of that view is based on a new stream that tracks the changes made by the team member.

Step 2: An activity is created (perhaps by the team leader) and assigned to the team member.

Step 3: This team member accepts the activity within the context of the stream and commences work. As the stream has preconfigured the working area (view), development can start immediately.

Step 4: Eventually, the change is completed. At this point, the team member could either take on other related work or baseline the stream to capture the state of the artifacts at that point in time.

Step 4a: In Case 1 (change required by just one variant), there might be integration required with others working on the project, but the work is otherwise finished.

Step 4b: In Case 2 or Case 3 (change required by some or all variants), we would also need to go through integration. This time, we could make use of the automation provided by the tool (Rational ClearCase) to deliver changes to multiple destinations (product variants).

Example: Developing a GPS Tracking System

Step 4b clearly needs additional explanation; let's do this by walking through another example: the development of a GPS tracking system originally created for in-car use. Let's suppose that, although the original system is successful, it has not been widely adopted in the prestige car market. To break into this market, a number of new features are required (i.e., a variant), which in turn require changes to existing software components. In addition, we want to create variants for trucks and hand-held devices.

The Problem

For the hand-held device, the graphics software for the GPS tracking system needs to be modified to deal with the screens used in these devices. The team working on the prestige car project also needs these changes, as they want to adopt similar hardware to reduce sun glare.

The UCM Solution

The UCM solution to this problem is sketched out in Figure 1. To solve this problem, the team used multiple streams:

A stream for each variant.
A stream to group activities common to multiple variants.

If the team had made modifications required to support the screens directly within the hand-held variant stream (by checking out files from a workspace [view] associated with that stream), it would have been difficult to share these modification without also sharing the activities specific to the hand-held variant. So instead, the team used a "Graphics Changes" stream to capture all the development changes and then "deliver"⁴ (to use more UCM terminology) these changes to each variant that needed them -- in this case, the hand-held and prestige car streams.

Using Streams to Manage Variants

Figure 1: Using Streams to Manage Variants

Let's get down to the "nuts and bolts" of setting up this mechanism. First, the developer creates a new view to provide a workspace to modify files and directories, and to build and test the changes. He can either select the Graphics Changes stream or request that another stream be created. For the moment, let's assume he uses the Graphics Changes stream.

To modify a file (e.g., graphicsdriver.cpp), the developer must check it out from the workspace. Of course, he needs to check out the correct version. If he is using UCM, he knows it is the correct version because the developer is working in a ClearCase view (i.e., a workspace), and ClearCase view is associated with a stream, which automatically configures the view with the latest set of file and directory versions. In response to a check out request, the stream selects a file that reflects the baseline (tested and reviewed component versions) the stream is founded upon, plus changes to this baseline.

When the developer checks out a file, UCM forces him to specify the context for his modifications -- that is, the activity to which he has been assigned. Checking in a file creates a new version for that file, and the activity's change set is updated to record this new version. ClearCase also automatically updates the configuration of versions in the developer's view so that he will select this new version. In our example, the developer modifies graphicsdriver.cpp to support new graphics hardware, so the activity might be called "Activity 263: Support New Graphics Hardware model 654 from VeryGoodGraphicsHardware Corporation."

Once the developer makes all the changes, he (or the integrator, depending on the organization's size and preferences) can perform delivery as follows:

The developer) selects the view with the changes, clicks a "Deliver" button, and selects the destination -- in this case the hand-held variant stream.
ClearCase finds the activities that have not been previously delivered⁵ . Then, using the change set for each of these activities, it determines the changes, at a file and directory level, that should be incorporated into the destination stream.
The developer/integrator may review the activities (e.g., Activity 263) and changes (e.g., versions 1, 2, and 3 created on the Graphics Changes stream of graphicsdriver.cpp) to be delivered, and then either proceed or perform further quality assurance procedures, such as peer reviews of these versions.
When the developer/integrator decides to proceed, the destination (hand-held) stream is updated to include all activities created or modified (such as Activity 263) within the Graphics Changes stream since the last delivery. At a file level, this means:
EITHER: ClearCase will update the hand-held variant stream so that a different version of graphicsdriver.cpp will be selected in views that are configured by this stream,

OR: If graphicsdriver.cpp has already been modified in the hand-held variant stream, ClearCase will merge the modifications made by the Graphics Changes stream with the modifications already present. Note that this merging procedure is quite robust for many file types (text, HTML, XML, etc.) but is problematic for binary files. In either case, the user may choose to be informed of the changes made and review the automated modifications before committing to them.

Tracking the Work of Multiple Developers

If the graphics changes required are quite significant, several developers might be assigned to the task. If all of these developers work on the same Graphics Changes stream, even with separate workspaces, issues can arise that impact productivity.

Let's suppose that Bob and Wendy are both assigned to the graphics work and have split the work between them: Bob does Activity 263, and Wendy does Activity 264: Improve Graphics Rendering Performance (Figure 2). Wendy determines that Activity 264 requires modifications to the graphicsrendering.cpp file, so she checks it out and makes modifications. Wendy, being the conscientious developer she is, decides to test these changes. To perform the test, she tries to build the software. Sadly for Wendy, Bob previously checked in his changes (to graphicsdriver.cpp) without testing or even compiling them. As Bob's and Wendy's views share the same configuration, when Bob checks in his changes to graphicsdriver.cpp, Wendy's view may be updated to select the new version. Because Wendy can test her system only if she compiles both graphicsrendering.cpp and graphicsdriver.cpp, her work could be halted by Bob's lack of attention to quality: graphicsdriver.cpp might not compile, or it might cause the system to fail immediately upon execution (Figure 3).

Initial Configuration Before Two Developers Commence Their Separate Activities

Figure 2: Initial Configuration Before Two Developers Commence Their Separate Activities

Configuration After Two Developers Check in Their Changes

Figure 3: Configuration After Two Developers Check in Their Changes

To address this situation, Wendy and Bob could create individual streams, based on the same foundation as the Graphics Changes stream. Their individual changes would then be entirely isolated from each other, and Wendy's productivity would not be hampered by Bob's lack of attention to detail (Figure 4).

Two Developers Using Separate Streams

Figure 4: Two Developers Using Separate Streams

This is detail; let's take a quick step back and look again at the big picture. We have seen that the big advantages to using streams and UCM are that you can:

Quickly propagate a bug fix or enhancement to all product variants that need it.
Improve the productivity of individual developers.

However, we have been using simplified examples that don't touch upon some of the complexities of larger software projects. We'll discuss scaling up in the next section.

Scaling Up

If the projects working to create the product variants are large both in duration and staff size, the variants are likely to move farther and farther from one another and from their common base. This makes it more difficult to craft modifications that can be delivered to multiple variants automatically.

A solution for scaling up is to use streams in an additional role: to manage capabilities. You can control divergence between variants by building them out of more fundamental building blocks. Note that this is different from conventional configuration management practices, which build systems from baselined configuration items; streams use baselines to group activities. The UCM approach provides the rigor of component centric baselines, but more readily provides for development of variants.

In larger projects, streams can be useful on three levels:

Capturing microlevel changes. Once they are reviewed and tested, these changes can be baselined.⁶
Capturing macrolevel functionality (i.e., capabilities). The microlevel streams start from a baseline within these capability streams and are delivered back to them. Again, after QA, the capability would be baselined.
Building up product variants via deliveries from capability streams. After QA, baselines of these variants would be delivered to the customer.

Example: GPS Tracking System with New Capability

Let's continue to follow the GPS Tracking System example, this time assuming that we need to add an entirely new capability (macrolevel change) requirement for the prestige car market: giving verbal directions to the user.

The new capability is developed within the confines of a stream. This means that even though it was originally developed with only prestige cars in mind, it was developed in isolation; so it can be delivered to another stream -- say the stream used for the truck variant -- or another project.

Some questions arise when implementing such a macrolevel change:

What if this shared software component depends on other components? How can we make sure that these are also included?

ClearCase does not need to read or understand the contents of the files it stores to understand dependencies between the files. Recall that activities record change sets; that is, an activity remembers the name and version of each file created as a result of that activity.

Therefore, when the development activities that were carried out within the (prestige car) stream are delivered to the truck project, we get not only the verbal directions component, but also the correct configuration of all other required files. This greatly simplifies reuse compared with the alternative: tracking the correct baseline of each and every required file or component.
How do we know that the versions of these files are ready for use, and have been approved according to our quality procedures?

Activities do not just track modified files; they may also have associated workflow and data that will help ensure that quality procedures are met. For example, if the activity should be signed off before it is delivered to other streams, this can be enforced. This is where Rational ClearQuest comes in; basically, ClearQuest is a configurable workflow engine.
OK, but what if the truck project wanted to "tweak" this software component to fit their specific needs?

That would be a microlevel change. Again, the developers could create a stream to capture the changes in isolation so that they could be integrated with other changes for the truck project, and perhaps also delivered back to the prestige car project if the changes were useful to that team.
What if the verbal directions component(s) depended on other components already in use by the truck project?
We have four possibilities:
1. The truck project uses the same version of these components. No problem.
2. The truck project uses a more recent version of these components. The shared component (verbal directions) would have to be retested to ensure it is not broken by the more recent components it depends upon.
3. The truck project uses an older version of the components. The components in use by the truck project would be updated (automatically) to match those required by the verbal directions component. Retesting the truck project might be required, because other parts of the truck system might be affected adversely by these new component versions.
4. The truck project uses a different version of the components (i.e., it has also modified the components). This might result in automated file merges if the same files or directories were modified. If this happens, ClearCase will attempt to retain the modifications made by both projects. Human intervention is required and prompted for if conflicts occur, such as when both projects change exactly the same line of a text file.

Reuse clearly remains a non-trivial task, but this scenario demonstrates that UCM streams make this goal achieveable by allowing teams to focus on capabilities, not individual files or components.

Quality Assurance

All this talk of streams and automated deliveries from one stream to another is well and good, but it doesn't stop problems from creeping in through various sources, including:

A change that is delivered to integration before unit testing or other quality assurance procedures are completed.
A change that depends on other changes is delivered out of sequence and stalls integration.
A change that is delivered to the wrong stream; for example, the verbal directions capability might be delivered to the hand-held variant by mistake.

Another challenge that arises in larger projects is simply tracking all the changes that have been delivered and integrated. For example, how can we easily generate release notes detailing bugs fixed and added enhancements? UCM provides options to help out. For example, the deliver function can be scripted to enforce any given process, as in the following cases:

The project manager might need to move change requests into an "approved" state before the delivery can take place.
Change requests might have links to other dependent change requests; the deliver script could check that all dependencies have been delivered first.

Because UCM is activity focused, it automatically provides a list of change requests completed between baselines, so generation of release notes and baseline auditing can be partially automated. This automation can then, in turn, provide a project manager or release engineer with sufficient data to decide whether a given baseline should be approved for release.

Streams for Phased Integration

Some organizations use streams to implement phased integration, even if they are not developing variants. For example, a project might be divided into teams focusing on particular system areas, such as capabilities or logical subsystems. Individuals within a team deliver to a stream for "local" integration. When the team has integrated successfully, there is a second delivery to a system integration stream. This allows the project to employ different levels of formality: Small teams can have a low-ceremony configuration management process, promoting collaboration wherever possible but employing more rigor where it matters -- at a system integration level.

Streams for Efficiency

Using UCM streams to capture product capabilities is a logical and efficient way to manage the complexities inherent in developing product variants in both large- and small-scale projects. Using streams, the configuration manager can minimize project startup and integration time by automating delivery of a correct and consistent configuration of the versions of components that provide these product capabilities. This alleviates the need to manually track the dependencies of individual components and keeps the project running smoothly and efficiently.

References

Brian White and Geoffrey Clemm, Software Configuration Management Strategies and Rational ClearCase: A Practical Introduction. Addison Wesley, 2000.

Notes

¹ As implemented in Rational ClearCase v2002.

² I've simplified definitions to fit the context of this article. The Glossary for Rational ClearCase provides more formal definitions.

³ RUP v2002 definition.

⁴ "Deliver" is an automated function in the ClearCase implementation of UCM.

⁵ The Deliver operation may also redeliver activities. An activity might be delivered more than once if work continues on the activity after the initial delivery. At a lower level, this means that extra file or directory versions have been created since the initial delivery.

⁶ This is true in UCM as implemented in ClearCase v2002. Previous versions of UCM were more restrictive with respect to where baselines could be applied.