Pruning the Tree of Trees: The Evaluation of Notations for Domain Modeling

Pruning the Tree of Trees:
The Evaluation of Notations for Domain Modeling

Mark Simos and Alan F. Blackwell

To appear in the Proceedings of the 10th Workshop of the Psychology of Programming Interest Group

Introduction

Green's Cognitive Dimensions of Notations can be applied not only to programming environments (as described by Green and Petre (1996)), but also to a broader range of notational classes. It is not yet clear how far from the programming domain an activity can be removed before the cognitive dimensions become irrelevant, but there are many interesting notation systems in the boundaries of software design.

One such boundary is the field of domain analysis, which provides methods for describing the context of application for a set of reusable software components. Organization Domain Modeling (ODM) is a domain analysis method that was developed in full awareness of its potential for application well beyond the software engineering context. This paper describes some key features of ODM which have significant implications for notations and visual interface requirements. It then describes some intuitions regarding elements of suitable notations for ODM tasks such as modeling commonality and variability and accessing classified collections. We explore the degree to which these intuitions can be evaluated using Cognitive Dimensions, although the framework for this analysis seems potentially quite complex compared to most software user interfaces. We close with some speculations regarding the nature of this complexity, and how cognitive dimensions might be adapted to explain even ontologically complex intuitions regarding the nature of knowledge representation.

Background

Systematic Reuse and Domain Engineering.

The software reuse community has spent the past two decades or so puzzling over the technical and non-technical challenges involved in helping software engineering organizations learn the "art of not reinventing the wheel" in application development. Systematic software reuse (SSR) is a catch-phrase used in this community to denote a set of technical and organizational processes and disciplines that result in a pervasive "reuse culture" within software development. It is usually contrasted to ad hoc reuse, the informal state of practice in which software work products (implementations, designs, etc.) created in one specific system context are adapted and applied in new contexts (often with less than successful results). In systematic reuse, software components (or, more generally, assets) are specifically designed for "multi-use", i.e., reuse within a given multi-system scope or domain. The process of developing reusable assets (i.e., "development for reuse") is generally termed domain engineering and involves both development and cognitive tasks quite different in nature from conventional application development.

Domain engineering is only one component of the overall set of processes required for a sustainable systematic reuse culture. Application development with reusable assets (i.e., "development with reuse") also changes the typical development life cycle, as it includes the steps of systematically considering the reuse of available assets, generating requirements for new assets, and capturing experience to fold into new assets. In addition, an "asset brokerage" role is required to mediate between asset creators and asset utilizers. This overall process model for reuse has been documented extensively in (STARS 1993).

A domain-specific focus is critical to the success of this overall organizational model. Assets are aggregated in collections or asset bases organized around specific domains. In the context of systematic reuse, domains can cross-cut specific systems or projects in complex ways: domains are characterized by analysis of multiple systems, while a single system will generally encompass the scope of concern of multiple domains (including non-structural aspects such as security requirements). Domain engineering is generally viewed as a key enabling technology for systematic reuse.

ODM

Synquiry Technologies, Ltd. has been at the forefront of research work in the area of software reuse and domain engineering. In particular, we have developed a comprehensive and systematic methodology and process model for domain engineering called Organization Domain Modeling (ODM). ODM has been described in detail elsewhere (Simos et. al. 1996). In this paper we will focus on certain challenges in the representation and notation of conceptual information presented by the ODM approach to domain engineering.

In the past decade or more, a number of both informal and formal methodologies for domain engineering have been developed (Prieto-Diaz & Arango 1991) ODM differs significantly from these other approaches, most of which intermix process aspects addressing distinct domain engineering concerns with binding commitments to specific system modeling representations or methods. In ODM, the domain engineering life cycle is separate from and orthogonal to the system engineering life cycle, and there is also clear distinction between the system models which describe behavior of software systems within the domain, and domain models that describe the overall semantics of the domain itself. Achieving this separation in practice involves complex issues in notation, representation, and interface design that have, to date, been significant barriers to successful adoption of domain engineering techniques within the software industry.

Before discussing a proposed framework for evaluating domain modeling notations and representations, it will be helpful to clarify some of ODM’s distinctive features. The following concepts are central to the ODM approach:

Stakeholder context: domains are not defined in the same way as individual systems. Establishing criteria for selecting, bounding and scoping domains requires careful analysis of the organizational or stakeholder context in which the domain modeling activity is being performed. This stakeholder analysis and contextual grounding are a formal part of the ODM process.
Exemplars: domain modeling in ODM involves comparative analysis of specific example systems in the domain, or exemplars. Working with an explicit exemplar set provides an extensional definition for the domain and helps controls the process.
Commonality and Variability: in ODM, domains are characterized by a set of multiple exemplars, and the defining criterion of a domain model is a scoped model of commonality and variability across such exemplars.

The nature of this cognitive task – discovering and modeling commonality and variability across a set of exemplars, representing a contextually bounded domain – makes domain modeling a potentially broadly applicable discipline. In fact, although historically ODM was developed in the context of software reuse and includes many details specifically relevant to software domains, the heart of ODM is a general modeling discipline which could be applied to any knowledge domain.

This discipline relies, in turn, on basic conceptual modeling capabilities which also have relevance outside a strict domain engineering process context. As a simple, non-software example, imagine a detective trying to organize facts and theories about a particular case. The detective might want to track various motives and opportunities of different suspects, who knew each other in the past, etc. This knowledge about the specifics of the case (the "case model" so to speak) could involve interplay of multiple domain models, e.g., the domain of "the range of potential motives recognized by law enforcement officers" or "ballistics knowledge." Both domain modeling and "case modeling" require certain conceptual manipulation capabilities which could be amenable to automated support. Synquiry’s overall technology development mission is in creating supporting infrastructure for a new paradigm of application construction based on this type of modeling.

Requirements for Representations

An ODM-based domain modeling discipline creates a number of distinct notational problems and particular criteria for adequate notations and representations. Some of these are inherent in the cognitive task of modeling commonality and variability: the domain model creates a level of abstraction or reflection in relation to the exemplars. We need visibility into the structure of the exemplars themselves in order to to elicit or discover salient features of domain semantics. We also need ways of manipulating the domain model constructs directly as a design activity. Furthermore, our vision is that domain modeling is inherently collaborative (hence our company name Synquiry, suggesting "synergistic inquiry"). Domain models represent the explicit or (taking a social construction of knowledge viewpoint) at least tacit collaboration of multiple domain practitioners (and other stakeholders).

This creates many subtle and complex issues for notations and visual interface design. The task is similar enough to programming in some respects that we believe insights from the psychology of programming, diagrammatic reasoning, and visual programming research communities will be essential points of reference. At the same time, because the overall structure of the cognitive tasks supported are innovative new applications for computer-supported work, some interesting new interface issues may surface.

Interface issues are compounded when the domain of interest is, in fact, a software-specific domain. In these cases, the exemplars or artifacts being analyzed are themselves system descriptions (or fragments thereof). Thus representational issues arise concerning the relationship between system and domain level descriptions. In particular, because system notations (e.g., OO notations such as UML) provide constructs for describing variability within a single system, there are both advantages and risks in "overloading" such a notation used at a meta-level to describe variability among systems. The problem is not just shifting a level of abstraction, but the potential similarity of notation at the two levels. Many difficulties that have plagued both would-be domain modelers and domain analysis methodologists themselves center on this potential confusion between domain and system level descriptions.

This was a key insight that emerged from a working group on Representing Commonality and Variability in Domain Modeling held at the WISR7 (Workshop on Institutionalizing Reuse) Workshop in 1995 (Simos 1995). In this working group, we differentiated different approaches to domain modeling in terms of their relative separation of system model from domain model. In some cases, a single system model was treated as, effectively, a "prototype" for a domain model. In other cases, a system model stripped of all features that varied across exemplars would be used to capture the "commonality only" view. Other approaches used formal representations (e.g., semantic networks, faceted schemes) for the domain description with varying levels of linkage to relevant portions of system artifacts

It became clear during this workshop that a systematic rather than ad hoc evaluative framework for the notations being discussed would be of great value. The need for such a framework has become more critical at Synquiry, where we are investigating ways of using computer technology to help people in conceptual and qualitative reasoning tasks such as domain modeling.

The problem is not one of simply finding the "right" interface style for domain modeling as a whole. Different cognitive tasks throughout the domain modeling life cycle impose very different interface requirements. As a simple example, in constructing a domain model the user might be an expert, very familiar with the domain of focus. In using a domain model as a navigable classification scheme for an asset base, the user might be a novice attempting to gain a better understanding of the domain. Would the same interface be suitable for these two tasks?

In addition, we need to better understand the possible influence of the specific domains being modeled on the qualitative impact of the interface and representations used. Particularly if the notation and representation support is intended to help us discover salient features and points of comparison between exemplars, then we will want the ability to see both the detailed structure of the exemplars (e.g., system artifacts) and the comparative model. This means that the specific characteristics of notations used for the domain content could interact in a variety of ways with the domain-level interface.

An Evaluation Matrix for Domain Notation Intuitions

This requirement for an evaluation framework can potentially be addressed in terms of Green's Cognitive Dimensions (Green & Petre 1996). Such an analysis constitutes a significant challenge for the cognitive dimensions, however, because of the range of activities that must be supported by notations in a domain modeling context. We propose that the issues involved in this analysis be envisaged as a multidimensional matrix along four axes, adding Tasks, Interface Strategies and Domains to the cognitive dimensions.

Simos et. al. (1996) have made a systematic description of the "Tasks" that are involved in domain modeling. The cognitive implications of these tasks are often implicit in his descriptions, and it is clear that notational support is necessary for many of them. As described below, domain modeling experts have many intuitions regarding appropriate elements of notations for representing and manipulating domain models. These elements can be described as intuitive "Interface Strategies". Although ODM defines an overall framework for the tasks that will be involved in domain modeling, there are also modeling activities that will be specific to particular domains, so it is necessary to recognize a domain-specific axis of variability in the evaluation matrix.

The cognitive dimensions have been described elsewhere (Green & Petre 1996). In the current paper model, we simply propose that the relationship between notational strategies and each of the ODM tasks (in a domain context) be analyzed fully in terms of performance aspects measured by the dimensions. (Of course we realize that the cognitive dimensions in their own right represent a 13-dimensional analysis space. The four axes of this evaluative matrix are in fact meta-dimensions describing the set of multidimensional design spaces in which notations for domain modeling can be discussed).

Constituent Tasks in Domain Modeling

To give a flavour of the broad categories of task encompassed by ODM, we must first note that each phase of the domain modeling life cycle raises its own notational challenges. The notations that are selected must support several different phases of cognitive activity:

the initial selection, examination and reflection on the exemplars themselves;
the definition or discovery of the salient concepts and features for the domain, and
the linkage of these as appropriate to the exemplar set;
the direct manipulation of the resulting domain model itself as an artifact of analysis and collaborative design;
the use of the domain model in creating assets specifically designed for reuse within the domain;
and the use of some classification scheme derived at least in part from the domain model in accessing assets and integrating them into new systems.

This last task has already been investigated by Green, Gilmore et. al. (1992) who described the problems of locating and comprehending code for reuse, based on a cognitive dimensions analysis of support for opportunistic design. They proposed that a "description level" be attached to program code, in which arbitrary attributes and relationships can be recorded in a browsable form. They noted that programmers often avoid comprehension of the code they use, and propose that expert knowledge be represented explicitly in order to assist exploratory design. This knowledge must be encoded in a way that assists users in locating code (for example through faceted classification schemes, multiple views, and construction of analogical links between semantic descriptions). That encoding must, however, minimize viscosity and premature commitment during the creation and modification of the knowledge representation.

Intuitive Notational Design Strategies

There is no emerging set of universally recognized notations for domain analysis (as there is, for example, in the case of object oriented design). Nevertheless, both domain analysis methodologists and practitioners have strong intuitions regarding potentially suitable notations for domain analysis, as revealed in the WISR workshop cited earlier (Simos 1995). The existence of such intuitions should come as no surprise, in the light of studies by Petre and Blackwell (1997) and by Whitley and Blackwell (1997) showing that software designers and programmers have similarly strong intuitions regarding visual representations of their domains of interest.

In the remainder of this paper, we discuss the ways in which such intuitions can be evaluated for suitability as notational tools, in the light of the structures of the ODM method, as well as the critical apparatus of the Cognitive Dimensions. To give a flavour of these interface design intuitions, consider the following examples, gleaned from the WISR workshop and from other informal sources:

Hypercube: in the words of a workshop participant, "We tried to represent some of the features as a "hypercube" or matrixed representation. We quickly abandoned this effort as too difficult to visualize." In terms of the cognitive dimensions, this might be considered an example of a notation that requires unreasonably hard mental operations.

2-D Matrix Representation: "Next we tried to compress multiple factors into a two-dimensional matrix with hierarchically arranged rows and columns. This led to a good deal of confusion because of redundancy in the labels chosen and a large number of dead or non-applicable cells." This notation would appear to be error-prone, as well as having poor closeness of mapping to the task.

Side by Side Views and Thumbnail Views: These presentations of different aspects of the domain model try to address both the cognitive dimensions of visibility and diffuseness, with the expected trade-offs being found between them.

Stacking Views: where there is some occlusion of some part of the views - this has a negative impact on the cognitive dimension of visibility. We might also gain reinforcement of analogical relations resulting from overlap in space – this could enhance the cognitive dimension of role-expressiveness.

Toggling Views (alternating, or cycling through multiple states): When displays of multiple images come fast enough we gain the illusion of movement. So difference between analogous but distinct situations can be represented in some of the same ways as differences resulting from change over time. This is related to the "blink comparators" used in astronomy to help one see what is the same and what different between pictures. These motion perception strategies may provide another kind of visibility, but could also be considered to increase closeness of mapping between the representation and the comparison task.

The Tree Of Trees as an Interface Strategy

This section provides a more extended treatment of an example interface strategy to be used to provide aid in classification of a set of exemplars within a given domain. Within the domain modeling life cycle, it could be used in either creating or accessing domain models. , in part because it is based on deriving the metaphor for an interface specifically from the domain of focus for the collection to be accessed. We tentatively call this interface strategy (or feature of an interface) congruence.

As an example, suppose you were looking at biological taxonomies for trees and shrubs. Suppose this taxonomy itself were depicted in a manner suggestive of the structure of a tree or shrub. In effect, you would be looking at a "tree of trees." What conceptual tasks would be aided by such a device (if any), and which hindered? To clarify the generality of the proposed feature (given that trees are so closely related to classification itself): if you were looking at a butterfly collection the presentation would be structured in some manner analogous to a butterfly (e.g., two wings, etc.); similarly for clouds, cars, etc.

This strategy is discussed here primarily as a "test case" for the evaluation matrix proposed earlier. The strategy is intuitively appealing (at least to its originator!) in part because of the "lure of meta-ness" to which so many programmers are prone. It thus has interesting implications for at least one of the cited challenges in domain modeling interfaces, that is, negotiating the shift between levels of abstraction. It also reflects the philosophical speculation that conceptual information about various domains might have a natural tendency to fit well within domain-specific constructs as organizing structures. Nevertheless, one could immediately start to build a critique of this proposal in terms of cognitive dimensions. However, it is interesting to note that the cognitive dimensions do not address the most immediately intuitive arguments either for or against this idea, which are addressed below:

Metaphor, what purpose?

Firstly, has the tree metaphor actually provided anything of benefit? One of us has proposed that physical metaphors help to understand the "virtual machine" behind diagrams (Blackwell 1996). But unpublished experimental work carried out by Blackwell since then has shown that although metaphors seem to facilitate memory for notational conventions, they give little assistance in complex problem-solving. What is more, the metaphor can interfere with comprehension of the domain. There is evidence that apt metaphors may require increased dissimilarity rather than similarity between source and target domains (Tourangeau & Sternberg 1982)

Meta, for what purpose?

Secondly, we might consider why the tree of trees idea seemed like an intuitively promising candidate to its proposer in the first place? We might consider the fact that self-referential systems are very popular amongst computer scientists. Consider the fate of the many AI projects that disappeared into their own internals as a result of speculations on the ontology of ontologies, or how to carry out planning and problem solving, using test cases from the domain of planning and problem solving programme design. Even children enjoy self-reference, as do adult philosophers (consider Epimenides paradox, or Hofstadter's sentence "this sentence no verb"). Perhaps people just enjoy the frisson of potential infinite regress, especially if they hold computational theories of mind - it is scary to think about things that might leave lesser brains than one's own stuck in a loop.

There may be a more reasonable, if subtle, justification for the appeal of the meta-level in the tree of trees. The cognitive dimensions include the concept of an abstraction gradient whose slope can be both an obstacle to use of a notation (if too steep, and hence difficult to use) or a limit to its effectiveness (if too shallow, and hence unwieldy). What if the intuitive geometries of naive or qualitative physics (Kuipers 1986) applied in abstraction space? If a notation describes itself, this would create a second order rather than a first order abstraction function, and this second order gradient should (in terms of this hypothetical naive physics of abstraction space) have an exponential shape, so that it is both shallow for the newcomer and powerfully steep for the meta-notation expert.

Conclusions

Domain modeling is an extremely challenging intellectual activity, that seems to attempt the most difficult aspects of the software production process, without the fixed constraints on abstraction provided by software engineering and programming language conventions. Given this situation, it would appear that domain modeling has much to gain from the use of diagrams, for the same reasons that Stenning and Oberlander (1995) recommend the use of graphical representations – because they are Limited Abstraction Representational Systems. Domain modeling methodologists and practitioners, moreover, apply their established techniques for the meta-analysis of any body of knowledge when designing notations. This awareness of analytical frameworks makes them very receptive to systems of evaluation such as Green's Cognitive Dimensions. In this paper we have described an evaluative matrix for measuring the suitability of notational intuitions in terms of the cognitive dimensions, but have also shown that proposed notations may fail on intuitive grounds no different from those by which they are proposed. Surveying the boundary between the two approaches returns easily to theories of abstract cognition - but with a possible explanation in terms of the Cognitive Dimensions.

Acknowledgments

Alan Blackwell’s research is funded by a collaborative studentship from the Medical Research Council and Hitachi Europe Ltd. He is grateful to the Advanced Software Centre of Hitachi Europe for their support.

Mark Simos's research is funded in part by a Synquiry Technologies, Ltd. grant from the National Institute of Standards and Technology (NIST) Advanced Technology Program (ATP) in the focus area of Component-Based Software.

References

Blackwell, A.F. (1996). Metaphor or analogy: How should we see programming abstractions? In P. Vanneste, K. Bertels, B. De Decker & J.-M. Jaques (Eds.), Proceedings of the 8th Annual Workshop of the Psychology of Programming Interest Group, pp. 105-113.

Green T.R.G. & Petre M. (1996). Usability analysis of visual programming environments: a 'cognitive dimensions' approach. Journal of Visual Languages and Computing 7:131-174.

Green, T.R.G., Gilmore, D.J., Blumenthal, B.B., Davies, S. & Winder, R. (1992). Towards a cognitive browser for OOPS. International Journal of Human-Computer Interaction 4(1), 1-34.

Kuipers, B.J. (1986). Qualitative simulation. Artificial Intelligence 29(3), 289-338.

Petre, M. & Blackwell, A.F. (1997). A glimpse of expert programmer's mental imagery. In S. Wiedenbeck & J. Scholtz (Eds.), Proceedings of the 7th Workshop on Empirical Studies of Programmers, pp. 109-123.

Prieto-Diaz, R. & Arango, G. (1991). Domain Analysis and Software Systems Modeling. IEEE Computer Society Press.

Simos, M. (1995). Domain Model Representations Strategies: Towards a Comparative Framework. Working Group Report, Seventh Annual Workshop on Institutionalizing Software Reuse. Available at: http://www.umcs.maine.edu/~ftp/wisr/wisr7/dawg-nps/dawg-nps.html

Simos, M., Creps, R., Klingler, C., Levine, L., Allemang. D.(1996). Organization Domain Modeling (ODM) Guidebook, Version 2.0. STARS Technical Report STARS-VC-A025/001/00, Lockheed Martin Tactical Defense Systems, Manassas VA, June 1996

STARS (1993). STARS Conceptual Framework for Reuse Processes. Unisys STARS Technical Report. STARS-VC-A018/001.

Stenning, K. & Oberlander, J. (1995). A cognitive theory of graphical and linguistic reasoning: logic and implementation. Cognitive Science 19, 97-140.

Tourangeau, R. & Sternberg, R.J. (1982). Understanding and appreciating metaphors. Cognition 11(3), 203-244.

Whitley, K.N. and Blackwell, A.F. (1997). Visual programming: the outlook from academia and industry. In S. Wiedenbeck & J. Scholtz (Eds.), Proceedings of the 7th Workshop on Empirical Studies of Programmers, pp. 180-208.

Click to return to Alan

Blackwell's home page.