Human Factors in Visualizing Complex Datasets

 

M.M.Taylor
369 Castlefield Avenue
Toronto, Ontario,
Canada M3M 3B9
mmt@mmtaylor.net

See Slide Presentation

 

 

Introduction

It is often written in human factors literature that one of the problems of electronic display is clutter. There is too much on the screen. It is said that screens full of information confuse the viewer, and should be avoided wherever possible. But humans and other living creatures have evolved in a complex and dangerous world. At almost every moment we are confronted with myriads of fluctuating patches of colour that we see as things, such as people, trees, books. Moreover, we see the relations among these things as situations that represent opportunity or danger, or that we can safely and unconsciously ignore. We see these things; we do not consciously interpret them. Our objective for electronic display should be to allow the user to see the situations that are important or useful at the moment, not to require the user to work them out.

We do not usually worry about display clutter in the everyday world, a world with much more potential for display complexity than can be achieved on any display screen. Why, then, is clutter a problem in electronic display? Why do we have difficulty designing displays that allow people to see what is going on in complex datasets? What principles should we follow in trying to improve the electronic presentation of data? I believe that a large part of the problem is that experimental research has concentrated on what subjects consciously perceive and report, which is necessarily focal material and subject to problems of clutter; but also part of the problem is that we tend to display items that demand attention, part of the problem is that we tend to display things symbolically, and part of the problem is that displays are often generated with little regard for the ability of the user to discriminate among colours, shapes, and patterns. This last aspect is frequently treated in cookbooks of human factors, and I shall not worry about it in this note.

The problem of how to deal with complex and changing data is not new; it is merely manifest in a new guise, in an electronic world unavailable to our eyes, ears and other sensors that have evolved to handle complex and changing data. Our sensors have evolved to help us deal with a world with which we interact in real time, not with a world that we must interpret for the archives in such a way that it may later be reconstituted. We influence the world in which we have evolved while it influences us, and we try to influence it in ways favorable to us. We have not evolved to be passive onlookers. So the question of how to deal with the electronic display of complex data sets can be addressed from two angles:

How has evolution resolved the problem in living organisms, and do the answers apply to the electronic world?

How can we create the kinds of possibilities for humans to interact with the electronic world that they have when they interact with the tangible world?

To answer the first question requires an examination of what is required of a living organism that is to survive and reproduce; all the ancestors of every organism now alive, whether it be a bacterium, a tree, or a person, have passed that test. To answer the second question requires first an examination of the perceptual, cognitive, and motor capabilities of humans, and only later a consideration of the technologies that might take advantage of those capabilities. It is a mistake to consider the available technology first; technologies change, the abilities of people do not.

The problem of making sense of large, complex, and dynamically changing data cannot be divorced from the use that is to be made of the data. What does the user want to see in the data, and why? For any non-trivial set of data, there may be as many uses as there are users and occasions. But no matter what the use, it is not data that the human wants; it is information-an understanding of some aspect of a complex world that the user may want to affect. Is the user concerned with making money from the complex changes in international currency trading rates, with monitoring the financial activities of international terrorists to determine their possible intentions, with evaluating appropriate strategies for introducing a new product to a marketplace, or with predicting the possible effects of proposed taxation policies? The same data sets might be valuable for each purpose, but what the user needs to see will be quite different in each case.

 

Existing in a complex world: modes of perception

We can learn something about the use of complex data by examining how evolution has addressed the same problem in the design of living organisms, specifically, mobile animals such as ourselves. By looking at the general nature of these solutions, we may see ways we can ourselves address the new problem of the electronic world.

All life exists in a turbulent world that buffets it and would destroy even its molecular integrity if not resisted. Animals are able to move out of the way of danger, to acquire resources, and to act on their environment so as to stabilize it, such as by building defensive homes (whether they be shells or castles) or by applying counterforce to disable an antagonist or to compensate for a wind gust that might push a car off a road. To apply counterforce, they require sensor and analysis (perceiving) systems that allow them to assess the state of the world, and they require some kind of systems (such as muscles) that allow them to act on their world.

Nothing in the world can be acted upon unless it is perceived, but always an animal can act on less of the world than it can perceive. The world is far more complex than any animal's musculature can deal with all at once. Sensor systems are less costly in resources than are action systems. This last statement is as true for commercial or military organisations as it is for individual animals. It is the fundamental truth that determines the shape of effective information systems. Neither the animal nor the organization can at any moment be acting on all the aspects of the world that it can perceive.

Since the environment of an animal is more complex than is its ability to act, the things perceptible in the world can be divided into two classes: those the animal is currently acting on, and those it is not, but might (there are, of course other classes, a third being those things the animal could not affect, but they are important only insofar as they relate to things the animal might affect, and can be ignored; and there is a fourth class, things the animal cannot sense but that might affect it, but these must be irrelevant to the behaviour of the animal).

The class of perceptions of things on which the animal is acting (which we call "controlling/monitoring") is far smaller than the class of perceptions of things on which it is not acting at any moment. There must, therefore, be a mechanism that alerts the animal to things that might require immediate action. We might call these things "dangers and opportunities" (the DAO of life), and the mechanism "Alerting." In humans, at least one of the alerting mechanisms is immediately apparent: if one sees a movement "out of the corner of one's eye," one normally looks without having been aware of having seen the movement. The movement might signal a need to act; its detection is done by an alerting system provided by evolution.

Another familiar example of Alerting is the startle you can get if you are immersed in some task and fail to detect the approach of someone who touches you. An unexpected touch could signal extreme danger, and the source of the touch should normally be brought immediately into the set of perceptions about which you might be doing something (the controlling/monitoring set), replacing the interrupted work. In the everyday world, the occurrence of an Alert means that your sensors can be brought to focus on the situation that led to the alert, and that your muscles can be used to do something about it. Likewise in an electronic display system, if some unattended situation should be part of the DAO, the electronic system should allow the user to bring it into focus-to zoom into the relevant data in its proper context.

There are two other ways in which perceptual information is used, to which we give the names "Searching" and "Exploring." Both involve changing the range of sensory data available for use in current or future actions, and look quite similar on the surface, but they are quite different. A simple pair of example questions may illustrate the contrast. "Where has that pencil gone?" leads to "searching," with a possible answer "Into that box," whereas "What is in that box?" leads to "exploring," with a possible answer "The pencil."

Both searching for the pencil and exploring might have led to looking in the box, and the result of looking in the box might be to find the pencil, but searching and exploring are very different. In the box one might have found anything at all, which is potentially valuable information if one is exploring, but not if one is searching for a pencil that is not in the box. When one is "searching," it is so that one can affect some presently controlled perception. When searching, it is of no value to know what is in the box if the pencil one needs is not there. Exploring supports no present need except the need to be prepared for unknown future eventualities.

Humans not only can see the existing situation but also can imagine other situations. Both exploring and searching can be done in imagination, exploring the remembered world or the imagined world of invention. Alerting cannot be done in imagination, because it is the unexpected danger or opportunity in the environment that causes the alert. Controlling/monitoring normally is also done through the outer environment, but "controlling" the imaginary world is what we might call "planning." In planning, one imagines what the results of actions might be if they were to be carried out in the real world.

There are four characteristically different modes of using perception. Monitoring is a sub-mode of Controlling, together they represent time-multiplexed control of the perceptions being monitored and controlled. Planning is a sub-mode of Exploring, because it can be seen as exploring the effects of actions in a space that simulates the world in which the actions might later occur. The modes of perception are listed in Table 1.

 

Type Range Focus Effect
1. Controlling Limited Directed Action on real world
1a. Monitoring Limited Directed Action in imagination on real-world data, 
preparing for possible shift to "controlling"
2. Alerting  Unlimited Undirected Passive reaction to events in real world; 
may move a perception into Controlling/Monitoring 
3. Searching Limited Sensor direction Support for Controlling 
4. Exploring
Limited Sensor direction Passive acceptance of data from real world 
4a. Planning Limited Imagined effectors  Passive acceptance of results of imagined action
 

Table1. Modes of using perception

 

Now let us consider the electronic world in light of these four modes of using data. The electronic world can be at least as complex and as dynamic as the tangible real world, and it may even represent some aspects of the changing tangible world. In military command, the electronic world may include all the information from all the sensors and all the spies over a whole continent or more. That, potentially, is far more than would be available to a human in the world in which our ancestors evolved. And yet the human must perceive and act in this electronic world using only the same sensors and effectors that have evolved to suit the tangible world. Those sensors and effectors cannot contact the electronic world directly. They need to act through surrogates we call "Input-Output Devices," displays, keyboards, drawing tablets, loudspeakers, gloves sensitive to finger position, and the like.

The key to effective interaction between the human and the electronic world is to provide the human with effective means of controlling/monitoring, searching and exploring/planning, and to take proper advantage of the human's alerting systems in directing attention to DAO events in the electronic world. The human user must be able to interact with the view onto the data, to change at least its focus and its detail, and probably its manner of representation.

 
Information in analogue and symbolic representation

If humans can perceive the world only through the systems evolved for the tangible world, then it is important that the useful information in the electronic world be transformed into a form suited to those tangible world systems. The world is relentlessly analogue, but we talk about it as if it contains discrete, labelled objects, with precisely labelled categories. What we see is guided by these linguist, symbolic labels, but is much richer than any language can convey. It would seem to make sense, then, to retain whatever analogue values are inherent in the complex datasets to be displayed, rather than trying to symbolize them and display the discrete representation.

I first came across this problem in the early 1970's in connection with a question of how best to present multi spectral remote sensing data to the eye. The satellite was Landsat 1, which provided data that could be turned into pictures of a swath of the Earth about 160 km wide, in each of four spectral bands, green, red, and two in the near infrared.

At that time, one accepted way to display the data was to use pattern-matching techniques to label each pixel or region as being, say, wheat field, granite rock, or muddy water, and to assign each pixel a colour associated with its label. This technique is fine, if the colours are discriminable, the distinctions among the categories clear (no pixels of rocks in water, for example), and the categories precisely those wanted by the user at a particular moment.

Alternatively, a frequently used analogue display, called a "false colour" image, combined the four bands into a single colour picture by taking three of them and displaying them overlaid, one in blue, one in green, and one in red. Over vegetated areas, the result was usually a picture dominated by reddish brown hues, with some blue tones representing bare rock and human artifacts. People could learn to interpret bright red as vegetation, and could become quite good at understanding the pictures, but better pictures were clearly possible. A winter scene in the Canadian Shield, for example, might show only shades of brownish-grey.

My approach to the problem of displaying Landsat imagery was to ask about the analogue information in the set of four images, noting that as black-and-white pictures, the four often looked alike-very much so in the winter Shield imagery. One could correlate the four, and do a statistical "principal components" analysis to generate four new images, one of which represented the common effects in the four pictures, and the others of which represented three independent kinds of differences among them in order of informational importance. The last of these four derived pictures often looked much like random noise, and was discarded.

The human visual system has three kinds of colour sensors, whose outputs are combined in different ways to form three informationally independent information channels, which we can call "brightness," "red-green contrast," and "blue-yellow contrast" in decreasing order of information handling capacity. So the natural thing to do with Landsat images was to map the first three principal components images onto the three information channels of the visual system, thus maximizing the actual visual information available to the brain for further processing. The result was very effective and was incorporated into commercially available image processing hardware (and some 5 years later was independently rediscovered and publicized in the USA). When this technique was used on the winter Shield imagery, the resulting picture was brightly coloured in ways that allowed immediate identification of the different vegetation, and of different qualities in the ice and snow, none of which could be seen in the ordinary false-colour image.

It is worth dwelling a little on the Landsat problem, because it illustrates an important point. The same information, and perhaps a little more, is available to the eye when the four black-and-white images are shown side by side as when a three-band false-colour overlay is shown, but the brain perceives much more in the false-colour image. The principal-components image allows the brain to perceive a great deal more still, even though its mapping to the human visual processing system is sub-optimal. Even better images could be made, but the principal components image is an incomparably better way for the human to make sense of the electronic world of Landsat than are the individual spectral images or the false-colour overlay that offer (formally) the same or more information to the eye.

The coded image based on recognizing and labelling the pixels by category has the virtue of allowing the user to say immediately "Here is wheat." But it does not allow the user to see that the quality of the wheat grades from north to south, and it does not show the conceptual context of any region. It does not show richer or poorer soil, mixed crops, or stony gradation into the forest surrounds unless a label and a colour has been provided for such a characteristic. If one stands back from some of these images, one can turn them back into a bastardized analogue form, seeing large-scale gradients; but since the coloration is symbolic and therefore arbitrary, the mixtures of different pairs of colours may look the same when the underlying data is quite different, or different when the underlying data is the same. Information is always lost by premature symbolization of data.

 

Time and dimension: Interacting with the dataspace

Much is heard in recent years about "virtual reality" (VR) as a means for humans to interact with large complex databases. Where does VR fit into the context of this discussion? It fits in two ways. Firstly, the ability of the human to move freely in the virtual space provides the kind of control required for exploring and searching, in that different regions of the data space may be brought readily into focus. Secondly, permitting the user to appear to move in a 3-D space that seems similar to the usual tangible world provides both familiarity and the possibility of presenting a larger amount of data than could be shown in a 2-D space. This latter point may seem paradoxical, in that the display surfaces are still 2-D, but the ability of the user to change viewpoint means that material in the background may be readily related to material in the foreground, and that changes in viewpoint can alter dramatically the 2-D projection of the 3-D space. Humans have evolved 3-D visualization using these effects, and even though the actual display may be 2-D, the effect is to allow the human to perceive the full 3-D space. Let us expand on these points.

The amount of data that can be displayed in a 1-dimensional space (along a line) is much less than can be displayed in a 2-D space (an area). The limitation in both cases is the ability of the user to discriminate the brightness and colour of one pixel from its near and far neighbours, and to locate each pixel with respect to its neighbours. The user must be able to answer the question "What is where" before the question "what does it mean to me" can be answered.

The amount of data that could, in principle, be displayed in a 3-D space is as much greater than that in a 2-D space as the 2-D space exceeds a l-D space. But it is not immediately possible to take advantage of this potential increase, because of the fact that our 3-D perception is based on two 2-D representations in which foreground objects can obscure background objects. The overlay in depth severely restricts our ability also to answer the question "where is it." We overcome this problem to some extent by stereopsis, better by allowing the representation to change over time, a fourth dimension, as the apparent viewpoint changes, and more particularly by allowing the changes of viewpoint to be influenced by the user directly-as by allowing movement in a VR space.

Just as a 3-D representation can display more than a 2-D one can, so a 4-D representation can display more than a 3-D one, and this is true whether or not the user's influence is what causes the changes in the display. Temporal changes in the display do not overlay one another, as do variations in the depth of particular data points. There are, of course, problems: of discriminating which event preceded which other, of distinguishing one long time-interval from another, of remembering what happened over a long time sequence. But despite these problems, animation brings displays nearer to what our sensory systems have evolved to handle, and animations with which the user interacts can be even better suited to our ancestral abilities. But a bad 4-D analogue display might be even more confusing for a user than a bad, complex, tabular written display of the same data.

Allowing the human to perceive a 3-D space is not of much value unless there are visual clues to relate location within the space to some interesting aspect of the data. In the tangible world, the location of a thing is an important element of what it is. Trees do not usually grow upside down on a house roof. In the electronic world, location is arbitrary. Anything can be located anywhere, and the locations at which something is displayed are mutable at the whim of the designer of the display. Navigation is a central and often overlooked part of surviving in the tangible world. It permits exploration to be directed to places that are likely to be of interest in the future, and it permits search to be directed to places likely to hold the desired information. One does not usually search for the misplaced pencil high in the branches of a tree. The link between "what" and "where" should be high among the priorities of the designer of a VR system for data access.

A VR space need not have the Euclidean geometry of our tangible space. It is easy to design a space in which the room one leaves through a door is not the room to which one returns through the "same" door. It is equally easy to provide a multidimensional maze of interlinked "rooms" containing data of particular kinds, or views of different kinds onto the same data. But is it beneficial to do so? Sometimes it may be, sometimes not. The same kind of considerations apply as in the case of the Landsat data. The VR space must be matched to the human's need.

The essential fact of life is that the actions of a living organism stabilize its environment, as represented by its perceptions. It controls its perceptions, and that means that there must be ways in which the user of a complex dataspace can manipulate the content of the space, even if only by altering the way it is perceived. The trivial change from a 2-D to a 3-D view has to be accompanied by ways of changing the 3-D viewpoint. It is much less trivial to change a commander's view of a battlefield from a simulation of a bird's-eye picture to an image of the state-of-readiness of friendly and hostile troops, to a listing of available armaments, to a visualization of anticipated transit times and supply security, and so on. The views can be readily changed, but how does the commander find out that some particular view would be useful, even if he does have the mechanism to achieve it? This kind of problem rarely arises in the tangible world that provides the model for a VR world.

 

Alerting

We have, directly or indirectly, addressed some of the more obvious issues of three of the uses of information: controlling/monitoring, searching, and exploring. Alerting provides both the greatest opportunity and the greatest challenge to the technology of the electronic world. The user can be but is not normally involved in detecting the alerting (DAO) event, inasmuch as the part of the electronic world in which the DAO event occurred is probably not immediately accessible to the user, and if it is, there is a good possibility that the DAO event happens in a part to which the user is not paying attention. In the tangible world, evolution has provided us with some solutions. We look at only a very small portion of the world that is visible, an area perhaps 1 degree wide centred on the direction of gaze. But if something happens within perhaps 60 to 90 degrees of the direction of gaze, we may look at it, if only for a flash. Rapid visual changes in an otherwise quiet environment are alerting. But even that covers less than half of our surroundings, so we have ears that can hear the approximate direction of sounds, at least well enough to allow us to turn and look for the source of the unexpected noise.

We can think of using a similar approach in our electronic world. Several different interfaces have used sound as an alerting mechanism with visual presentation of detailed information. In the "Media Room" at MIT in the 1 970s, for example, one wall showed many windows into which the user could "zoom" for more detailed information, but the whole dataspace was much larger than the wall could accommodate. The user could shift the "direction of gaze," bringing new windows onto the wall from one side while dropping them off the other. Windows on or near the wall produced characteristic sounds (a crowd scene window might make crowd noises, an industrial one factory noises, and so forth). The user might be able to determine from the available noises whether a window of interest was nearby, even though it was not visible. This idea could easily be adapted to a full sphere of available space, with the "direction of gaze" specified by the user pointing with finger, or perhaps by head movements.

Auditory alerting was used by William Gaver in a simulated control room for a bottling factory. Two "controllers" in communication with each other were asked to keep the "factory" running smoothly by assuring that the right supplies were delivered at the right times and quantities, that the flow lines and machines did not fail, and so forth. All the necessary information was available in mimic diagrams on the operators screens, but performance was much better when clanking, gurgling, dripping, grinding or thumping noises were introduced to simulate the sounds of the different parts of the factory. When the sound of the factory changed, the controllers detected it much more rapidly, and were able to correct the fault better than without the noises.

In these cases, the electronic world did not act so as to provide specific alerting signals. Rather, some varying state of the electronic world was displayed in such a way as to be useable by one of the natural human alerting mechanisms. In a more complex electronic world, this simple use of the alerting mechanism may not be possible. It may become necessary to use electronic agents, corresponding to the assistant who you may ask to warn you when some event occurs. The electronic agent monitors the electronic world for events that suggest that the user should be alerted, and when the event happens, provides a display suited to the user's alerting mechanism, such as flashing something on a screen, making a noise, or changing some colour. In a really complex electronic world, there may be thousands or millions of such agents, whose job is to shield the user from the vagaries of the electronic world except when they might be of interest. The user may look at any aspect of the electronic world, but need not look at any more than the part of current interest, if it is assured that "interesting" conditions elsewhere will be brought to notice.

 

Continuous versus discrete variables

In acting on the world, there are two kinds of possible result: a variable changes its magnitude, or a condition changes from one state to another. The first kind is exemplified by a change in the setting of a thermostat; the room temperature may go up or down by any amount, large or small. The second kind is exemplified by a door being locked or unlocked. Even though the position of the ward is continuously variable, the state is discrete: it determines whether one needs a key in order to pass through the doorway.

It was pointed out earlier that the act of labelling involved a drastic reduction in the information available in a display. But the converse is that labelling can bring out aspects of a situation that are hard to represent in any other way. Why, after all, do we use language so much? If we could readily communicate in analogue ways that convey more detailed information, why do we so often use discrete, symbolic, language instead? And why is this RSG so concerned with ways to visualize what has been said in vast streams of written or spoken language?

The answer to most of these questions is that language is much better than analogue display for communicating specific things. It is easier to understand a written "X is growing too much faster than Y" than it is to see, in a complex display, that this relationship is the important perception in respect of which some alert has occurred. There is usually too much else in the display that might be relevant.

One of the beneficial aspects of using computers is that they can be used to check huge changing datasets for the occurrence of certain conditions. They can supplement the human perceptual alerting system, which might be able to detect significant conditions in whatever is displayed at the moment, but cannot do so in the portions of the data that are not displayed. When one of a myriad of automated alerting detectors determines that an alerting condition exists, it has to communicate to the human user what that condition is. Since there might be an indefinitely large number of possible alerting conditions, of many different kinds, communication is very important. There may be little option but to do most of it symbolically, and probably linguistically (including visual "gestural" display). The "alerting agent" must first draw the user's attention by displaying to the human's alerting system, and then communicate to the user the reason for the alert.

Situation displays also may require some kind of symbolic representation. It may be essential for a commander to know which specific battalion is represented on a display, and even the fact that it is a battalion is a symbolic matter. A battalion is not just a certain number of soldiers with certain equipment, it is an organizational structure. The fact of the structure is hard to represent in an analogue way.

The facts that are easy to indicate symbolically but not in analogue form are usually facts imposed by human fiat, not facts organized by nature. Nature knows no battalions, no names. And if the structure of our language is any guide, we find problems in dealing with relations among more than three things. A critical triplet relationship we find in language, and around which our computer programs are built, is "If X then A, else B." But we do not have any corresponding four-way relationship, and it is hard to imagine what such a relationship might be. Another suggestive aspect of at least the English language is that we have verbs that relate one, two or three obligatory noun forms, but no verbs with a higher number of links. (Of course, any one of the noun form items related by a verb may itself be a relationship.)

 
Speculation: Limitations on Relationships

There may be a deep reason why we deal with relationships of no more than three items, or it may just be a limitation of our brain capacity. Either way, our displays should help us to take advantage of three-way relationships. Here is a speculation as to a deep reason for a limit.

Imagine all the nouns of a language as points in some space, disconnected from one another. A verb provides links among the nouns. A verb with one obligatory noun form makes no link between nouns, but a verb with two obligatory noun forms provides a single link between two nouns. A verb with three obligatory noun forms provides two links for each of the three nouns, forming a triangle in the space. We can treat the nouns and the verbs that link them as a graph. The graph is active, in the sense that when a noun is used in conjunction with a verb, something happens in the brain, changing the perceived state of whatever is represented by the noun. That change may affect the states of other items that might be represented by nouns linked to it by possible verbs, in a cascading chain of effect. Any one of these changes could be verbalized, could form part of a longer text, could form part of a dialogue, or might simply occur in nonverbal imagination.

Stuart Kauffman (At Home in the Universe, Oxford University Press, 1995) has shown that the number of links to each item in such a network is critical in determining its behaviour. If the number is greater than two, the network almost inevitably behaves chaotically, everything affecting everything else. If it is less, the activity quickly dies away. There is a "phase transition" at about two links per item. At the phase transition the network behaviour is sustained, but constrained to relatively small portions of the network.

The analogy to the networks described by Kauffman may be a little strained. But it is not unreasonable to suppose that there is a similar possibility of a phase transition into chaotic thinking occurring if we have any substantial number of four-way relationships in our conceptual networks. We have survived with a linguistic limit of three, perhaps because three is necessary if we are to consider conceptual possibilities (if-then-else), and because more is dangerous if used promiscuously. All this is speculation, but it suggests that the display of three-way relationships may be an important aspect of data display, whereas the display of more complex relationships probably is not.

In any dataset one can conceive of possible relationships among any pair or triplet of data items. The number of two-way relationships grows as N2, of three-way relationships as N3, and of all relationships as eN. Of these, almost all are of no interest whatever to any specific user, but

any one of them at least up to a three-way relationship-may be of interest to some user. And here, there are opportunities for analogue display. We can, perhaps, display the magnitude of a relationship such as the degree of danger to friendly aircraft from ground fire as a colour in a 3-D airspace. In such a display, the value of the relationship is intrinsically linked to a point in a real 3-D space (making a three-way relationship among aircraft, ground fire and location), but there is no need for it to be so. The "location" in the space can be the value of any variable at all, even the name of stocks whose price is being related to some element of management strategy. Such displays are used to good effect by Visible Decisions for financial and stock market data (http://www.vdi.com).

 

Symbolic and analogue display

One of the characteristics of symbolic display is that the symbol requires attention of some kind. Perception of a symbol is ordinarily of the controlling/monitoring kind, with a rather small limit on the number that can be dealt with at any moment. The "problem of clutter" is a problem of symbolic representation. If, in some way, one can convert a particular symbolic representation into an analogue representation, then there is a chance one can display it in some form that allows the user to take advantage of the evolved sensor systems to see important things in a useful context, to search, to explore, and most importantly, to be alerted passively to Danger and Opportunity (DAO) regions of the dataspace. That is the essential core of systems like those embodied in the Galaxy project, in which the conceptual content of documents is represented as points or volumes in a multidimensional space, displayed as points, vectors, landscapes, and colours.

Starting with an analogue representation of the approximate conceptual content of myriads of documents, a user might be able to perceive patterns among the documents, or to zoom into regions of ever more precise analogue representations, until the actual symbolic content represented was small enough to be viewed directly. An analogy might be a geographical map, seen first at continent scale, and then progressively zoomed into country scale, city scale, street-block scale, and then into an alphanumeric listing of the usage of municipal utilities by a single household.

The ability to zoom into an analogue representation of symbolic material can support either Search or Explore perceptual modes. It does not directly support Alerting, though an Alert should be able to pinpoint a region of the conceptual space wherein the Alerting event occurred, provided the event is not widely distributed in the representation being used. Zooming does not support controlling/monitoring, at least not of the symbolic content itself. Nevertheless, in many military situations Searching and Exploring/Planning are the more critical modes, since it is the uncertainty of situations-missing information-that creates the most difficulty. Appropriate analogue displays can ease the problem of finding the critical information in a Search, or of studying the conceptual terrain in an Exploration.

 

Briefing

To this point, we have considered the extraction of information from raw data. But the RSG also has taken displays for briefing to be part of its mandate. The requirements here are quite different. From the massive dataset, someone has extracted information thought to be valuable to the prime user, and the problem is to communicate this information. Formally, the issue is like that of Alerting; an autonomous agent has analyzed some aspect of the data and found it likely to be in the DAO class, worthy of being reported. But in practice the two requirements for communication are rather different. An autonomous Alerting agent is one of many, each looking for a specific kind of DAO event, whereas a briefing agent supports either the Search or the Explore perceptual mode, reporting the answer to a "what is there" question.

There is similarity between the Alerting agent and the Briefing agent, however, in that each needs to communicate with the prime user the results of interpreting the massive dataset. This implies that symbolic presentation is likely to be useful, or even required. There is a difference, in that when a DAO event is observed by an Alerting agent, the agent reports only the location and nature of the event. The prime user must then interpret the raw data in context, shifting the relevant region of the data momentarily into Monitoring perceptual mode to determine whether the DAO event warrants fully shifting the focus of the Controlling/Monitoring perceptual mode. In contrast, in a briefing, the prime user is never expected to be confronted with the raw data. Instead, the interpreted data is displayed, and the prime user usually has the opportunity to query the briefing officer if a question about it arises. To ask questions requires that the questioner have a context for the matters in question. The briefing display therefore needs to provide the interpreted information in context.

A simple example of a briefing display that incorporates context might be a trip-routing map provided by an automobile club. The actual information is "take this highway; turn right at that highway after so many kilometres; ..." but the trip routing map displays much more. It shows towns on the route, other intersections with turnings not to be taken, and possibly sights and sites of interest along the way. All these provide context, which makes the actual briefing information much easier to understand and to use during the trip.

Likewise, a largely linguistically presented briefing display may be cast on a background of the raw data from which the information was extracted. A financial briefing pointing out investment opportunities might well be done with actual figures and numbers, but displayed against a background landscape of investments not recommended. The prime user might then query the adviser as to why other apparently profitable investments had not been recommended, and in any case would have a better understanding of the briefing than if the symbolic material had been presented alone and unsupported. Context matters a great deal in understanding situations, particularly evolving situations. Analogue displays are often better at providing context than are linguistic/symbolic ones.

 
Conclusion: Some ergonomic aphorisms

The bottom line is: don't be afraid of putting too much into a display; be afraid of putting too many items into it that a user has to consider individually. It is patterns in the data that the user usually wants to see, and from the patterns the user may want to see individual items. People have evolved to see spatially bounded objects, and are good at seeing even obscured patterns that might be objects. Colours in the world usually indicate the nature of an object, and should be used in displays for that purpose. Rapid changes tend to be alerting, as do brightly contrasting colours if they are few and far between.