Background and progress of Panel 8/RSG-30

M. M. Taylor

970415

Sent to N/X members, as background reading.

 

Nature of the problem

 

RSG-30 was initiated largely because of the interests of Canada (CSE) in discovering militarily and politically important information in textual data sources. The data sources might be archival or they might be incoming streams of open source (newswire, etc) and other textual data. The volume of the data source material can be very large (millions of Megabytes) and the "interesting" information might be both subtly distributed among many sources and varying according to changing political and military requirements.

 

There are no known algorithms for extracting information of the kinds required from such data sources, and CSE employs highly skilled analysts who read and evaluate the material. The volume of material is such that they cannot see more than a tiny fraction of it, using their intuition to determine which items should at least be given a preliminary glance. CSE hoped that there might be a technological approach to assisting the skilled analysts to discover which items are the "interesting" ones. This is a subtle and difficult problem. To one analyst at one moment, the price of corn futures, the weather in Karachi, and the launching of a spacecraft may be closely connected and interesting, whereas to another analyst or at a different moment, they are totally unrelated.

 

CSE determined that other countries had similar concerns, and an Exploratory group was formed. The work of this EG showed that the problem went well beyond the extraction of information from textual data sources, and that several directions of technological approach seemed promising. (For example, software development and maintenance is a high-priority topic; the visualization of the relationships among the components of a large software system are prime candidates for visualization techniques).

 

A workshop was held in Brussels (at which I was the keynote speaker). The workshop considered some technologies under study, and some potential applications, but it largely concerned the work of the future RSG. The Report of the 1994 Workshop is now "out of print", but a limited number of copies

 

One of the agreements that came out of the Brussels workshop was that although technological development was essential if the serious problem was to be solved, the problem itself (extracting information out of masses of data) was essentially a human factors problem. It was clear to all participants that "Visualization of complex data" did not mean "Screen display of complex data." It means "Getting the picture in oneās head" or "understanding the import" of the data.

 

National participation

 

At the Brussels workshop, the US participants were quite active in addressing the structure and work of the future RSG. There are, however, political turf battles in the US, and nobody is able to assert responsibility for US participation in the work itself. The person with the official authority to allow the US to participate seemed unwilling to do so. Consequently, despite the obvious enthusiasm and interest shown by the US participants (some quite senior) at the technical level, at the political level the US is not at present particpating in the RSG. They are, however, kept informed.

 

The official participating nations are Canada (lead nation), UK, Denmark, Belgium, and Spain. Other nations have expressed interest; one has said that when it drop out of another RSG, it will probably join RSG.30. At the meetings to date, Canada and the UK have seemed to be the dominant participants. The Spanish delegate has not yet attended, but promises to do so at the Third meeting, in the UK, May 12-16 1997.

 

Method of work

 

RSG-30 has a method of work I have not seen used by any other RSG, but which promises to be quite powerful. It has set up an auxiliary body called a "Network of Experts" (N/X). The N/X has no official status within the NATO structure, but serves as a body of unofficial advisors to RSG-30. Membership in the N/X is by invitation. All participants in RSG-30 are members of the N/X. Otherwise, it is open to citizens of NATO countries who are invited by existing members, and (I believe) to citizens of TTCP nations. Citizens of other countries can be invited to be members, but those invitations are subject to the approval of the general membership of the N/X.

 

The N/X performs its work largely by electronic interaction among the members. By general agreement, RSG-30 proposes to the N/X issues that the N/X might like to address. The N/X is free to do what it wants, as are the individual members of the N/X, but it is generally accepted that the N/X exists to try to help the RSG to fulfil its objectives. The N/X meets in a face-to-face workshop in conjunction with each Spring meeting of the RSG. The second such workshop will be held along with the May 1997 meeting of the RSG in the UK.

 

Information exchange and site visits

 

Like most RSGās the work is in part an exercise in mutual information exchange, which allows the participants to gain leverage and save on costs by exploiting each otherās knowledge of the problem area. This is in part done through site visits to laboratories doing relevant work. In the case of RSG-30, these site visits are expected to be very limited during meetings that are associated with a meeting of the N/X. To date, there have been almost no site visits in a host country, because the first meeting was in Ottawa along with a N/X meeting, and the second was in Brussels.

 

During the second meeting, the opportunity was taken to visit the Royal Military Academy, but apart from some work on image understanding, very little of what was shown had much relevance to RSG-30. The other site visits were in the Netherlands, at the invitation of NCCCA and of TNO-TM (Soesterberg) and -FEL (Scheveningen) to determine the interests of those establishments in joining or at least communicating with RSG-30. At these institutions we spent more time discussing issues of common interest than in demonstrations of ongoing research or application. So RSG-30 really has not had much by way of site visits to date.

 

Applications and Technologies Document

As you are aware, I was involved in the initial stages of Panel 3/RSG-10 on Automatic Speech Processing, and was its long-time secretary until my retirement. One of the products of RSG-10 that proved very useful, both on its own and as a guide to the work of the group, was a document that evaluated the available technologies, the applications for which speech technology might be useful, the probable value and difficulty of applying the technology to each application, and the research requirements that promised to have the best payoff. A new document of this kind was produced approximately every five years, and was published in the peer-reviewed open literature as well as being a NATO report. RSG-30 decided at its second meeting to construct such a document for visualization technologies and applications.

The reason that a technology and application document is valuable is that it enables a potential user to see how existing or near-future technologies might apply to a real problem, or possibly to see that existing technology provides no cost-effective solution. Likewise, it enables a researcher to evaluate which research issues are key, having potentially high payoff in a number of areas, or which have a direct possibility of permitting the implementation of specific applications. The document does more than this, however, if it is well constructed. It allows for researchers and potential users in all the NATO countries to evaluate what work is being done where, permitting the development of synergistic efforts rather more readily than does the discussion and information exchange around the meeting tables.

 

Joint experiments

 

The core objectives ot RSG30 involve joint experiments that test the application of available (research-level) technology to the kinds of datasets for which they supposedly can be used. In most cases, these experiments will be done in more than one nation simultaneously, to ensure that the results truly are repeatable and trans-national.

 

The first phase is the collection of representative data sets (or of ongoing and available data sources), and of technologies that various nations are prepared to submit for test (or that can otherwise be obtained, the RSG having agreed that resources are not available for direct technological development for this purpose). Datasets to be used in the experiments may have been collected by other institutions for other purposes. An example might be a dataset of Internet traffic that contains instances of attempted intrusion (Denmark has insisted that this dataset be among those used). Another example is the Canadian Hansard, which has the unique distinction of being a large textual database containing diverse topics and being available electronically in an official translation in two languages. Strong techniques ought to be able to point the human analyst to the same interesting items in both languages. Newswire sources and publicly disseminated Internet mailing lists are potential sources of unilingual textual material from sources that continually remain current (it having been decided that no classified material will be used). The source code for large software systems provides a database of an entirely different character, and visualization problems of quite a different nature.

 

Any experiments to be carried out in this program necessarily involve human factors. The technological questions do not involve the ability of computational techniques to extract information so much as the ability of computational techniques to select and display information in ways that assist humans to make sense of it for their ever-changing purposes.

 

Experiments do not have to be conducted by the RSG member institutions or nations. It is anticipated that the RSG will recommend experiments to the N/X, and that members of the N/X will carry them out (remembering that all RSG-30 members are also members of the N/X). Of course, this depends on the goodwill of the N/X, and if that goodwill fails, it will be incumbent on RSG-30 members to find institutions for which conducting the experiments will be in their own interest, or to find money to contract for the experiments to be done. Since the main premise of any RSG is that the work is in the interest of the members, this last possibility may, in some nations, be the first to be considered.

The current (Nov 96) state of the RSG is that database suggestions and technique proposals are being collected. A draft outline of the Techniques and Applications document has been accepted. Responsibility for the earlier sections of the document have been assigned, and some paragraphs drafted. It is hoped that the Third meeting, with its parallel N/X meeting, will result in substantial progress in the development of this document. The document in turn will assist the RSG to focus its attention and that of the N/X onto experiments of greatest military and technical relevance.