Landwich, "Model for Digital Library User Interfaces supporting Visual Information Dialog Services ", TCDL Bulletin 5.2 (2009)

Model for Digital Library User Interfaces
supporting Visual Information Dialog Services

Paul Landwich

Distance University of Hagen
Workgroup for Multimedia and Internet Applications
Paul.Landwich@FernUni-Hagen.de

This dissertation will present a new approach for the design of user interfaces for the search and retrieval modules for a Digital Library. The central idea is that only a dialog with the system can establish the necessary information context to satisfy an information need. With the introduction of a conceptual model the importance of this dialog within a search is emphasized. In the next step the elementary information sets and activities are derived. An example will then show the applicability and utility of the formalization presented in this dissertation. With the formalization, a cognitive walk-through in the ACM Digital Library is started. The example should emphasise the great challenges for future systems and the possibilities for improvement of existing systems.

1 Introduction

The requirements definition of an interactive information retrieval system can be viewed from two perspectives: First by the existence of an information deficit corresponding to an information need with respect to the user [2]; and second by the mediated provision of stored knowledge through the underlying access support system (i. e. database management, digital library or information retrieval systems). Under optimal conditions the information deficit of a user can be resolved through the use of the provided interactive information retrieval system.

In classical information retrieval (IR) research, the system-oriented view dominated in the past. This view stipulates that a user formulates a query and then judges the found elements according to their relevance. This kind of static IR process was improved and refined in the past. Particularly within the fields of IR engines and result presentation, many innovations were developed in this special setting.

This static setting does not always correspond to the communication and interaction needs of humans. IR systems should explicitly support the cognitive abilities of users in order to realize a dynamic dialog between the user and the system. An information dialog that not only supports an individual query but also the complete search process is necessary. Only this model will support the possibility of satisfying an information need.

The model shown in Figure 1 displays the key elements required for satisfactory information dialog [16]. A central part of this conceptual model is the interactive information visualization cycle. The interactive information visualization cycle supports an information dialog with an information retrieval system, starting with a first query and ending after n cycles with an information status on the users' side that has either eliminated or at least reduced the information deficit. It is important that a dialog is made available through the system in order to activate suitable event handling. The outcome is a new dialog cycle (or a part of it) with changed parameters in the query, navigation, relevance or information visualization. In this way, interactive information processes can be supported. All activities and user derived information sets fulfill the information-dialog context and reduce the information need of the user.

Figure 1: Conceptual Model

In order to fully understand the basis for modeling of interactive IR systems, the complete information seeking process must be understood.

This dissertation will explore the issue that only if the essential activities and conditions of a search task are defined is it possible to develop user interfaces that offer assistance in the form of a set of dialogues. To reach the cognitive abilities of the user, it is reasonable to visualize not only results but also components of the dialog. This makes it possible to represent the information context in all facets.

2 Related work

Coopers [3] has established already distinguished measures for the quality of information systems based on relevance and usefulness. The aim of increasing not only the relevance but also the usefulness of results was presented in the "information behavior model" by Wilson [26], which opened two related fields of research. The first went in the direction of user interfaces of information systems and the second in the direction of information visualization.

Visual presentation, perception and cognitive interpretation of information were considered for a long time [17] as an efficient communication method for informational situations. For this purpose tools were developed that visualize the exploration by means of query construction and (re)formulation [23] and the result presentation [13], [4], [7]. In consequence visualization is not only representation but also stimulation for interaction and dialog. First approaches into this direction were already implemented and studied, e. g. by the LyberWorld [10], VisMeB [18] and prefuse [8] systems.

[11] was motivated by the leading thought for the support of dialog when he introduced a cognitive model of the information dialog and a model of an interactive information visualization cycle supporting an IR dialog ( [9]). [16] took up this approach and introduced a cognitive enhanced model of IR which led to a model for visually direct-manipulative information retrieval dialogs.

In order to optimise information systems, models in which the human users are not only a part of the system (e.g. providing only input) but also become an important component of the system and even its centre respectively were developed. [15] investigated how information seeking and corresponding access can be understood. In consequence cognitively oriented models and approaches to support the concept of information seeking were introduced. Information needs often change during a seeking process due to changes in user awareness. The understanding of the concept of information access was extended and with [1] the first so called information strategies were identified and investigated. Many other works ( [19], [21], [27]) showed the complexity of the information search.

All of these research areas flow into the design of user interfaces. Different works ( [12], [20], [4] investigate this problem and examine the challenges for the optimisation of the human-computer interface.

3 Formal description of the information dialog towards a framework

Results of existing research have been used to design user interfaces. This has resulted in user interfaces not being practicable. As [24] wrote: "Most user interfaces are unknown, in that they have grown from adding features.... Instead, build simple, explicit interaction frameworks to lay the foundation for clear interaction structures." With this basic principle we remember the core element of interfaces, the dialog. [16] showed that the dialog is extracted from the information dialog context, as the basis for managing and coupling the states of, for example, the database management system or the retrieval engine in their consecutive operations. It supports users in reducing their expressed uncertain state of information need.

In order to receive a first simple, explicit interaction framework this research analyzes the basic elements of the dialog of IR-systems. In doing so, we have to ask ourselves the following questions:

What are the possible activities?
How do activities effect our information dialog context?

To answer these questions, this study will describe the different sets of information objects which are present, as well as those that could be created in sequence of activities. Furthermore a formal description of the activities is provided.

By the use of a cognitive walk-through this formalization is illustrated. With a series of representative examples on the basis of a usual retrieval system of a digital library, a reference for the practice is established and will simulate a search.

3.1 Scenario and possible sets of information objects

In order to clarify the definitions, the study outlines a scenario within the ACM Digital Library. In this library we find a large amount of articles. So we define:

Content set: Let O be an information object, e.g. digital paper. Then the content set C of a data source a user has access to is the set of all contained information objects.

On the other hand, a certain number of articles exist, that would help us to solve our information problem. But these do not necessarily have to be contained within the ACM Digital Library.

Interest set: The interest set I exists as a set of concrete information objects that are able to reduce an information deficit. For the simplification we assume I is constant.

So the set of available and relevant articles is defined as

Relevance set: The relevance set R derives itself from the intersection of the contents set C and the interest set I.

In search for relevant articles, we access the ACM and a result set is returned as an answer.

Result set: For a query q onto the data source we receive the result set r_q which is a subset of the contents set C

Figure 2 illustrates the targets of our formal definitions by means of a single query so far.

Figure 2: Dialog State after a First Explorative Query

3.2 Cognitive walk-through and possible activities

Beginning with the defined sets (see 3.1) we start our seeking process. The user has to find a specific article to satisfy his information need. With the walk-through we want to introduce the activities EXPLORATION, FOCUS, NAVIGATION, INSPECTION, EVALUATION and STORE.

In our simulated search the user is looking for a specific article, but he is also interested in articles which deal with the same research field. The specific article concerns the topics ”system effectiveness” and ”search tasks”. He starts a first query with the keywords ”system effectiveness search tasks.” With this starting point the first activity, EXPLORATION, is started.

EXPLORATION: The access to the content set C in the form of a query q and the visualization and realisation of the produced result set r_q defines the EXPLORATION. A change (e.g. an enlargement) of the informal context is caused by this EXPLORATION.

For this first query, the result set r_q contains many information objects. But only a certain part of r_q will be a part of the interest set I and is defined as the

Explored recall set: The intersection of the result set r_q of a query q and the interest set I defines the explored recall set k_e,q

Due to the size of the visualized result set, only a part of the articles can always be captured cognitively.

For this we define:

FOCUS: The focus set f_n,q represents the subset of information objects O_i of a result set r_q which reach the field of vision of the user through a visualisation and is the result of the activity FOCUS.

If there are relevant information objects within the focus set we can define:

Focused recall set: The intersection of the relevance set R and the focus set f_n,q defines the focused recall set k_f,q

In order to identify elements of the explored recall set k_e,q, the user will scroll through the result list and consider single articles in detail. These activities are defined as:

NAVIGATION: The movement within a set of information objects (information room) or between different information rooms. This causes a change of the focus f_n,q.

Our user remembers that he is looking for articles which are published in the year 2006. For this he resorts to the result set, which also causes a change of the focus. This is also a NAVIGATION activity. When the user finds information objects of interest, he will try to get more detailed information about these articles. This activity is called:

INSPECTION: INSPECTION is used for the cognitive determination of the state of an information object.

In the ACM Digital Library the user can open the summary of an article and also open the full text. Following the INSPECTION the user himself classifies the information object referring to his assessment of relevance. This activity is defined as:

EVALUATION: EVALUATION gives the system a feedback of the user’s understanding of relevance and appoints the verified recall set.

Verified recall set: Within different INSPECTIONS the user identifies relevant information objects. This set of information objects is defined as the verified recall set k_v for this query

This activity is not implemented in the user interface of ACM Digital Library, and the next activity is only realized in a rudimental way.

STORE: This activity allows the user to store found documents. It either happens logically in form of a storage box on the user interface or physically when a document is downloaded or printed.

Stored recall set: The stored recall set k_s,q represents the subset of information objects O_i of the focused recall set k _f,q

Even if the user can browse in a result list, he will not always find a result due to a potentially large quantity of found articles. He will be forced to rephrase and run more queries to narrow the search.

Within Figure 3 the result of three separate queries for the moments t₁, t₂ and t₃ is displayed. It becomes clear that every query causes a change of the explored recall set. If we display each query onto the projection plane the union sets becomes visible.

Context: Context r_total is defined as the union of all result sets r_q and stretches the information room in which the user moves

So as the context r_total increases, also the explored, the focused and the verified recall set ( k_e,total, k_f,total, k_v,total ) will grow with every new exploration. This constant extension of the context however cannot be used by most information retrieval systems. So, with every new query, the information gets out of the users' field of vision. But with the presented definitions we are now able to analyze existing systems. This makes it possible for us to improve existing systems or to define goals for new systems.

Figure 3: Sequence of Separate Queries

3.3 Analysis

In Figures 2 and 3, three interactive modes can be identified:

The mode of Access, which causes a change of the informational context set of through access the database by means of a query.
The mode of Orientation, which changes the view onto the informational context set and represents a movement between information objects and rooms.
The mode of Assessment, which identifies information objects of the interest set.

Every mode is totally enclosed and has its own activities. The first mode is Access. Within this mode there is only one activity, Exploration. After the first EXPLORATION the user changes into the second mode Orientation. Activities for this mode are Navigation, Focus and Inspection. The user has now the ability to change the visual as well as the informational focal point in an information visualisation of the dialog context. The mode Assessment is reached, if the user finds objects of interest during his inspection. For this mode the activities EVALUATION and STORE are available. They help to express the users appreciation of relevance and to define the identified recall set.

The projection displays the weak points of traditional information retrieval systems very well. Every exploration produces only a snapshot of the complete seeking process and navigation is only possible within a single exploration step. However, through such a sequence of interactions the users are only able to identify and manage the sum of all relevant information objects. Therefore, tools supporting this demand in the user interface as well as utilising it within the underlying retrieval engine are of great potential.

To support all these types of activities the IR system has to provide appropriate interaction tools in order to navigate within a single exploration step also in the projection plane. If an optimal intersection between focus and recall set is achieved through such interactions, further tools become necessary to support the narrowing down of the focus (up to the single objects again) to be able to support a more detailed inspection, analysis and assessment of the information objects.

Also, it has to be investigated in which way an optimised focus of an information dialog step can be transferred to the next step, e. g. to speed up the focusing process in the next step. Furthermore, the sequence of dialog can also provide insight in the type of information behaviour that users are performing and, in the ideal case, an information strategy can be identified and utilized to derive successive dialog steps from a given starting point. By applying corresponding visualisation and interaction tools, the users are enabled to strategically optimise their information behaviour and reach the satisfaction of their information need faster.

Following the formal description of the information dialog and given the demands, this research will use the DAFFODIL system as an experimental system for further development and evaluation of the above described framework.

DAFFODIL is a virtual digital library system targeted at strategic support of users during the information seeking and retrieval process ( [6], [14]). It provides not only basic and but also sophisticated search functions for exploring and managing digital library objects including metadata annotations over a federation of heterogeneous digital libraries. It already matches the above named demands to a high degree and can be instantly used to verify the new framework.

4 Challenges

With the presented formal description of sets and activities – EXPLORATION, FOCUS, NAVIGATION, INSPECTION, EVALUATION and STORE – it is now possible to illustrate, store and exploit the search process in a series of finite steps. Over the set-oriented description and the derivation of definite sets it becomes clear how the search process for one or more queries satisfies the users need for information.

With our framework following approaches for an implementation in Daffodil are possible:

Search strategy: With the help of the user or by monitoring the activities the system must provide different search strategies to raise efficiency.

Social Database: By monitoring many searches in form of a set of activities, it is perhaps possible to support a user through recommendations. Analyzing a new search from the beginning, the system could be able to identify similar stored search processes. If this knowledge is visualized for the user, he could get benefit for his own search.

The concept of relevance feedback is the most important challenge. The users' relevance assessments must be captured in explicit and implicit form and need to be processed and analyzed to be used for recommendations.

For the implicit feedback on the detail view several features to be measured are already identified, e. g. :

number of clicks within the detail view,
time spent reading the detail,
how the detail was accessed (incoming link), e. g.

via the result list,
via related terms,
via links from within the detail

All this upcoming functionality needs to be presented to the user through new, innovative and user-friendly visualizations.

4.1 Visualization

Four different visualization areas for improvement are identified. Generally, every visualization tool needs to be analyzed and alternatives need to be developed and evaluated, but the focus will be on the following:

Search history: The user must be able to get the full control of his search history and the developed information context.

Relevance Feedback: The user must have the ability to notify and review his assessments of relevance through the user interface in order to understand and change them.

Result list: The visualization of results must go beyond the usual measure. The user needs a portfolio of visualization tools, which approach his cognitive abilities. These visualization tools should be highly interactive so the user can capture the results under different visual angles.

Search pattern/strategies: With a good visualization the user can recognize that he follows a specific strategy. This awareness could lead to new directions in the further search.

5 Next steps

Next steps in this research will focus on:

Theory: On theoretical side the study will deepen the new framework and relate it to existing models on the information seeking process.

Relevance feedback: An exact analysis of Daffodil is necessary to specify possible measured values. A conclusion for implicit relevance feedback can then derived from the measured value. Based on this analysis an implementation for DAFFODIL is the aim of this step.

Visualization: The user must be able to get the full control of his search history and the developed information context. Furthermore the user must be able to see his expressed appraisal of relevance. To satisfy this demand and to illustrate the above described sets it is planned to realize a visualization in form of venn-diagrams (2-dimensional) or information objects clouds (3-dimensional).

Evaluation: To evaluate the framework and the implemented components an adequate method has to be found. Several models (e.g. [25], [5], [22]) are available. In this step these models will be analysed and compared before one of this methods will be used.

References

[1]	N. J. Belkin, C. Cool, A. Stein, and U. Thiel. Cases, scripts, and information-seeking strategies: On the design of interactive information retrieval systems. In Arbeitspapiere der GMD. GMD, Sankt Augustin, November 1994.

[2]	I. Campbell. Supporting information needs by ostensive definition in an adaptive information space, MIRO '95. Electronic Workshops in Computing, Springer Verlag, 1995.

[3]	W. S. Cooper. A definition of relevance for information retrieval. Information Storage and Retrieval, 7(1):19–37, 1971.

[4]	L. Davis. Designing a search user interface for a digital library. J. Am. Soc. Inf. Sci. Technol., 57(6):788–791, 2006.

[5]	N. Fuhr, P. Hansen, M. Mabe, A. Micsik, and I. Sølvberg. Digital libraries: A generic classification and evaluation scheme. Lecture Notes in Computer Science, 2163:187–??, 2001.

[6]	N. Fuhr, C.-P. Klas, A. Schaefer, and P. Mutschke. Daffodil: An integrated desktop for sup-porting high-level search activities in federated digital libraries. In Research and Advanced Technology for Digital Libraries. 6th European Conference, ECDL 2002, pages 597–612. Springer, 2002.

[7]	D. J. Harper and D. Kelly. Contextual relevance feedback. In Proceedings of the 1st international conference on Information interaction in context, pages 129–137, New York, NY, USA, 2006. ACM Press.

[8]	J. Heer, S. K. Card, and J. A. Landay. prefuse: a toolkit for interactive information visualization. In G. C. van der Veer and C. Gale, editors, Proceedings of the 2005 Conference on Human Factors in Computing Systems, CHI, pages 421–430, Portland, Oregon, USA, April 2-7 2005. ACM Press.

[9]	M. Hemmje. Unterstützung von Information-Retrieval-Dialogen mit Informationssystemen durch interaktive Informationsvisualisierung. Dissertation, Darmstadt, 1999.

[10]	M. Hemmje, C. Kunkel, and A. Willett. Lyberworld a visualization user interface supporting fulltext retrieval. In Proceedings of SIGIR ’94, pages 249–259, New York, NY, USA, 1994. ACM, Springer-Verlag New York, Inc.

[11]	M. Hemmje, A. Stein, and H.-D. Boecker. A multidimensional categorization of information activities for differential design and evaluation of information systems. In GMD-Studien, number 1036 in Arbeitspapiere der GMD. GMD - Forschungszentrum Informationstechnik, Sankt Augustin, Dezember 1996.

[12]	C.-P. Klas, S. Kriewel, A. Schaefer, and G. Fischer. Das daffodil system - strategische literaturrecherche in digitale bibliotheken. In 4.ter HIER Workshop. UVK, 2005.

[13]	A. Komlodi, D. Soergel, and G. Marchionini. Search histories for user support in user interfaces. J. Am. Soc. Inf. Sci. Technol., 57(6):803–807, 2006.

[14]	S. Kriewel, C.-P. Klas, A. Schaefer, and N. Fuhr. Daffodil - strategic support for user-oriented access to heterogeneous digital libraries. D-Lib Magazine, 10(6), June 2004. http://www.dlib.org/dlib/june04/kriewel/06kriewel.html.

[15]	C. C. Kuhlthau. Longitudinal case studies of the information search process of users in libraries. Library & Information Science Research, 10:257–304, 1988.

[16]	P. Landwich, M. Hemmje, and N. Fuhr. Ansatz zu einem konzeptionellen modell fr interaktive information-retrieval-systeme mit untersttzung von informationsvisualisierung. In Proceedings ISI, pages 327–332, 2007.

[17]	J. H. Larkin and H. A. Simon. Why a diagram is (sometimes) worth ten thousand words. Cognitive Science, 11:65–100, 1987.

[18]	F. Mller, P. Klein, T. Limbach, and H. Reiterer. Visualization and interaction techniques of the visual metadata browser vismeb. In I-Know, 2003.

[19]	N. Pharo. A new model of information behaviour based on the search situation transition schema. Inf. Res., 10(1), 2004.

[20]	M. L. Resnick and M. W. Vaughan. Best practices and future visions for search user interfaces. J. Am. Soc. Inf. Sci. Technol., 57(6):781–787, 2006.

[21]	D. E. Rose. Reconciling information-seeking behavior with search user interfaces for the web. J. Am. Soc. Inf. Sci. Technol., 57(6):797–799, 2006.

[22]	T. Saracevic and L. Covi. Challenges for digital library evaluations. In Proceedings of the American Society for Information Science, volume 37, pages 341–350, 2000.

[23]	A. Schaefer, M. Jordan, C.-P. Klas, and N. Fuhr. Active support for query formulation in virtual digital libraries: A case study with DAFFODIL. In A. Rauber, S. Christodoulakis, and A. M. Tjoa, editors, Research and Advanced Technology for Digital Libraries, 9th European Conference, ECDL 2005, volume 3652 of Lecture Notes in Computer Science, Vienna, Austria, September 18-23 2005. Springer.

[24]	H. Thimbleby. Press On: The principles of interaction programming. The MIT Press, Cambridge, Massachusetts, 2007.

[25]	P. Thomas and D. Hawking. Evaluation by comparing result sets in context. In Proceedings of the 15th ACM international conference on Information and knowledge management, pages 94–101. ACM Press, 2006.

[26]	T. Wilson. Human information behavior, Informing Science The International Journal of an Emerging Transdiscipline, 3(2): 49-56, 2000.
[27]	Y. Xu. The dynamics of interactive information retrieval behavior, Part I: An activity theory perspective. J. Am. Soc. Inf. Sci. Technol., 58(7):958–970, 2007.