Finding and Using Strategies for Search Situations in Digital Libraries
The problem of supporting end users in finding and deriving good strategies for satisfying their information need is well known. Depending on the level of system involvement a search system might suggest useful strategies for completing a specific work either upon user request or pro-actively or might offer to automatically execute strategies for very common search problems with little or no user intervention. This paper proposes research on the topic of finding search strategies by analysing search logs and using these findings for suggesting situationally appropriate strategies to the user.
The problem of supporting end users in finding good strategies for satisfying their information need is well known. Despite the advances in making information search technology available to the larger public instead of just search professionals, the effective use of these information retrieval technologies remains a challenge [1, 2, 3]. All information search systems support low-level search actions, but in  three higher levels of abstraction for categorising search functionalities were introduced, namely tactics, stratagems, and strategies. While there has been extensive work done in supporting the user in executing specific moves, tactics, and even stratagems, there still exists a real need for support on the level of strategies [5, 6,7].
DAFFODIL (http://www.daffodil.de) is a federated digital library system aimed at providing strategic support of the user . Evaluations of the system have shown that users mostly employ only simple or non-effective strategies and don't know how to escape from critical situations, or how to start a search when the search goal is unclear and only vaguely defined . They often don't even realize their strategic problems and missed search opportunities.
Depending on the level of system involvement, an ideal search system would suggest useful strategies for completing a specific work either upon user request or pro-actively or might offer to automatically execute strategies for very common search problems with little user intervention. In addition, to help users improve their search experience and results, the system should also raise awareness of the strategic aspect of searching, providing users with the procedural knowledge to form appropriately organised sub-goals, and help them with planning a search. Thus, it should explain the reasoning behind strategic suggestions, i.e., show users why a strategy might be good idea, and how they can use these suggestions to create better search strategies themselves.
For common search tasks, like the task of gathering information for reviewing a paper, it would be useful to provide ready-made strategies that users can adapt to their own needs, or that guide novice users through their searches. In other cases, users might arrive at points where they cannot reach their search goals on their own, and would profit from a system that can recognise the situation and suggest strategies that will help users to extricate themselves from the search problem.
Two interesting challenges present themselves immediately: (1) the task of finding "good" strategies, and (2) the task of selecting the most appropriate strategy for the user in his or her current situation. Starting from this realization I want to concentrate my research on the following questions:
For this, I want to analyse the search sessions of experienced DAFFODIL users, expecting to find common paths or patterns in the extensive existing search logs collected from hundreds of search sessions. These will be used as building blocks for suggested strategies. It will also be necessary to clearly define the search situation of an end user, so that suggestions can be based upon it.
I will first describe the current system of DAFFODIL, specifically the tools and stratagems supported by the systems, as well as the existing logging framework. Following that, I will present a first definition of search situations and detail some of their important aspects. In section 4, search paths will be described, and search strategies can be found and used. After this I will outline the work plan for this thesis, and conclude with a discussion of related work.
Using DAFFODIL as a research tool
DAFFODIL is a front-end to federated, heterogeneous digital libraries aiming for strategic support of users during the information-seeking process by offering a variety of functions for searching, exploring and managing digital library objects [10, 11]. The strategic support is currently implemented in the form of high-level search functions, so-called stratagems. Through the tight integration of the various stratagems within the tool-based search desktop, DAFFODIL already supports users during their search activities.
From the perspective of this research proposal two main aspects of DAFFODIL are of interest: the large number of possible actions that can be combined into complex search plans, and the high level of logging support, which is necessary for finding paths from collected search sessions.
In the following I will briefly cover the main tools that will form the building blocks of search strategies, and then provide an overview of the logging format used by DAFFODIL.
Tools and Stratagems
DAFFODIL combines browsing and searching approaches to information seeking. During their search users are free to choose a search strategy and currently receive no help in formulating one but the system provides access to well-known search tactics or stratagems, available through the various tools of the search system.
The search tool allows for querying a number of federated digital libraries through a simple search form, or by using Boolean queries. The results are presented in a unified, ranked list, with each result or the entire list as a possible starting point for further actions.
By using the extraction tool on a result list, users are provided with lists of new query terms, important authors, journals, or conferences related to their query. Results can be filtered using extracted terms or terms provided by the user.
All the digital library objects, including metadata, authors, journals, or terms, can be stored temporarily in a search basket, or stored persistently in a personal or shared library.
While this survey of possible tool combinations, tactics and stratagems supported by DAFFODIL is not exhaustive, it shows that search strategies within the framework of this system can become quite complex. An important application of finding useful search plans would be to provide users with the ability to successfully navigate through this system, structuring their search plans into sub-goals, and helping them to find appropriate tool combinations to reach those.
DAFFODIL does a large amount of logging, including but not limited to the users' actions during a search session. These logs follow an XML logging schema that will hopefully allow for easy analysis. Not all parts of the schema are already implemented, as this is a work in progress.
Within this logging schema, a search session is composed of a sequence of events contained in distinct search contexts. In theory a search session can contain several contexts, with a context shift indicating a change in the fundamental parameters of a search session, e.g., a shift in the search task, the addition of a second searcher to the session, or a similar change. Currently each session has only one context.
Log events are differentiated by their type, including search and browse events, store events, display events, and others. The event logs further detail the exact tool used in the event, query or filter terms, results, input source (e.g., from which other tool a query term originated), search times, and other important information.
Log Event Types:
Search situations have been defined in previous works, e.g., in [13, 14, 6]. I will use the term search situation to describe a specific point during a user's search that was reached by a sequence of user and system actions. A situation is composed of the current state of the search and the history of the session.
A search state is characterised by system and user parameters. Some of the parameters collected in  are:
Depending on the search system or the service, it might also be interesting to consider the number and kind of search terms used, the sources or collections selected, the results from the last action, or time spent on the last action. As system parameters can be logged easily within the logging framework of DAFFODIL, and methods and tools for analysing them already exist, I plan to concentrate on these without losing sight of the user.
In a specific situation, several more generic situations might hold, i.e., different sub-sets of the situational parameters indicate different situations. Strategic suggestions could be given based on any or all of them, they can be combined into a more complex suggestion, or provided in a (ranked) list, leaving the user to choose between several of them.
An example for such a generic situation in DAFFODIL might be one where the last action of the user was a search action that returned no results. This situation could then be further defined by other parameters or the situations that led to it, as any number of reasons can be the cause:
An additional difficulty is introduced when regarding the situation at the beginning of a search session. At that time most of the parameters that define a search situation are the same for all users. While some insight in the specific situation of the user might be gained from looking at previous searches, or by user modelling, this will only help to a limited extent, especially if the user is new to the system, or previously used the system for different purposes.
When considering the user's side of a situation, either at the beginning or during a search, important aspects are:
These aspects, if deduced by the system or made known to the system by the user through a questionnaire, can substitute for or enhance the system parameters harvested during a search. A search that takes a long time is not in itself an indicator that the user is in a problematic situation, if he or she is willing or expecting to spend this time in pursuit of a search goal.
Certainly the user's information need is an important factor for defining the situation, and it will be necessary to find a way to specify it. Users have differing types of information need depending on what they are trying to find and why they're trying to find it. If the most common information needs for users of a specific search system can be determined, this could also be used as a basis for finding and suggesting common useful strategies to address them [15, 16].
As a first step plans for some common search tasks could be offered, while the user would be asked to specify the search using a short questionnaire. Common search tasks might be the search for a known item, searching to complete bibliographic data, or explorative searching to gain a first orientation. For these tasks simple, predefined strategies could then be suggested. Various classifications of search tasks and work tasks, as well as strategies for them, exist in the literature, and there are attempts to build a unified framework [1, 17].
Search Paths and Strategies
The sequence of search situations and the transitional actions during a search session can be seen as a path along which the user navigates towards the search goal . Every information search can be described by a specific search path. This holds especially true for web information search systems, where these search paths have been researched and mined to some extent [19, 20, 21].
But these paths also exist in search systems like DAFFODIL that provide several different search services, and where an information flow between these services exists and can be observed, i.e., where results or information from the use of one service can be used in another service.
I want to show that over a large number of successful search sessions of experienced users clear patterns will emerge, which could be likened to dirt trails that result from many people treading along the same path. Further, that these "trails" can be used as strategic suggestions to less experienced or novice users. These trails could be found from the collected search paths using pattern mining techniques, as described in [20, 21].
After finding patterns, it will be the next task to describe the resulting trails as strategies. In  a framework for describing and executing search strategies for the DAFFODIL system has been developed that can be used for suggesting not only a next step during the search, but presenting a user with a fully modelled search plan or strategy that can be exploited to reach the search goal.
A mechanism has to be found for suggesting the most promising strategies among those applicable to a specific search situation. Possibilities include a knowledge base of hand-crafted or semi-automatically acquired rules, or a case-based reasoning (CBR) approach. CBR seems like an attractive solution for finding an appropriate strategy based on a person's previous interaction with the system (the case), and has been suggested for this purpose in .
To get a baseline against which to compare strategy suggestions from previous user's searches, and also to evaluate the general usefulness of suggesting strategies, I am collecting strategies from the literature for specific search tasks.
A questionnaire tool is currently being integrated into DAFFODIL that will be used to collect data about the user and his or her current search task. One of several prepared strategies can then be presented to the user based on the answers to the questionnaire. These first strategies will initially be presented in form of a textual, advisory overview or a clickable list of search steps. Later on, they will be modelled using the existing strategy framework, which will allow for (semi-) automatic execution of these search plans or parts of them.
Based on these rather static strategy suggestions, I want to do evaluations of the general acceptance of such suggestions and the potential use of them by the end users of DAFFODIL.
Parallel to the evaluation, strategic suggestions will be introduced for easily recognised problem situations in addition and as a complement to the search task based suggestions derived from the users' answers to the questionnaire. By monitoring the log data collected by DAFFODIL, situational parameters as described in section 3 will be compared to a set of (initially hand-crafted) rules, and suggestions be made based upon these rules. Usually more than one possible suggestion will apply in any situation. In this case they will all be presented to the user to choose from them.
I plan to use a parameter-learning agent that is going to be developed within a student project to derive better rules, which can then suggest the most appropriate strategy for the user's current situation. It should learn which of several possible suggestions in a situation were successfully employed by other users and which not.
In a next step, branching will be added to the existing strategies, making them more flexible and allowing more experienced users to adapt them to their own needs.
With this general framework in place, I want to use the trails that are mined from the search paths contained in DAFFODIL's sessions logs (using, of course, logs from sessions without suggestions). The trails will be fed into the suggestion tool and their success compared to those of the human-designed strategic suggestions. In this way, I hope to show the usefulness of commonly used search paths as a basis for search strategies.
It will be necessary to evaluate if guidance derived in such a way is accomplishing the goal of better search results for the user and an improved search experience. Can users learn to build better strategies on their own by using or benefitting from strategic suggestions derived from the trails of other users? In addition to user studies by interviews or questionnaires, this evaluation could be supported by quantitative measures, like search time or number of actions, number of saved or viewed results, or number of aborted searches.
In  Bhavnani et al. present the idea of a Strategy Hub. A strategy hub provides domain specific search procedures (or strategies) for web searching. They noted the importance of procedural search knowledge in addition to system and domain knowledge, and the need to make this knowledge explicit within a search system. In collaboration with domain experts they identified critical search procedures and made them available to users. They succeeded in showing that these search procedures can be generalised, and that they could improve the efficiency, effectiveness and satisfaction of users.
Belkin et al. describe information retrieval (IR) in terms of information-seeking behaviours or information-seeking strategies (ISSs) [24, 25]. They present a characterisation of such behaviours using a small set of dimensions: goal and method of interaction, retrieval mode, and type of resource. In the course of a single information-seeking episode, users engage in several such ISSs, moving from one to the next. Within DAFFODIL many of the ISSs described correspond to specific tools, and the movement of the user between the various tools is similar to the movement from one ISS to another.
An IR system is suggested that provides mixed-initiative support for specific procedures during information-seeking episodes, composed of complex combinations of ISSs, which change and branch as the user interacts with the system. In  the MERIT system based on these concepts is presented. This system uses scripts derived by case-based reasoning techniques to guide users through an information-seeking episode. The search process is presented as a conversational interaction between system and user.
Another approach was taken by Brajnik et al. in . Based on research that has shown that non-expert users only use simple actions, don't know how to react in difficult search situations, and fail to use information retrieval systems to their full potential, they integrated a strategic help module into FIRE, a user-interface to a Boolean IR system. This module takes a collaborative coaching approach where the interface can be viewed as a problem-solving partner of the user.
The rules used by the system to give strategic help are hand-crafted, as are the suggestions given. In addition, the suggested actions are usually on the level of tactics or (rarely) stratagems according to the classification by Bates . They do not show a larger picture of the search, or suggest a search plan beyond the next step. The system was evaluated and the general usefulness of strategic help suggestions has been demonstrated.
In  Xie presents the results of a study on the shifts in user intentions and information-seeking strategies during an information-seeking episode. It is demonstrated that users engage in different behaviours during a single information search. Four types of intention shifts and three types of shifts of ISSs are derived from the collected data, and developmental directions for IR systems are suggested based on the findings. In particular:
In  Pharo presents the search situation transition schema as a model for information-seeking behaviour. Search transitions are defined as periods of meta-data interaction (search actions), which alternate with search situations in which the user interacts with actual data or documents. An entire search process is then defined as an alternating series of transitions and situations, starting with a transition and ending with a search situation. While Pharo shows the usefulness of this methodical tool for analysing information seeking behaviour in the context of web search systems , it could also be used in Digital Library systems like DAFFODIL that offer a richer set of interactions.
 Bhavnani, S.K., Drabenstott, K., Radev, D., Towards a unified framework of IR tasks and strategies. In Proceedings of ASIST. (2001) 340-354.
 Drabenstott, K.M., Do nondomain experts enlist the strategies of domain experts. Journal of the American Society for Information Science and Technology 54 (2003) 8836-854.
 Wildemuth, B.M., The effects of domain knowledge on search tactic formulation. Journal of the American Society for Information Science and Technology 55 (2004) 246-258.
 Bates, M.J., Where should the person stop and the information search interface start? Information Processing and Management 26 (1990) 575-591.
 Brajnik, G., Mizzaro, S., Tasso, C., Evaluating user interfaces to information retrieval systems: A case study on user support. In Frei, H.P., Harman, D., Schäuble, P., Wilkinson, R., eds., Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New York, ACM (1996) 128-136.
 Brajnik, G., Mizzaro, S., Tasso, C., Venuti, F., Strategic help in user interfaces for information retrieval. Journal of the American Society for Information Science and Technology 53 (2002) 343-358.
 Hsieh-Yee, I., Effects of search experience and subject knowledge on online search behavior: Measuring the search tactics of novice and experienced searchers. Journal of the American Society for Information Science 44 (1993) 161-174.
 Klas, C.P., Fuhr, N., Schaefer, A., Evaluating strategic support for information access in the DAFFODIL system. In Heery, R., Lyon, L., eds., Research and Advanced Technology for Digital Libraries. Proc. European Conference on Digital Libraries (ECDL 2004). Lecture Notes in Computer Science, Heidelberg et al., Springer (2004).
 Schaefer, A., Jordan, M., Klas, C.P., Fuhr, N., Active support for query formulation in virtual digital libraries: A case study with DAFFODIL. In Rauber, A., Christodoulakis, C., Tjoa, A.M., eds., Research and Advanced Technology for Digital Libraries. Proc. European Conference on Digital Libraries (ECDL 2005). Lecture Notes in Computer Science, Heidelberg et al., Springer (2005).
 Gövert, N., Fuhr, N., Klas, C.P., Daffodil: Distributed agents for user-friendly access of digital libraries. In Borbinha, J., Baker, T., eds., Research and Advanced Technology for Digital Libraries. Proc. European Conference on Digital Libraries (ECDL 2000). Volume 1923 of Lecture Notes in Computer Science, Heidelberg et al., Springer (2000) 352-355.
 Kriewel, S., Klas, C.P., Schaefer, A., Fuhr, N., DAFFODIL - strategic support for user-oriented access to heterogeneous digital libraries. D-Lib Magazine 10 (2004). <doi:10.1045/june2004-kriewel>.
 Fuhr, N., Klas, C.P., Schaefer, A., Mutschke, P., Daffodil: An integrated desktop for supporting high-level search activities in federated digital libraries. In Research and Advanced Technology for Digital Libraries. 6th European Conference, ECDL 2002, Heidelberg et al., Springer (2002) 597-612.
 Pharo, N., The Search Situation and Transition method schema: a tool for analyzing Web information search processes. Ph.D. thesis, University of Tampere, Finland (2002) <http://acta.uta.fi/pdf/951-44-5355-7.pdf>.
 Pharo, N., A new model of information behaviour based on the search situation transition schema. Information Research 10 (2004) paper 203, available at <http://InformationR.net/ir/10-1/paper203.html>.
 Bhavnani, S.K., Christopher, B.K., Johnson, T.M., Little, R.J., Peck, F.A., Schwartz, J.L., Strecher, V.J., Strategy hubs: next-generation domain portals with search procedures. In Proceedings of the conference on Human factors in computing systems, ACM Press (2003) 393-400.
 Rosenfeld, L.B., Information needs analysis. Web article (2002) <http://louisrosenfeld.com/home/bloug_archive/000139.html> [2005-06-24, 16:06].
 Järvelin, K., Ingwersen, P., Information seeking research needs extension towards tasks and technology. Information Research 10 (2004) paper 212, available at <http://InformationR.net/ir/10-1/paper212.html>.
 Lalmas, M., Ruthven, I., A framework for investigating the interaction in information retrieval. In Proceedings of the 9th European-Japanese Conference on Information Modelling and Knowledge Bases (EJ-IMKB), Iwate, Japan (1999).
 Catledge, L.D., Pitkow, J.E., Characterizing browsing strategies in the World-Wide Web. Computer Networks and ISDN Systems 27 (1995) 1065-1073.
 Chen, M.S., Park, J.S., Yu, P.S., Data mining for path traversal patterns in a web environment. In Sixteenth International Conference on Distributed Computing Systems. (1996) 385-392.
 Cooley, R., Srivastava, J., Mobasher, B., Web mining: Information and pattern discovery on the world wide web. In Proceedings of the 9th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'97). (1997).
 Frankmölle, S., Strategien zur Suche in Digitalen Bibliotheken. Master's thesis, Universität Dortmund, FB Informatik (2004).
 Belkin, N.J., Cool, C., Stein, A., Thiel, U., Cases, scripts, and information-seeking strategies: On the design of interactive information retrieval systems. Expert Systems with Applications 9 (1995) 1-30, <http://www.cmu.edu/~antoine/11743/eswa.pdf>.
 Belkin, N.J., Interaction with texts: Information retrieval as information seeking behavior. In Knorz, G., Krause, J., Womser-Hacker, C., eds., Information Retrieval '93. Von der Modellierung zur Anwendung. Proc. d. 1. Tagung Information Retrieval, Konstanz (1993) 55-66.
 Belkin, N.J., Marchetti, P., Cool, C., BRAQUE: Design of an interface to support user interaction in information retrieval. Information Processing and Management 29 (1993) 325-344.
 Xie, H.I., Shifts of interactiv intentions and information-seeking strategies in interactive information retrieval. Journal of the American Society for Information Science 51 (2000) 841-857.
Sascha Kriewel is a research assistant and Ph.D. student at the University of Duisburg-Essen, Germany, where he is part of the information systems workgroup. He is currently working in the DAFFODIL project, where his interests lie in information search strategies and information visualisation. Sascha has a diploma degree in computer science from the University of Dortmund, Germany.
University of Duisburg-Essen,