Liu, "Personalizing Information Retrieval Using Task Features, Topic Knowledge, and Task Products", TCDL Bulletin 5.3 (2009)

Personalizing Information Retrieval Using Task Features,
Topic Knowledge, and Task Products

Jingjing Liu

School of Communication and Information
Rutgers University
4 Huntington Street, New Brunswick, NJ 08901
jingjing@eden.rutgers.edu

ABSTRACT

Personalization of information retrieval tailors search towards individual users to meet their particular information needs. Personalization systems obtain additional information about users and their contexts beyond the queries they submit to the systems, and use this information to bring the desired documents to top ranks. Such additional information can come from many explicit or implicit sources, including relevance feedback, user behaviors, contextual factors, and so on. This study looks specifically at the following factors which are likely to provide significant evidence for personalization: the features of users’ work tasks, users’ degrees of familiarity with work task topics, and task products that users generate in accomplishing their tasks. The study explores whether or not taking account of task features and topic familiarity helps predict a document’s usefulness from its display time. The study also investigates whether or not expanding queries by significant terms extracted from task products and used/saved pages improves search performance. To these ends, a controlled lab experiment is conducted to gather related data. Twenty-four participants are recruited, each coming three times within a two-week period working on three sub-tasks in a general work task. Data are collected by two major means: logging software that records user-system interactions, and questionnaires that elicit users’ background information and their perceptions on a number of aspects. The results are hoped to provide significant evidence on personalizing search by taking account of contextual factors and thus to extend the literature in personalization of information retrieval.

Categories and Subject Descriptors

H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval – relevance feedback, search process

General Terms

Performance, Design, Experimentation, Human Factors

Keywords

Personalization of IR, contextual factors in IR, desktop repository

1. INTRODUCTION

As the amount of information on the Web grows, it becomes increasingly difficult for people to find documents that meet their particular needs. Traditional search engines provide the same search results to the same query no matter who submits it, and under what circumstances. Documents that are most relevant to a specific user may not be on top ranks of his/her search result lists. Spending time looking for the desired document beyond the first result page or issuing new queries requires additional time and effort, which often frustrate users. Furthermore, people are often not able to articulate appropriate queries, especially when they have little or no knowledge about their search topics.

Personalization is a solution that meets such challenges. Since the late 1990s, many research efforts have been spent on applying personalization in operational information retrieval (IR) system design. Taking account of differences among users and their contexts, personalization systems adapt search results to particular users by putting their desired documents on top ranks. The ultimate goal of personalization is to make users’ interaction with search systems “as effective and pleasurable as possible” [2].

Personalization of information retrieval can be applied in mainly three aspects: search result content, interface, and interaction mode [2], and content personalization are the most frequently seen. Many types of evidence can be used for personalization, providing additional information about the user and/or his/her context. The sources of such evidence consist of explicit user preferences, user behaviors from which implicit user interests can be inferred, and contextual factors that help elicit information about user interests [2]. Due to its advantage of not interrupting users in their search and not requiring additional user effort, the implicit approach has been adopted by most of the current personalization approaches, in which user interests are inferred from behavioral or contextual sources, for example, dwell time (e.g., [10, 11]), browsing history (e.g., [10]), query history (e.g., [20]), desktop repository (e.g., [4, 5, 22]), to name a few.

In everyday life, what leads a person to information search is usually an Anomalous State of Knowledge [1] in pursuing a certain goal [19]. A great deal of information search is associated with people’s work tasks, which are the activities that they performs to fulfill their responsibilities for the work [16]. There is rich information in a work task environment that could possibly provide additional information for interpreting the user’s information need but many aspects of this environment have not been studied. The current research hopes to contribute to content personalization by looking at the following three types of contextual factors in a work task environment: 1) task features, 2) the knowledge that the user has on the search task topic, and 3) the desktop information that the user has generated for his/her work task, which is essentially work task product(s). To be more specific, this study is aimed at exploring how these factors may help implicitly predict document usefulness, in particular examining the possible interactions of these different factors and user behaviors on performing personalization.

2. RELATED WORK

2.1 Task and Personalization

Task has been found in previous research to be helpful in predicting document usefulness by user behaviors, such as dwell time (or display time, i.e., the time a user spends on an information object). Kelly & Belkin [11] found that using display time averaged over a group of users to predict document usefulness is not likely to work, nor is it working using display time for a single user without taking into account contextual factors. In particular, display time differs significantly according to specific tasks and specific users. This demonstrated that inferring document usefulness from dwell time should be tailored towards individual tasks and/or users. Yet, their study did not examine how to incorporate contextual factors.

This pending problem was addressed by White & Kelly [28]. By examining the interactions between dwell time and the two factors of user and task, they explored if additional information from the user and/or the task helps reliably establish a dwell time threshold to predict document usefulness. They found that tailoring display time threshold based on task information improved implicit relevance feedback (IRF) performance. In other words, display time was proved to be able to successfully predict document usefulness when the task information is considered. This study is a successful case taking into consideration the interaction effect of contextual factors and dwell time in predicting document usefulness.

Nevertheless, there are still issues calling for further research. White & Kelly [28] only used a limited number of 7 participants, and the tasks were all self-identified by the participants since it was a naturalistic study. In classifying the users’ tasks, the authors collapsed them into several categories such as online shopping, emailing, researching, etc., according to task contents. However, the different tasks cannot be more effectively used for personalization in a more general sense unless they are classified according to a certain generic features, for example, task complexity, difficulty, task product, etc.

One such feature is the stage of task. It has attracted much attention in information seeker’s affective, emotional, and physical action changes during information seeking (e.g, [13], [15]). Previous studies (e.g., [25]) have found that when task stage progresses, users’ search tactics change. They were less likely to start their initial queries by introducing all the search terms, were more likely to enter only a fraction of the terms, and tended to use more synonyms and parallel terms. It would be beneficial if systems can be designed to provide tools helping users build their conceptual structures in the initial stage of tasks. However, more research effort is needed to study how task stage can be helpful in adapting search to different users.

It is not always easy to accurately split task stages in empirical research because such stages do not often or necessarily have apparent boundaries. In their study analyzing the effect of relevance feedback, White, Ruthven, & Jose [29] divided tasks, based on the logged user-system interaction data, into three stages with equal time length: “start”, “middle”, and “end”. They found that IRF is used more in the middle of the search than in the beginning or in the end, whereas ERF (explicit relevance Feedback) is used more towards the end. They further found task complexity affects when the user gets to the most interactive point in using the IRF based system, indicating that for the more complex tasks, users may spend more time initially interpreting search results before interacting with them. Another way to operationalize the stage of task is Lin [17], in which the user’s task was manipulated with different sub-tasks to be completed in different search sessions. The task scenario in his study required participants to make a vacation plan through three steps in three sessions: identifying candidate places, comparing candidate places and choosing one to go, and making a plan for the trip. Both ways to operationalize task stage ([29] and [18]) are arbitrary to some extent, but the latter is closer to the situation in people’s everyday life when solving complex tasks.

If the work task consists of multiple sub-tasks, the relationship between sub-tasks seems necessary to be taken into account because the orders of these sub-tasks may vary during the process of task completion. One such work was Toms et al. [24], which classified tasks based on the conceptual structures of its sub-tasks. The two types of tasks were: the parallel, where the search uses multiple concepts that exist on the same level in a conceptual hierarchy, and the hierarchical, where the search uses a single concept for which multiple attributes or characteristics are sought. As other task features, further approaches are needed to better utilize the information that it can provide in personalization.

2.2 Domain/Topic Knowledge and Personalization

Many studies have examined the effect of domain knowledge or topic knowledge/familiarity on search tactics. They have found that users with different levels of domain or topic knowledge showed behavioral differences, and these findings often have implications in various aspects of personalization system design.

A number of these studies demonstrate domain knowledge affects search tactics. Hsieh-Yee [8] found that when users have had a certain searching experience, subject knowledge affected their searching tactics. When they work with a less familiar topic, they used the thesaurus more for term suggestions, monitored the search more closely, included more synonyms, and tried out more term combinations than when they searched a familiar subject area. Moreover, domain knowledge was found to be associated with people’s ability to choose appropriate search terms [26], efficient selection of concepts to include in the search [30], or the ability to better utilize the assistance from a thesaurus [21].

These findings have two points calling for attention. First, with the increasing popularity of the Internet and search engines, the majority of web users nowadays are experienced, hence, subject knowledge should affect searching tactics in a wider population than before, and should become more important a factor that is worth further investigation. Second, such differences implied that it would be beneficial to provide users who have different levels of domain knowledge with different systems or system features that support query formulation and reformulation. More research is needed as for how to make use of such behavioral differences to personalize search, for example, to infer the user’s knowledge or expertise from their search behaviors (e.g., [27]), or to adapt suitable system features according to their knowledge levels to assist their search.

Some other studies looked at the relations between user’s topic familiarity and their behaviors, such as dwell time, viewing, saving behaviors, etc., and investigated if such relations can be used for personalization. Kelly & Cool [12] found that with the increase of one’s familiarity with search topics, his/her reading time decreases while search efficacy (the ratio of saved documents to total viewed documents) increases. This indicated that it may be possible to infer topic familiarity implicitly from searching behavior. On a different approach, Kumaran, Jones, & Madani [16] proposed a method to differentiate, by document features, introductory or advanced documents that match different levels of topic familiarity. Certain document features (e.g., stop-word, line-length) could predict a document being introductory or advanced, and a user who read this document having high or low familiarity with this topic. This could be useful in implicitly inferring one’s topic familiarity. On the other hand, this differentiating method can be effective in biasing search result ranking for people with different levels of familiarity with search topics when such familiarity levels are known.

Another study addressing user’s topic familiarity is Kelly & Belkin [11], which proposed a user modeling system that accounts for contextual factors. They pointed out that topic familiarity may affect the types of information search and behaviors exhibited by the user. They illustrated the likely way that topic familiarity may affect user’s reading time on a document: the relationship between reading time of relevant and non-relevant documents is not simply linear, rather, it could vary in two very different ways according to topic familiarity. For those with low degree of familiarity, reading time for both relevant and non-relevant documents may be similar, but for those with high degree of familiarity, reading time for relevant and non-relevant documents may be very different. Their concept is intuitively sensible, but no further research hypothesis has been developed and effort is needed to verify this relationship in a systematic way.

2.3 Desktop Repository as an IRF Source for Personalization

A number of studies have attempted to use desktop repository for search personalization by treating it as a source of user interest from which useful terms can be extracted for query expansion. Teevan, Dumais, & Horvitz [22] looked at users’ previously issued queries and previously visited Web pages, as well as some desktop repository information such as documents and emails the users have read and created. An evaluation study showed such an approach improved search performance over the non-personalized baseline system. In terms of the desktop information, specifically, it was found that using user’s entire desktop index received the best performance, followed by using the recently indexed content (within the last month) and then the indexed Web page content only. The authors discussed that the richness of the information used for representing user interest is important in achieving the best performance. In addition, this study revealed that combination of Web ranking and the personalized ranking yielded a significant improvement over either single ranking method.

Chirita, Firan, & Nejdl [4] attempted three different desktop oriented approaches to capturing user interests: summarizing the entire desktop data, summarizing only the desktop documents relevant to each user query, and applying natural language processing techniques to extract compounds from relevant desktop resources. An evaluation study found personalized query expansion terms by summarizing only those relevant to each user query are more effective than summarizing the entire desktop data. Similarly, through the investigation of five techniques for generating query expansion keywords from personal collection, Chirita, Firan, & Nejdl [5] obtained performance improvement of some techniques, especially on ambiguous terms. In any case, the simple desktop term frequency and lexical compounds with the local desktop analysis performed best. These results seem to be inconsistent with Teevan, Dumais, & Horvitz’s [22] finding that the richer the representation, the better the performance. A closer review showed that they used different sets of partial desktop information. Teevan and colleagues [22] considered the desktop information in the last month and the viewed Web pages only, which can be viewed as only “partially” desktop repository but not necessarily “relevant”; comparatively, Chirita and colleagues ([4], [5]) looked at only “relevant” resources to the user’s query, therefore, it is not surprising that they found the relevant fraction outperformed the entire desktop repository as IRF sources.

While these approaches receive promising results in general, it has been suggested that it is critical to decide when to apply personalization, since sometimes it is not necessarily needed, for example, when queries are less ambiguous or more navigational [7, 23]. One way to solve such problems is to detect under what circumstances personalization should be used; meanwhile, it should also be useful to study what personalization algorithms are less vulnerable to circumstances, or more robust in improving search performance. This calls for further research effort to identify how desktop information can be better utilized in order to build more robust personalization systems and to obtain better personalization performance.

3. RESEARCH MODEL AND RESEARCH QUESTIONS

3.1 A Research Model

As previously mentioned, people often search in everyday life to meet their information needs for some work tasks under certain goals. Based on this concept, a general model of IR in work task environments could be built, which consists of four types of components (illustrated by columns in Figure 1): goal, various contextual factors and their detailed specific situations, user behaviors, and document usefulness. Figure 1 shows the relationships among these four sets of components (the factors examined in this study are in grey boxes and are italicized).

Different from many other models in IR, the proposed one considers information seeking in the general work task context. Under a certain goal, an information seeker conducts some activities, for instance, work tasks, etc. Meanwhile, he/she carry various types of contextual factors, such as task, knowledge, etc., which convey rich information about this person and his/her search interest. These contextual factors have various details, which we call them specific situational values (context is a more general concept than situation, c.f., [6]). For example, the general and umbrella concept of task (as a contextual factor) includes many features: task complexity, stage of task, task product, etc., all are specific situational values. Another set of components in this model is user behavior related with the information search, which could be query history, dwell time, printing and saving behaviors, etc. Finally, the ultimate goal of information retrieval is a document’s usefulness, which is also the goal of personalization.

Various types of relationships exist among these factors. User behaviors may be able to predict document usefulness (e.g., [10]), or they may not reliably do so (e.g., [11]). User behaviors may also be affected by contextual factors/situational values, or may be used to predict situational values (e.g., [16]). In addition, user behaviors may interact with contextual factors/situational values and predict document usefulness (e.g., [28]).

Figure 1. A research model: Relationships between factors in IR
(For a larger view of the image, click here)

3.2 Research Questions

Within this general model, the current study focuses on the interaction effects of users’ contexts/situations and their behaviors on perceived document usefulness. To make it more specific, user behaviors explored in this study include dwell time, and saving and using behaviors. The contextual factors/situational values examined in this study are users’ knowledge/familiarity on search topics, the stage of work tasks, work task interdependence, and task products. In terms of work task interdependence, two types of tasks are considered, which are comparable to the parallel and the hierarchical tasks in [24]. In this study, the two task types are: parallel, in which the sub-tasks are parallel and independent to each other; and dependent, in which the sub-tasks have a logical order to be worked on, and some sub-tasks are dependent upon the completion of other sub-tasks. In short, the study seeks to examine the effects of work task features, user’s topic knowledge, and the product(s) generated during the course of the search, on personalizing search results. To this end, three general research questions (RQs) are developed, as follows.

RQ1: Does the stage of users’ work tasks help predict document usefulness from dwell time in the parallel and in the dependent tasks, respectively?

RQ2. Does users’ topic knowledge help predict document usefulness from dwell time in the parallel and the dependent tasks, respectively?

RQ3. Can effective query expansion terms be extracted from documents that users have created, saved, and/or used in accomplishing their work tasks?

RQs 1 and 2 examine the interaction effects of task stage, topic knowledge and dwell time on document usefulness. RQ3 looks at a specific personalization technique that extracts terms for query expansion from work task products and user behaviors, operationalized as user saved and/or used documents. In other words, RQ3 examines the performance of a query expansion technique which extracts terms for suggestion from the documents that users have created, as well as those that users have saved and/or used.

4. METHODOLOGY

4.1 General Study Design

The study was a 2*2 factorial design with four conditions along two dimensions (Table 1). One dimension was task type being the parallel or the dependent, and the other was the search system being query-expansion (QE) or none query-expansion (NQE).

4.2 Tasks, Stages, and Sub-task Orders

For ease of control, the study was a 3-session experiment similar to Lin’s [18] approach. Each participant came three times, each time working on one sub-task, and the three times was treated as task stage 1, 2, and 3 respectively.

Journalists’ assignments were used as tasks mainly because they could be relatively easily set as realistic tasks in different search domains. Two tasks, one parallel and one dependent, were designed, each having three sub-tasks. As mentioned, the logical sub-task order in the parallel task is not supposed to be fixed while that in the dependent task is. To maintain consistency, the sub-task orders in both assignments were freely chosen by the participants. Sub-task orders that appear in task description were rotated following a Latin Square design, with totally 6 orders in each task.

Each task described a general working task episode, to be finished by the participants in three steps, and it asked the participants to look for information and to write and submit a report. Each task also asked the participant to submit a sub-report for each sub-task in each session. These reports, together with all possible notes or drafts that users have created, consisted their task products. Users’ task products and saved web pages, etc., were treated as the participants’ personal desktop repository and were used for extracting suggested query term.

4.3 Participants

Twenty-four participants (see Table 1 for an assignment of their task and system conditions) were recruited from Journalism/Media studies undergraduates at Rutgers University. A recruitment email was sent to the student listserv. Each participant came three times at their convenient time slots within a two-week period, and received some amount of remuneration. To encourage them to do as much information search and to write as good reports as possible, the study employed an incentive system, offering bonus to some top participants who have submitted the most detailed reports.

Table 1. Participants’ task assignment

Tasks	System versions
Tasks	Non-Query Expansion	Query Expansion
Dependent	Participants 1-6	Participants 13-18
Parallel	Participants 7-12	Participants 19-24

4.4 Physical Equipment

The study was conducted in an on-campus interaction lab. The participants were provided a desktop computer with high-speed Internet connection. They were allowed to freely choose the systems to search.

In the NQE condition, the normal Internet Explorer (IE) version 6.0 was used in all sessions. In the QE condition, the regular IE was used in session 1, but a special interface was provided for sessions 2 and 3 to support query expansion (Figure 2). The QE version system included two parts: on the right was the normal IE and on the left was a list of terms that were extracted from user’s previous session(s). The study employed an explicit means of term suggestion since expanding queries in a purely hidden way in the open and dynamic Web has more technical requirements. If term extraction in the proposed way would be proved effective, future studies will seek ways to apply query expansion in implicit ways to reduce user effort.

Figure 2. Search interface for QE condition

The suggested terms were extracted from two sources: the saved and/or used documents, and the user-generated product(s) including their submitted task reports and notes. Significant terms were selected based on term frequency (TF) (c.f., [5]), which was be obtained using the index function in Indri toolkit (http://www.lemurproject.org/indri/). The number of terms for each session was 10-20 depending on various cutoffs of term frequencies. This number scope was in accordance with previous studies [3, 14]. Participants were encouraged to use these terms at least in their first search.

4.5 Experiment Procedures

Upon arrival in session 1, each participant was first given a consent form. After he/she signed it, an entry questionnaire was administered to collect his/her background information. He/she was then assigned a work task, and was asked to finish a pre-task Questionnaire about his/her knowledge on the topic and expected difficulty of the task. Then the participant was asked to choose a sub-task to work on in session 1, which was limited to 40 minutes. Pre- and post-session questionnaires were completed before and after the sub-task, eliciting user knowledge and expected or perceived difficulties of the sub-task. After searching, the participant was asked to save the product(s) generated in the process. Then the participant judged the usefulness of their viewed documents in the order that they viewed them in the session.

The above described process for each sub-task was repeated for all three sessions. After the post-session questionnaire in session 3, the participant completed a post-task questionnaire to gather his/her knowledge and difficulty of the general task after working with the whole task. Finally, an exit interview was conducted asking the participant about his/her overall knowledge acquisition on the task and other comments. The whole experiment was logged by the Morae logging software.

4.6 Data Analysis Method

General Linear Model (GLM) was used to test RQs 1 and 2 for interaction effect among contextual and behavioral factors considered in this study. The results are hoped to detect the relations between/among dwell time, task stage, sub-task interdependence, user’s familiarity with search topics, and perceived document usefulness.

RQ3 will involve t-test on session-based Discounted Cumulative Gain (sDCG) [9] comparing the system’s performance in QE and NQE conditions. The result of this RQ expects to reveal the relations between/among user’s saving and using behaviors, their desktop repository information, and the search result ranking performance (as a form of search result document usefulness).

Finally, a model incorporating all these relations will be built based on the results (to confirm the model shown in Figure 1). This will not only demonstrate the results found in this study, but will also be a basis for future efforts and directions in personalization of IR.

5. RESULTS THUS FAR

Data analysis has been conducted for RQ1. Results showed that, in general, the total time that a document (it was almost always a Web page in the current experiment) displayed in a session was positively correlated with its perceived usefulness that the user reported in the evaluation. The total time that the user read a page in a session was also positively correlated with its perceived usefulness. At the first glance, these results seemed to be inconsistent with [11], but a close examination showed that the two studies had different settings and requirements for participants. While [11] was a naturalistic study in which users were not necessarily asked to generate any outcome, the current study asked users to write reports. Users usually worked in Microsoft Word while leaving the Web pages open, or switched between Word and IE windows while they worked with the reports, copying or memorizing the contents in the Web pages and pasting or typing in Word document. The Web pages in parallel with Word documents were most likely the useful ones, and their total display time, or the total time that users spent with them, were prolonged than if users were not generating the reports.

Meanwhile, we can consider only the first duration of time that the user read a page before he/she left the page (e.g., going to Word or another Web page), at which time users usually made some judgments on the usefulness of the pages. For example, starting to work in Word may indicate that the page was useful, while leaving (closing) a page and going to a different page may indicate that the page was not useful. This duration of time can be defined as decision time. Data analysis showed that there was an interaction effect (p<.01) of task stage and document usefulness on decision time (Figure 3). This indicated that task stage can indeed play some kind of role in predicting document usefulness from decision time. More efforts are needed to explore how to make use of this finding, as well as how other RQs will be answered.

Figure 3. Decision time, stage, and document usefulness

6. ACKNOWLEDGMENTS

I would like to thank my dissertation advisor, Nick Belkin, as well as my committee members, Jacek Gwizdka, Xiangmin Zhang, and Diane Kelly, for their contributions to this work. I would also like to thank the reviewers and organizers of JCDL 2009 Doctoral Consortium for their valuable feedback and support. This research is sponsored by IMLS grant LG#06-07-0105.

7. REFERENCES

[1]	Belkin, N.J. (1980) Anomalous states of knowledge as a basis for information retrieval. Canadian Journal of Information Science, 5, 133-143.

[2]	Belkin, N.J. (2006). Getting personal: Personalization of support for interaction with information. Talk in The 1st International Workshop on Adaptive Information Retrieval, October 2006, Glasgow, UK.

[3]	Belkin, N.J., Cole, M., Gwizdka, J., Li, Y.-L., Liu, J.-J., Muresan, G., Roussinov, D., Smith, C.A., Taylor, A., & Yuan, X.-J. (2005). Rutgers Information Interaction Lab at TREC 2005: Trying HARD. In Proceedings of TREC 2005.

[4]	Chirita, P.-A., Firan, C.S., & Nejdl, W. (2006). Summarizing local context to personalize global web search. In Proceedings of CIKM '06, 287-296.

[5]	Chirita, P.-A., Firan, C.S., & Nejdl, W. (2007). Personalized query expansion for the Web. In Proceedings of SIGIR 2007, 7-14.

[6]	Cool, C. (2001). The concept of situation in information science. In M. E. Williams (Ed.), Annual review of information science and technology (Vol. 35) (pp. 5-42). Medford, NJ: Learned Information.

[7]	Dou, Z., Song, R., & Wen, J.-R. (2007). A large-scale evaluation and analysis of personalized search strategies. In Proceedings of WWW '07, 581-590.

[8]	Hsieh-Yee, I. (1993). Effects of search experience and subject knowledge on the search tactics of novice and experienced searchers. Journal of the American Society for Information Science, 44(3), 161-174.

[9]	Järvelin, K., Price, S.L., Delcambre, L.M.L., & Nielsen, M.L. (2008). Discounted Cumulated Gain based evaluation of multiple-query IR sessions. In C. Macdonald et al. (Eds.): ECIR 2008, LNCS 4956, pp. 4–15, 2008.

[10]	Joachims, T., Granka, L., Pang, B., Hembrooke, H., & Gay, G. (2005). Accurately interpreting clickthrough data as implicit feedback. In Proceedings of SIGIR '05, 154-161.

[11]	Kelly, D., & Belkin, N.J. (2004). Display time as implicit feedback: Understanding task effects. In Proceedings of SIGIR '04, 377-384.

[12]	Kelly, D., & Cool, C. (2002). The Effects of Topic Familiarity on Information search behavior. In Proceedings of JCDL '02, 74-75.

[13]	Kelly, G.A. (1963). A theory of personality: The psychology of personal constructs. New York: Norton.

[14]	Koenemann, J. & Belkin, N.J. (1996). A case for interaction: A study of interactive information retrieval behavior and effectiveness. In Proceedings of CHI '96, 205-212.

[15]	Kuhlthau, C.C. (1991). Inside the search process: information seeking from the user’s perspective. Journal of the American Society for Information Science, 42, 361-371.

[16]	Kumaran, G, Jones, R., & Madani, O. (2005). Biasing Web Search Results for Topic Familiarity. In Proceedings of CIKM '05, 271-271.

[17]	Li., Y. (2008). Relationships among work tasks, search tasks, and interactive information searching behavior. Unpublished dissertation. Rutgers University.

[18]	Lin, S.-J. (2001). Modeling and Supporting Multiple Information Seeking Episodes over the Web. Unpublished dissertation. Rutgers University.

[19]	Schutz, A., & Luckmann, T. (1973). The Structures of the life-worlds. (M. Zaner and H. T. Engelhardt. Jr., Trans.). Evanston, IL: Northwestern University Press.

[20]	Shen, X., Tan, B., & Zhai, C. (2005). Context-sensitive information retrieval with implicit feedback. In Proceedings of SIGIR '05, 43-50.

[21]	Sihvonen, A., & Vakkari, P. (2004). Subject knowledge improves interactive query expansion assisted by a thesaurus. Journal of Documentation, 60(6), 673-690.

[22]	Teevan, J., Dumais, S.T., & Horvitz, E. (2005). Personalizing search via automated analysis of interests and activities. In Proceedings of SIGIR '05, 449-456.

[23]	Teevan, J., Dumais, S.T., & Liebling, D.J. (2008). To personalize or not to personalize: Modeling queries with variation in user intent. In Proceedings of SIGIR '08, 163-170.

[24]	Toms, E., MacKenzie, T., Jordan, C., O’Brien, H., Freund, L., Toze, S., Dawe, E., & MacNutt, A. (2007). How task affects information search. In N. Fuhr, N. Lalmas, & A. Trotman (Eds.). Workshop Pre-proceedings in Initiative for the Evaluation of XML Retrieval (INEX) 2007, 337-341.

[25]	Vakkari, P., & Hakala, N. (2000). Changes in relevance criteria and problem stages in task performance. Journal of Documentation, 56(5), 540-562.

[26]	Vakkari, P., Pennanen, M., & Serola, S. (2003). Changes of search terms and tactics while writing a research proposal: A longitudinal research. Information Processing & Management, 39(3), 445-463.

[27]	White, R.W., Dumais, S., & Teevan, J. (2009). Characterizing the influence of domain expertise on Web search behavior. In Proceedings of WSDM '09, 132-141.

[28]	White, R., & Kelly, D. (2006). A study of the effects of personalization and task information on implicit feedback performance. In Proceedings of CIKM ’06, 297-306.

[29]	White, R.W., Ruthven, I., & Jose, J.M. (2005). A study of factors affecting the utility of implicit relevance feedback. In Proceedings of SIGIR '05, 35-42.

[30]	Wildemuth, B. (2004). The effects of domain knowledge on search tactic formulation. Journal of the American Society for Information Science and Technology, 55(3), 246-258.