IEEE TCDL Bulletin
 
space space

TCDL Bulletin
Volume 3   Issue 2
Summer 2007

 

A Curated Harvesting Approach to Establishing
a Multi-Protocol Online Subject Portal

Robert Sanderson
University of Liverpool
Liverpool, L693DA
United Kingdom
+44 151 7954252
<azaroth@liv.ac.uk>
John Harrison
University of Liverpool
Liverpool, L693DA
United Kingdom
+44 151 7943142
<john.harrison@liv.ac.uk>
Clare Llewellyn
University of Liverpool
Liverpool, L693DA
United Kingdom
+44 151 7943142
<clare.llewellyn@liv.ac.uk>
 

The focus of this work was to create a single point of access to information concerning Minerals Heritage in the UK, regardless of its location or the protocol needed to access it. In a previous phase, an online metasearch engine was configured to cross-search a number of pre-determined remote services. This prototype cross-searched 15 individual databases in parallel and returned the full results of the searches to a portal via the SRU protocol. This approach was considered to return too many false positives, due to the wide range of both content and quality of data in the remote systems, which were typically library OPACs and databases of archival finding aids.

For the second phase, it was decided that administrator intervention was necessary to achieve and maintain the quality of service expected. To facillitate the data curation by non-technical users, a Graphical User Interface would be needed that abstracted many of the problems experienced by any metasearch engine.

We describe a curated harvesting approach to creating and maintaining a subject portal, comprising selected records from automated harvesting of remote services via information retrieval standards such as SRW/U, Z39.50 and OAI. The result was a web-based data curation interface where administrators can configure access to remote resources and any queries to be performed at them and reviewers can decide which records should be included for the end user to search.

Thumbnail image of poster

For a larger view of Figure 1, click here.

 

© Copyright 2007 Robert Sanderson, John Harrison, and Clare Llewellyn
Some or all of these materials were previously published in the Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital libraries, ACM 1-59593-354-9.

Top | Contents
Previous Article
Next Article
Home | E-mail the Editor