Getting the most out of the project : Survey of M25 Members

Followers of the SEARCH25 project blog will be aware of the primary project objective to significantly enhance the existing InforM25 discovery services. Benefits will accrue through a new architecture; inclusion of records derived from Copac; aggregation and enhanced matching of multiple resource records; provision of live holdings and circulation data; and a platform to deliver enriched services in the future.

It is a priority to ensure that the project builds on its formal outcomes by working with the M25 membership to exploit immediate and emerging opportunities arising from the work. Put simply we’re looking for SEARCH25 to act as a catalyst for M25 service development rather than as an end in itself.

Following on from the SEARCH25 Open Metadata Licensing workshop (reported here in an earlier post) and meetings with the AIM25 archival team, we have developed a short survey in order to assess M25 member opinions and appetite for actions potentially arising from the project. Some of the proposed steps are simple and small but may be start of a significant journey for the consortium and for institutions, raising challenges and opportunities around the real utility of our services and our metadata – as highlighted by Chris Banks in her recent reflection on practice at the University of Aberdeen.

The workshop group therefore agreed we should get member feedback on the following:

  • Assess appetite within the consortium for jointly adopting an open licence for bibliographic metadata delivered though the SEARCH25 service in line with the JISC Discovery principles (http://discovery.ac.uk/principles)
  • Consider the potential relationship between M25 Consortium and a national National Union Catalogue service
  • Understand whether and how the Z39.50 based SEARCH25 service fits with new LMS choices at an institutional level
  • Explore potential opportunities to make cross-domain links, notably with the AIM25 archival consortium

We now have an online survey covering those topics so we should be grateful if members would complete a single management response by close of Monday 4 June. The questions are divided in to four sections as set out above and should take no more than 15 minutes to complete. An introduction to the survey plus a printable copy of the questions can be found at www.sero.co.uk/search25-survey.

The Focus Groups: FAQs

We have now completed all of the seven scheduled focus groups, which encompass the initial stage of the research process. I feel it may be useful to feedback in the form of some frequently asked questions, regarding the thought processes involved in setting up a series of focus groups – rather than say a set of semi-structured interviews – and furthermore to relay some of the common threads that were woven through our choice of participants, questioning, and data gathering. I will also allow a sneak preview of the findings!

1. Why focus groups? For the obvious reason that they provide a broad often uncontrollable, sometimes unexpected, mass of opinions. They also seem to self-regulate themselves, so as a group consensus is often agreed upon with a collective silence at the end of a few minutes discussion – when this does not happen and a throw-away comment is made: these often lead in to the pivot for the next question, regularly from another focus group attendee. A focus group simply enables you to find out a lot in a very short space of time – and despite it seemingly being only a group of people in a room; the data gleaned after doing a number of them is robust.

2. What choices were made regarding participants? We decided upon having different focus groups for different types of potential users – the main separation being library staff and researchers/students. While it could be argued that this could curtail disagreement and debate, what it allowed for was a general accord (obviously not all of the time), permitting us to infer the overall mood of each user type – whether at a busy help-desk in a consortium library or in a bedroom full of empty cans of red bull.

3. What questions were asked? It was relatively relaxed. Although the same sorts of questions were asked each time, so we could compare and analyse across focus groups. It began broadly with a question about library catalogues, before drilling down in to the minute details of the old InforM25 service and the new SEARCH25 test version, finally returning to more open-ended questions about potential users and marketing. We allowed time for responses and made sure to wait before jumping in with the next question – this made for a good wide-ranging discussion. We knew that once the participants were relaxed a group dynamic would take shape and they would begin to try to get to the bottom of things, working almost as a team.

4. How did you gather the data and what do you plan to do with it? We recorded the discussion and also took notes – all of the points raised were done so anonymously and no names or institutions will be reproduced in the final report (we will though note whether the point was raised by, for example, a subject librarian or a postgraduate taught student). The audio recording and notes will be written up into a final report by the University of Sheffield, Information School, specifically Paula Goodale – this report will directly influence the way the user will experience SEARCH25, which is why it was important to undertake a series of focus groups whilst the service was still under development; in order to make it as straightforward as possible for a user to navigate through what is a wealth of materials. Analysing the hours of Dictaphone footage and reams of notes will be done through a method of coding; working out what the common responses were to questions and giving them a specific code, and also highlighting interesting comments that expounded something we perhaps had not given enough thought to, before going back though and analysing the amount of each code using some basic statistics. This is to provide an element of supposed quantitative rigour, to what is essentially a qualitative subjective exercise – saying that though both I and Paula Goodale have a sense of what happened in each focus group and gained a general feeling from all the different types of user: this is important, and will undoubtedly reach the final report of suggested directions in which to take SEARCH25.

5. Was it worth it and did we find out what we expected to? To tie all of the aforementioned questions together somewhat; focus groups allowed for a lot of stuff to be got at instantly – alongside these we also ran a survey and analysed log data, which require a heftier chunk of quantitative analysis. Therefore it was a useful exercise to relay back a vast amount of findings quickly to the development process. So, in that sense it was the right choice of research method. Further to that, findings… We realised from an early stage that SEARCH25 even in its current unfinished state is a cleaner, sleeker, more modern regional union catalogue to InforM25 (the results that are dragged back from each institution are instantly more understandable for a first time user). And added to that, the focus groups worked to re-affirm some of our own ideas regarding development. Indeed, we had envisaged most of the responses and had an idea that the final product should incorporate most of what the focus group participants wanted it to.

6. What next? Transcribing and coding the data will allow us to succinctly make suggestions for the final interface design. There are still some big questions to be answered regarding geographical delineation and access rights (and where and when the user should narrow geographically or only search for borrowable items and how this should be done – we know that people hate it when they find a book but cannot access it, therefore there is a case for putting some basic information in at the start such as home institution and status; this may then feed in to the geographical filtering), basic search/advanced search (it has become clear that both are necessary and a single box with keyword as default, with an option for a more advanced search is preferable, as sometimes you are searching broadly, sometimes for a known item), merging (an obvious vast improvement here upon the old service but there are still some minor technical issues), faceting (an ability to select more than one facet, so say two institutions you are prepared to visit, or two subject areas that you were looking to narrow down by, was the general consensus), speed, and fetched/found are all still things that need thinking about in detail (it was clear that once you give the user the information that there are more books available by an author at an institution, they would like then to view all of them – some form of slider here so the user has control over speed or volume may be the answer), e-journals and e-books (the service is primarily aimed at finding hard-copy, so how to clarify when a book or journal is electronic and then where to link them to is a question that we need to work though), and as Graham Seaman notes in his previous blog post ‘closing the loop’ for the user is a priority.

But, and this is a big but, I think that the focus groups have given us a clearer understanding of what it is that users want from SEARCH25 and where to go from here – and as a quick aside, due to asking the wide ranging questions at the start and finish, we now have some interesting thoughts to ponder over with regards to the ultimate potential of a regional search, across a remarkable range of world-renowned institutions, densely packed in a city that is easily traversable (if I had known InforM25 had existed whilst writing my thesis I would have certainly used it – SEARCH25 is a vast improvement upon the original service and should become a site oft visited by students and academics, without the need for a librarian as a go-between).

Finding the way home

Once someone has identified an item they want in Search25, they need to know how to access it. At this point, Search25 can show them the bibliographic details for the book; a link to the holding library; information about access rights for the library (that bit’s coming soon); and if the library provides it over Z39.50, some basic holdings and perhaps circulation information.

It would be good to ‘close the loop’ on this so the user can also see the record in it’s home context, back in the originating library’s catalogue, where there may well be additional information. You’d think nothing could be easier: we have a Marc record derived from the library, and all we need to do is link back to the library display of the same record. But how to make the link? The obvious thing to do is use the record number (Marc 001). But that may not be exposed at all, as it isn’t really meant for public consumption. Or we could use the classmark, which might not get us to the exact item, but should – given the full classmark – get us back to a very short list of possible and related candidates. What we don’t want to do is repeat the original search from Search25 on the source library – there may be hundreds of results, and our user has just gone to the effort of filtering this down to one particular one they’re interested in.

So this is a first pass at working out what’s possible. It’s done on a tiny sample – one record from one or two libraries using each of the most common LMS systems. But even that took time. If you use one of these LMSes and know how to do it better, please, please let me know. A quick summary of the results first, then the details.

Results Summary

What’s possible depends not only on the underlying LMS, but also on whatever is being used as a front-end – eg. Vufind, in front of Voyager (as used by the University of Kent and the LSE) is more straightforward to deal with than Voyager alone.

Given that, the open source options are the simplest to link to. Koha allows you to link straight to a record by its record number. So do the Vufind implementations. If Vufind is configured for it (ie. the classmark has been indexed in Solr), then it is also very simple to link using the classmark. I don’t know if this is also true for Koha – the one example I have of a Koha library doesn’t use classmarks, but it looks like they should work too.

ExLibris Aleph allows you to link to a record by record number provided you know the internal library database name. This link may or not be exposed to normal users, but once reverse engineered from one site it seems to carry over to others (assuming you know the database name). Some libraries however put other systems in front of their Aleph system which mean that the raw Aleph page does not display well, but the fronting system blocks use of the record link. Aleph systems do generally allow access by classmark, but through a clumsy browse system and ignoring the Cutter, which makes the target record hard to rediscover.

ExLibris Voyager appears to use the record number for some purposes (eg for user export of records for citation), but unlike Aleph I was unable to find any URL incorporating the record number. On the other hand, it is possible to do a Keyword search ANDing all terms using the classmark (including Cutter) to get almost straight to the record.

Innovative Interfaces’ Innopac (Millenium) – at least for the two samples tested – provides no way to access a record directly from the information given in the exported Marc record. The classmark search again drops the Cutter, potentially leading to very long lists of candidates, and although records do have an identifying bibId it is not one that is in the Marc record.

SirsiDynix Horizon also uses various strings as IDs for records, none of which are in the Marc record. However, the ‘shelfmark’ search uses the whole classmark and so generally gets very close to the desired record.

SisriDynix Unicorn also uses identifiers for records which are not contained in the Marc record. In the one library sampled, there was no shelfmark search.

Talis Prism also includes no way to link back to an individual record. It does allow a classmark search, which given the full classmark does a good job of retrieving the desired record – but in the example records/library chosen the Cutter was missing from the exported Marc record making this impossible for Search25 to use.

In short, linking back to particular records in library catalogues is way harder than you’d expect – if you don’t know how they work. I have a sneaking suspicion that anyone who works with a particular one of these catalogues will know exactly how to do it. Let me know!

You probably don’t want to read past this point…


The Gory Details

These are the actual tests (reduced to one example per system)

1. Anglia Ruskin – Aleph, returning Opac format records (ie Marc21+holdings)

Brecht Caucasian Chalk Circle 1963
001 000336428
Callnumber: 832.912 BRE

Link to full record using record number:
http://libweb.anglia.ac.uk/catalogue/search_results.html?http://oscar0.anglia.ac.uk/F/B45J4RAMLQS5SNAYVKNCESM3MV5ICQG4364XIFPP3Y41KH8839-04099?func=direct&doc_library=APU01&doc_number=000336428

Unfortunately this doesn’t work without the session id. Drop the libweb part and you can use the remaining URL without a session number, but then get an unformatted display (no CSS). No solution was found for this. To generalise this to other libraries, you need to know the name of the database (here it’s APU01).

2. University of Bedfordshire, Innopac, returning Opac format records

Mortimore, Chalk of Sussex and Kent
001: 0900717831
callNumber: 552.58 MOR

This is the page for the record and holdings:
http://library.beds.ac.uk/record=b1073514~S20
Unfortunately, the number b1073514 doesn’t appear anywhere in the Marc record (S20 is the database name) so it’s not possible to link back to this record.

3. Birkbeck, Horizon, Opac format (incomplete – missing classmark and itemids in some records at least)

Brecht, Plays, 1987
001 DYNIX70607
975 00 $a 836 BRE 85 MET

Again, the numbers in the URL for the full record don’t seem to have any connection with any values in the Marc record;
http://ipac.lib.bbk.ac.uk/ipac20/ipac.jsp?full=3100001~!2739~!0

But it is possible to search on the full classmark, here:

http://ipac.lib.bbk.ac.uk/ipac20/ipac.jsp?npp=1&ipp=20&spp=20&profile=ms&ri=&term=&index=.AW&aspect=advanced&term=&index=.TW&term=&index=.GW&term=&index=SUBJECT&term=&index=ISBN&term=836+BRE+85+MET&index=CALLDD&term=&index=SERIES&term=&index=VTITL&term=&index=BIB

Oddly, the classmark is stuffed into the ISBN field.

4. Brighton, Talis, Marc21

Red Chalk, Cole, 2001
001 818009
CLA $a 370.111 $b 370.111

The actual shelfmark is 370.111 RED, but RED is not in the exported Marc anywhere

I couldn’t find any way to search on the record number or use it in a url. Searching on the full shelfmark works fine;

http://prism.talis.com/brighton-ac/items?query=370.111+RED&adv=t

but as mentioned the full shelfmark wasn’t available in this Marc record.

5. Brunel, Unicorn, Marc21

Teaching economics to Undergraduates, Becker, 1998
001 ocm40159060
926 $c HB74.8.T43

The record page is:

http://library.brunel.ac.uk/uhtbin/cgisirsi/Ust8waPsDs/UXBRIDGE/234890078/88

but none of these numbers are in the exported Marc record.

I was unable to search by shelfmark.

6. LSE, Vufind as front-end to Voyager, Opac format

Democratic schools, Apple and Beane, 1999
001 593801
callNumber: LC1049.5 D38
itemID: 704282 etc

The record page is:
https://catalogue.lse.ac.uk/Record/593801

And searching by classmark:
https://catalogue.lse.ac.uk/Search/Results?lookfor=LC1049.5+D38&type=CallNumber&submit=Find

Perfect!

7. Open University in London, Voyager, Opac format

The chalk of Sussex and Kent, Mortimore
001 63509
callNumber: 552.5809422 MOR

Export shows a bibId of 63509, the 001 value, but can’t find a url that works to use this to get the record.

Searching by classmark as ANDed keywords does work:

http://bulbul.open.ac.uk/vwebv/search?searchArg1=552.5809422+MOR&argType1=phrase&searchCode1=GKEY&combine2=and&searchArg2=&argType2=any&searchCode2=GKEY&combine3=and&searchArg3=&argType3=any&searchCode3=GKEY&year=2011-2012&fromYear=&toYear=&location=all&place=all&type=all&status=all&medium=all&language=all&recCount=10&searchType=2&page.search.search.button=Search

8. Institute for development Studies, Koha, Marc21

War and Refugees, Lawless
001 2245
No shelf numbers used in this library

Link direct to record number:
http://blds.ids.ac.uk/directory/catalogue-record/rn/2245

Pilot Focus Group

Last Thursday, as part of our user testing, we conducted a pilot focus group – the first of seven, to gather comprehensive feedback on SEARCH25. It took place in Sheffield with eight chatty, postgraduate students; all studying either Librarianship or Digital Library Management. They were, for this reason, a very good and knowledgeable group to begin with – being able to think as both a helpless undergraduate student and a busy librarian at the help-desk. Paula Goodale led the discussion, while myself and Amy Warner, observed and took notes. The students present were taken though the various features of the working prototype of SEARCH25. Initially the questions were broad, about library catalogues in general, potential users, whether it was something they would recommend to others, before we got on to the prototype itself. We did some known-item searching and some broad keyword searches before showing them the various faceting options. This drove the conversation along nicely, and they were happy to talk away, critically reflecting on the potential of such a service and its current state prior to launch. They often posed questions to each other, so interested were they in the intricacies of interface design – talking about tabs, merging, number of items per page, citations, location maps, and a jump to last page option, amongst other things. We found the pilot focus group, to be an excellent way of getting instant critical reflection on SEARCH25. There were things which came out of it that were illuminating for all sorts of reasons, giving us lots to think about before the next batch of focus groups. Most of all, I think we went away after the pilot focus group, knowing that SEARCH25 is an important service, and that more students and academics should know about it.

The next focus group is at Birkbeck, University of London, on Monday 16th April for academics and postdoctoral researchers, followed by one at CSSD on Tuesday 17th April for e-strategy, library IT, systems managers, and director staff, and another one at Birkbeck, University of London, on Tuesday 17th April, for liaison, subject, and user services librarians, and the final one of the first batch is at LSHTM, on Wednesday 18th April, for undergraduates and postgraduates. If you are interested in attending any of the focus groups, please email me:

james.riding@rhul.ac.uk

Hopefully they will prove to be as successful as the pilot focus group!

Opening up M25 bibliographic data

Thoughts from the project workshop – 19 March 2012 - David Kay, Sero

It is an ambition of the Search25 project to determine whether and how open access bibliographic metadata might be provided through consortium and / or member institution services in line with the JISC/RLUK Discovery principles (http://discovery.ac.uk/businesscase/principles/).

Immediate project opportunity – The project certainly represents an opportunity for M25 institutions to stake out their commitment to open metadata signified by use of an open licence. The new end user functionality previewed at the workshop provides a facility to accumulate ‘Saved Records’ that can be exported in a variety of citation formats, for which open licensing could be explicitly adopted in order to make the user’s position clear. As far as any data goes, ‘Say nothing’ is an unhelpful position to adopt, so It was suggested that this would provide clarity as well as acting as a useful catalyst for the consortium to think through the implications and value of such a step. The discussion of immediate project opportunities to explore the implications of open re-use also covered the project’s re-use of Copac metadata in search results; in this respect, David Prosser (RLUK Exec Director) emphasised that there is an openness to discuss any new service options arising from the project.

The bigger picture – It was recognised, however, that the citation use case only scratches the surface in terms of the potential value of open metadata licensing. The workshop participants reviewed the 17 use cases set out in the Open Bibliographic Data guide (http://obd.jisc.ac.uk) to identify those of relevance to libraries individually and the M25 consortium more broadly. Very high levels of interest were expressed in the following use case for open bibliographic metadata:

  • Supply open data to a physical or virtual union catalogue
  • Allow a union catalogue to publish open data
  • Contribute open data to global search engines such as Google Scholar
  • Supply open holdings data for shared Collection Management; for example, a ‘UK Research Reserve for books’
  • Expose holdings and availability data for ‘closest copy’ location (a key purpose of Search25)
  • Share open data for collaborative cataloguing

A national Union Catalogue? – It was noted that the use cases that were highly valued and which open metadata would enhance generally point to the opportunities that might be afforded by a UK union catalogue. Whilst this idea was discarded at the turn of the century (see the UKNUC study, which led to a focus on serials not monographs), it was generally agreed that times and use cases have changed and therefore that such a model should be revisited.

Open licensing options - The workshop also involved a discussion of open licensing options in the context of a wider vision for the M25 Consortium. The discussion highlighted:

Next Steps - The following steps were recommended for consideration at the end of the workshop. It was however noted that only Step 1 falls within the scope of the Search25 project deliverables. Whilst the other items would offer valuable and relevant outcomes, as well as being important more broadly to M25 development, the Project Board will need to determine how they might be followed up.

  • Develop an open licensing position paper which articulates a shared vision for the M25 Consortium
  • Test the appetite within the consortium for adopting an open licence for bibliographic metadata
  • Articulate broadly how the M25 Consortium would fit within a UK union catalogue proposition

SEARCH25 (Strengthening Electronic Access to Research Consortial Holdings in M25)

On behalf of John Tuck (Project Director)

The warmth of the sun on my back on a March Monday morning, I have decided to review the nature of the various agreements we have put in place to underpin the start-up and progress of SEARCH25. There are many unsung shakers and movers sitting in HR offices, Research and Enterprise suites, Finance Departments and Administration clusters, all beavering away to common goals – helping make projects work. They did their collective bit for us in the frantic build up to application in September 2011, lay dormant for a while in autumn, and, given the green light to proceed, came to life again. For the nine-month SEARCH project, I rifle through the pile of papers on my desk. (I still struggle reading legal agreements on a screen).

1) The collaboration agreement between the named partners: Royal Holloway and Bedford New College, Institute of Development Studies, Birkbeck, Central School of Speech and Drama, London School of Hygiene and Tropical Medicine, M25 Consortium of Academic Libraries. Real signatures have been applied and gathered in across the South Downs and the smoking metropolis. As a consequence, we have and are all happy with our 14-page collaboration agreement.

2) This is one number lower than the 15 individuals who played their part in the management and compilation of the secondment agreement, enabling our Developer, Graham Seaman, to be seconded from Royal Holloway to SEARCH25 for the duration of the project. The actual agreement runs to a brief 9 pages.

3) The two agreements with sub-contracted partners, SERO Consulting Ltd and the University of Sheffield, are much shorter pieces of work, just a couple of pages each but carefully constructed and appropriately signed off under guidance from contracts officers.

4) The agreement with the Library School of Economics is a model of clarity and concision, just over a page focussing on the main issue of use of space utilised by the M25 Consortium within and with the agreement of the LSE Library.

Of course, the above catalogue does not include the important documentation which underpins all aspects of recruitment. In conjunction with HR departments, the job descriptions, adverts, person specifications have all been compiled, approved and used as part of the recruitment of both the Developer (on secondment) and the Project Officer. SEARCH25 is pleased, in this latter case, to announce that James Riding has been appointed to this post after a recent round of interviews.

This blog post is not intended to make light of the documentation required to make projects work. A proper legal and administrative framework is essential to the success of any collaborative project. It is really intended as a signal that each project needs to be aware of the administrative tasks ahead when embarking on collaboration between bodies eager to share but, perhaps, each possessing certain of their own idiosyncracies or ways of doing things. Things may take a little longer than first expected. It is also intended to show what can be achieved in a short space of time by colleagues across institutions, all grasping and committed to the shared services and collaborative nettle. As our IP and Contracts Manager, Research and Enterprise, said to me just the other day, thrusting the completed bundle of agreements into my palm: “It’s really nice to work on a project where everybody is pulling in the same direction (which is not always the case).”

Let’s keep our fingers crossed!

March 2012 Update

A brief update as we work towards our next set of deliverables. The Project Team is now complete with the appointment of a Project Officer:

  • John Tuck – Project Director
  • John Gilby – Project Manager
  • Graham Seaman – Project Developer
  • James Riding – Project Officer
  • Amy Waner – User Testing Advisor
  • Glyn Price – Technical Advisor

Sheffield University Information School (Dr Paul Clough & Paula Goodale) are gathering data about user search behaviour via an online questionnaire, looking at InforM25 usage logs and will also lead the user focus groups planned for April.

Sero Consulting Ltd. are also working with the project team to establish some guidelines on Open Access data for M25 Consortium resource discovery services.

We intend to post more details on our activities in the near future.

Evaluating Open Source Options

The SEARCH25 Technical Specification required us to evaluate potentially useful Open Source applications in terms of both their functionality and a set of criteria relating to quality and community 1. One of the aims of the project is to remove dependence on locally written and unsupported code, and use of widely used and supported open source applications is intended to achieve this.

We need two types of application: a back-end capable of searching multiple Z39.50 clients in parallel, and a front-end providing not only communication with the back-end but also the ability to add extra user functions as required.

The Back-end

As mentioned, our back-end is tied to Z39.50. There is no practical alternative to Z39.50 for searching a distributed set of widely different library catalogues. But Z39.50 is not a fashionable technology, even though it is in very widespread use. As a result, many open source Z39.50 based projects have faded away from lack of developer interest. Some continue based on work by a few dedicated supporters and may yet revive again, but the scope of the SEARCH25 project limits us to packages which will need as little as possible support in the future 2. The exceptions are the real survivors, which now dominate the field.

These are largely divided by language: `C’ and Perl on the one hand (represented by Index Data and applications built on Index Data software), Java on the other (represented by Knowledge Integration, as well as a few academic projects). Index Data and Knowledge Integration are both small companies, but have each survived successfully for many years. The current Inform25 search back-end is actually based on the Index Data `YAZ’ client, combined with Perl wrapping packages from CPAN (most provided by Index Data, some by others). While Index Data has continued and strengthened its YAZ client over the years, support for some of the non-Index Data CPAN modules has fallen away, and the additional work required to build directly on YAZ meant that the option to continue this combination was not seriously considered. However Index Data now supply a new product, Pazpar2, which will run parallel searches through the YAZ client. Pazpar2 will also search indexes, such as Solr or Zebra.

Knowledge Integration supply a similar product to Pazpar2, JzKit3, which works with a range of plugins, from Z39.50 through Solr to database search.

In terms of our five of our criteria (dependencies, distribution, continuity, quality, and licensing) there were no obvious grounds for selecting one product over the other. Pazpar2 is available in linux packages (both Debian and RedHat formats), which means it benefits from the automatic installation of dependencies provided by the packaging systems. JzKit3 is available with its dependencies as binary or source from a subversion repository, and is built using the widely used Maven project management system. Both Pazpar2 and JzKit3 are supported by companies which have a history dating from the 90s. In both cases the code, to a casual inspection, seems to be well structured and organized but to have little informative commenting (the Java code has the edge here, due to Javadoc). Both are available under variants of the GPL.

The remaining three criteria relate to the community around the software. Here Index Data clearly has the edge. They provide mailing lists, which Knowledge Integration do not, and both company members and other developers will respond to queries. Index Data do however make it clear, as is reasonable, that paying clients have priority in responding to development requests. In spite of this there is a steady trickle of bug-fix updates to the software made available to all. JzKit3 in contrast seems not to have been updated since 2008. Finally, Index Data have made a large amount of documentation available for their products; the Knowledge Integration products (as of January 2011) include only some outlines of potential documentation. Given the above, as well as prior familiarity with Index Data products, the decision was made to adopt Pazpar2 as the back-end system.

The Front-end

Index Data have developed their own proprietary front-end for Pazpar2, MasterKey, which includes support for several of the aspects we needed in our own application. Unfortunately as the project has no funding for purchasing software licenses or external support, this could not be considered.

Pazpar2 itself comes with an interface library written in javascript, and a demonstration user-facing front end. There is also one known external front-end which communicates with Pazpar2 using this interface, mkdru, which was developed for the Drupal CMS by Index Data with funding from the Dansk Bibliotekscenter.  Drupal is not specifically intended as library software, but is used as the front end for some library systems (notably the eXtended Catalog, XC) and is used as a general purpose CMS in many academic libraries, particularly in the USA. Mkdru itself is a small interface consisting of only a few files, most  of the work being done by the Pazpar2 javascript library.

Neither the demonstration interface nor mkdru is adequate for our needs as they currently stand. Expanding the demonstration interface would result in a completely new system which would be unsupported at the end of the project, contrary to the project goals. Mkdru plus Drupal would be closer, but still limited: we would need to expand mkdru to create a new Drupal module, which would in turn be unsupported. A better alternative might be to work on a Pazpar2 interface for the eXtensible Catalog Drupal Toolkit.

No existing open source discovery systems yet support Pazpar2. The most widely used open source discovery systems are currently VuFind, written in PHP, and Blacklight, written in Ruby and running on the Rails framework. A new version of VuFind (VuFind 2.0) is currently being created; this will run on the Zend framework. VuFind and Blacklight are otherwise similar, both supporting a range of library functions (recommendations, reviews, etc) as plugins in addition to their core search functions. However, neither of them has existing adapters to work with an asynchronous search system such as Pazpar2. A third system, Xerxes, was originally created to work with just such a system, the proprietary Metalib from ExLibris. The original Xerxes has few additional functions and is tied to Metalib alone. A new version of Xerxes (Xerxes 2.0) is now under development: like the new VuFind, this will be based on Zend, and include adapters for a wide range of library systems, including Metalib.

In terms of our evaluation criteria for open source software there is little difference between these four systems: the XC Drupal Toolkit, Vufind, Blacklight, and Xerxes are all potential candidates for the front-end system. All have a good community-based pool of developers, are well documented, are GPL licensed, and have a good track record.

The fact that two (Vufind and Xerxes) are in the middle of a major rewrite is both a problem and an opportunity: a problem in that there is no absolute guarantee that a stable release will be ready in time to fit the SEARCH25 timetable; and an opportunity, in that this is the best time to submit new code to them which implements the SEARCH25 requirements but may be supported in future as part of the overall application.

The underlying framework is an additional consideration: Zend is currently being rewritten as Zend 2.0, and it is this version both Xerxes and Vufind have chosen to use. The new Zend, however, is already in beta testing and seems quite stable enough to build on. The XC Drupal Toolkit is just beginning to be rewritten for Drupal 7. Blacklight is being modified to run on Rails 3.1/3.2. In each case changes in the framework imply additional work, but the SEARCH25 team should be largely buffered from these changes by the developer team for each application.

In the end, given the limited time available for experimentation, the decision was largely based on the experience of the SEARCH25 developer, who had previously used both Vufind and Xerxes. After discussion with the main developer for each application, it was concluded that Xerxes was a closer fit. Now that both Xerxes and Vufind are converging on the same framework is likely that it will not be too difficult to port an adapter for Pazpar2 from one to the other in any case. The immediate goal for the SEARCH25 team is to produce a Pazpar2 adapter and as many generic functions as possible for Xerxes, which may then become part of the standard Xerxes distribution. There will still be some features which are SEARCH25 specific; these will necessarily be unsupported extensions to Xerxes, but their extent will be minimized as far as possible.


Footnotes

  1. Our criteria, defined in the SEARCH25 Technical Specification, are:

    1. Are the software dependencies easy to satisfy?
    2. How reliable is the distribution method for updates?
    3. Is the development team large enough to ensure continuity?
    4. What is the quality of the software?
    5. How responsive is the development team to feature requests and bug reports?
    6. How responsive is the community to requests for help?
    7. What is the quality of the documentation?
    8. Are there any licensing or trademark issues?

  2. Applications which were very relevant but failed to match sufficient criteria include:
    • DbWiz, a perl front-end to the YAZ client
    • JAFER, a light-weight Z39.50 toolkit in Java
    • LibraryFind, a federated search system built on the YAZ client and running on Ruby on Rails
    • Sesat, a generic federated search toolkit written in Java and running on the Velocity framework

Technical Specification Delivered

We recently delivered our Technical Specification to JISC and the document is available. It contains a description of the system architecture that we are attempting to build, our plans for development and details of our methodology for evaluating suitable Open Source applications for use in the project.

We would be interested in any feedback, particularly if anyone has suggestions on what sorts of things should be logged to monitor and inform on system usage.