Library Without Walls Anticipates Its Next Generations This article is one in a series of interviews BITS is conducting with CIC managers to get their views of the "big picture" as it relates to their work and the Laboratory mission. These people have also been asked to do a little forecasting as it applies to their business. BITS invites readers to join in the spirit of these interviews, treating the forecasts as a sort of informed speculation without holding anyone's "feet to the fire" to make the predictions come true.

 The Library Without Walls (LWW) of the future may enable researchers to search across databases, manipulate computer models from remote locations and archive the results, search indexed television news coverage, and model and simulate language and ideas as they do now with mathematical problems. These capabilities will be extensions of the ways researchers presently use the LWW.
 
 

The First Generation

In the March issue of BITS, Research Library Group Leader Rick Luce stated that enabling scientific collaboration is a prime focus of the project. The LWW is a framework connecting a number of powerful tools that make such collaboration more convenient for users. "In the first generation of LWW," Luce explains, "we delivered science citation databases on-line through the Web, we digitized LA reports (http://library.lanl.gov/cgi-bin/getfile?la-pubs.htm) and made them available in PDF form, and we established a Web interface to our on-line catalog (http://library.lanl.gov/catalog). By the end of this fiscal year we will be able to deliver about 900 of our 1,600 scientific journals to our customers' desktops (http://library.lanl.gov/ejournals/). By the end of the calendar year the LWW will have roughly 42 million scientific citation records (with duplicates in different databases)."

 The first generation of the LWW has taken about three years to establish. It includes four citation databases:
 
 

As a SciSearch subscriber, a researcher can create an individual profile that, in turn, initiates customized search strategies for checking the 18,000 new weekly citations added to the database. E-mail notification then alerts the researcher to items of pertinent interest: new papers, citations of the researcher's papers, or citations of other important papers in his/her field.

 The Laboratory's expertise and reputation in computer security underlies agreements with these major database publishers to license their data on-line through the Web under licenses with the Laboratory. Leveraging this technology, the Research Library now has 8 external institutions that subscribe to LWW databases. Some journals allow researchers to follow Web links from the database to an on-line copy of the article. In the future, however, the researcher should not have to guess which database to search, nor should he/she have to take the extra steps of going to the journal publisher, the journal, and the table of contents before going to the article.
 
 

The Second Generation

The LWW plans to make the database search path more useful by combining databases so the researcher begins a database search in only one place. In addition, the researcher should find that some half a million articles are linked directly to this megadatabase. The Laboratory is negotiating with publishers to obtain licenses to link additional electronic journal articles. Of course, the Laboratory has to pay for the privilege.

 In addition to these copyright negotiations, the LWW is challenged with technical problems in integrating the databases. Every publisher handles communication algorithms differently. Luce says, "Even though there is supposed to be a standard across databases for the terms and protocols used in making entries, it is rarely followed." In addition, there are semantic difficulties; databases may not use the same term for a given concept. A further challenge is the size of the integrated databases, which will create huge files. SciSearch alone comprises over 90 gigabytes of data to manage. Today the digital library data represent roughly 1.75 terabytes, and this is expected to double in the next 18 months. Even given these issues, Luce says, "`They' say it can't be done, but we are confident we will be able to integrate these databases in stages over the next two years or so."

 Over the coming months, the LWW will increase the opportunities to browse electronic journals. From citation databases, the researcher will find some half million articles on-line. Articles will also be available electronically through the library's on-line catalog and its traditional paths: author, title, subject, date, and so on. The LWW aims to maintain a rich browsing environment, i.e., as many pathways as possible from researcher to data.

 In the more immediate time frame, weekly alerting services will be added to BIOSIS at LANL and INSPEC at LANL. Recently electronic customers have been required to submit Z number data. Library personnel now use this data to plan future customer services. Requiring users to submit personal information raises issues of privacy and confidentiality, and customers are assured that the Research Library will not share this information with anyone, neither internal nor external to the Laboratory.

 Also in the second generation, the LWW hopes to expand the collection of digitized LA reports and brochures. Researchers are encouraged to add to this database by submitting the articles they have published in journals, papers presented to conferences, and materials written for the public.
 
 

The Third Generation

"It's very clear to us in the library world that science is becoming more cross-disciplinary, so we need to look for relevant connections with a wider lens," Luce says. "Look how physics interrelates with bioscience, creating biophysics; people are moving between disciplines. The frontiers of new science are in the intersections between the disciplines." Over the next 10-20 years we should be able to visualize language, ideas, and concepts, to model these as mathematical constructs are modeled now. Then, when someone moves from one discipline to another, he/she can look at the models and see how the specialized terms map over from one discipline to another.

 Picture, if you will, a three-dimensional computer graphic showing mountains rising up on an island. The statistical algorithm used to create the mountains represents the citations found for given terms (the Y axis), and the height of the mountains represents the frequency of citations for each one (the X axis). The next sequence of the visualization tool names the concepts or terms of interest and maps them onto the graphic as labels on the mountains. The user can then select one of the mountains, say the one labeled "physics," and zoom into it with a few key strokes to see a graphical representation of related terms overlying another set of mountains. The user can zoom in again on one of these, say the one labeled "biophysics," and so on. Users will then have the choice of going directly to a citation or list of citations.

 But we're probably closer to achieving another multimedia model, that of having remote users access information on different servers, combine it, and create a new document. (See Figure 1.) In this model users can access a code for a simulation on one server anywhere in the world, combine it with an explanation of the experiment found on a document server somewhere else (technical report or journal article), and manipulate the code to create a new simulation. You then have, in two different places, two versions of the code, and other researchers can manipulate either one. Librarians are asking themselves, "How can we archive something like this that's dynamic?" And of course, there are computer hardware and software problems such as the amount of bandwidth used when these files are shipped from one remote server to another.

 Figure 1: Users will be able to access simulation code remotely, manipulate it, and create new simulations to be placed on widely distributed servers. Other researchers can access either the old or the new code remotely and manipulate it to new simulations. Accessing dynamic forward and backward links and figuring out how to archive these different versions of a simulation, all in different places, are problems to be worked through.

 Also under the category of "blue sky" ideas, Luce muses on what it would be like to be able to search an indexed file of all the CNN footage of one day during the Gulf War without "fast forwarding" through an entire six hours of footage. He says that even without such multimedia, the physical and electronic space that science literature takes up doubles every five to six years. "It's due both to the `publish-or-perish' mandate in the academic world, coupled with the fact that scientific communication is happening faster, and the fact that science itself is happening faster," he concludes.

 Any advances made in the management of knowledge will simply be tools to enable collaboration among researchers. The LWW will apply these tools in the management of its data to better serve its Laboratory customers and their collaborators around the world.

 Richard Luce is the Research Library Director at Los Alamos National Laboratory and the Project Leader of the Library Without Walls. Before to coming to LANL, he was known nationally for his pioneering work in linking heterogeneous library systems in Colorado and Florida. Today, the Library Without Walls project is noted as one of the more pioneering and successful digital library efforts to date.

Back

 Ann Mauzy, mauzy@lanl.gov, (505) 667-5387
Communications Arts and Services (CIC-1)