Overview of Digital Library Components and Developments


Edward A. Fox

Department of Computer Science, Virginia Tech

Blacksburg, VA 24061 USA - fox@vt.edu - http://fox.cs.vt.edu


Abstract

Digital libraries are being built upon a firm foundation of prior work as the high-end information systems of the future. A component approach is becoming popular, with well-established support for key components like the repository, especially through the Open Archives Initiative. Thus, we separately consider digital objects, metadata, harvesting, indexing, searching, browsing, rights management, linking, and powerful interfaces. Flexible integration will be possible through a variety of architectures, using buses, agents, and other technologies. The field as a whole is undergoing rapid growth, supported by advances in storage, processing, networking, algorithms, and interaction. There are many initiatives and developments, including those supporting education, and these will certainly be of benefit in Latin America.



1. Introduction


Digital libraries extend and integrate approaches adopted in traditional libraries, as well as in distributed information systems, to yield high-end information systems, services, and institutions. Here we explore some of the parts or components of digital libraries and discuss several of the developments in this emerging field.


Comprehensive digital libraries will help users manage all phases of the information lifecycle. This is illustrated in Figure 1, which summarizes much of the discussion of a US National Science Foundation funded workshop on Social Aspects of Digital Libraries (Borgman, 1996). Of particular import is to simplify the authoring and creation processes so that wider populations can participate, adding all types of multimedia content directly into digital libraries. Downstream access allows readers to benefit from this type of computer-mediated communication, across time and space. Ultimately, it is hoped that knowledge will be shared and then lead to additional cycles of discovery, authoring, and utilization that are facilitated by digital libraries.


Digital libraries are distinguished in that they afford services connected with each of the phases of the lifecycle. They integrate technologies from a variety of disciplines to help realize the designs articulated by early visionaries.


1.1 Foundations


Vannevar Bush was one of the first to clearly describe problems related to the modern explosion of information and to appeal to technology to help us meet our needs regarding scholarly communication (Bush, 1945). Twenty years later, Licklider painted a more complete picture, identifying the needs for better distributed-processing, human-computer interaction, document management, and retrieval (Licklider, 1965). Salton helped launch the modern era of automatic indexing and search through some 30 years of laboratory research (Salton, 1968). But real development of digital libraries per se only began in the early 1990s, drawing upon such visions, as well as statements of needs and requirements from prospective users (Fox et al., 1993b; Heath et al., 1995). Work on early projects like TULIP (Dougherty & Fox, 1995), ongoing efforts to reach consensus and establish standards like the Dublin Core Metadata Initiative (Dublin Core Community, 1999; Weibel, 1999), and several rounds of research funding (Lesk, 1999), have all helped lay a firm foundation as a clearer understanding of the scope of the field has emerged.


Information Life Cycle diagram

Figure 1. Information Life Cycle: Diagram from Workshop Report on Social Aspects of Digital Libraries, http://www-lis.gseis.ucla.edu/DL/



1.2 Definitions


The scope and definition of the field of digital libraries has been the subject of intensive debate, which is well summarized in (Borgman, 1999). Here we simply remind the reader of the integrative nature of the field through three definitions that show such combinations:


However, to help clarify our subsequent analysis, we add our own favorite definition, drawing upon the 5S framework (Fox, 1999b), with its 5 key constructs: societies, scenarios/services, spaces, structures, and streams. Thus, digital libraries are complex systems that:

  1. help satisfy information needs of users (societies),

  2. provide information services (scenarios),

  3. locate and present information in usable ways (spaces),

  4. organize information in usable ways (structures), and

  5. communicate information with users and computers (streams).


2. Components


To build digital libraries we must ensure that each of the "S" constructs is addressed, and so can use 5S as a checklist or guideline. In operational terms, however, many digital libraries are built out of components that are integrated into a production quality system. Figure 2 highlights some of the most important such components.


Components of a Digital Library


Figure 2. Components of a Digital Library


In the following subsections we explore issues and sub-components related to this figure.


2.1 Digital objects


The actual content of digital libraries is made up of a number of digital objects. In some cases these may be thought of as data sets (e.g., a table of results, the genomic information for an individual). In others they may be multimedia information, such as an image, graphic, animation, sound, musical performance, or video. Many can be thought of as documents, which carry content in some structure or structures, perhaps made up of logical or physical divisions such as sections or pages. Some of the objects will be "born digital", such as this paper, while others may be representations of some physical object (such as a painting that is shown through a digital image) that result from some type of digitization process. Thus, into the foreseeable future, digital libraries will be hybrid constructs, where paper, microforms, and other media carry much of the content that is of interest and only the metadata is in digital form.


2.2 Metadata


Digital objects are described, structured, summarized, managed, and otherwise manipulated in surrogate form through the use of "metadata", which literally means data about data. Three types of metadata are often distinguished: descriptive, structural, and administrative. Metadata is usually produced through a process called "cataloging" that is often carried out by trained librarians. Collections of such information are commonly stored in "catalogs". In computerized environments, metadata may be automatically or semi-automatically extracted or derived from the original content, or the "full-text" may simply be indexed and searched without involving metadata (as happens on the WWW when search engines are employed). Nevertheless, if metadata is available and can be used along with content terms derived from full-text documents, the result is better than if only one source of evidence is employed (Fox, 1983). Yet, if only metadata is available in computer form to describe a digital object, it must be used in digital libraries. Hence, metadata should be used whenever available in a digital library, and is an important aspect in many such systems.


2.3 Repositories and Harvesting


As can be seen in Figure 3, we can think of digital libraries as containing a collection of digital objects (DOs), each of which has one or more sets of metadata objects (MDOs) associated. This "repository" part of a digital library may, as is the case in the Open Archives Initiative (Van de Sompel, 2000), follow certain conventions (Van de Sompel & Lagoze, 2000). In particular, according to the latest specifications, an "Open Archive" (OA) is a computer system with a WWW server that behaves according to an OA protocol to allow other computers to harvest metadata from it. That protocol supports requests to, for example:


Components of a Digital Library


Figure 3. Open Archives Repository


In particular, every OA must be able to return MDOs that conform to the Dublin Core and are coded in XML. They must have documented policies about what DOs are included, what "archiving" is in place, and what terms and conditions apply to use, if any. Having a repository component that is an OA means that access to digital library content is open, ensuring widespread interoperability at one important level.


At a slightly deeper level, interoperability among digital libraries requires that digital objects be accessible through some scheme for universal identifier names (e.g., URNs). MDOs thus have an URN that facilitates access (if authorized) to the corresponding DO. A likely policy for an OA is to use some specific type of URN (e.g., OCLC's PURL or CNRI's handle - also employed for Digital Object Identifiers, DOIs). In today's ubiquitous WWW environment, it is presumed that users and computers will be able to retrieve a desired DO if its identifier is known.


2.4 Rights Management


Considering again Figure 2, we note that in the layer above data, multimedia information, and repositories are the "rights manager" which must protect intellectual property rights. In the trivial case, which fortunately is common, content is freely available so nothing is needed here. In some cases too, where content is encrypted, content management is outside the scope of the digital library, since secure objects are stored and retrieved and the steps of encryption and decryption occur remotely. Similarly, some content may have "watermarks" added in a way that makes removal difficult, so that subsequent access outside the digital library can be monitored or controlled.


However, as e-commerce, e-government, and other movements spread, it will be crucial for many digital libraries to manage rights. This typically involves a number of steps:

  1. The digital library should include policies and rules specifying the management required.

  2. The users of the digital library should be authenticated in some way so they are known.

  3. The content of the digital library should be shown to be authentic.

  4. Payment should be made if access requires that in a particular case.

  5. Users who are authorized to access a DO are allowed to do so.

  6. Subsequent access with the DO may take place after retrieval to a user's site.


IBM has developed sophisticated technology and digital library systems with support for rights management (Gladney, 1998). Xerox staff has developed a Digital Property Rights Language that relates to step 1 above. Password schemes and systems like Kerberos relate to step 2. Hashing with MD5 or use of digital signatures can support step 3. E-commerce mechanisms, especially those enabling micro-payments or subscriptions, relate to step 4. Rules processing relates to step 5, as was done at Case Western Reserve University - employing Prolog to work with users, user classes, documents, and document classes - in order to find the lowest cost access solution. In step 6, there usually is little control, unless a scheme like IBM's Cryptolope mechanism is involved wherein the DO is encapsulated with code that limits access (e.g., prohibits printing).


2.5 Indexing, Resource Discovery, Searching, and Retrieving


Considering Figure 2 further, we clearly must support finding DOs, directly or through MDOs, so that they can be identified, retrieved, and used. Often, DOs and/or MDOs are automatically indexed so that some index structure is built to speed up search. Such indexing may build upon any manual indexing carried out by authors, other creators, or indexers. Automatic indexing also may involve first classifying DOs, such as when OCLC's CORC project software suggests Dewey Decimal Classification entries for a WWW page that is being cataloged.


Indexes may be centralized or distributed. They may be two-level, allowing a resource discovery phase to proceed to find what source(s) should be included in the subsequent (lower) level search. Indexes also may have multiple parts, such as when a document has text, image, audio, or video content. Content-based indexing of multimedia information generally involves identifying and assessing features that characterize the DOs, whether they involve concepts, n-grams, words, keywords, descriptors, phonemes, textures, color histograms, eigenvalues, links, or user ratings.


Most commonly, searching in a digital library involves an information retrieval system or search engine. In some cases a database management system is used instead or underlies the retrieval system. In any case, retrieval will be more effective if a suitable scheme is used to combine the various types of evidence available (Belkin, Kantor, Fox, & Shaw, 1995), to indicate if a DO may be relevant with respect to the query that is used to express the user's information need.


2.6 Linking, Annotating, and Browsing


Once a DO is found, it often is appropriate to follow links from it to cited works (or vice versa). Further, notes can be recorded as annotations and linked back to the works, so they can be recalled later or shared with colleagues as part of collaborative activities. If suitable clustering is in place, other DOs that are "near" a given work may be examined. Or, using a classification system appropriate for the content domain, users may browse around in "concept space" and link at any point between concepts and related DOs. Browsing also can proceed based on any of the elements in the MDO. Thus, dates, locations, publishers, contributing artists, language, and other aspects may be considered to explore the collection or refine a search.


2.7 Interfaces and Interaction


Ultimately, users will connect through a human-computer interface and interact with the digital library, though in some cases the digital library may be an embedded system that is seen only indirectly (e.g., through a word processor that allows one to search for a quotation). Most commonly, a digital library has an interface for users to search, browse, follow links, retrieve, and read documents. As can be seen in Figure 4, that interface may be specialized according to what roles the user will play.


Components of a Digital Library


Figure 4. Users Direct


For example, the Computer Science Teaching Center, a digital library of courseware about computing, requests that all users, except those just browsing or searching, login to identify themselves. Then it knows and tailors the interface so that suitably authorized users can submit, review, or edit works. Further, CSTC encourages users to submit courseware they have developed that others might download, and supports their entering in suitable metadata as well as uploading their applets, demonstrations, laboratory exercises, interactive multimedia training resources, etc. After the work has been entered by the creator, it can be improved through peer review and "certified" for public use, or even accepted for publication in the ACM Journal of Educational Resources in Computing (JERIC) (Cassel & Fox, 2000). Thus, instead of requiring current complex and expensive chains of processing for journal submissions, handling with digital libraries may be implemented through this "users direct" model.


Following retrieval, users examine a list of results or else work with more sophisticated schemes for visualizing and managing results (Heath et al., 1995; Nowell & Hix, 1992; Nowell & Hix, 1993). Many other types of rich interaction are possible through innovative digital library interfaces (Rao et al., 1995).


In particular, as can be seen in Figure 2, digital libraries may use special software to present or render multimedia, hypertext, and hypermedia content. Such software may be launched from a WWW browser, which typically only supports a limited range of hypermedia. Real-time requirements must be satisfied to ensure adequate quality of service when streaming content is involved, especially if multiple streams are involved.


Also related to interaction with the digital library is the matter of workflow management. Especially with the "users direct" model, the various people interacting with a particular DO may leave some record of approval or review, which triggers others to continue with processing. For example, universities that are part of the Networked Digital Library of Theses and Dissertations (Fox, 1997; Fox, 2000; Fox, 1999c) allow students to upload their works, make changes, secure approval by the graduate school, have cataloging information added by librarians, and ultimately release the suitably enhanced information for widespread access.


Finally, from Figure 2 we see that many digital libraries will have their own user interfaces. In addition, they may support gateway connections, using protocols like Dienst (Lagoze & Davis, 1995) or Z39.50 (International Standard Maintenance Agency Z39.50, 2000). As we work toward world digital libraries, we must ensure that our interfaces support multimedia and multilingual access, as well as various schemes for interconnection with other digital libraries and resources.


2.8 Architectures and Interconnecting


Since the field of digital libraries is young, there still is active investigation regarding architecture, interconnection, and interoperability (Paepcke, Chang, Garcia-Molina, & Winograd, 1998). Figure 2 shows one, rather high-level, decomposition of a digital library into components. Given the range of legacy systems that are used today as parts of digital libraries, the actual situation often is more complex.


To simplify matters, several interconnection strategies have been explored. At Stanford, a bus approach has been used (Baldonado, Chang, Gravano, & Paepcke, 1997; Melnik, Garcia-Molina, & Paepcke, 2000; Paepcke, 1999). Mediation code "wraps" around various collections or resources to make suitable conversions to representations supported by the bus and the other services connected to it.


Agents provide another interconnection mechanism (Birmingham, 1995; Nicholas, Crowder, & Soboroff, 2000; Sánchez, Flores, & Schnase, 1999; Sánchez, Leggett, & Schnase, 1997; Sánchez & Leggett, 1997; Sánchez, Lopez, & Schnase, 1998b). Many agent-based systems use KQML as the language for transporting knowledge constructs (Barceinas, Sánchez, & Schnase, 1998). Yet another approach is the distributed scheme supporting federated search that underlies the Dienst system (Lagoze & Davis, 1995; Lagoze & Payette, 1998; Payette, Blanchi, Lagoze, & Overly, 1999).


Nevertheless, currently it is not possible to identify the best architecture(s) for digital libraries. We must look to future technological developments and actual deployments as well as standardized performance testing to resolve such questions.


3. Developments


Since the early 1990s, the digital library field has emerged as an important area of research and development. There are hundreds of projects and thousands of reports/publications describing them. One of the active research programs is in the USA, funded by the National Science Foundation. A summary of that Digital Libraries Initiative makes clear the broad scope of work underway (Fox, 1999a). NSF has deliberately selected a diverse set of content areas, genre, media, and user communities in an effort to rapidly develop the field (Lesk, 1999). In the following sections we explore the overall process of such development.


3.1 Technology


Work on digital libraries has been facilitated by technical advances in many areas. For the first time, storage systems are readily affordable that can handle enormous text collections, very large image collections, and large audio or video collections. Fast processors, supercomputers, cluster-computers, networks of workstations, and other computational aids have provided ample processing capacity to handle user communities operating on a global scale. Increases in network speeds and bandwidth have made it possible to build distributed systems that perform well and have high reliability. Computationally expensive algorithms have been refined so that useful techniques such as LSI can help with multilingual retrieval and other applications (Dumais, 1998). High-end graphics systems and virtual environments also have evolved to be usable for information visualization as well as interfacing with digital libraries (Das Neves & Fox, 2000). Representation schemes like PDF and XML have made digital documents easy to produce and share; facilitating information interchange and encouraging further digitization.


An example of the effects of technology can be seen with regard to the MARIAN digital library system developed at Virginia Tech (Can, Fox, Snavely, & France, 1995; Fox, France, Sahle, Daoud, & Cline, 1993a; France, 2000; Zhao, 1999). Over the last decade, coding has switched from C to C++ to Java. Hardware has switched from mainframes to minis to PCs. Current work to enhance performance includes development of new algorithms to manipulate inverted files (Sornil, 2000) on the Virginia Tech PetaPlex system, which has 100 processors and 2.5 terabytes of disk storage capacity. Earlier studies of performance (Zhao, 1999) demonstrated that the architecture is scalable, and showed that, for example, changing some of the internal communication from TCP to UDP would lead to substantive improvement.


3.2 Economic, Social, and Legal Issues


According to the 5S framework, the topmost level deals with "societies". Though technology has made possible many advances in digital libraries, all such efforts are situated in a social context, as can be seen in Figure 1. Many of the key social issues were identified in a NSF-funded workshop on this topic (Borgman, 1996). A good explanation of social issues in constructing and evaluating digital libraries appears in (Kilker & Gay, 1998).


Underlying work on the Networked Digital Library of Theses and Dissertations (discussed further in Chapter 5) is a strong social and educational rationale, to prepare the next generation of scholars for the Information Age (Fox, 1997; Fox, 1998a; Fox et al., 1996; Fox, Hall, & Kipp, 1997a; Fox et al., 1997b). Its aims - of encouraging discussion about intellectual property rights among students and faculty, of building awareness and infrastructure about digital libraries on campuses, and of developing a new genre for communication among graduate students and researchers - are largely being met. The impact of economics is remarkable, in that making works available for free leads to hundreds or thousands of downloads per work per year. This contrasts with interlibrary loan or buying copies for roughly $50, which very rarely led to more than 5 accesses per year.


Social and technical issues often relate. For example, the culture and social atmosphere determines how electronic theses and dissertations will be managed on a particular campus, or across broader boundaries involving states or nations. Universities with advanced infrastructure, like Caltech, MIT, and Virginia Tech, have their own services. On the other hand, there are regional/national projects associated with NDLTD in Ohio, Catalunya, Australia, Germany, India, Portugal, and South Africa. Because of this arrangement, a federated search mechanism was implemented (Powell & Fox, 1998), though future plans call for use of Open Archives and regular harvesting. Groups that own content or have a tradition of managing it can continue to do so, while at the same time technical approaches can allow comprehensive search across such distributed collections.


Legal issues also become more visible with digital libraries. There often was little concern over copyright issues when preparing a dissertation that would largely go unread, sitting on a shelf in a local library. However, with the potential of thousands of downloads from around the world, authors must be very careful not to include a copyrighted image or other content without permission.


Other economic, social, and legal issues have come to the fore with digital libraries. In many states, like Virginia, a single university library can run a service for the whole state, further extending the benefits of volume purchasing. On the other hand, many such services involve a contract with an information provider that often has a complex set of terms and conditions, meaning that libraries now require more legal counsel.


Digital libraries allow new groups to assemble around new collections of interest. In many countries, digital libraries show promise regarding preserving cultural, historic, and linguistic records. In a number of situations, they have the potential of aiding economic development by supporting localized and distance education. We explore such cases among those discussed in the next subsection.


3.3 Initiatives and Projects


There are hundreds of digital library efforts underway around the world. A good place to find out about many of these is in D-Lib Magazine (Arms, 2000a), which by the end of 2000 has about 1000 entries.


As mentioned earlier, the National Science Foundation in USA has funded the Digital Libraries Initiative, with awards granted in 1994, 1998, and 1999. Well over $50M has gone to a wide range of research projects. NSF support to Stanford University through this program has led in part to the appearance of Google, one of the most popular search services on WWW. Other technical advances have led to methods for processing and searching collections of images, music, and video. A great expansion in work on geographic information systems and other spatial data has also resulted. Microsoft's TerraServer (http://terraserver.microsoft.com/) demonstrates how very large data collections can serve huge user communities in spite of using relatively modest hardware resources.


The museum community has demonstrated methods of accessing distributed cultural heritage information (Moen, 1998). Sacred and/or precious resources, including antiquities, have become available from sources such as the Vatican (Gladney, Mintzer, Schiattarella, Bescos, & Treu, 1998; Mintzer et al., 1996). Special collections, such as of butterflies (Hong, Chen, & Hsiang, 2000) or floristic information (Amavizca, Sánchez, & Abascal, 1999; Sánchez, Fernández, & Schnase, 1998a; Sánchez et al., 1999; Sánchez et al., 1998b), have become accessible through digital libraries.


In some ways the broadest and most influential digital library efforts have to do with education. For example, in USA, after years of study, it was decided that NSF should support development of a digital library to support learning by students (Arms, 1999). This has led to over $50M committed to the National Science (and Mathematics, Engineering, and Technology Education) Digital Library, which should be launched by the end of 2002 (www.smete.org, 2000). Many thousands of teachers, and millions of students, will benefit from this resource as it aims to enhance learning through dynamic interaction, visualization, simulation, and other computer-related devices.



4. Conclusion


In conclusion, we try to put digital libraries in perspective by answering key questions.


Why? We build digital libraries for many reasons. They can help us preserve our linguistic, literary, historical, and cultural heritage. They make access simpler and cheaper. They lower the costs of disseminating information. They help us establish new communities around new collections that can now become available. They support teaching and learning, especially in the context of distance or lifelong learning. They allow rich media types to be included and managed effectively. They encourage authors to create and share, and others to collaborate and quickly build on newly discovered knowledge.


How? Digital libraries provide comparable or better services to traditional libraries at relatively low cost over existing networks and tools like those used for WWW. They are built in most cases upon prior technologies, such as library automation packages or records management systems. Research systems like MARIAN and Dienst may become more widely deployed. Certainly, lightweight software such as that used for Open Archives will be commonly used. Connecting an URN scheme, a repository mechanism, search engine(s), and a WWW interface can lead to a relatively simple system in short order.


What? All types of content will be available through digital libraries. They will become the high-end of the information systems world, not only including bibliographic and full-text content but also images, music, and video. Included will be medical images, images of scanned pages, engineering drawings, satellite images, educational videos, courseware, oral histories, 3D renderings of museum objects, musical performances, etc.


Where? Digital libraries are appearing all around the world. DL's are managed by universities, publishers, government agencies, and public libraries. Some companies are shifting to use them for records and reports as well as for customer access. In the context of this booklet, we are assured that they will be widely used throughout Latin America!



5. Resource List


There is a great deal of information available regarding digital libraries. An easy way to get started is to consult our online courseware (Fox, 1998b). In particular, follow links to Resources and to References (and thence to Repositories for a set of other Web sites to consider).


Early events in the field are summarized in a 1993 Sourcebook (Fox, 1993). A handy reference work for those building digital libraries is an extensive white paper prepared for Sun Microsystems (Noerr, 2000). The first two single-author books providing an overview appeared in 1997 (Lesk, 1997) and 2000 (Arms, 2000b). D-Lib Magazine (Arms, 2000a) is an online publication to which many in the field send news, early results, or helpful summaries. There is one journal (Springer-Verlag, 2000), but numerous special issues on the topic have appeared in other journals (Chen & Fox, 1996; Fox, 1999a; Fox, Akscyn, Furuta, & Leggett, 1995; Fox & Lunin, 1993; Fox & Marchionini, 1998; Marchionini & Fox, 1999). Numerous conferences have been sponsored by ACM (Fox & Marchionini, 1996; Fox & Rowe, 1999) or by other organizations. In 2001 the first ACM / IEEE-CS joint conference will take place (Borgman & Fox, 2001). There also are annual regional conferences in Europe and Asia, in addition to national events in countries like Japan and Russia. It is hoped that the reader will explore these various resources further, and participate in some of the many workshops and conferences.



6. References


Amavizca, M., Sánchez, J. A., & Abascal, R. (1999). 3DTree: Visualization of large and complex information spaces in the Floristic Digital Library, Proceedings of Segundo Encuentro de Computación (ENC'99). Pachuca, Hidalgo, México.

Arms, W. (1999). Report of the NSF Science, Mathematics, Engineering, and Technology Education Library Workshop, July 21-23, 1998 (NSF 99-112): National Science Foundation, Division of Undergraduate Education. [Online at]: http://www.dlib.org/smete/public/report.html

Arms, W. (2000a). D-Lib Magazine. {Online at]: http://www.dlib.org

Arms, W. Y. (2000b). Digital Libraries. Cambridge, MA: MIT Press.

Baldonado, M., Chang, C.-C. K., Gravano, L., & Paepcke, A. (1997). The Stanford Digital Library Metadata Architecture. International Journal on Digital Libraries, 1(2), 108-121.

Barceinas, A., Sánchez, J. A., & Schnase, J. L. (1998). MICK: A KQML inter-agent communication framework in a digital library, Memorias del Simposium Internacional de Computación (CIC'98) (pp. 66-79). Mexico City.

Belkin, N. J., Kantor, P., Fox, E. A., & Shaw, J. A. (1995). Combining the Evidence of Multiple Query Representations for Information Retrieval. Information Processing and Management, 31(3), 431-448.

Birmingham, W. (1995). University of Michigan Digital Library Project. [Homepage online at]: http://http2.sils.umich.edu/UMDL/

Borgman, C. (1996). Social Aspects of Digital Libraries (NSF Workshop Report). Los Angeles: UCLA. Feb. 16-17, 1996. [Online at]: http://www-lis.gseis.ucla.edu/DL/

Borgman, C., & Fox, E. A. (2001). Proceedings of the First Joint Conference on Digital Libraries. June 24-28, 2001. New York: ACM Press.[Online at]: http://www.jcdl.org

Borgman, C. L. (1999). What are digital libraries? Competing visions. Information Processing and Management, 35(3), 227-243.

Bush, V. (1945). As We May Think. Atlantic Monthly,176, 101-108.

Can, F., Fox, E., Snavely, C., & France, R. (1995). Incremental Clustering for Very Large Document Databases: Initial MARIAN Experience. Information Systems, 84, 101-114.

Cassel, L., & Fox, E. A. (2000). ACM Journal of Education Resources in Computing. [Online at]: http://purl.org/net/JERIC/

Chen, S., & Fox, E. A. (1996). Guest Editors' Introduction to Special Issue on Digital Libraries. Journal of Visual Communication and Image Representation, 7, 1.

Das Neves, F. A., & Fox, E. A. (2000). A study of user behavior in an immersive Virtual Environment for digital libraries, Proceedings of the Fifth ACM Conference on Digital Libraries: DL '00, June 2-7, 2000, San Antonio, TX (pp. 103-111). New York: ACM Press.

Dougherty, W. C., & Fox, E. A. (1995). TULIP at Virginia Tech. Library Hi Tech, 13(4), 54-60.

Dublin Core Community. (1999). Dublin Core Metadata Initiative. [Online at]: http://purl.org/dc

Dumais, S. T. (1998). References to LSI Papers. [Online at]: http://superbook.bellcore.com/~std/lsiPapers.html

Fox, E. (1997). Networked Digital Library of Theses and Dissertations: An International Collaboration Promoting Scholarship. ICSTI Forum, Quarterly Newsletter of the International Council for Scientific and Technical Information, 26, 8-9. [Online at]: http://www.icsti.org/icsti/forum/fo9711.html#ndltd

Fox, E. (1999a). The Digital Libraries Initiative: Update and Discussion: Guest editor's introduction to Special Section. Bulletin of the American Society of Information Science, 26(1), 7-11.

Fox, E. (2000). NDLTD: Networked Digital Library of Theses and Dissertations. [Online at]: http://www.ndltd.org

Fox, E., France, R., Sahle, E., Daoud, A., & Cline, B. (1993a). Development of a Modern OPAC: From REVTOLC to MARIAN, Proceedinds 16th Annual International ACM SIGIR Conference on R&D in Information Retrieval, SIGIR '93 (pp. 248-259). Pittsburgh: ACM Press.

Fox, E. A. (1983). Some Considerations for Implementing the SMART Information Retrieval System under UNIX (Technical report TR 83-560). Ithaca, NY: Cornell University, Computer Science Department.

Fox, E. A. (1993). Sourcebook on Digital Libraries: Report for the National Science Foundation. (Technical Report TR-93-35). Blacksburg, VA: Dept. of Computer Science, Virginia Tech. [Online at]: fox.cs.vt.edu

Fox, E. A. (1998a). Digital Libraries: Preparing the Next Generation of Scholars. Paper presented at the NFAIS'98, Philadelphia, PA. Feb. 24, 1998. [Online at]: http://www.ndltd.org/talks/NFAIS98.ppt

Fox, E. A. (1998b). Digital Library Courseware. Virginia Tech, Department of Computer Science: Blacksburg, VA. [Online at]: http://ei.cs.vt.edu/~dlib/

Fox, E. A. (1999b). The 5S Framework for Digital Libraries and Two Case Studies: NDLTD and CSTC, Proceedings NIT'99. Taipei, Taiwan. [Online at]: http://www.ndltd.org/pubs/nit99fox.doc

Fox, E. A. (1999c). Networked Digital Library of Theses and Dissertations, Proceedings DLW15. Japan: ULIS. [Online at]: http://www.ndltd.org/pubs/dlw15.doc

Fox, E. A., Akscyn, R., Furuta, R., & Leggett, J. (1995). Guest Editors' Introduction to Digital Libraries. Communications of the ACM, 38, 22-28.

Fox, E. A., Eaton, J., McMillan, G., Kipp, N., Weiss, L., Arce, E., & Guyer, S. (1996). National Digital Library of Theses and Dissertations: A Scalable and Sustainable Approach to Unlock University Resources. D-Lib Magazine, 2. [Online at]: http://www.dlib.org/dlib/september96/theses/09fox.html

Fox, E. A., Hall, R., & Kipp, N. (1997a). NDLTD: Preparing the Next Generation of Scholars for the Information Age. The New Review of Information Networking (NRIN), 3, 59-76. [Online at]: http://www.ndltd.org/pubs/nrin.pdf

Fox, E. A., Hall, R., Kipp, N. A., Eaton, J. L., McMillan, G., & Mather, P. (1997b). NDLTD: Encouraging International Collaboration in the Academy. Special Issue on Digital Libraries of DESIDOC Bulletin of Information Technology (DBIT), 17(6), 45-56. [Online at]: http://www.ndltd.org/pubs/dbit.pdf

Fox, E. A., Hix, D., Nowell, L., Brueni, D., Wake, W., Heath, L., & Rao, D. (1993b). Users, User Interfaces, and Objects: Envision, a Digital Library. Journal of the American Society Information Science, 44, 480-491.

Fox, E. A., & Lunin, L. (1993). Introduction and Overview to Perspectives on Digital Libraries; guest editor's introduction to special issue. Journal of the American Society Information Science, 44, 441-443.

Fox, E. A., & Marchionini, G. (1996). Proceedings of the First ACM International Conference on Digital Libraries, DL '96. Bethesda, MD; New York: ACM.

Fox, E. A., & Marchionini, G. (1998). Toward a Worldwide Digital Library; Guest Editors' Introduction to Special Section on Digital Libraries: Global Scope, Unlimited Access. Communications of the ACM, 41, 28-32. {online at]: http://purl.lib.vt.edu/dlib/pubs/CACM199804

Fox, E. A., & Rowe, N. (1999). Proceedings of The Fourth ACM Conference on Digital Libraries, DL '99. Berkeley: ACM.

France, R. K. (2000). MARIAN Digital Library Information System. [Online at]: http://www.dlib.vt.edu/products/marian.html

Gladney, H. M. (1998). Safeguarding Digital Library Contents and Users: Interim Retrospect and Prospects. D-Lib Magazine, 4(7). [Online at]: http://www.dlib.org/dlib/july98/gladney/07gladney.html

Gladney, H. M., Mintzer, F., Schiattarella, F., Bescos, J., & Treu, M. (1998). Digital Access to Antiquities. Communications of the ACM, 41, 49-57.

Heath, L., Hix, D., Nowell, L., Wake, W., Averboch, G., & Fox, E. A. (1995). Envision: A User-Centered Database from the Computer Science Literature. Communications of the ACM, 38, 52-53.

Hong, J.-S., Chen, H.-Y., & Hsiang, J. (2000). A Digital Museum of Taiwanese Butterflies, Proceedings of the Fifth ACM Conference on Digital Libraries: DL '00, June 2-7, 2000, San Antonio, TX (pp. 260-261). New York: ACM Press. [Online at]: http://digimuse.nmns.edu.tw

International Standard Maintenance Agency Z39.50. (2000). International Standard Maintenance Agency Z39.50. The Library of Congress Network Development and MARC Standards Office, Washington, DC. [Online at]: http://lcweb.loc.gov/z3950/agency/, September 9, 2000

Kilker, J., & Gay, G. (1998). The Social Construction of a Digital Library: A Case Study Examining Implications for Evaluation. Information Technology and Libraries, 17, 60-70. [Online at]: http://www.lita.org/ital/ital1702.htm

Lagoze, C., & Davis, J. R. (1995). Dienst: An Architecture for Distributed Document Libraries. Communications of the ACM, 38, 47.

Lagoze, C., & Payette, S. (1998). An Infrastructure for Open-Architecture Digital Libraries (TR98-1690): Cornell University, Computer Science.

Lesk, M. (1997). Practical Digital Libraries: Books, Bytes and Bucks. San Francisco: Morgan Kaufmann Publishers.

Lesk, M. (1999). Perspectives on DLI-2 - Growing the Field. D-Lib Magazine, 5(7/8). [Online at]: http://www.dlib.org/dlib/july99/07lesk.html

Licklider, J. C. R. (1965). Libraries of the Future. Cambridge, MA: MIT Press.

Marchionini, G., & Fox, E. A. (1999). Progress toward digital libraries: Augmentation through integration; Guest Editor's Introduction to Special Issue on Digital Libraries. Information Processing and Management, 35, 219-225.

Melnik, S., Garcia-Molina, H., & Paepcke, A. (2000). A Mediation Infrastructure for Digital Library Services. Paper presented at the Proceedings of the Fifth ACM Conference on Digital Libraries, San Antonio, Texas.

Mintzer, F., Boyle, L., Cazes, A., Christian, B., Cox, S., Giordano, F., Gladney, H., Lee, J., Kelmanson, M., Lirani, A., Magerlein, K., Pavani, A., & Schiattarella, F. (1996). Towards online worldwide access to Vatican Library materials. IBM Journal of Research and Development, 40, 139-162.

Moen, W. E. (1998). Accessing Distributed Cultural Heritage Information. Communications of the ACM, 41, 45-48.

Nicholas, C., Crowder, G., & Soboroff, I. (2000). CARROT: An Agent-Based Architecture for Large-Scale Document Information Systems (TR CS-2000-01). Baltimore: UMBC.

Noerr, P. (Ed.). (2000). The Digital Library Toolkit (2nd ed.). Palo Alto, CA: Sun Microsystems, Inc.

Nowell, L., & Hix, D. (1992). User interface design for the project Envision database of computer science literature. Twenty-second Annual Virginia Computer Users Conference (pp. 29-33). Blacksburg, VA.

Nowell, L., & Hix, D. (1993). Visualizing search results: User interface development for the project Envision database of computer science literature. Advances in Human Factors/Ergonomics, Proceedings of HCI International '93, 5th International Conference on Human Computer Interaction (Vol. 19B, Human-Computer Interaction: Software and Hardware Interfaces, pp. 56-61). Elsevier.

Paepcke, A. (1999). Using the InfoBus. Palo Alto: Stanford University Digital Libraries Project. [Online at]: http://www-diglib.stanford.edu/diglib/pub/userinfo.html

Paepcke, A., Chang, C.-C. K., Garcia-Molina, H., & Winograd, T. (1998). Interoperability for Digital Libraries Worldwide. Communications of the ACM, 41, 33-43.

Payette, S., Blanchi, C., Lagoze, C., & Overly, E. A. (1999). Interoperability for Digital Objects and Repositories: The Cornell/CNRI Experiments. D-Lib Magazine, 5(5). [Online at]: http://www.dlib.org/dlib/may99/payette/05payette.html

Powell, J., & Fox, E. (1998). Multilingual Federated Searching Across Heterogeneous Collections. D-Lib Magazine, 4(8). [Online at]: http://www.dlib.org/dlib/september98/powell/09powell.html

Rao, R., Pedersen, J. O., Hearst, M. A., Mackinlay, J. D., Card, S. K., Masinter, L., Halvorsen, P.-K., & Robertson, G. G. (1995). Rich Interaction in the Digital Library. Communications of the ACM, 38, 29-39.

Salton, G. (1968). Automatic Information Organization and Retrieval. New York: McGraw-Hill.

Sánchez, J. A., Fernández, L., & Schnase, J. L. (1998a). Agora: Enhancing awareness and collaboration in floristic digital libraries, Proceedings of the Fourth CYTED-RITOS Workshop on Groupware (CRIWG'98). Rio de Janeiro: CYTED.

Sánchez, J. A., Flores, C. A., & Schnase, J. L. (1999). Mutant: Agents as guides for multiple taxonomies in the Floristic Digital Library, Proceedings of the Fourth ACM Conference on Digital Libraries (DL'99) (pp. 244-245). Berkeley: ACM.

Sánchez, J. A., Leggett, J. J., & Schnase, J. L. (1997). AGS: Introducing agents as services provided by digital libraries, Proceedings of the Second ACM International Conference on Digital Libraries (DL'97) (pp. 75-82). Philadelphia: ACM.

Sánchez, J. A., & Leggett, J. L. (1997). Agent services for users of digital libraries. Journal of Network and Computer Applications, 20, 45-58.

Sánchez, J. A., Lopez, C. A., & Schnase, J. L. (1998b). An agent-based approach to the construction of floristic digital libraries, Proceedings of the Third ACM International Conference in Digital Libraries (DL´98) (pp. 210-216). Pittsburgh: ACM.

Sornil, O. (2000). A Distributed Inverted Index for a Large-Scale, Dynamic Digital Library. Unpublished Ph. D. Dissertation Draft. Virginia Tech., Blacksburg: Department of Computer Science.

Springer-Verlag. (2000). International Journal on Digital Libraries, Springer-Verlag. [Online at]: http://link.springer-ny.com/link/service/journals/00799/

Van de Sompel, H. (2000). Open Archives Initiative. WWW site: Universteit van Ghent: OAI Group. [Online at]: http://www.openarchives.org

Van de Sompel, H., & Lagoze, C. (2000). The Santa Fe Convention of the Open Archives Initiative. D-Lib Magazine, 6(2). [Online at]: http://www.dlib.org/dlib/february02vandesompel-oai/02vandesompel-oai.html

Weibel, S. (1999). The State of the Dublin Core Metadata Initiative: April 1999. D-Lib Magazine, 5(4). [Online at]: http://www.dlib.org/dlib/april99/04weibel.html

www.smete.org. (2000). Information Portal: A Digital Library for Science, Mathematics, Engineering, and Technology Education. [Online at]: www.smete.org (home page) In: NEEDS (Ed.). www.smete.org

Zhao, J. (1999). Making Digital Libraries Flexible, Scalable, and Reliable: Reengineering the MARIAN System in JAVA. Unpublished Master of Science Thesis, Virginia Tech, Blacksburg, VA: Department of Computer Science. [Online at]: http://scholar.lib.vt.edu/theses/available/etd-070499-204531/unrestricted/SGML-etd/