Exploring the Academic Invisible Web

Lewandowski, Dirk and Mayr, Philipp Exploring the Academic Invisible Web., 2006 [Preprint]

WarningThere is a more recent version of this item available.

Download (373kB) | Preview

English abstract

Purpose: To provide a critical review of Bergman’s 2001 study on the Deep Web. In addition, we bring a new concept into the discussion, the Academic Invisible Web (AIW). We define the Academic Invisible Web as consisting of all databases and collections relevant to academia but not searchable by the general-purpose internet search engines. Indexing this part of the Invisible Web is central to scientific search engines. We provide an overview of approaches followed thus far. Design/methodology/approach: Discussion of measures and calculations, estimation based on infor-metric laws. Literature review on approaches for uncovering information from the Invisible Web. Findings: Bergman’s size estimation of the Invisible Web is highly questionable. We demonstrate some major errors in the conceptual design of the Bergman paper. A new (raw) size estimation is given. Research limitations/implications: The precision of our estimation is limited due to small sample size and lack of reliable data. Practical implications: We can show that no single library alone will be able to index the Academic Invisible Web. We suggest collaboration to accomplish this task. Originality/value: Provides library managers and those interested in developing academic search en-gines with data on the size and attributes of the Academic Invisible Web.

Item type: Preprint
Keywords: Search engines, Worldwide Web, Indexing, Scholarly content, Digital library
Subjects: L. Information technology and library technology > LS. Search engines.
L. Information technology and library technology > LC. Internet, including WWW.
Depositing user: Dirk Lewandowski
Date deposited: 16 Apr 2006
Last modified: 02 Oct 2014 12:03
URI: http://hdl.handle.net/10760/7447

Available Versions of this Item


Bergman, M.K. (2001). The Deep Web: Surfacing Hidden Value. Journal of Electronic Pub-lishing, 7(1).

Brophy, J., & Bawden, D. (2005). Is Google enough? Comparison of an internet search engine with academic library resources. Aslib Proceedings, 57(6), 498-512.

Jacsó, P. (2005). Google Scholar: The pros and cons. Online Information Review, 29(2), 208-214.

Lawrence, S., & Giles, C.L. (1999). Accessibility of Information on the web. Nature, 400(8), 107-109.

Lewandowski, D. (2005a). Google Scholar - Aufbau und strategische Ausrichtung des Ange-bots sowie Auswirkung auf andere Angebote im Bereich der wissenschaftlichen Suchmaschinen. Retrieved 13.12.2005, from http://www.durchdenken.de/lewandowski/doc/Expertise_Google-Scholar.pdf

Lewandowski, D. (2005b). Web Information Retrieval: Technologien zur Informationssuche im Internet. Frankfurt am Main: DGI.

Lewandowski, D. (2005c). Yahoo - Zweifel an den Angaben zur Indexgröße, Suche in mehre-ren Sprachen. Password, 20(9), 21-22.

Lewandowski, D. (2006). Suchmaschinen als Konkurrenten der Bibliothekskataloge: Wie Bib-liotheken ihre Angebote durch Suchmaschinentechnologie attraktiver und durch Öffnung für die allgemeinen Suchmaschinen populärer machen können. Zeitschrift für Bibliothekswesen und Bibliographie, 53(2).

Lossau, N. (2004). Search Engine Technology and Digital Libraries: Libraries Need to Dis-cover the Academic Internet. D-Lib Magazine, 10(6).

Lyman, P., Varian, H.R., Swearingen, K., Charles, P., Good, N., Jordan, L.L., et al. (2003). How Much Information 2003?, from http://www.sims.berkeley.edu/research/projects/how-much-info-2003/

Mayr, P., & Walter, A.-K. (2005). In Google Scholar - Wie tief gräbt diese Suchmaschine? Paper presented at the In die Zukunft publizieren: Herausforderungen an das Pub-lizieren und die Informationsversorgung in den Wissenschaften, Bonn.

McKiernan, G. (2005). E-profile: Scirus: For Scientific Information Only. Library Hi Tech News, 22(3), 18-25.

Notess, G.R. (2005). Scholarly Web searching: Google Scholar and Scirus. Online, 29(4), 39-41.

Scirus White Paper: How Scirus works. (2004). Retrieved 8.3.2006, from http://www.scirus.com/press/pdf/WhitePaper_Scirus.pdf

Sherman, C. (2001). Search for the Invisible Web. Retrieved 8.3.2006, from http://www.guardian.co.uk/online/story/0,3605,547140,00.html

Sherman, C., & Price, G. (2001). The Invisible Web: Uncovering Information Sources Search Engines Can't See. Medford, NJ: Information Today.

Stock, W.G. (2003). Weltregionen des Internet: Digitale Informationen im WWW und via WWW. Password, 18(2), 26-28.

Williams, M.E. (2005). The State of Databases Today: 2005. In Gale Directory of Databases (Vol. 2, pp. XV-XXV). Detroit, Mich.: Gale Group.


Downloads per month over past year

Actions (login required)

View Item View Item