Exploring the academic invisible web

Lewandowski, Dirk and Mayr, Philipp Exploring the academic invisible web. Library Hi Tech, 2006, vol. 24, n. 4, pp. 529-539. [Journal article (Paginated)]

This is the latest version of this item.

[img]
Preview
PDF
lewandowski_mayr_final_web.pdf

Download (140kB) | Preview

English abstract

Purpose: To provide a critical review of Bergman’s 2001 study on the deep web. In addition, we bring a new concept into the discussion, the academic invisible web (AIW). We define the academic invisible web as consisting of all databases and collections relevant to academia but not searchable by the general-purpose internet search engines. Indexing this part of the invisible web is central to scientific search engines. We provide an overview of approaches followed thus far. Design/methodology/approach: Discussion of measures and calculations, estimation based on informetric laws. Literature review on approaches for uncovering information from the invisible web. Findings: Bergman’s size estimate of the invisible web is highly questionable. We demonstrate some major errors in the conceptual design of the Bergman paper. A new (raw) size estimate is given. Research limitations/implications: The precision of our estimate is limited due to a small sample size and lack of reliable data. Practical implications: We can show that no single library alone will be able to index the academic invisible web. We suggest collaboration to accomplish this task. Originality/value: Provides library managers and those interested in developing academic search engines with data on the size and attributes of the academic invisible web.

Item type: Journal article (Paginated)
Keywords: digital libraries, indexing, search engines, worldwide web
Subjects: L. Information technology and library technology > LS. Search engines.
L. Information technology and library technology > LC. Internet, including WWW.
Depositing user: Emma McCulloch
Date deposited: 24 Mar 2007
Last modified: 02 Oct 2014 12:07
URI: http://hdl.handle.net/10760/9203

Available Versions of this Item

References

"SEEK" links will first look for possible matches inside E-LIS and query Google Scholar if no results are found.

Bergman, M.K. (2001), "The deep web: surfacing hidden value", Journal of Electronic Publishing, Vol. 7, No. 1, available at: www.press.umich.edu/jep/07-01/bergman.html (accessed 6 April 2006).

Brophy, J. and Bawden, D. (2005), "Is Google enough? Comparison of an internet search engine with academic library resources", Aslib Proceedings, Vol. 57, No. 6, pp. 498-512.

Jacsó, P. (2005), "Google Scholar: the pros and cons", Online Information Review, Vol. 29, No. 2, pp. 208-214.

Lawrence, S. and Giles, C.L. (1999), "Accessibility of information on the web", Nature, Vol. 400, No. 8, pp. 107-109.

Lewandowski, D. (2005a), "Google Scholar - Aufbau und strategische Ausrichtung des Angebots sowie Auswirkung auf andere Angebote im Bereich der wissenschaftlichen Suchmaschinen", available at: www.durchdenken.de/lewandowski/doc/Expertise_Google-Scholar.pdf (accessed 13 December 2005).

Lewandowski, D. (2005b), Web Information Retrieval: Technologien zur Informationssuche im Internet, DGI, Frankfurt am Main.

Lewandowski, D. (2005c), "Yahoo - Zweifel an den Angaben zur Indexgröße, Suche in mehreren Sprachen", Password, Vol. 20, No. 9, pp. 21-22.

Lewandowski, D. (2006), "Suchmaschinen als Konkurrenten der Bibliothekskataloge: Wie Bibliotheken ihre Angebote durch Suchmaschinentechnologie attraktiver und durch Öffnung für die allgemeinen Suchmaschinen populärer machen können", Zeitschrift für Bibliothekswesen und Bibliographie, Vol. 53, No. 2, pp. 71-78.

Lossau, N. (2004), "Search engine technology and digital libraries: libraries need to discover the academic internet, D-Lib Magazine, Vol. 10, No. 6, available at: www.dlib.org/dlib/june04/lossau/06lossau.html (accessed 6 April 2006).

Lyman, P., Varian, H.R., Swearingen, K., Charles, P., Good, N., Jordan, L.L., et al. (2003), "How much information 2003?", available at: www.sims.berkeley.edu/research/projects/how-much-info-2003/ (accessed 6 April 2006).

Mayr, P. and Walter, A.-K. (2005), "Google Scholar - Wie tief gräbt diese Suchmaschine?", Paper presented at the 11. IuK-Jahrestagung: In die Zukunft publizieren: Herausforderungen an das Publizieren und die Informationsversorgung in den Wissenschaften, Bonn, Germany, 9-11 May 2005. available at: www.ib.hu-berlin.de/~mayr/arbeiten/Mayr_Walter05-preprint.pdf (accessed 6 April 2006).

McKiernan, G. (2005), "E-profile: Scirus: for scientific information only, Library Hi Tech News, Vol. 22, No. 3, pp. 18-25.

Notess, G.R. (2005), "Scholarly Web searching: Google Scholar and Scirus", Online, Vol. 29, No. 4, pp. 39-41.

Ru, Y. and Horowitz, E. (2005), "Indexing the invisible web: a survey", Online Information Review, Vol. 29, No. 3, pp. 249-265.

"Scirus White Paper: how Scirus works" (2004), available at: www.scirus.com/press/pdf/WhitePaper_Scirus.pdf (accessed 6 April 2006).

Sherman, C. (2001), "Search for the Invisible Web", available at:. www.guardian.co.uk/online/story/0,3605,547140,00.html (accessed 8 March 2006).

Sherman, C. and Price, G. (2001), The Invisible Web: Uncovering Information Sources Search Engines Can't See, Information Today, Medford, NJ.

Stock, W.G. (2003), "Weltregionen des Internet: Digitale Informationen im WWW

und via WWW", Password, Vol. 18, No. 2, pp. 26-28.

Williams, M.E. (2005), "The state of databases today: 2005", in Gale Directory of Databases, Vol. 2, pp. XV-XXV, Gale Group, Detroit, MI.


Actions (login required)

Edit Item Edit Item