An exploratory study of Google Scholar

Mayr, Philipp and Walter, Anne-Kathrin An exploratory study of Google Scholar., 2007 [Preprint]


Download (576kB) | Preview

English abstract

Purpose – This paper discusses the new scientific search service Google Scholar (GS). This search engine, intended for searching exclusively scholarly documents, will be described with its most important functionality and then empirically tested. The focus is on an exploratory study which investigates the coverage of scientific serials in GS. Design/methodology/approach – The study is based on queries against different journal lists: international scientific journals from Thomson Scientific (SCI, SSCI, AH), Open Access journals from the DOAJ list and journals of the German social sciences literature database SOLIS as well as the analysis of result data from GS. All data gathering took place in August 2006. Findings – The study shows deficiencies in the coverage and up-to-dateness of the GS index. Furthermore, the study points up which web servers are the most important data providers for this search service and which information sources are highly represented. We can show that there is a relatively large gap in Google Scholar’s coverage of German literature as well as weaknesses in the accessibility of Open Access content. Major commercial academic publishers are currently the main data providers. Research limitations/implications – Five different journal lists were analyzed, including approximately 9,500 single titles. The lists are from different fields and of various sizes. This limits comparability. There were also some problems matching the journal titles of the original lists to the journal title data provided by Google Scholar. We were only able to analyze the top 100 Google Scholar hits per journal. Practical implications – We conclude that Google Scholar has some interesting pros (such as citation analysis and free materials) but the service can not be seen as a substitute for the use of special abstracting and indexing databases and library catalogues due to various weaknesses (such as transparency, coverage and up-to-dateness). Originality/value – We do not know of any other study using such a brute force approach and such a large empirical basis. Our study can be considered as using brute force in the sense that we gathered lots of data from Google, then analyzed the data in a macroscopic way.

Item type: Preprint
Keywords: Search engines, Digital libraries, Worldwide Web, Serials, Electronic journals
Subjects: H. Information sources, supports, channels. > HN. e-journals.
L. Information technology and library technology > LS. Search engines.
H. Information sources, supports, channels. > HS. Repositories.
Depositing user: Philipp Mayr
Date deposited: 21 Aug 2007
Last modified: 02 Oct 2014 12:09


Banks, M. A. (2005), "The excitement of Google Scholar, the worry of Google Print", Biomed Digit Libr., Vol. 2 No. 2, available at:

Bar-Ilan, J. (2006), "An ego-centric citation analysis of the works of Michael O. Rabin based on multiple citation indexes", Information Processing & Management, Vol. 42 No. 6, pp.1553-1566.

Bauer, K. and Bakkalbasi, N. (2005), “An examination of citation counts in a new scholarly communication environment”, D-Lib Magazine, Vol. 11 No. 9, available at:

Belew, R. K. (2005), "Scientific impact quantity and quality: Analysis of two sources of bibliographic data", available at:

Butler, D. (2004), "Science searches shift up a gear as Google starts Scholar engine", Nature, Vol. 432, p 423.

Ewert, G. and Umstätter, W. (1997), Lehrbuch der Bibliotheksverwaltung, Stuttgart: Hiersemann.

Giles, J. (2005), "Science in the web age: Start your engines”, Nature, Vol. 438 No. 7068, pp. 554-555.

Harnad, S., Brody, T., Vallières, F., Carr, L., Hitchcock. S., Gingras, Y., Oppenheim, C., Stamerjohanns, H. and Hilf, E. (2004), "The green and the gold roads to Open Access", Nature, available at:

Jacsó, P. (2004), "Google Scholar Beta", Thomson Gale, available at:

Jacsó, P. (2005a), "As we may search - Comparison of major features of the Web of Science, Scopus, and Google Scholar citation-based and citation-enhanced databases", Current Science, Vol. 89 No. 9, pp.1537-1547.

Jacsó, P. (2005b), "Google Scholar Beta (Redux)", Thomson Gale, available at:

Jacsó, P. (2005c), "Google Scholar: the pros and the cons”, Online Information Review, Vol. 29 No. 2, pp. 208-214.

Jacsó, P. (2006a), "Deflated, Inflated and Phantom Citation Counts", Online Information Review, Vol. 30 No. 3, pp. 297-309.

Jacsó, P. (2006b), "Dubious Hit Counts and Cuckoo's Eggs", Online Information Review, Vol. 30 No. 2, pp.188-193.

Kousha, K. and Thelwall. M. (to appear), "Google Scholar citations and Google Web/URL citations: A multidiscipline exploratory analysis", Journal of the American Society for Information Science and Technology.

Lawrence, S., Giles, C. L. and Bollacker, K. (1999), "Digital Libraries and Autonomous Citation Indexing", IEEE Computer, Vol. 32 No. 6, pp. 67-71.

Lewandowski, D. and Mayr, P. (2006), "Exploring the academic invisible web", Library Hi Tech, Vol. 24 No. 4, pp. 529-539, available at:

Markoff, J. (2004), "Google Plans New Service For Scientists And Scholars", New York Times, New York.

Mayr, P. and Tosques, F. (2005), "Google Web APIs - An Instrument for Webometric Analyses?", Proceedings of the 10th International Conference of the International Society for Scientometrics and Informetrics, Stockholm

(Sweden), available at:

Mayr, P. and Walter, A.-K. (2006), "Abdeckung und Aktualität des Suchdienstes Google Scholar", Information - Wissenschaft & Praxis, Vol. 57 No. 3, pp.133-140, available at:

Noruzi, A. (2005), "Google Scholar: The Next Generation of Citation Indexes", Libri, Vol. 55, pp. 170-180.

Payne, D. (2004), "Google Scholar welcomed." The Scientist, Vol. 5 No. 1.

Price, G. (2004), "Google Scholar Documentation and Large PDF Files", SearchEngineWatch, available at:

Sullivan, D. (2004), "Google Scholar Offers Access To Academic Information", Searchenginewatch, available at:

Swan, A. and Brown, S. (2005), "Open access self-archiving: An author study", Joint Information Systems Committee (JISC), available at:

Terdiman, D. (2004), "A Tool for Scholars Who Like to Dig Deep", New York Times, New York.


Downloads per month over past year

Actions (login required)

View Item View Item