An exploratory study of Google Scholar

Mayr, Philipp and Walter, Anne-Kathrin An exploratory study of Google Scholar., 2007 [Preprint]


English abstract

Purpose – This paper discusses the new scientific search service Google Scholar (GS). This search engine, intended for searching exclusively scholarly documents, will be described with its most important functionality and then empirically tested. The focus is on an exploratory study which investigates the coverage of scientific serials in GS. Design/methodology/approach – The study is based on queries against different journal lists: international scientific journals from Thomson Scientific (SCI, SSCI, AH), Open Access journals from the DOAJ list and journals of the German social sciences literature database SOLIS as well as the analysis of result data from GS. All data gathering took place in August 2006. Findings – The study shows deficiencies in the coverage and up-to-dateness of the GS index. Furthermore, the study points up which web servers are the most important data providers for this search service and which information sources are highly represented. We can show that there is a relatively large gap in Google Scholar’s coverage of German literature as well as weaknesses in the accessibility of Open Access content. Major commercial academic publishers are currently the main data providers. Research limitations/implications – Five different journal lists were analyzed, including approximately 9,500 single titles. The lists are from different fields and of various sizes. This limits comparability. There were also some problems matching the journal titles of the original lists to the journal title data provided by Google Scholar. We were only able to analyze the top 100 Google Scholar hits per journal. Practical implications – We conclude that Google Scholar has some interesting pros (such as citation analysis and free materials) but the service can not be seen as a substitute for the use of special abstracting and indexing databases and library catalogues due to various weaknesses (such as transparency, coverage and up-to-dateness). Originality/value – We do not know of any other study using such a brute force approach and such a large empirical basis. Our study can be considered as using brute force in the sense that we gathered lots of data from Google, then analyzed the data in a macroscopic way.

Item type: Preprint
Keywords: Search engines, Digital libraries, Worldwide Web, Serials, Electronic journals
Subjects: H. Information sources, supports, channels. > HN. e-journals.
L. Information technology and library technology > LS. Search engines.
H. Information sources, supports, channels. > HS. Repositories.
Depositing user: Philipp Mayr
Date deposited: 21 Aug 2007
Last modified: 02 Oct 2014 12:09


