Optimising metadata to make high-value content more accessible to Google users

Dawson, Alan and Hamilton, Val Optimising metadata to make high-value content more accessible to Google users. Journal of Documentation, 2006, vol. 62, n. 3, pp. 307-327. [Journal article (Paginated)]

[thumbnail of AD2w.doc.pdf]
Preview
PDF
AD2w.doc.pdf

Download (141kB) | Preview

English abstract

This paper shows how information in digital collections that have been catalogued using high-quality metadata can be retrieved more easily by users of search engines such as Google.The research and proposals described arose from an investigation into the observed phenomenon that pages from the Glasgow Digital Library (gdl.cdlr.strath.ac.uk) were regularly appearing near the top of Google search results shortly after publication, without any deliberate effort to achieve this. The reasons for this phenomenon are now well understood and are described in the second part of the paper. The first part provides context with a review of the impact of Google and a summary of recent initiatives by commercial publishers to make their content more visible to search engines.

Item type: Journal article (Paginated)
Keywords: digital libraries, metadata, search engine optimization
Subjects: L. Information technology and library technology > LS. Search engines.
I. Information treatment for information services > IE. Data and metadata structures.
Depositing user: Emma McCulloch
Date deposited: 27 Aug 2009
Last modified: 02 Oct 2014 12:15
URI: http://hdl.handle.net/10760/13440

References

AACR2 (2002), Anglo-American Cataloguing Rules Second Edition, 2002 revision, American Library Association, Canadian Library Association and CILIP, Chicago.

Aldrich, S. (2004), "Using search engines to find new customers at the American Institute of Physics: scientific publisher combines findability with "Buy by the Piece" to grow its customer base", Strategic Research Service (Patricia Seybold Group), 4 March, available at: www.aipservices.org/newsroom/white_papers/pdf/PhysicsFinder.pdf (accessed 26 October 2004).

Arms, C.R. and Arms, W.Y. (2004), "Mixed content and mixed metadata: information discovery in a messy world", in Hillman, D.I. and Westbrooks E.L. (Eds), Metadata in Practice, American Library Association, Chicago.

Bains, S. (2004), "Breaking through the walls: developing the virtual National Library of Scotland", presentation at Electric Connections Conference, Stirling.

Banks, M.A. (2005), "The excitement of Google Scholar, the worry of Google Print," Biomedical Digital Libraries, 2:2, available at: http://www.bio-diglib.com/content/2/1/2 (accessed 5 July 2005).

Becker, N.J. (2003), "Google in perspective: understanding and enhancing student search skills," New Review of Academic Librarianship, Vol. 9, pp. 84-100.

Berners-Lee, T., Hendler, J. and Lassila, O. (2001), "The Semantic Web", Scientific American, Vol. 284 No. 5.

Bowman, J.H. (2003), Essential Cataloguing, Facet Publishing, London.

Bird, D. and Simons, G. (2003), "Extending Dublin core metadata to support the description and discovery of language resources," Computers and the Humanities, Vol. 37 No. 4, pp. 375-388.

Bradley, P. (2004), "Search engines: the Google backlash", Ariadne, No. 39, available at: http://www.ariadne.ac.uk/issue39/search-engines/ (accessed 26 October 2004).

Brandchannel (2004), Brand of the Year Survey Results 2003, available at: http://brandchannel.com/features_effect.asp?pf_id=195 (accessed 26 October 2004).

Calishain, T. (2003), "Why try to out-Google Google?", O'Reilly Web DevCenter, available at: http://www.oreillynet.com/pub/a/javascript/2003/05/16/googlehacks.html (accessed 26 October 2004).

Calishain, T. and Dornfest, R. (2003), Google Hacks, O'Reilly, Sebastopol, CA.

CFHub (2004), Search Engine Safe URLs, available at: http://cfhub.com/contributions/SES/ (accessed 26 October 2004).

CrossRef (2004), Press Release: CrossRef Search Pilot Now Includes 29 Publishers, 3.4 Million Research Articles, available at: http://www.crossref.org/01company/pr/press092104.html (accessed 26 October 2004).

Dawson, A. (2004), "Building a digital library in 80 days: the Glasgow experience", in Andrews, J. and Law, D. (Eds), Digital Libraries: Policy, Planning and Practice, Ashgate Publishing, Aldershot.

De Groat, G. (2002), "Perspectives on the Web and Google: Monika Henziger, Director of Research, Google", Journal of Internet Cataloging, Vol. 5 No. 1, pp. 17-28.

Dowling, T. (2004), "On Monday, Google went down", The Guardian, 28 July.

DP9 (2001), DP9: an OAI Service Provider for Web Crawlers, available at: http://dlib.cs.odu.edu/dp9 (accessed 26 October 2004).

Dublin Core (2004), available at: http://dublincore.org/ (accessed 26 October 2004).

Hunter, P. and Guy, M. (2004), "Metadata for harvesting: the Open Archives Initiative and how to find things on the Web", Electronic Library, Vol. 22 No. 2, pp. 168-174.

IEEE Learning Standards Technology Committee (2002), 1484.12.1-2002 IEEE Standard for Learning Object Metadata, IEEE, New York.

IEEE (2004), "Google users flock to IEEE XPLORE", What's New @ IEEE for Students, Vol. 6 No.3.

Ingenta (2004), Ingenta Partners With Google to Enable Full Text Indexing, available at: http://www.ingenta.com/isis/general/Jsp/?target=/about_ingenta/press_releases/google.jsp (accessed 26 October 2004).

Inger, S. (2004), "Google vs. traditional information services: a comparison of search results", National Federation of Abstracting and Indexing Services (NFAIS), February 22, available at: http://www.scholinfo.com/presentations/GoogleversusTraitionalInformationServices.pdf (accessed 26 October 2004).

Jacso, P. (2004), "Péter's Digital Reference Shelf - Google Scholar", December 2004, available at: http://googlescholar.notlong.com/ (accessed 5 July 2005).

Jacso, P. (2005), "Péter's Digital Reference Shelf - Google Scholar (Redux)", June 2005, available at: www.galegroup.com/servlet/HTMLFileServlet?imprint=9999&region=7&fileName=reference/archive/200506/google.html (accessed 5 July 2005).

Jacso, P. (2005), "Google Scholar: the pros and cons", Online Information Review, Vol. 29 No. 2, pp. 208-214.

JISC (2004), A Vision for the Future: Towards a Common Information Environment, available at: www.jisc.ac.uk/uploaded_documents/vision.pdf (accessed 26 October 2004).

Jordan, J. (2004), "From the President: extending WorldCat, raising the visibility of libraries", OCLC Newsletter, No. 263, available at: http://www.oclc.org/news/publications/newsletters/oclc/2004/263/letter.html (accessed 26 October 2004).

Kent, P. (2004), Search Engine Optimization for Dummies, Wiley, Indianapolis.

Lynch, C. (2001), "When documents deceive: trust and provenance as new factors for information retrieval in a tangled web", Journal of the American Society for Information Science and Technology, Vol. 52 No. 1, pp. 12-17.

MacDougall, S. (2000), "Signposts on the information superhighway: indexes and access", Journal of Internet Cataloging, Vol. 2 No. 3/4, pp. 61-79.

Mangold, T. (2004), "We can't live without Google", Evening Standard, 28 July.

MARC 21 (2003) MARC 21 Concise Format for Bibliographic Data, Library of Congress, Network Development and MARC Standards Office, Washington, DC. , available at: www.loc.gov/marc/bibliographic/ecbdhome.html (accessed 26 October 2004).

Marchiori, M. (1998), "The limits of Web metadata, and beyond", Computer Networks and ISDN Systems, Vol. 30 Nos. 1-7, pp. 1-9.

McCulloch, E. (2004), "Multiple terminologies: an obstacle to information retrieval," Library Review, Vol. 53 No. 6, pp 297-300.

Miller, P. (2004), "Towards the digital aquifer: introducing the Common Information Environment", Ariadne, No. 39, available at: http://www.ariadne.ac.uk/issue39/miller/ (accessed 26 October 2004).

OCLC (2003), 2003 Environmental Scan, available at: http://www.oclc.org/membership/escan/social/satisfaction.htm (accessed 26 October 2004).

OCLC (2004), Open WorldCat Program, available at: http://www.oclc.org/worldcat/open/default.htm (accessed 26 October 2004).

Open Access Now (2004), Google and DSpace launch joint project, 10 May, available at: http://www.biomedcentral.com/openaccess/news/?issue=16 (accessed 26 October 2004).

Scholarly Information Strategies Ltd. (2003), How to Get Premium Content Indexed by Google and Other Web Crawlers, Scholarly Information Strategies Ltd., Didcot, Oxon.

Sherman, C. and Price G. (2001), The Invisible Web: Uncovering Information Sources Search Engines Can't See, Cyber Age Books, Medford, NJ.

Sullivan, D. (2002), Search Engine Features for Webmasters, available at: http://searchenginewatch.com/webmasters/article.php/2167891 (accessed 26 October 2004).

Svenonius, E. (2000), The Intellectual Foundation of Information Organization, MIT Press, Cambridge, MA.

Tenopir, C. (2004), "Is Google the competition?" Library Journal, Vol. 129 No. 6, p. 30, available at: http://www.libraryjournal.com/article/CA405423 (accessed 26 October 2004).

Thomas, C.F. and Griffin, L.S. (1998), "Who will create the metadata for the Internet?" First Monday, Vol. 3 No. 12, available at: http://www.firstmonday.dk/issues/issue3_12/thomas/ (accessed 26 October 2004).

Urquhart, E. et al. (2003), "Uptake and use of electronic information services: trends in UK higher education from the JUSTEIS project", Program: Electronic Library and Information Systems, Vol. 37 No. 3, pp. 168-180.

Wallis, J. (2003), "Information-saturated yet ignorant: information mediation as social empowerment in the knowledge economy", Library Review, Vol. 52 No. 8, pp. 369-372.


Downloads

Downloads per month over past year

Actions (login required)

View Item View Item