|
|
E-LIS. E-prints in Library and Information Science >
List of countries by continent >
EUROPE >
Germany >
Journal Article (On-line/Unpaginated) >
Lewandowski, D. Web searching, search engines and Information Retrieval, 2005. In Information Services & Use. IOS Press. (In Press) [Journal Article (On-line/Unpaginated)].
See the references list of this item
Citable URI:
http://hdl.handle.net/10760/6702
Files in This Item:
| File |
Description |
Size | Format | Visibility |
| isu_preprint.pdf | | 248.14 kB | Adobe PDF | View/Open
|
|
| Author(s): | Lewandowski, Dirk |
| Title: | Web searching, search engines and Information Retrieval |
| Subjects: | L. Information technology and library technology > LS. Search engines |
| Date: | 2005 |
| Abstract: | This article discusses Web search engines; mainly the challenges in indexing the World Wide Web, the user behaviour, and the ranking factors used by these engines. Ranking factors are divided into query-dependent and query-independent factors, the latter of which have become more and more important within recent years. The possibilities of these factors are limited, mainly of those that are based on the widely used link popularity measures. The article concludes with an overview of factors that should be considered to determine the quality of Web search engines. |
| Publication: | Information Services & Use |
| Volume: | 18 |
| Number: | 3 |
| Publisher: | IOS Press |
| Alternative Locations: | http://www.durchdenken.de/lewandowski/doc/isu2005.php |
| Keywords: | search engines; Information Retrieval |
| Country: | Germany |
| Type: | Journal Article (On-line/Unpaginated) |
| Rights: | http://eprints.rclis.org/copyright/ |
|
References
- Acharya, A.; Cutts, M.; Dean, J.; Haahr, P.; Henzinger, M.; Hoelzle, U.; Lawrence, S.; Pfleger, K.; Sercinoglu, O.; Tong, S. (2005): Information retrieval based on historical data. Patent Application US 2005/0071741 A1 (published: 31.3.2005)
- Bergman, M. K. (2001): The Deep Web: Surfacing Hidden Value. Journal of Electronic Publishing 7(1). http://www.press.umich.edu/jep/07-01/bergman.html [22.8.2005]
- Broder, A. (2002): A taxonomy of web search. SIGIR Forum 36(2). http://www.acm.org/sigir/forum/F2002/broder.pdf [22.8.2005]
- Chakrabarti, S. (2003): Mining the Web: Discovering Knowledge from Hypertext Data. Amsterdam (u.a.): Morgan Kaufmann
- Clay, B. (2004): Search Engine Relationship Chart. http://www.bruceclay.com/searchenginechart.pdf [22.8.2005]
- Fetterley, D.; Manasse, M.; Najork, M.: Spam, Damn Spam, and Statistics. Seventh International Workshop on the Web and Databases (WebDB 2004), June 17-18, 2004, Paris, France, pp. 1-6
- Gee. K.R.: Using Latent Semantic Indexing to Filter Spam. Proceedings of SAC 2003, Florida, USA. pp. 460-464
- Gulli, A.; Signorini, A. (2005): The Indexable Web is More than 11.5 billion pages. Proceedings of the Special interest tracks and posters of the 14th international conference on World Wide Web, May 10-14, 2005, Chiba, Japan. pp. 902-903
- Gyögyi, Z.; Garcia-Molina, H.; Pedersen, J.: Combating Spam with TrustRank. Proceedings of the 30th VLDB Conference, Toronto, Canada, 2004, pp. 576-587
- Hamilton, N. (2003): The Mechanics of a Deep Net Metasearch Engine. http://turbo10.com/papers/deepnet.pdf [22.8.2005]
- Jansen, B. J.; Spink, A.; Saracevic, T. (2000): Real Life, Real Users, and Real Needs: A Study and Analysis of User Queries on the Web. Information Processing & Management 36(2), pp. 207-227
- Kleinberg, J. (1999): Authoritative Sources in a Hyperlinked Environment. Journal of the ACM 46(5), pp. 604-632
- Lawrence, S., Giles, C. L. (1998): Searching the World Wide Web. Science 280, pp. 98-100
- Lawrence, S., Giles, C. L. (1999): Accessibility of information on the web. Nature 400(8), pp. 107-109
- Lewandowski,, D. (2004): Abfragesprachen und erweiterte Funktionen von WWW-Suchmaschinen. Information: Wissenschaft und Praxis 55(2), pp. 97-102
- Lewandowski, D. (2005): Web Information Retrieval. Frankfurt am Main, DGI, 2005
- Lewandowski, D. (2005): Yahoo - Zweifel an den Angaben zur Indexgröße, Suche in mehreren Sprachen. Password 20(9) [to appear]
- Lewandowski, D.; Wahlig, H.; Meyer-Bautor, G.: The Freshness of Web Search Engines’ Databases. [to appear]
- Machill, M.; Lewandowski, D.; Karzauninkat, S. (2005): Journalistische Aktualität im Internet. Ein Experiment mit den News-Suchfunktionen von Suchmaschinen. In: Machill, M.; Schneider, N. (Hrsg.): Suchmaschinen: Herausforderung für die Medienpolitik. Berlin: Vistas 2005, pp. 105-164
- Machill, M.; Neuberger, C.; Schweiger, W.; Wirth, W. (2003): Wegweiser im Netz: Qualität und Nutzung von Suchmaschinen. In: Machill, M.; Welp, C. (Hrsg.): Wegweiser im Netz: Qualität und Nutzung von Suchmaschinen. Gütersloh: Verlag Bertelsmann Stiftung, pp. 13-490
- Notess, G. (2003): Search Engine Statistics: Database Total Size Estimates. http://www.searchengineshowdown.com/stats/sizeest.shtml [7.7.2005]
- Notess, G. (2003): Search Engine Statistics: Freshness Showdown. http://www.searchengineshowdown.com/stats/freshness.shtml [7.7.2005]
- Ntoulas, A.; Cho, J.; Olston, C. (2004): What's New on the Web? The Evolution of the Web from a Search Engine Perspective. Proceedings of the Thirteenth WWW Conference, New York, USA. http://oak.cs.ucla.edu/~ntoulas/pubs/ntoulas_new.pdf [22.8.2005]
- Page, L., Brin, S., Motwani, R., Winograd, T. (1998): The PageRank citation ranking: Bringing order to the Web. http://dbpubs.stanford.edu:8090/pub/1999-66 [22.8.2005]
- Savoy, J.; Rasolofo, Y. (2001): Report on the TREC-9 Experiment: Link-Based Retrieval and Distributed Collections. http://trec.nist.gov/pubs/trec9/papers/unine9.pdf [22.8.2005]
- Seuss, D. (2004): Ten Years Into the Web, and the Search Problem is Nowhere Near Solved. Computers In Libraries Conference, March 10-12, 2004. http://www.infotoday.com/cil2004/presentations/seuss.pps [22.8.2005]
- Sherman, C. (2001): Search for the Invisible Web. Guardian Unlimited 6.9.2001. http://www.guardian.co.uk/online/story/0,3605,547140,00.html [22.8.2005]
- Sherman, C.; Price, G. (2001): The Invisible Web: Uncovering Information Sources Search Engines Can't See. Medford, NJ: Information Today
- Singhal, Amit (2004): Challenges in Running a Commercial Search Engine. http://www.research.ibm.com/haifa/Workshops/searchandcollaboration2004/papers/haifa.pdf [22.8.2005]
- Smith, A. G. (2004): Web links as analogues of citations. Information Research 9(4). http://informationr.net/ir/9-4/paper188.html [22.8.2005]
- Spink, A.; Jansen, B. J. (2004): Web Search: Public Searching of the Web. Dordrecht: Kluwer Academic Publishers
- Stock, W. G. (2003): Weltregionen des Internet: Digitale Informationen im WWW und via WWW. Password Nr. 18(2), pp. 26-28
- Thelwall, M. (2004): Link Analysis: An Information Science Approach. Amsterdam [u.a.]: Elsevier Academic Press
- Vaughan, L. (2004): New measurements for search engine evaluation proposed and tested. In: Information Processing and Management 40(4), pp. 677-691
- Vaughan, L.; Thelwall, M. (2004): Search Engine Coverage Bias: Evidence and Possible Causes. Information Processing & Management, 40(4), pp. 693-707
- Wu, B.; Davison, B.D.: Identifying Link Farm Spam Pages. Proceedings of WWW 2005, May 10-14, Chiba, Japan, pp. 820-829
|