Techniques for Specialized Search Engines
2001, Proceedings of the International Conference on Internet Computing
Abstract
It is emerging that it is very difficult for the major search engines to provide a comprehensive and up-to-date search service of the Web. Even the largest search engines index only a small proportion of static Web pages and do not search the Web’s backend databases that are estimated to be 500 times larger than the static Web. The scale of such searching introduces both technical and economic problems. What is more, in many cases users are not able to retrieve the information they desire because of the simple and generic search interface provided by the major search engines. A necessary response to these search problems is the creation of specialized search engines. These search engines search just for information in a particular topic or category on the Web. Such search engines will have smaller and more manageable indexes and have a powerful domainspecific search interface. This paper discusses the issues in this area and gives an overview of the techniques for building specialized search engines.
References (16)
- E. Agichtein, S. Lawrence, L. Gravano. Learning Search Engine Specific Query Transformations for Question Answering. To appear in Proceedings of WWW10, Hong Kong, 2001.
- S. Brin, L. Page. The Anatomy of a Large-Scale Hypertextual Web Search Engine. In Proceedings of WWW7, Brisbane, Australia, 1998.
- BrightPlanet LLC. The Deep Web: Surfacing Hidden Value. http://www.completeplanet.com/Tutorials/DeepWeb/ind ex.asp, 2000.
- S. Chakrabarti, M. van den Berg, B. Dom. Focused crawling: a new approach to topic-specific Web resource discovery. In Proceedings of WWW8, Toronto, 1999.
- J. Ding, L. Gravano, N. Shivakumar. Computing Geographical Scopes of Web Resources. In Proceedings of the 26 th VLDB Conference, Cairo, Egypt, 2000.
- E. Glover, G. Flake, S. Lawrence, W. Birmingham, A. Kruger, C. Lee Giles, D. Pennock. Improving Category Specific Web Search by Learning Query Modifications. In Symposium on Applications and the Internet, SAINT 2001.
- Jon Kleinberg. Authoritative sources in a Hyperlinked Environment. Proceedings 9th ACM-SIAM Symposium on Discrete Algorithms, 1998.
- A. Kruger, C. Lee Giles, F. Coetzee, E. Glover, G. Flake, S. Lawrence, C. Omlin. DEADLINER: Building a New Niche Search Engine. Conference on Information and Knowledge Management, Washington DC, November 6-11, 2000.
- C. Kwok, O. Etzioni, D. Weld. Scaling Question Answering to the Web. To appear in Proceedings of WWW10, Hong Kong, 2001.
- S. Lawrence. Context in Web Search. IEEE Data Engineering Bulletin, Volume 23, Number3, pp. 25-32, 2000.
- S. Lawrence, C. Giles. Context and Page Analysis for Improved Web Search. IEEE Internet Computing, 2(4): 38-46, 1998.
- S. Lawrence, C. Giles, K. Bollacker. Digital Libraries and Autonomous Citation Indexing. IEEE Computer, Volume 32, Number 6, pp. 67-71, 1999.
- M. Mauldin. Method for Searching a Queued and Ranked Constructed Catalog of Files Stored on a Network. US Patent 5,748,954, 1998.
- S. Raghavan, H. Garcia-Molina. Crawling the Hidden Web. Technical Report 2000-36, Database Group, Computer Science Department, Stanford University, November 2000.
- J. Shakes, M. Langheinrich, O. Etzioni. Dynamic Reference Sifting: A Case Study in the Homepage Domain. In proceedings of Sixth International Web Conference, WWW6, 1997.
- Z. Wu, W. Meng, C. Yu, Z. Li. Towards a Highly- Scalable and Effective Metasearch Engine. Tenth International Web Conference, WWW10, Hong Kong, May 1-5, 2001.