Verifying Search Results Over Web Collections
2012
Abstract
Web searching accounts for one of the most frequently performed computations over the Internet as well as one of the most important applications of outsourced computing, producing results that critically affect users' decision-making behaviors. As such, verifying the integrity of Internet-based searches over vast amounts of web contents is essential. In this paper, we provide the first solution to this general security problem. We introduce the concept of an authenticated web crawler and present the design and prototype implementation of this new concept. An authenticated web crawler is a trusted program that computes a special "signature" S over a collection of web contents it visits. Subject to this signature, web searches can be verified to be correct with respect to the integrity of their produced results. But this signature serves more advanced purposes than just content verification: It allows the verification of complicated queries on web pages, such as conjunctive keyword searches, which are vital for the functionality of online web-search engines today. In our solution, along with the web pages that satisfy any given search query, the search engine also returns a cryptographic proof. This proof, together with the signature S, enables any user to efficiently verify that no legitimate web pages are omitted from the result computed by the search engine, and that no pages that are non-conforming with the query are included in the result. An important property of our solution is that the proof size and the verification time are proportional only to the sizes of the query description and the query result, but do not depend on the number or sizes of the web pages over which the search is performed. Our authentication protocols are based on standard Merkle trees and the more involved bilinear-map accumulators. As we experimentally demonstrate, the prototype implementation of our system gives a low communication overhead between the search engine and the user, and allows for fast verification of the returned results on the user side.
References (29)
- Apache Thrift Software. Version 0.6.2. http://thrift.apache.org/.
- M. J. Atallah, Y. Cho, and A. Kundu. Efficient data authentication in an environment of untrusted third-party distributors. In Data Engineering, 2008. ICDE 2008. IEEE 24th Inter- national Conference on, pages 696 -704, april 2008.
- E. Bertino, B. Carminati, E. Ferrari, B. Thuraisingham, and A. Gupta. Selective and au- thentic third-party distribution of xml documents. IEEE Trans. on Knowl. and Data Eng., 16(10):1263-1278, Oct. 2004.
- A. Buldas, P. Laud, and H. Lipmaa. Accountable certificate management using undeniable attestations. In Proceedings of the 7th ACM conference on Computer and communications security, CCS '00, pages 9-17, New York, NY, USA, 2000. ACM.
- J. Camenisch and A. Lysyanskaya. Dynamic accumulators and application to efficient revo- cation of anonymous credentials. In Proceedings of the 22nd Annual International Cryptology Conference on Advances in Cryptology, CRYPTO '02, pages 61-76, London, UK, UK, 2002. Springer-Verlag.
- P. Devanbu, M. Gertz, A. Kwong, C. Martel, G. Nuckolls, and S. G. Stubblebine. Flexible authentication of xml documents. In Proceedings of the 8th ACM conference on Computer and Communications Security, CCS '01, pages 136-145, New York, NY, USA, 2001. ACM.
- P. Devanbu, M. Gertz, C. Martel, and S. G. Stubblebine. Authentic data publication over the internet. Journal of Computer Security,, 11(3):291-314, Apr. 2003.
- M. T. Goodrich, R. Tamassia, and J. Hasic. An efficient dynamic and distributed cryptographic accumulator. In Proceedings of the 5th International Conference on Information Security, ISC '02, pages 372-388, London, UK, UK, 2002. Springer-Verlag.
- M. T. Goodrich, R. Tamassia, and A. Schwerin. Implementation of an authenticated dictionary with skip lists and commutative hashing. In DARPA Information Survivability Conference amp; Exposition II, 2001. DISCEX '01. Proceedings, volume 2, pages 68 -82 vol.2, 2001.
- M. T. Goodrich, R. Tamassia, and N. Triandopoulos. Super-efficient verification of dynamic outsourced databases. In Proceedings of the 2008 The Cryptopgraphers' Track at the RSA conference on Topics in cryptology, CT-RSA'08, pages 407-424, Berlin, Heidelberg, 2008. Springer-Verlag.
- M. T. Goodrich, R. Tamassia, and N. Triandopoulos. Efficient authenticated data structures for graph connectivity and geometric search problems. Algorithmica, 60(3):505-552, July 2011.
- F. Li, M. Hadjieleftheriou, G. Kollios, and L. Reyzin. Dynamic authenticated index structures for outsourced databases. In Proceedings of the 2006 ACM SIGMOD international conference on Management of data, SIGMOD '06, pages 121-132, New York, NY, USA, 2006. ACM.
- C. Martel, G. Nuckolls, P. Devanbu, M. Gertz, A. Kwong, and S. G. Stubblebine. A general model for authenticated data structures. Algorithmica, 39(1):21-41, Jan. 2004.
- R. C. Merkle. A certified digital signature. In Proceedings on Advances in cryptology, CRYPTO '89, pages 218-238, New York, NY, USA, 1989. Springer-Verlag New York, Inc.
- M. Naehrig, R. Niederhagen, and P. Schwabe. New software speed records for cryptographic pairings. In Proceedings of the First international conference on Progress in cryptology: cryp- tology and information security in Latin America, LATINCRYPT'10, pages 109-123, Berlin, Heidelberg, 2010. Springer-Verlag.
- M. Naor and K. Nissim. Certificate revocation and certificate update. In Proceedings of the 7th conference on USENIX Security Symposium -Volume 7, SSYM'98, pages 17-17, Berkeley, CA, USA, 1998. USENIX Association.
- M. Narasimha and G. Tsudik. Authentication of outsourced databases using signature aggre- gation and chaining. In Proceedings of the 11th international conference on Database Systems for Advanced Applications, DASFAA'06, pages 420-436, Berlin, Heidelberg, 2006. Springer- Verlag.
- L. Nguyen. Accumulators from bilinear pairings and applications. In Proceedings of the 2005 international conference on Topics in Cryptology, CT-RSA'05, pages 275-292, Berlin, Heidel- berg, 2005. Springer-Verlag.
- G. Nuckolls. Verified query results from hybrid authentication trees. In Proceedings of the 19th annual IFIP WG 11.3 working conference on Data and Applications Security, DBSec'05, pages 84-98, Berlin, Heidelberg, 2005. Springer-Verlag.
- H. Pang, A. Jain, K. Ramamritham, and K.-L. Tan. Verifying completeness of relational query results in data publishing. In Proceedings of the 2005 ACM SIGMOD international conference on Management of data, SIGMOD '05, pages 407-418, New York, NY, USA, 2005. ACM.
- H. Pang and K. Mouratidis. Authenticating the query results of text search engines. Proceedings of the VLDB Endowment, 1(1):126-137, Aug. 2008.
- C. Papamanthou, R. Tamassia, and N. Triandopoulos. Authenticated hash tables. In Proceed- ings of the 15th ACM conference on Computer and communications security, CCS '08, pages 437-448, New York, NY, USA, 2008. ACM.
- C. Papamanthou, R. Tamassia, and N. Triandopoulos. Optimal verification of operations on dynamic sets. In Proceedings of the 31st annual conference on Advances in cryptology, CRYPTO'11, pages 91-110, Berlin, Heidelberg, 2011. Springer-Verlag.
- R. Tamassia. Authenticated data structures. In Proceedings of the 11th Annual European Symposium on Algorithms, volume 2832 of LNCS, pages 2-5. Springer-Verlag, 2003.
- R. Tamassia and N. Triandopoulos. Computational bounds on hierarchical data processing with applications to information security. In Proceedings of the 32nd international conference on Automata, Languages and Programming, ICALP'05, pages 153-165, Berlin, Heidelberg, 2005. Springer-Verlag.
- R. Tamassia and N. Triandopoulos. Certification and authentication of data structures. In Proceedings of the 4th Alberto Mendelzon International Workshop on Foundations of Data Management, 2010.
- Y. Yang, D. Papadias, S. Papadopoulos, and P. Kalnis. Authenticated join processing in outsourced databases. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of data, SIGMOD '09, pages 5-18, New York, NY, USA, 2009. ACM.
- M. L. Yiu, Y. Lin, and K. Mouratidis. Efficient verification of shortest path search via au- thenticated hints. In Data Engineering (ICDE), 2010 IEEE 26th International Conference on, pages 237 -248, march 2010.
- J. Zobel and A. Moffat. Inverted files for text search engines. ACM Computing Surveys, 38(2), 2006.