Academia.eduAcademia.edu

Outline

Editors: Roberto Grossi and Moshe Lewenstein

https://doi.org/10.4230/LIPICS.CPM.2016.2

Abstract

Let S and S be two strings, having the same length, over a totally-ordered alphabet. We consider the following two variants of string matching. Parameterized Matching: The characters of S and S are partitioned into static characters and parameterized characters. The strings are a parameterized match iff the static characters match exactly, and there exists a one-to-one function which renames the parameterized characters in S to those in S. Order-Preserving Matching: The strings are an order-preserving match iff for any two integers i, j ∈ [1, |S|], S[i] S[j] ⇐⇒ S [i] S [j], where denotes the precedence order of the alphabet. Let P be a collection of d patterns {P 1 , P 2 ,. .. , P d } of total length n characters, which are chosen from a totally-ordered alphabet Σ. Given a text T , also over Σ, we consider the dictionary indexing problem under the above definitions of string matching. Specifically, the task is to index P, such that we can report all positions j (called occurrences) ...

References (31)

  1. Alfred V. Aho and Margaret J. Corasick. Efficient string matching: An aid to bibliographic search. Commun. ACM, 18(6):333-340, 1975. doi:10.1145/360825.360855.
  2. Amihood Amir, Martin Farach, and S. Muthukrishnan. Alphabet dependence in paramet- erized matching. Inf. Process. Lett., 49(3):111-115, 1994. doi:10.1016/0020-0190(94) 90086-8.
  3. Brenda S. Baker. A theory of parameterized pattern matching: algorithms and applications. In Proceedings of the Twenty-Fifth Annual ACM Symposium on Theory of Computing, May 16-18, 1993, San Diego, CA, USA, pages 71-80, 1993. doi:10.1145/167088.167115.
  4. Djamal Belazzougui. Succinct dictionary matching with no slowdown. In Combinatorial Pattern Matching, 21st Annual Symposium, CPM 2010, New York, NY, USA, June 21-23, 2010. Proceedings, pages 88-100, 2010. doi:10.1007/978-3-642-13509-5_9.
  5. 5 Djamal Belazzougui and Gonzalo Navarro. Alphabet-independent compressed text indexing. ACM Transactions on Algorithms, 10(4):23:1-23:19, 2014. doi:10.1145/2635816.
  6. Sudip Biswas, Arnab Ganguly, Rahul Shah, and Sharma V. Thankachan. Forbidden ex- tension queries. In 35th IARCS Annual Conference on Foundation of Software Technology and Theoretical Computer Science, FSTTCS 2015, December 16-18, 2015, Bangalore, In- dia, pages 320-335, 2015. doi:10.4230/LIPIcs.FSTTCS.2015.320.
  7. Maxime Crochemore, Costas S. Iliopoulos, Tomasz Kociumaka, Marcin Kubica, Alessio Langiu, Solon P. Pissis, Jakub Radoszewski, Wojciech Rytter, and Tomasz Walen. Order- preserving incomplete suffix trees and order-preserving indexes. In String Processing and Information Retrieval -20th International Symposium, SPIRE 2013, Jerusalem, Israel, October 7-9, 2013, Proceedings, pages 84-95, 2013. doi:10.1007/978-3-319-02432-5_13.
  8. A. Ganguly, W.-K. Hon, K. Sadakane, R. Shah, S. V. Thankachan, and Y. Yang 2:11
  9. Paolo Ferragina and Giovanni Manzini. Opportunistic data structures with applications. In 41st Annual Symposium on Foundations of Computer Science, FOCS 2000, 12-14 November 2000, Redondo Beach, California, USA, pages 390-398, 2000. doi:10.1109/SFCS.2000. 892127.
  10. Paolo Ferragina and Giovanni Manzini. Indexing compressed text. J. ACM, 52(4):552-581, 2005. doi:10.1145/1082036.1082039.
  11. Paolo Ferragina, Giovanni Manzini, Veli Mäkinen, and Gonzalo Navarro. An alphabet- friendly fm-index. In String Processing and Information Retrieval, 11th International Con- ference, SPIRE 2004, Padova, Italy, October 5-8, 2004, Proceedings, pages 150-160, 2004. doi:10.1007/978-3-540-30213-1_23.
  12. Paolo Ferragina, Giovanni Manzini, Veli Mäkinen, and Gonzalo Navarro. Compressed representations of sequences and full-text indexes. ACM Transactions on Algorithms, 3(2), 2007. doi:10.1145/1240233.1240243.
  13. Arnab Ganguly, Rahul Shah, and Sharma V. Thankachan. Succinct non-overlapping indexing. In Combinatorial Pattern Matching -26th Annual Symposium, CPM 2015, Ischia Island, Italy, June 29 -July 1, 2015, Proceedings, pages 185-195, 2015. doi: 10.1007/978-3-319-19929-0_16.
  14. Roberto Grossi, Ankur Gupta, and Jeffrey Scott Vitter. High-order entropy-compressed text indexes. In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms, January 12-14, 2003, Baltimore, Maryland, USA., pages 841-850, 2003. URL: http://dl.acm.org/citation.cfm?id=644108.644250.
  15. Roberto Grossi and Jeffrey Scott Vitter. Compressed suffix arrays and suffix trees with applications to text indexing and string matching (extended abstract). In Proceedings of the Thirty-Second Annual ACM Symposium on Theory of Computing, May 21-23, 2000, Portland, OR, USA, pages 397-406, 2000. doi:10.1145/335305.335351.
  16. Carmit Hazay, Moshe Lewenstein, and Dina Sokol. Approximate parameterized matching. In Algorithms -ESA 2004, 12th Annual European Symposium, Bergen, Norway, September 14-17, 2004, Proceedings, pages 414-425, 2004. doi:10.1007/978-3-540-30140-0_38.
  17. Wing-Kai Hon, Tak Wah Lam, Rahul Shah, Siu-Lung Tam, and Jeffrey Scott Vitter. Com- pressed index for dictionary matching. In 2008 Data Compression Conference (DCC 2008), 25-27 March 2008, Snowbird, UT, USA, pages 23-32, 2008. doi:10.1109/DCC.2008.62.
  18. Wing-Kai Hon, Rahul Shah, Sharma V. Thankachan, and Jeffrey Scott Vitter. Space- efficient frameworks for top-k string retrieval. J. ACM, 61(2):9:1-9:36, 2014. doi:10. 1145/2590774.
  19. Ramana M. Idury and Alejandro A. Schäffer. Multiple matching of parameterized pat- terns. In Combinatorial Pattern Matching, 5th Annual Symposium, CPM 94, Asilo- mar, California, USA, June 5-8, 1994, Proceedings, pages 226-239, 1994. doi:10.1007/ 3-540-58094-8_20.
  20. Markus Jalsenius, Benny Porat, and Benjamin Sach. Parameterized matching in the streaming model. In 30th International Symposium on Theoretical Aspects of Computer Science, STACS 2013, February 27 -March 2, 2013, Kiel, Germany, pages 400-411, 2013. doi:10.4230/LIPIcs.STACS.2013.400.
  21. Jinil Kim, Peter Eades, Rudolf Fleischer, Seok-Hee Hong, Costas S. Iliopoulos, Kunsoo Park, Simon J. Puglisi, and Takeshi Tokuyama. Order-preserving matching. Theor. Comput. Sci., 525:68-79, 2014. doi:10.1016/j.tcs.2013.10.006.
  22. S. Rao Kosaraju. Faster algorithms for the construction of parameterized suffix trees (pre- liminary version). In 36th Annual Symposium on Foundations of Computer Science, Mil- waukee, Wisconsin, 23-25 October 1995, pages 631-637, 1995. doi:10.1109/SFCS.1995. 492664. C P M 2 0 1 6 2:12 Compact Parameterized and Order-Preserving Dictionaries
  23. Veli Mäkinen and Gonzalo Navarro. Compressed compact suffix arrays. In Combinatorial Pattern Matching, 15th Annual Symposium, CPM 2004, Istanbul,Turkey, July 5-7, 2004, Proceedings, pages 420-433, 2004. doi:10.1007/978-3-540-27801-6_32.
  24. Veli Mäkinen and Gonzalo Navarro. Succinct suffix arrays based on run-length encoding. In Combinatorial Pattern Matching, 16th Annual Symposium, CPM 2005, Jeju Island, Korea, June 19-22, 2005, Proceedings, pages 45-56, 2005. doi:10.1007/11496656_5.
  25. J. Ian Munro, Gonzalo Navarro, Jesper Sindahl Nielsen, Rahul Shah, and Sharma V. Thankachan. Top-k term-proximity in succinct space. In Algorithms and Computation - 25th International Symposium, ISAAC 2014, Jeonju, Korea, December 15-17, 2014, Pro- ceedings, pages 169-180, 2014. doi:10.1007/978-3-319-13075-0_14.
  26. Gonzalo Navarro and Veli Mäkinen. Compressed full-text indexes. ACM Comput. Surv., 39(1), 2007. doi:10.1145/1216370.1216372.
  27. Kunihiko Sadakane. Compressed text databases with efficient query algorithms based on the compressed suffix array. In Algorithms and Computation, 11th International Conference, ISAAC 2000, Taipei, Taiwan, December 18-20, 2000, Proceedings, pages 410-421, 2000. doi:10.1007/3-540-40996-3_35.
  28. Kunihiko Sadakane. New text indexing functionalities of the compressed suffix arrays. J. Algorithms, 48(2):294-313, 2003. doi:10.1016/S0196-6774(03)00087-7.
  29. Kunihiko Sadakane and Gonzalo Navarro. Fully-functional succinct trees. In Proceed- ings of the Twenty-First Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2010, Austin, Texas, USA, January 17-19, 2010, pages 134-149, 2010. doi:10.1137/1. 9781611973075.13.
  30. Alan Tam, Edward Wu, Tak Wah Lam, and Siu-Ming Yiu. Succinct text indexing with wildcards. In String Processing and Information Retrieval, 16th International Symposium, SPIRE 2009, Saariselkä, Finland, August 25-27, 2009, Proceedings, pages 39-50, 2009. doi:10.1007/978-3-642-03784-9_5.
  31. Dekel Tsur. Top-k document retrieval in optimal space. Inf. Process. Lett., 113(12):440-443, 2013. doi:10.1016/j.ipl.2013.03.012.