Academia.eduAcademia.edu

Outline

Spatial Weighted Outlier Detection

2006, Proceedings of the 2006 SIAM International Conference on Data Mining

https://doi.org/10.1137/1.9781611972764.71

Abstract

Spatial outliers are the spatial objects with distinct features from their surrounding neighbors. Detection of spatial outliers helps reveal valuable information from large spatial data sets. In many real applications, spatial objects can not be simply abstracted as isolated points. They have different boundary, size, volume, and location. These spatial properties affect the impact of a spatial object on its neighbors and should be taken into consideration. In this paper, we propose two spatial outlier detection methods which integrate the impact of spatial properties to the outlierness measurement. Experimental results on a real data set demonstrate the effectiveness of the proposed algorithms.

FAQs

sparkles

AI

What distinguishes spatial outliers from traditional outliers?add

Spatial outliers focus on local neighborhood differences, while traditional outliers consider global data. They analyze complex spatial data formats, such as 3D objects, unlike traditional methods.

How do the proposed algorithms improve outlier detection accuracy?add

The algorithms use weighted neighborhood comparisons, assigning different impacts based on spatial attributes. For instance, they consider factors like distance and common border length.

What was the primary dataset used for algorithm validation?add

The validation was conducted on the West Nile virus data from the U.S. CDC, covering veterinary cases in 2003. This dataset provided a spatial context for the outlier detection.

How does the AvgDiff algorithm differ from the weighted z value approach?add

AvgDiff computes the weighted average of absolute differences, capturing variance among neighbors. In contrast, the weighted z value approach averages neighbor attribute values before comparison.

What future improvements are planned for the spatial outlier detection methods?add

Plans include extending algorithms to detect outliers with multiple attributes and developing a classification-based training method. This aims to assess the importance of spatial features and their influence.

References (67)

  1. Harford County,MD,0.0158 York County,PA,0.0175
  2. Lancaster County,PA,0.0683
  3. 2 Hot Springs County,WY,0.0008
  4. Chester County,PA,0.0501
  5. Lebanon County,PA,0.0245
  6. Lebanon County,PA,0.0245
  7. Cecil County,MD,0.0078
  8. Carroll County,MD,0.0378
  9. Lancaster County,PA,0.0683
  10. Gloucester County,NJ,0.0321
  11. Chester County,PA,0.0501
  12. Cecil County,MD,0.0078
  13. Cumberland County,NJ,0.0063
  14. Salem County,NJ,0.0309
  15. Montgomery County,PA,0.0184
  16. 11 York County,PA,0.0175
  17. York County,PA,0.0175
  18. 12 Baltimore city,MD,0.0000
  19. Adams County,PA,0.0178 Baltimore city,MD,0.0000
  20. 13 Howard County,MD,0.0199
  21. Carroll County,MD,0.0378
  22. Rockwall County,TX,0.0210
  23. 14 McKinley County,NM,0.0002
  24. Frederick County,MD,0.0175
  25. Cumberland County,NJ,0.0063
  26. 15 Philadelphia County,PA,0.0029
  27. Howard County,MD,0.0199
  28. 16 Weld County,CO,0.0050
  29. 17 Cumberland County,NJ,0.0063
  30. 18 Cecil County,MD,0.0078
  31. Montgomery County,PA,0.0184
  32. Camden County,NJ,0.0087
  33. Monmouth County,NJ,0.0147
  34. 20 Baltimore County,MD,0.0129 Baltimore city,MD,0.0000 Union County,PA,0.0134
  35. 21 Johnson County,WY,0.0012
  36. Salem County,NJ,0.0309
  37. Ramsey County,MN,0.0149
  38. 22 Boulder County,CO,0.0094
  39. Gloucester County,NJ,0.0321
  40. Camden County,NJ,0.0087
  41. 23 Montgomery County,PA,0.0184
  42. Mercer County,NJ,0.0051
  43. Atlantic County,NJ,0.0062
  44. Anne Arundel County,MD,0.0158
  45. Howard County,MD,0.0199
  46. 26 Guadalupe County,NM,0.0004
  47. Ocean County,NJ,0.0049
  48. Frederick County,MD,0.0175
  49. 27 Hood County,TX,0.0018
  50. Dallas County,TX,0.0070
  51. 28 Arapahoe County,CO,0.0072
  52. Montgomery County,MD,0.0125
  53. 30 Tarrant County,TX,0.0085
  54. Hancock County,WV,0.0093
  55. N. R. Adam, V. P. Janeja, and V. Atluri. Neighborhood based detection of anomalies in high dimensional spatio-temporal sensor datasets. In Proceedings of the 2004 ACM symposium on Applied computing, pages 576-583, 2004.
  56. V. Barnett and T. Lewis. Outliers in Statistical Data. John Wiley, New York, 1994.
  57. T. Cheng and Z. Li. A hybrid approach to detect spatial- temporal outliers. In Proc. of the 12th International Confer- ence on Geoinformatics, pages 173-178, 2004.
  58. R. Haining. Spatial Data Analysis in the Social and Environ- mental Sciences. Cambridge University Press, 1993.
  59. J. Haslett, R. Brandley, P. Craig, A. Unwin, and G. Wills. Dynamic Graphics for Exploring Spatial Data With Applica- tion to Locating Global and Local Anomalies. The American Statistician, 45:234-242, 1991.
  60. C.-T. Lu, D. Chen, and Y. Kou. Algorithms for spatial outlier detection. In Proc. of the 3rd IEEE International Conference on Data Mining, 2003.
  61. A. Luc. Local indicators of spatial association: Lisa. Geo- graphical Analysis, 27(2):93-115, 1995.
  62. Y. Panatier. VARIOWIN: Software for Spatial Data Analysis in 2D. Springer-Verlag, New York, 1996.
  63. S. Shekhar, C. Lu, and P. Zhang. A unified approach to detecting spatial outliers. GeoInformatica, 7(2):139-166, 2003.
  64. S. Shekhar, C.-T. Lu, and P. Zhang. Detecting graph-based spatial outliers: algorithms and applications. In Proc. of the 7th International Conference on KDD, 2001.
  65. W. Tobler. Cellular geography. In Philosophy in Geography, pages 379-386. Dordrecht Reidel Publishing Company, 1979.
  66. J. Zhao, C.-T. Lu, and Y. Kou. Detecting region outliers in meteorological data. In Proc. of the 11th ACM-GIS, pages 49-55, 2003.
  67. Downloaded 06/03/20 to 34.228.24.229. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php