Abstract
Tableau Research (a) Default binning scheme. (b) Binning scheme recommended by OSCAR Figure 1: Visualizations showing comparisons of bins for data on per-country life expectancy (left) and per-U.S. county obesity rates (right). The top-row bins are computed based on statistical properties, while the bottom-row bins are computed by OSCAR. Semantic bins have benefits for legibility, reducing the number of bins (i.e., the visual complexity of the map or histogram), and taking advantage of non-uniformity to either highlight areas of interest or compress long tails of the distribution into single bins.
References (37)
- Tableau Help: Create bins from a continuous measure. https://onlinehelp.tableau.com/current/pro/desktop/ en-us/calculations_bins.html.
- Cdc nutrition, physical activity, and obesity: Data, trends and maps, 2022. https://www.cdc.gov/nccdphp/dnpao/ data-trends-maps/index.html.
- Gini coefficient by country 2022, 2022. https: //worldpopulationreview.com/country-rankings/ gini-coefficient-by-country.
- M. P. Armstrong, N. Xiao, and D. A. Bennett. Using genetic algorithms to create multicriteria class intervals for choropleth maps. Annals of the Association of American Geographers, 93(3):595-623, 2003.
- D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. J. Mach. Learn. Res., 3(null):993-1022, mar 2003.
- L. Boels, A. Bakker, W. Van Dooren, and P. Drijvers. Conceptual diffi- culties when interpreting histograms: A review. Educational Research Review, 28:100291, 2019. doi: 10.1016/j.edurev.2019.100291
- C. A. Brewer and L. Pickle. Evaluation of methods for classifying epidemiological data on choropleth maps in series. Annals of the Association of American Geographers, 92(4):662-681, 2002.
- M. Correll, M. Li, G. Kindlmann, and C. Scheidegger. Looks good to me: Visualizations as sanity checks. IEEE Transactions on Vi- sualization and Computer Graphics, 25(1):830-839, 2019. doi: 10. 1109/TVCG.2018.2864907
- C. Fellbaum. WordNet: An Electronic Lexical Database. Bradford Books, 1998.
- D. Freedman and P. Diaconis. On the histogram as a density estimator:L 2 theory. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 57(4):453-476, 1981. doi: 10.1007/BF01025868
- Gapminder. World development indicators, 2022. CC-BY Dataset: https://gapminder.org/data/.
- R. Garcia-Retamero and E. T. Cokely. Communicating health risks with visual aids. Current Directions in Psychological Science, 22(5):392- 399, 2013.
- S. V. Gopal Lolla and L. L. Hoberock. On selecting the number of bins for a histogram. In DMIN 2011: proceedings of the 2011 international conference on data mining (Las Vegas NV, July 18-21, 2011), pp. 344- 350, 2011.
- M. Harrower and C. A. Brewer. Colorbrewer.org: An online tool for selecting colour schemes for maps. The Cartographic Journal, 40(1):27-37, 2003. doi: 10.1179/000870403235002042
- G. Jenks. Optimal data classification for choropleth maps. occasional paper no. 2. University of Kansas, Department of Geography, 1977.
- Kaggle. Titanic dataset, 2022. https://www.kaggle.com/c/ titanic/data.
- J. J. Kaplan, J. G. Gabrosek, P. Curtiss, and C. Malone. Investigating student understanding of histograms. Journal of Statistics Education, 22(2):null, 2014. doi: 10.1080/10691898.2014.11889701
- Y.-S. Kim, J. Hullman, and M. Agrawala. Generating personalized spatial analogies for distances and areas. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, CHI '16, p. 38-48. Association for Computing Machinery, New York, NY, USA, 2016. doi: 10.1145/2858036.2858440
- S. Lem, P. Onghena, L. Verschaffel, and W. V. Dooren. On the mis- interpretation of histograms and box plots. Educational Psychology, 33(2):155-174, 2013. doi: 10.1080/01443410.2012.674006
- A. M. MacEachren. The role of complexity and symbolization method in thematic map effectiveness. Annals of the Association of American Geographers, 72(4):495-513, 1982.
- C. D. Manning, P. Raghavan, and H. Schütze. Introduction to Informa- tion Retrieval. Cambridge University Press, USA, 2008.
- A. K. McCallum. Mallet: A machine learning for language toolkit. http://mallet.cs.umass.edu, 2002.
- M. S. Monmonier. Maximum-difference barriers: An alternative nu- merical regionalization method. Geographical analysis, 5(3):245-261, 1973.
- T. Pohlert and M. T. Pohlert. Package 'pmcmrplus'. R Foundation for Statistical Computing, Vienna, Austria, 2018.
- Programmable Web. Thesaurus API, 2022.
- R. Sahann, T. Müller, and J. Schmidt. Histogram binning revisited with a focus on human perception. In 2021 IEEE Visualization Conference (VIS), pp. 66-70, 2021. doi: 10.1109/VIS49827.2021.9623301
- A. Satyanarayan, D. Moritz, K. Wongsuphasawat, and J. Heer. Vega- lite: A grammar of interactive graphics. IEEE Transactions on Vi- sualization and Computer Graphics, 23(1):341-350, 2017. doi: 10. 1109/TVCG.2016.2599030
- D. W. Scott. On optimal and data-based histograms. Biometrika, 66(3):605-610, 12 1979. doi: 10.1093/biomet/66.3.605
- D. W. Scott. Sturges' rule. WIREs Computational Statistics, 1(3):303- 306, 2009. doi: 10.1002/wics.35
- Sesame Street. Oscar the Grouch ™ , 2022. https:/https://www. sesamestreet.org/.
- V. Setlur and M. C. Stone. A linguistic approach to categorical color assignment for data visualization. IEEE Transactions on Visualization and Computer Graphics, 22(1):698-707, 2016. doi: 10.1109/TVCG. 2015.2467471
- V. Setlur, M. Tory, and A. Djalali. Inferencing underspecified natural language utterances in visual analysis. In Proceedings of the 24th International Conference on Intelligent User Interfaces, IUI '19, p. 40-51. Association for Computing Machinery, New York, NY, USA, 2019. doi: 10.1145/3301275.3302270
- T. A. Slocum, R. B. McMaster, F. C. Kessler, and H. H. Howard. Thematic cartography and geovisualization. CRC Press, 2014.
- Tableau Software. Tableau Calculations, 2022. https: //help.tableau.com/current/pro/desktop/en-us/ calculations_bins.htm.
- W. R. Tobler. Choropleth maps without class intervals? Geographical analysis, 5(3):262-265, 1973.
- U.S. Census Bureau. 2019 American Community Survey 5-year esti- mates, 2019.
- D. Zavala-Rojas, D. Sorato, L. Hareide, and K. Hofland. The Multi- lingual Corpus of Survey Questionnaires: a tool for refining survey translation. Meta: Journal des traducteurs.