Academia.eduAcademia.edu

Outline

A methodology for semi-automatic classification schema building

2009, Arxiv preprint arXiv: …

Abstract

This paper describe a methodology for semi-automatic classification schema definition (a classification schema is a taxonomy of categories useful for automatic document classification). The methodology is based on: (i) an extensional approach useful to create a typology starting from a document base, and (ii) an intensional approach to build the classification schema starting from the typology. The extensional approach uses clustering techniques to group together documents on the basis of a similarity measure, whereas the intensional approach uses different operations (aggregation, reduction, generalization specialization) to define classes.

References (57)

  1. ADANSON, M., Familles des plantes, Paris 1763.
  2. ALBERTUS MAGNUS, De Vegetabilis, Köln 1250.
  3. APTER, D., The Politics of Modernization, Chicago 1965.
  4. BECKER, H. S., Constructive Typology in the Social Sciences, in "American Sociological Review" 1940, V, 1, pp. 40-45.
  5. BERGER, J. E ZELDITCH, M., Review of T. Parsons' Sociological Theory and Modern Society, in "American Sociological Review", 1968, XXXIII, 3, pp. 446- 450.
  6. BLALOCK, H. M., Social Statistics, New York 1960 (tr. it.: Statistica per la ricerca sociale, Bologna 1970).
  7. CAIN, A. J., Classification: Biological, in The New Encyclopedia Britannica, vol. IV, London 1974, pp. 683-691.
  8. CAPECCHI, V., I modelli di classificazione e l'analisi della struttura latente, in "Quaderni di Sociologia", 1964, XIII, 3, pp. 289-340.
  9. CAPECCHI, V. e MOELLER, F., Some Applications of Entropy to the Problems of Classification, in "Quality & Quantity" 1968, II, 1-2, pp. 63-84.
  10. CAVALLI, A., La funzione dei tipi ideali e il rapporto fra conoscenza storica e sociologica, in ROSSI, P. (ed.), Max Weber e l'analisi del mondo moderno, Torino 1981, pp. 27-52.
  11. COHEN, M. R. e NAGEL, E., An Introduction to Logic and Scientific Method, New York 1934.
  12. COLLINS, R., Conflict Sociology: Toward an Explanatory Science, New York 1975.
  13. DUFRENOY, P. A., Traité de minéralogie, Paris 1845.
  14. DURKHEIM, E. e MAUSS, M., De quelques formes primitives de classification, in "L'année sociologique", 1902, VI, pp. 1-71 (tr. ingl.: Primitive Classification, London 1963).
  15. FOX, J., Selective Aspects of Measuring Resemblance for Taxonomy, in HUDSON, H. C.
  16. ed.), Classifying Social Data, San Francisco 1982, pp. 127-151.
  17. GIL, F., Sistematica e classificazione, in Enciclopedia Einaudi, vol. VIII, Torino 1981, pp. 1024-1044.
  18. GILMOUR, J. S. L., Taxonomy and Philosophy, in HUXLEY, J. (ed.), The New Systematics, Oxford 1940.
  19. GLASER, B. G., Theoretical Sensitivity: Advances in the Methodology of Grounded Theory, Mill Valley 1978.
  20. GREENBERG, J. M., The Nature and Uses of Linguistic Typologies, in "International Journal of American Linguistics" 1957, XXIII, 2, pp. 68-72.
  21. HEMPEL, C. G., Typological Methods in the Natural and the Social Sciences, in "Proceedings of the American Philosophical Association", 1952, pp. 65-86.
  22. HEMPEL, C. G., Fundamentals of Taxonomy, in HEMPEL, C. G., Aspects of Scientific Explanation, Glencoe 1965, pp. 137-154 (tr. in it. Aspetti della spiegazione scientifica, Milano 1965).
  23. HEMPEL, C. G. e OPPENHEIM, P., Der Typusbegriff im Lichte der neuen Logik, Leyden 1936.
  24. HENNIG, W., Phylogenetic Systematics, Urbana 1979.
  25. HUDSON, H. C., Cluster and Factor Analysis of Cultural Data from Continuous Geographical Areas, in HUDSON, H. C. (ed.), Classifying Social Data, San Francisco 1982, pp. 56-83.
  26. HUXLEY, J. (ed.), The New Systematics, Oxford 1940.
  27. KAPLAN, A., The Conduct of Inquiry, San Francisco 1964.
  28. KOERNER, S., Classification Theory, in The New Encyclopedia Britannica, vol. IV, London 1974, pp. 691-694.
  29. LAZARSFELD, P. F., Some Remarks on the Typological Procedures in Social Research, in "Zeitschrift für Sozialforschung" 1937, VI, pp. 119-139.
  30. LAZARSFELD, P. F. E BARTON, A. H., Qualitative Measurement in the Social Sciences: Classifications, Typologies, and Indices, in LERNER, D. e LASSWELL H. D. (eds.), The Policy Sciences, Stanford 1951, pp. 155-192.
  31. LENZEL, V. F., Procedures of Empirical Science, Chicago 1938.
  32. LEVY-BRUHL, L., Les fonctions mentales dans les sociétés inférieures, Paris 1910.
  33. LINNAEUS, C., Systema naturae, Stockholm 1735.
  34. LUNDBERG, G. A., The Concept of Law in the Social Sciences, in "Philosophy of Science", 1938, V, 2, pp. 189-203.
  35. MALINOWSKI, B., A Scientific Theory of Culture and Other Essays, Chapel Hill 1944.
  36. MAY, R. W., Discriminant Analysis in Cluster Analysis, in HUDSON, H.
  37. C. (ed.), Classifying Social Data, San Francisco 1982, pp. 39-55.
  38. McKINNEY, J. C., Constructive Typology and Social Theory, New York 1966.
  39. NOWAK, S., Understanding and Prediction, Dordrecht 1976.
  40. PIAGET, J. e INHELDER, B., La genèse des structures logiques élémentaires chez l'enfant: classifications et sériations, Neuchâtel 1959.
  41. RADFORD, A. E. E ALTRI, Vascular Plants Systematics, New York 1974.
  42. ROSSI, P., Introduzione, in WEBER, M., Il metodo delle scienze storico- sociali, Torino 1958, pp. 9-43.
  43. RUNCIMAN, W. G., A Critique of Max Weber's Philosophy of Social Science, Cambridge 1972.
  44. SANDRI, G., On the Logic of Classification, in "Quality & Quantity" 1969, III, 1-2, pp. 80-124.
  45. SARTORI, G., Concept Misformation in Comparative Politics, in "American Political Science Review" 1970, LXIV, 4, pp. 1033-53.
  46. SAUSSURE, F. de, Cours de linguistique générale, Paris 1916.
  47. SCHEFFLER, I., Science and Subjectivity, Indianapolis 1967 (tr. it.: Scienza e soggettività, Roma 1983).
  48. SCHELTING, A. von, Die logische Theorie der historischen Kulturwissenschaft von Max Weber und im besonderen sein Begriff des ideal Types, in "Archiv für Sozialwissenschaft und Sozialpolitik" 1922, XLIX, pp. 725- 752.
  49. SCHLEGEL, F. von, Ueber die Sprache und Weisheit der Indier, Heidelberg 1808.
  50. SMELSER, N. J., Comparative Methods in the Social Sciences, Englewood Cliffs 1976 (tr. it.: La comparazione nelle scienze sociali, Bologna 1982).
  51. SNEATH, P. H. A., Some Thoughts on Bacterial Classification, in "Journal of General Microbiology", 1957, XVII.
  52. SOKAL, R. R., Distance as a Measure of Taxonomic Similarity, in "Systematic Zoology", 1958, X, 1: 70-79.
  53. SPENCER, H., The Principles of Sociology, London 1892.
  54. TIRYAKIAN, E. A., Typologies, in International Encyclopedia of the Social Sciences, vol. XVI, London & New York 1968, pp. 177-185.
  55. WATKINS, J. W. N., Ideal Types and Historical Explanation, in "British Journal for the Philosophy of Science" 1952, III, 1, pp. 22-43.
  56. WEBER, M., Die Objektivität sozialwissenschaftlicher und sozialpolitischer Erkenntnis, in "Archiv für Sozialwissenschaft und Sozialpolitik", 1904, XIX, pp. 22-87 (tr. it. L'oggettività conoscitiva della scienza sociale e della politica sociale, in WEBER, M., Il metodo delle scienze storico-sociali, Torino 1958, pp. 53-141).
  57. WEBER, M., Wirtschaft und Gesellschaft. Grundriss der verstehenden Soziologie, Tübingen 1922 (tr. it.: Economia e Società, Milano 1961).