Key research themes
1. How can semi-supervised learning algorithms effectively leverage both labeled and unlabeled data to improve classification accuracy?
This research theme focuses on the development and evaluation of semi-supervised learning (SSL) methods that integrate small amounts of labeled data with large pools of unlabeled data to enhance classification performance. The challenge lies in designing algorithms that utilize unlabeled data beneficially without degrading accuracy, especially given that labeled data are costly or hard to procure. Various algorithmic frameworks including self-training, co-training, graph-based methods, and boosting-based approaches are explored to optimize learning when labels are scarce, addressing scenarios across text, image, and multidomain data classification.
2. What methodologies enable effective integration of pairwise constraints, soft label estimation, and discriminant embeddings for enhancing semi-supervised classification performance?
This theme investigates advanced semi-supervised classification techniques that go beyond conventional label propagation by incorporating pairwise constraints, soft labeling strategies, and discriminant feature embedding. These methods aim to encode relational information and data structure more explicitly, fostering better class separability and improved decision boundaries in both binary and multiclass contexts. Such integration is crucial for domains where label information is scarce but relational or intrinsic geometric information among data points is available or can be inferred.
3. How can feature selection and evaluation metrics be adapted or redefined to capitalize on unlabeled data and better assess semi-supervised classification models?
This research area explores semi-supervised feature selection frameworks and advanced performance evaluation metrics specially designed for semi-supervised learning scenarios. Considering limited labeled data, enhancing feature selection by incorporating unlabeled examples can improve classifier generalization and robustness. Simultaneously, assessing model quality in imbalanced or multi-class contexts necessitates more interpretable and customizable metrics that reflect class distributions, skew-sensitivity, and training progress effectively, thus facilitating trustworthy semi-supervised classification evaluation.