Solution of the eigenvalue problems can be based on inverting matrices built from regularized vec... more Solution of the eigenvalue problems can be based on inverting matrices built from regularized vectors. The regularization parameters are equal to the eigenvalues of the given matrix after fitting in accordance to the collinearity models. In this approach the eigenvectors are equal to the columns of the inverted matrix.
Identification of risk factors in patients with a particular disease can be analyzed in clinical ... more Identification of risk factors in patients with a particular disease can be analyzed in clinical data sets by using feature selection procedures of pattern recognition and data mining methods. The applicability of the relaxed linear separability (RLS) method of feature subset selection was checked for high-dimensional and mixed type (genetic and phenotypic) clinical data of patients with end-stage renal disease. The RLS method allowed for substantial reduction of the dimensionality through omitting redundant features while maintaining the linear separability of data sets of patients with high and low levels of an inflammatory biomarker. The synergy between genetic and phenotypic features in differentiation between these two subgroups was demonstrated.
An algorithm of learning in multilayer threshold nets without feedbacks is proposed. The net is b... more An algorithm of learning in multilayer threshold nets without feedbacks is proposed. The net is built of threshold elements with binary inputs. During a learning process each input vector x is accompanied by a teacher's decision co (co~ {1,..., M}). The pairs (x[n], co[n]) appear in successive steps independently according to some unknown stationary distribution p(x, co). The problem of learning of a threshold net has been decomposed to a series of problems of learning of the threshold elements. The proposed learning algorithm of the threshold elements has a perceptron-like form. It was proven that a decision rule of the threshold net stabilizes after a finite number of steps. For definite classes {p(x, co)}* of distributions p(x, co), an optimal decision rule stabilizes after a finite number of steps. These classes {p(x, co)}~ also contain distributions describing learning processes with perturbations.
Abstract: Prognostic procedures can be based on ranked linear models. Ranked regression type mode... more Abstract: Prognostic procedures can be based on ranked linear models. Ranked regression type models are designed on the basis of feature vectors combined with set of relations defined on selected pairs of these vectors. Feature vectors are composed of numerical results of measurements on particular objects or events. Ranked relations defined on selected pairs of feature vectors represent additional knowledge and can reflect experts' opinion about considered objects. Ranked models have the form of linear transformations of feature vectors on a line which preserve a given set of relations in the best manner possible. Ranked models can be designed through the minimization of a special type of convex and piecewise linear (CPL) criterion functions. Some sets of ranked relations cannot be well represented by one ranked model. Decomposition of global model into a family of local ranked models could improve representation. A procedures of ranked models decomposition is described in thi...
The main challenges in data mining are related to large, multi-dimensional data sets. There is a ... more The main challenges in data mining are related to large, multi-dimensional data sets. There is a need to develop algorithms that are precise and efficient enough to deal with big data problems. The Simplex algorithm from linear programming can be seen as an example of a successful big data problem solving tool. According to the fundamental theorem of linear programming the solution of the optimization problem can found in one of the vertices in the parameter space. The basis exchange algorithms also search for the optimal solution among finite number of the vertices in the parameter space. Basis exchange algorithms enable the design of complex layers of classifiers or predictive models based on a small number of multivariate data vectors.
Studies in Health Technology and Informatics, 2001
The "Hepar" system comprises a clinical database and the shell of procedures that aim a... more The "Hepar" system comprises a clinical database and the shell of procedures that aim at data analysis and the support of diagnosis. The database consists of hepatological patient cases. Each case is described by about 200 medical findings and histopathologically verified diagnosis. The diagnosis supporting rules of "Hepar" are based on visualizing data transformations and on the similarity based techniques. The applied linear visualizing transformations of data sets on the plane aim at separating of the groups of patients associated with different diseases. The resulting diagnostic maps by the visual inspection allow to find such cases in the database that are similar to the previously diagnosed patients. This paper examines combining of data transformations with the nearest neighbors techniques in the support of diagnosis. We report the results on the experimental comparisons of different decision rules including the feature selection procedure.
Feature selection problem appears where large number of features constraint effective data analys... more Feature selection problem appears where large number of features constraint effective data analysis and processing. Identification of the most important feature subsets is a crucial challenge in many important applications. For example, a basic question in bioinformatics which is identification of genes functionalities, can be formulated and answered as a problem of this kind. Identification of the most important feature subsets through minimisation of convex and piecewise-linear (CPL) criterion function is described and analysed in the paper. This approach is combined with relaxation of the linear separability assumption.
Linear separability of data sets is one of the basic concepts in the theory of neural networks an... more Linear separability of data sets is one of the basic concepts in the theory of neural networks and pattern recognition. Data sets are often linearly separable because of their high dimensionality. Such is the case of genomic data, in which a small number of cases is represented in a space with extremely high dimensionality. An evaluation of linear separability of two data sets can be combined with feature selection and carried out through minimisation of a convex and piecewise-linear (CPL) criterion function. The perceptron criterion function belongs to the CPL family. The basis exchange algorithms allow us to find minimal values of CPL functions efficiently, even in the case of large, multidimensional data sets.
Streszczenie: Zagadnienia szeregowania zadań pojawiaj a si e mi edzy innymi w kontekście problemó... more Streszczenie: Zagadnienia szeregowania zadań pojawiaj a si e mi edzy innymi w kontekście problemów realizowalności dużych procesów obliczeniowych i ich optymalizacji. Przy rozstrzyganiu tego typu problemów można wykorzystywać metody regresji rangowej. Do celów konstrukcji modeli regresji rangowej poszczególne zadania obliczeniowe charakteryzowane s a poprzez wielowymiarowe wektory zależności. Wektory zależności pozwalaj a stwierdzić czy określone zadanie może być zrealizowane tylko wtedy, gdy zostan a wcześniej zrealizowane pewne inne zadania. Regresja rangowa obejmuje konstrukcj e takich odwzorowań liniowych z wielowymiarowej przestrzeni zależności na przestrzeń jednowymiarow a (lini e czasu), która odzwierciedla w możliwie dużym stopniu zależności pomi edzy zadaniami. Słowa kluczowe: szeregowanie zadań obliczeniowych, model rangowy, wypukła i odcinkowo-liniowa funkcja kryterialna (CPL)
Uploads
Papers by L. Bobrowski