Papers by Alexander Mozeika

arXiv (Cornell University), Dec 4, 2023
We use statistical mechanics techniques, viz. the replica method, to model the effect of censorin... more We use statistical mechanics techniques, viz. the replica method, to model the effect of censoring on overfitting in Cox's proportional hazards model, the dominant regression method for time-to-event data. In the overfitting regime, Maximum Likelihood parameter estimators are known to be biased already for small values of the ratio of the number of covariates over the number of samples. The inclusion of censoring was avoided in previous overfitting analyses for mathematical convenience, but is vital to make any theory applicable to realworld medical data, where censoring is ubiquitous. Upon constructing efficient algorithms for solving the new (and more complex) RS equations and comparing the solutions with numerical simulation data, we find excellent agreement, even for large censoring rates. We then address the practical problem of using the theory to correct the biased ML estimators without knowledge of the datagenerating distribution. This is achieved via a novel numerical algorithm that self-consistently approximates all relevant parameters of the data generating distribution while simultaneously solving the RS equations. We investigate numerically the statistics of the corrected estimators, and show that the proposed new algorithm indeed succeeds in removing the bias of the ML estimators, for both the association parameters and for the cumulative hazard.
Replica analysis of overfitting in regression models for time to event data: the impact of censoring
Journal of physics. A, Mathematical and theoretical, Feb 28, 2024

arXiv (Cornell University), Apr 14, 2020
Nearly all statistical inference methods were developed for the regime where the number N of data... more Nearly all statistical inference methods were developed for the regime where the number N of data samples is much larger than the data dimension p. Inference protocols such as maximum likelihood (ML) or maximum a posteriori probability (MAP) are unreliable if p = O(N), due to overfitting. This limitation has for many disciplines with increasingly high-dimensional data become a serious bottleneck. We recently showed that in Cox regression for time-to-event data the overfitting errors are not just noise but take mostly the form of a bias, and how with the replica method from statistical physics one can model and predict this bias and the noise statistics. Here we extend our approach to arbitrary generalized linear regression models (GLM), with possibly correlated covariates. We analyse overfitting in ML/MAP inference without having to specify data types or regression models, relying only on the GLM form, and derive generic order parameter equations for the case of L2 priors. Second, we derive the probabilistic relationship between true and inferred regression coefficients in GLMs, and show that, for the relevant hyperparameter scaling and correlated covariates, the L2 regularization causes a predictable direction change of the coefficient vector. Our results, illustrated by application to linear, logistic, and Cox regression, enable one to correct ML and MAP inferences in GLMs systematically for overfitting bias, and thus extend their applicability into the hitherto forbidden regime p = O(N).

arXiv (Cornell University), Mar 24, 2023
Blockchains facilitate decentralization, security, identity, and data management in cyber-physica... more Blockchains facilitate decentralization, security, identity, and data management in cyber-physical systems. However, consensus protocols used in blockchains are prone to high message and computational complexity costs and are not suitable to be used in IoT. One way to reduce message complexity is to randomly assign network nodes into committees or shards. Keeping committee sizes small is then desirable in order to achieve lower message complexity, but this comes with a penalty of reduced reliability as there is a higher probability that a large number of faulty nodes will end up in a committee.
In this work, we study the problem of estimating a probability of a failure in randomly sharded networks. We provide new results and improve existing bounds on the failure probability. Thus, our framework also paves the way to reduce committee sizes without reducing reliability.

Journal of Physics A: Mathematical and Theoretical
We analyse the equilibrium behaviour and non-equilibrium dynamics of sparse Boolean networks with... more We analyse the equilibrium behaviour and non-equilibrium dynamics of sparse Boolean networks with self-interactions that evolve according to synchronous Glauber dynamics. Equilibrium analysis is achieved via a novel application of the cavity method to the temperature-dependent pseudo-Hamiltonian that characterizes the equilibrium state of systems with parallel dynamics. Similarly, the non-equilibrium dynamics can be analysed by using the dynamical version of the cavity method. It is well known, however, that when self-interactions are present, direct application of the dynamical cavity method is cumbersome, due to the presence of strong memory effects, which prevent explicit analysis of the dynamics beyond a few time steps. To overcome this difficulty, we show that it is possible to map a system of N variables to an equivalent bipartite system of 2N variables, for which the dynamical cavity method can be used under the usual one time approximation scheme. This substantial technical ...

We study the stochastic dynamics of Ising spin models with random bonds, interacting on finitely ... more We study the stochastic dynamics of Ising spin models with random bonds, interacting on finitely connected Poissonnian random graphs. We use the dynamical replica method to derive closed dynamical equations for the joint spin-field probability distribution, and solve these within the replica symmetry ansatz. Although the theory is developed in a general setting, with a view to future applications in various other fields, in this paper we apply it mainly to the dynamics of the Glauber algorithm (extended with cooling schedules) when running on the so-called vertex cover optimization problem. Our theoretical predictions are tested against both Monte Carlo simulations and known results from equilibrium studies. In contrast to previous dynamical analyses based on deriving closed equations for only a small numbers of scalar order parameters, the agreement between theory and experiment in the present study is nearly perfect.

Typical properties of computing circuits composed of noisy logical gates are studied using the st... more Typical properties of computing circuits composed of noisy logical gates are studied using the statistical physics methodology. A growth model that gives rise to typical random Boolean functions is mapped onto a layered Ising spin system, which facilitates the study of their ability to represent arbitrary formulae with a given level of error, the tolerable level of gate-noise, and its dependence on the formulae depth and complexity, the gates used and properties of the function inputs. Bounds on their performance, derived in the information theory literature via specific gates, are straightforwardly retrieved, generalized and identified as the corresponding typical-case phase transitions. The framework is employed for deriving results on error-rates, function-depth and sensitivity, and their dependence on the gate-type and noise model used that are difficult to obtain via the traditional methods used in this field.

The generating functional method is employed to investigate the synchronous dynamics of Boolean n... more The generating functional method is employed to investigate the synchronous dynamics of Boolean networks, providing an exact result for the system dynamics via a set of macroscopic order parameters. The topology of the networks studied and its constituent Boolean functions represent the system's quenched disorder and are sampled from a given distribution. The framework accommodates a variety of topologies and Boolean function distributions and can be used to study both the noisy and noiseless regimes; it enables one to calculate correlation functions at different times that are inaccessible via commonly used approximations. It is also used to determine conditions for the annealed approximation to be valid, explore phases of the system under different levels of noise and obtain results for models with strong memory effects, where existing approximations break down. Links between BN and general Boolean formulas are identified and common results to both system types are highlighted.
The dynamics of Boolean networks (BN) with quenched disorder and thermal noise is studied via the... more The dynamics of Boolean networks (BN) with quenched disorder and thermal noise is studied via the generating functional method. A general formulation, suitable for BN with any distribution of Boolean functions, is developed. It provides exact solutions and insight into the evolution of order parameters and properties of the stationary states, which are inaccessible via existing methodology. We identify cases where the commonly used annealed approximation is valid and others where it breaks down. Broader links between BN and general Boolean formulas are highlighted.
We study noisy computation in randomly generated k-ary Boolean formulas. We establish bounds on t... more We study noisy computation in randomly generated k-ary Boolean formulas. We establish bounds on the noise level above which the results of computation by random formulas are not reliable. This bound is saturated by formulas constructed from a single majority-like gates. We show that these gates can be used to compute any Boolean function reliably below the noise bound.

We show that model-based Bayesian clustering, the probabilistically most systematic approach to t... more We show that model-based Bayesian clustering, the probabilistically most systematic approach to the partitioning of data, can be mapped into a statistical physics problem for a gas of particles, and as a result becomes amenable to a detailed quantitative analysis. A central role in the resulting statistical physics framework is played by an entropy function. We demonstrate that there is a relevant parameter regime where mean-field analysis of this function is exact, and that, under natural assumptions, the lowest entropy state of the hypothetical gas corresponds to the optimal clustering of data. The byproduct of our analysis is a simple but effective clustering algorithm, which infers both the most plausible number of clusters in the data and the corresponding partitions. Describing Bayesian clustering in statistical mechanical terms is found to be natural and surprisingly effective.
We study the Glauber dynamics of Ising spin models with random bonds, on finitely connected rando... more We study the Glauber dynamics of Ising spin models with random bonds, on finitely connected random graphs. We generalize a recent dynamical replica theory with which to predict the evolution of the joint spin-field distribution, to include random graphs with arbitrary degree distributions. The theory is applied to Ising ferromagnets on randomly diluted Bethe lattices, where we study the evolution of the magnetization and the internal energy. It predicts a prominent slowing down of the flow in the Griffiths phase, it suggests a further dynamical transition at lower temperatures within the Griffiths phase, and it is verified quantitatively by the results of Monte Carlo simulations.
Using population dynamics algorithm to cluster 10d data
Physical Review Letters, 2009
Computing circuits composed of noisy logical gates and their ability to represent arbitrary Boole... more Computing circuits composed of noisy logical gates and their ability to represent arbitrary Boolean functions with a given level of error are investigated within a statistical mechanics setting. Bounds on their performance, derived in the information theory literature for specific gates, are straightforwardly retrieved, generalized and identified as the corresponding typical-case phase transitions. This framework paves the way for obtaining new results on error-rates, function-depth and sensitivity, and their dependence on the gate-type and noise model used.
Physical Review E, 2013
Many natural, technological and social systems are inherently not in equilibrium. We show, by det... more Many natural, technological and social systems are inherently not in equilibrium. We show, by detailed analysis of exemplar models, the emergence of equilibrium-like behavior in localized or nonlocalized domains within non-equilibrium Ising spin systems. . Equilibrium domains are shown to emerge either abruptly or gradually depending on the system parameters and disappear, becoming indistinguishable from the remainder of the system for other parameter values.

Physical Review E, 2010
Typical properties of computing circuits composed of noisy logical gates are studied using the st... more Typical properties of computing circuits composed of noisy logical gates are studied using the statistical physics methodology. A growth model that gives rise to typical random Boolean functions is mapped onto a layered Ising spin system, which facilitates the study of their ability to represent arbitrary formulae with a given level of error, the tolerable level of gate-noise, and its dependence on the formulae depth and complexity, the gates used and properties of the function inputs. Bounds on their performance, derived in the information theory literature via specific gates, are straightforwardly retrieved, generalized and identified as the corresponding typical-case phase transitions. The framework is employed for deriving results on error-rates, function-depth and sensitivity, and their dependence on the gate-type and noise model used that are difficult to obtain via the traditional methods used in this field.

Philosophical Magazine, 2012
The generating functional method is employed to investigate the synchronous dynamics of Boolean n... more The generating functional method is employed to investigate the synchronous dynamics of Boolean networks, providing an exact result for the system dynamics via a set of macroscopic order parameters. The topology of the networks studied and its constituent Boolean functions represent the system's quenched disorder and are sampled from a given distribution. The framework accommodates a variety of topologies and Boolean function distributions and can be used to study both the noisy and noiseless regimes; it enables one to calculate correlation functions at different times that are inaccessible via commonly used approximations. It is also used to determine conditions for the annealed approximation to be valid, explore phases of the system under different levels of noise and obtain results for models with strong memory effects, where existing approximations break down. Links between BN and general Boolean formulas are identified and common results to both system types are highlighted.
Journal of Physics A: Mathematical and Theoretical, 2014
Thermal noise in a cellular automaton refers to a random perturbation to its function which event... more Thermal noise in a cellular automaton refers to a random perturbation to its function which eventually leads this automaton to an equilibrium state controlled by a temperature parameter. We study the 1dimensional majority-3 cellular automaton under this model of noise. Without noise, each cell in this automaton decides its next state by majority voting among itself and its left and right neighbour cells. Transfer matrix analysis shows that the automaton always reaches a state in which every cell is in one of its two states with probability 1/2 and thus cannot remember even one bit of information. Numerical experiments, however, support the possibility of reliable computation for a long but finite time.

Journal of Physics A: Mathematical and Theoretical, 2008
We study the stochastic dynamics of Ising spin models with random bonds, interacting on finitely ... more We study the stochastic dynamics of Ising spin models with random bonds, interacting on finitely connected Poissonian random graphs. We use the dynamical replica method to derive closed dynamical equations for the joint spin-field probability distribution, and solve these within the replica-symmetry ansatz. Although the theory is developed in a general setting, with a view to future applications in various other fields, in this paper we apply it mainly to the dynamics of the Glauber algorithm (extended with cooling schedules) when running on the so-called vertex cover optimization problem. Our theoretical predictions are tested against both Monte Carlo simulations and known results from equilibrium studies. In contrast to previous dynamical analyses based on deriving closed equations for only a small number of scalar order parameters, the agreement between theory and experiment in the present study is nearly perfect.
Journal of Physics: Conference Series, 2010
Random Boolean formulae, generated by a growth process of noisy logical gates are analyzed using ... more Random Boolean formulae, generated by a growth process of noisy logical gates are analyzed using the generating functional methodology of statistical physics. We study the type of functions generated for different input distributions, their robustness for a given level of gate error and its dependence on the formulae depth and complexity and the gates used. Bounds on their performance, derived in the information theory literature for specific gates, are straightforwardly retrieved, generalized and identified as the corresponding typical-case phase transitions. Results for error-rates, function-depth and sensitivity of the generated functions are obtained for various gate-type and noise models.
Uploads
Papers by Alexander Mozeika
In this work, we study the problem of estimating a probability of a failure in randomly sharded networks. We provide new results and improve existing bounds on the failure probability. Thus, our framework also paves the way to reduce committee sizes without reducing reliability.