Papers by Daniela De Canditiis
arXiv (Cornell University), Feb 28, 2024
We propose AFTNet, a novel network-constraint survival analysis method based on the Weibull accel... more We propose AFTNet, a novel network-constraint survival analysis method based on the Weibull accelerated failure time (AFT) model solved by a penalized likelihood approach for variable selection and estimation. When using the log-linear representation, the inference problem becomes a structured sparse regression problem for which we explicitly incorporate the correlation patterns among predictors using a double penalty that promotes both sparsity and grouping effect. Moreover, we establish the theoretical consistency for the AFTNet estimator and present an efficient iterative computational algorithm based on the proximal gradient descent method. Finally, we evaluate AFTNet performance both on synthetic and real data examples.
jewel: Graphical Models Estimation from Multiple Sources
Journal of Industrial Mathematics, 2016
This paper presents a methodology for assessing and monitoring the cleaning state of a heating, v... more This paper presents a methodology for assessing and monitoring the cleaning state of a heating, ventilation, and air conditioning (HVAC) system of a building. It consists of a noninvasive method for measuring the amount of dust in the whole ventilation system, that is, the set of filters and air ducts. Specifically, it defines the minimum amount of measurements, their time table, locations, and acquisition conditions. The proposed method promotes early intervention on the system and it guarantees high indoor air quality and proper HVAC working conditions. The effectiveness of the method is proved by some experimental results on different study cases.

arXiv (Cornell University), Jul 10, 2023
In this paper we provide explicit upper bounds on some distances between the (law of the) output ... more In this paper we provide explicit upper bounds on some distances between the (law of the) output of a random Gaussian neural network and (the law of) a random Gaussian vector. Our main results concern deep random Gaussian neural networks, with a rather general activation function. The upper bounds show how the widths of the layers, the activation function and other architecture parameters affect the Gaussian approximation of the output. Our techniques, relying on Stein's method and integration by parts formulas for the Gaussian law, yield estimates on distances which are indeed integral probability metrics, and include the convex distance. This latter metric is defined by testing against indicator functions of measurable convex sets, and so allow for accurate estimates of the probability that the output is localized in some region of the space. Such estimates have a significant interest both from a practitioner's and a theorist's perspective.
On the probability of (falsely) connecting two distinct components when learning a GGM
Communications in Statistics - Theory and Methods
In this paper, we extend the result on the probability of (falsely) connecting two distinct compo... more In this paper, we extend the result on the probability of (falsely) connecting two distinct components when learning a GGM (Gaussian Graphical Model) by the joint regression based technique. While the classical method of regression based technique learns the neighbours of each node one at a time through a Lasso penalized regression, its joint modification, considered here, learns the neighbours of each node simultaneously through a group Lasso penalized regression.
Cardiac function in adolescents and young adults with 22q11.2 deletion syndrome without congenital heart disease
European Journal of Medical Genetics

Mathematics
In this paper, we consider the problem of estimating the graphs of conditional dependencies betwe... more In this paper, we consider the problem of estimating the graphs of conditional dependencies between variables (i.e., graphical models) from multiple datasets under Gaussian settings. We present jewel 2.0, which improves our previous method jewel 1.0 by modeling commonality and class-specific differences in the graph structures and better estimating graphs with hubs, making this new approach more appealing for biological data applications. We introduce these two improvements by modifying the regression-based problem formulation and the corresponding minimization algorithm. We also present, for the first time in the multiple graphs setting, a stability selection procedure to reduce the number of false positives in the estimated graphs. Finally, we illustrate the performance of jewel 2.0 through simulated and real data examples. The method is implemented in the new version of the R package \({\texttt{jewel}}\).

Journal of Clinical Medicine
Cow’s milk allergy (CMA) is a common condition in the pediatric population. CMA can induce a dive... more Cow’s milk allergy (CMA) is a common condition in the pediatric population. CMA can induce a diverse range of symptoms of variable intensity. It occurs mainly in the first year of life, and if the child is not breastfed, hypoallergenic formula is the dietary treatment. Extensively hydrolyzed cow’s milk formulas (eHF) with documented hypo-allergenicity can be recommended as the first choice, while amino acid-based formulas (AAF) are recommended for patients with more severe symptoms. Hydrolyzed rice-based formulas (HRFs) are a suitable alternative for infants with CMA that cannot tolerate or do not like eHF and in infants with severe forms of CMA. In the present paper, we reviewed the nutritional composition of HRFs as well as studies regarding their efficacy and tolerance in children, and we provided an updated overview of the recent evidence on the use of HRFs in CMA. The available studies provide evidence that HRFs exhibit excellent efficacy and tolerance and seem to be adequate i...

Using frames in statistical signal recovering
Overcomplete representation such as wavelet and windowed Fourier expansion have become mainstays ... more Overcomplete representation such as wavelet and windowed Fourier expansion have become mainstays of modern statistical data analysis. The need of overcomplete representations stems from the fact that a lot of real signals are not efficiently represented by a single orthonormal basis and then the use of frame offer a valid alternative. In the present work, frames-based statistical signal recovering methods are presented with particular attention to tight frames. For both the general and the tight frames, a set of practically implementable signal recovering techniques is presented taking frame induced correlation structures into account. In particular, the Wiener filter design and its empirical version for statistical signal estimate are generalized to the case of frame operators. The Wiener filter is formally derived as the best linear diagonal estimator in the MSE (Mean Squared Error) sense. However, it comes out that the implementable Wiener filter is no more diagonal neither li...
A resting state EEG study on depressed persons with suicidal ideation
IBRO Neuroscience Reports
Model selection for inferring Gaussian graphical models
Communications in Statistics - Simulation and Computation, 2021

Italian Journal of Pediatrics
Background COVID-19 lockdown caused sudden changes in people’s lifestyle, as a consequence of the... more Background COVID-19 lockdown caused sudden changes in people’s lifestyle, as a consequence of the forced lockdown imposed by governments all over the world. We aimed to evaluate the impact of lockdown on body mass index (BMI) in a cohort of allergic children and adolescents. Methods From the first of June until the end of October 2020, we submitted a written questionnaire to all the patients who, after lockdown, carried out a visit at the Pediatric Allergy Unit of the Department of Mother-Child, Urological Science, Sapienza University of Rome. The questionnaire was composed by 10 questions, referring to the changes in their daily activities. Data were extrapolated from the questionnaire and then analyzed considering six variables: BMI before and BMI after lockdown, sugar intake, sport, screens, sleep, and anxiety. Results One hundred fifty-three patients agreed to answer our questionnaire. Results showed a statistically significant increase in the BMI after lockdown (20.97 kg/m2 ± 2...

Lecture Notes in Computer Science, 2012
In this paper we present a functional Bayesian method for detecting genes which are temporally di... more In this paper we present a functional Bayesian method for detecting genes which are temporally differentially expressed between several conditions. We identify the nature of differential expression (e.g., gene is differentially expressed between the first and the second sample but is not differentially expressed between the second and the third) and subsequently we estimate gene expression temporal profiles. The proposed procedure deals successfully with various technical difficulties which arise in microarray time-course experiments such as a small number of observations, non-uniform sampling intervals and presence of missing data or repeated measurements. The procedure allows to account for various types of errors, thus, offering a good compromise between nonparametric and normality assumption based techniques. In addition, all evaluations are carried out using analytic expressions, hence, the entire procedure requires very small computational effort. The performance of the procedure is studied using simulated data.
Convergence of Fourier regularization for smoothing data
na.iac.cnr.it
The classical smoothing data problem is considered in a Sobolev space under the assumption of whi... more The classical smoothing data problem is considered in a Sobolev space under the assumption of white noise. A Fourier series method based on regularization plus Generalized Cross Validation (GCV) is considered to approximate the unknown function. This estimator is globally ...
Statistica Sinica, 2007
The problem of estimating the log-spectrum of a stationary time series by Bayesian shrinkage of e... more The problem of estimating the log-spectrum of a stationary time series by Bayesian shrinkage of empirical wavelet coefficients is studied. A model in the wavelet domain that accounts for distributional properties of the log-periodogram at levels of fine detail and approximate normality at coarse levels in the wavelet decomposition, is proposed. The smoothing procedure, called BAMS-LP (Bayesian Adaptive Multiscale Shrinker of Log-Periodogram), ensures that the reconstructed log-spectrum is sufficiently noise-free. It is also shown that the resulting Bayes estimators are asymptotically optimal (in the mean-squared error sense). Comparisons with non-wavelet and wavelet-non-Bayesian methods are discussed.

The conceptual framework for modeling the inertial subrange is strongly influenced by the Kolmogo... more The conceptual framework for modeling the inertial subrange is strongly influenced by the Kolmogorov cascade phenomena, which is nowadays the subject of significant reinterpretation. It has been argued that the effects of boundary conditions influence large-scale motion and direct interaction between large and small scales is possible by means other than passing sequentially through the full cascade. Using longitudinal (u) and vertical (w) velocity and temperature (T) time series measurements collected in the atmospheric surface layer (ASL), we evaluate whether the inertial subrange multifractral function (f (a)) of all three flow variables is influenced by atmospheric stability (x), which is a bulk measure of the effect of boundary conditions on large scale flow properties for ASL turbulence. This study is the first to demonstrate that x significantly influences f(a) for all three flow variables. Here, statistical significance is evaluated using a novel wavelet-based Functional Analysis of Variance (FANOVA) approach that explicitly considers different classes of x, the flow variable type, and possible interactions between x and the three flow variables.
In this paper we propose a block shrinkage method in the wavelet domain for estimating an unknown... more In this paper we propose a block shrinkage method in the wavelet domain for estimating an unknown function in the presence of Gaussian noise. This shrinkage utilizes an empirical Bayes, block-adaptive approach that accounts for the sparseness of the representation of the unknown function. The modeling is accomplished by using a mixture of two normal-inverse gamma distributions as a joint prior on wavelet coefficients and noise variance in each block at a particular resolution level. This method results in explicit and fast rules. An automatic, level dependent choice for the prior hyperparameters is also suggested. Finally, the performance of the proposed method, BBS (Bayesian Block Shrinkage), is illustrated on the battery of standard test functions and compared to some standard wavelet-based denoising methods.

Electronic Journal of Statistics, 2018
We consider a general statistical linear inverse problem, where the solution is represented via a... more We consider a general statistical linear inverse problem, where the solution is represented via a known (possibly overcomplete) dictionary that allows its sparse representation. We propose two different approaches. A model selection estimator selects a single model by minimizing the penalized empirical risk over all possible models. By contrast with direct problems, the penalty depends on the model itself rather than on its size only as for complexity penalties. A Q-aggregate estimator averages over the entire collection of estimators with properly chosen weights. Under mild conditions on the dictionary, we establish oracle inequalities both with high probability and in expectation for the two estimators. Moreover, for the latter estimator these inequalities are sharp. The proposed procedures are implemented numerically and their performance is assessed by a simulation study.
In this paper the Wiener estimator for signal-denoising is generalized to finite frame operators.... more In this paper the Wiener estimator for signal-denoising is generalized to finite frame operators. In particular, a two-stage procedure which results in a non-linear and non-diagonal estimator is proposed. Advantages and disadvantages with respect to the classical Wiener estimator used with orthonormal basis operator are discussed showing results on standard and real test signals.

A technique for the restoration of low resonance component and high resonance component of K inde... more A technique for the restoration of low resonance component and high resonance component of K independently measured signals is presented. The definition of low and high resonance component is given by the Rational Dilatation Wavelet Transform (RADWT), a particular kind of finite frame that provides sparse representation of functions with different oscillations persistence. It is assumed that the signals are measured simultaneously on several independent channels and in each channel the underlying signal is the sum of two components: the low resonance component and the high resonance component, both sharing some common characteristic between the channels. Components restoration is performed by means of the lasso-type penalty and back-fitting algorithm. Numerical experiments show the performance of the proposed method in different synthetic scenarios highlighting the advantage of estimating the two components separately rather than together.
Uploads
Papers by Daniela De Canditiis