Academia.eduAcademia.edu

Statistical Regression

description1,772 papers
group0 followers
lightbulbAbout this topic
Statistical regression is a statistical method used to model and analyze the relationships between a dependent variable and one or more independent variables. It estimates the conditional expectation of the dependent variable given the independent variables, allowing for predictions and insights into the nature of the relationships.
lightbulbAbout this topic
Statistical regression is a statistical method used to model and analyze the relationships between a dependent variable and one or more independent variables. It estimates the conditional expectation of the dependent variable given the independent variables, allowing for predictions and insights into the nature of the relationships.

Key research themes

1. Which regression methods currently offer the best predictive performance across diverse datasets?

This theme investigates empirical comparisons among a wide variety of regression techniques, aiming to identify which models provide superior predictive accuracy and computational efficiency across standardized benchmark datasets. Understanding the relative performance aids researchers in selecting appropriate regression tools for novel problems.

Key finding: This comprehensive empirical study compares 77 regression models from 19 distinct families on 83 datasets from the UCI repository, identifying the M5 rule-based model (cubist) and the gradient boosted machine as top... Read more

2. How can widely used regression models be understood and selected for medical research and practical applications?

This theme explores the landscape of regression models relevant in medical and applied research, focusing on the methods' assumptions, computational implementations, and domains of application. It emphasizes practical guidelines, methodological considerations such as collinearity, and software implementation to aid researchers in choosing appropriate regression techniques.

Key finding: This tutorial systematically introduces a broad spectrum of regression models—linear, generalized linear, non-linear, ridge, lasso, Bayesian, support vector, quantile, Cox, among others—highlighting their assumptions,... Read more
Key finding: This work offers a unique exposition of regression analysis techniques with novel contributions including regression decomposition models, set-theoretic approaches to assess regressor contributions, and hidden Markov chain... Read more

3. What strategies help ensure the validity and reliability of regression coefficients and associated sample size estimation?

This theme examines statistical methodologies focused on accurately estimating regression coefficients, particularly how to choose appropriate sample sizes to guarantee that sample regression coefficients reliably estimate population parameters. It extends beyond traditional power analysis by offering probabilistic precision measures backed by simulation and real data.

Key finding: This paper extends the A Priori Procedure (APP), a framework devised to determine sample sizes needed for obtaining sample statistics that are precise estimators of population parameters, to regression coefficients in linear... Read more

All papers in Statistical Regression

In this paper we propose accurate parameter and over-identification tests for indirect inference. Under the null hypothesis the new tests are asymptotically χ 2 -distributed with a relative error of order n -1 . They exhibit better finite... more
This paper introduces two practical constructs for robust one-dimensional distributional summaries: the S M Nazmuz Sakib Quantile Envelope Principle (the Sakib Principle) and the S M Nazmuz Sakib Median-of-Quartiles Theorem (the Sakib... more
The Exponentially Weighted Moving Average (EWMA) control chart is widely implemented in applications in various fields, such as finance, medicine, engineering, and others. In real-world applications such as hospital admissions, share... more
This paper proposes consistent estimators for transformation parameters in semiparametric models. The problem is to find the optimal transformation into the space of models with a predetermined regression structure like additive or... more
A common problem in applied regression analysis is that covariate values may be missing for some observations but imputed values may be available. This situation generates a trade-off between bias and precision: the complete cases are... more
In this paper D-and V-optimal population designs for the quadratic regression model with a random intercept term and with values of the explanatory variable taken from a set of equally spaced, non-repeated time points are considered.... more
Missing covariate values is a common problem in a survival data research. The aim of this study is to compare the use of the multiple imputation (MI) and last observation carried forward (LOCF) methods for handling missing covariate... more
This study aims to obtain empirical evidence of the influence of subject norms and attitudes on whistleblowing intentions. The population of this study is employees of accounting and finance department of Universitas Negeri Semarang. The... more
In this paper the problem of estimating the scale matrix in a complex elliptically contoured distribution(complex ECD) is addressed. An extended Haff-Stein identity for this model is derived. It is shown that the minimax estimators of the... more
This paper uses model symmetries in the instrumental variable (IV) regression to derive an invariant test for the causal structural parameter. Contrary to popular belief, we show that there exist model symmetries when equation errors are... more
Inferring causal relationships or related associations from observational data can be invalidated by the existence of hidden confounding. We focus on a high-dimensional linear regression setting, where the measured covariates are affected... more
Five robustifications of L 2 Boosting for linear regression with various robustness properties are considered. The first two use the Huber loss as implementing loss function for boosting and the second two use robust simple linear... more
We consider post-selection inference for high-dimensional (generalized) linear models. Data carving is a promising technique to perform this task. However, it suffers from the instability of the model selector and hence may lead to poor... more
We study a sieve bootstrap procedure for time series with a deterministic trend. The sieve for constructing the bootstrap is based on autoregressive a p p r o ximation. Given time series data, one would rst use a preliminary estimate of... more
El objetivo de esta tesis doctoral es aplicar y desarrollar tecnicas de inferencia estadistica para modelos de coeficientes variables. Por un lado, se investigan tecnicas de inferencia estadistica basadas en la verosimilitud empirica para... more
We consider Bayesian inference using an extension of the family of skewelliptical distributions studied by . This new class is referred to as bimodal skew-elliptical (BSE) distributions. The elements of the BSE class can take quite... more
When both variables are subject to error in regression model, the least squares estimators are biased and inconsistent. The measurement error model is more appropriate to fit the data. This study focuses on the problem to construct... more
When both variables are subject to error in regression model, the least squares estimators are biased and inconsistent. The measurement error model is more appropriate to fit the data. This study focuses on the problem to construct... more
The Bayesian additive regression trees (BART) model is an ensemble method extensively and successfully used in regression tasks due to its consistently strong predictive performance and its ability to quantify uncertainty. BART combines... more
The Bayesian additive regression trees (BART) model is an ensemble method extensively and successfully used in regression tasks due to its consistently strong predictive performance and its ability to quantify uncertainty. BART combines... more
The Bayesian additive regression trees (BART) model is an ensemble method extensively and successfully used in regression tasks due to its consistently strong predictive performance and its ability to quantify uncertainty. BART combines... more
The Bayesian additive regression trees (BART) model is an ensemble method extensively and successfully used in regression tasks due to its consistently strong predictive performance and its ability to quantify uncertainty. BART combines... more
SummaryThe design of experiments for generalized non-linear models is investigated and applied to an optical process for characterizing interfaces which is widely used in the physical and natural sciences. Design strategies for overcoming... more
We investigate optimal designs for discriminating between exponential regression models of different complexity, which are widely used in the biological sciences; see, e.g., Landaw (1995) or . We discuss different approaches for the... more
In many applications, there are multiple time series that are hierarchically organized and can be aggregated at several different levels in groups based on products, geography or some other features. We call these "hierarchical time... more
Empirical evidence suggest that many macroeconometric and financial models are subject to both instability and identification problems. We address both issues under the unified framework of time-varying information, which includes changes... more
We study the asymptotic properties of the standard GMM estimator when additional moment restrictions, weaker than the original ones, are available. We provide conditions under which these additional weaker restrictions improve the... more
Given m time series regression models, linear or not, with additive noise components, it is shown how to estimate semiparametrically the predictive probability distribution of one of the time series conditional on past random covariate... more
When discussing non-Gaussian spatially correlated variables, generalized linear mixed models have enough flexibility for modeling various data types. However, the maximum likelihood methods are plagued with substantial calculations for... more
It is more and more frequently the case in applications that the data we observe come from one or more random variables taking values in an infinite dimensional space, e.g. curves. The need to have tools adapted to the nature of these... more
We propose a new regression method to estimate the impact of explanatory variables on quantiles of the unconditional (marginal) distribution of an outcome variable. The proposed method consists of running a regression of the (recentered)... more
In this paper, we attempt to address the robustness issues in circular-circular regression. We consider the Möbius transformation based circular-circular regression model of Kato et al. (2008). Then, we discuss the robustness issue of the... more
We consider a continuous-time stochastic volatility model. The model contains a stationary volatility process, the multivariate density of the finite dimensional distributions of which we aim to estimate. We assume that we observe the... more
The Cube method proposed by enables the selection of balanced samples : that is, samples such that the Horvitz-Thompson estimators of auxiliary variables match the known totals of those variables. As an exact balanced sampling design... more
Maximum likelihood estimation is investigated in the context of linear regression models under partial independence restrictions. These restrictions aim to assume a kind of completeness of a set of predictors Z in the sense that they are... more
A basic assumption concerned with general linear regression model is that there is no correlation (or no multicollinearity) between the explanatory variables. When this assumption is not satisfied, the least squares estimators have large... more
This dissertation is the result of work carried out by myself between October 2009 and October 2012. It includes nothing which is the outcome of work done in collaboration with others, except as specified in the text. Signed:
fitting. Their sampling properties such as bias and variance and asymptotic distributions can be derived. Estimators of their biases and variances can easily be formulated. The global modeling and the local modeling approach both have... more
In this research, a comparison was made between two methods for estimating a semiparametric regression model with the presence of an autocorrelation problem, based on a semiparametric partial linear regression model, which contains a... more
In this paper we propose a new framework for Bayesian nonparametric modelling with continuous covariates. In particular, we allow the nonparametric distribution to depend on covariates through ordering the random variables building the... more
The vast majority of work done on inventory system is based on the critical assumption of fully observed inventory inventory level dynamics and demand. Modern technology, like the internet, offers a tremendous number of opportunities to... more
This state-level analysis of 2022 vehicle death rates across all 50 states (and the District of Columbia) finds that income inequality and educational attainment are key factors, with contributions also from seat belt use, urbanicity, and... more
Semi-parametric Gaussian mixtures of non-parametric regressions (SPGMNRs) are a flexible extension of Gaussian mixtures of linear regressions (GMLRs). The model assumes that the component regression functions (CRFs) are non-parametric... more
The paper studies GMM inference for subvector hypotheses in structural models where nuisance parameters may not be identified. Such testing problems are often assessed using the plug-in principle (Stock and Wright, 2000) or the projection... more
We develop inference and testing procedures for conditional dispersion and skewness in a nonparametric regression setup based on statistical depth functions. The methods developed can be applied in situations, where the response is... more
Quantitative hydrologic forecasting usually requires knowledge of the spatial and temporal distribution of precipitation. First, it is important to accurately measure the precipitation falling over a particular watershed of interest.... more
Download research papers for free!