LU factorization

description313 papers

group2 followers

lightbulbAbout this topic

LU factorization is a mathematical method used to decompose a matrix into the product of a lower triangular matrix (L) and an upper triangular matrix (U). This technique is commonly employed in numerical analysis to simplify the solution of linear systems, matrix inversion, and determinant calculation.

lightbulbAbout this topic

Key research themes

1. How can communication costs be minimized in parallel LU factorization on large-scale high-performance computing systems?

This research area focuses on deriving theoretical lower bounds for data movement (communication volume) in parallel LU factorization algorithms and designing practical algorithms that approach these bounds. Minimizing communication costs is critical because data movement dominates runtime and energy consumption on distributed-memory and exascale computing systems, where LU factorization is widely used for solving linear systems in scientific computing.

On the Parallel I/O Optimality of Linear Algebra Kernels: Near-Optimal LU Factorization

by Grzegorz Kwasniewski

2022

Key finding: The paper derives a novel parallel I/O lower bound for LU factorization: communicated elements per processor scale as N^3 / (P √M), where N is matrix size, P number of processors, and M local memory size. Building on this... Read more

articleView Paper downloadDownload

2. What algorithmic and data-structural techniques enable efficient, scalable parallel LU factorization of hierarchical (H-) matrices with dynamic block structures?

Hierarchical matrices (H-matrices) arising from boundary element and partial differential equations provide data-sparse representations enabling efficient approximate factorizations. Parallelizing LU factorization on modern multicore and manycore architectures requires exploiting nested task parallelism with dynamic, non-uniform data structures due to low-rank blocks whose sizes evolve during computation. Addressing this challenge involves advanced task programming models that can manage dependencies despite changing memory layouts while maximizing concurrency and maintaining fine-grained parallel efficiency.

Exploiting nested task-parallelism in the H-LU factorization

by José Aliaga

2024, Journal of Computational Science

Key finding: The authors develop a task-parallel implementation of LU factorization for H-matrices using the OmpSs programming model supporting weak dependencies and early release, which enables nested fine-grained concurrency. They... Read more

articleView Paper downloadDownload

3. How can the relationship between time and energy consumption in multithreaded LU factorization implementations be quantitatively characterized and optimized on multicore processors?

This research theme examines the scalability of LU factorization algorithms in terms of both execution time and energy consumption using multithreading and dynamic voltage and frequency scaling (DVFS) techniques. Understanding these correlations and tradeoffs is essential for optimizing algorithm implementations for energy-efficient high-performance computing, especially as energy constraints become paramount. The goal is to balance performance and power to minimize overall energy use without sacrificing scalability.

Time–Energy Correlation for Multithreaded Matrix Factorizations

by Beata Bylina

2023, Energies

Key finding: The study experimentally demonstrates strong correlations between execution time and energy consumption in multithreaded LU factorization (with and without pivoting) and Cholesky factorizations on an Intel Xeon Gold multicore... Read more

articleView Paper downloadDownload

All papers in LU factorization

Technical Report on

by Cevdet Aykanat

2025

The sparse matrix-vector multiplication (SpMxV) is a kernel operation widely used in iterative linear solvers. The same sparse matrix is multiplied by a dense vector repeatedly in these solvers. Matrices with irregular sparsity patterns... more

descriptionView Paper arrow_downwardDownload

Study of Power System Load Flow Using FPGA and LabVIEW

by Ahmed Yahia Yaseen

2025, Maǧallaẗ al-handasaẗ wa-al-tiknūlūǧiyā

The capability to rapidly execute the power flow (PF) calculations permit engineers in assured with stay bigger assured within the dependability, protection, and economical operation of their system within the case of planned or unplanned... more

descriptionView Paper arrow_downwardDownload

Sum rules and large deviations for spectral matrix measures in the Jacobi ensemble

by Alain Rouault

2025, HAL (Le Centre pour la Communication Scientifique Directe)

HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or... more

descriptionView Paper arrow_downwardDownload

Surya Matrix Multiplication: An Optimized 2-Multiplication Method for 2x2 Symmetric Circulant Matrices

by Suryateja Keerthi

2025, Independent Research Submission

Matrix multiplication is a cornerstone of computational mathematics. Standard algorithms for 2x2 matrices require 8 scalar multiplications, while Strassen's algorithm reduces this to 7. This paper introduces and details Surya Matrix... more

descriptionView Paper arrow_downwardDownload

Multifrontal Sparse Matrix Factorization on Graphics Processing Units

by John Tran

2025

For many finite element problems, when represented as sparse matrices, iterative solvers are found to be unreliable because they can impose com-putational bottlenecks. Early pioneering work by Duff et al, explored an alternative strategy... more

descriptionView Paper arrow_downwardDownload

A Formula for Computing Integer Powers for one Type of Tridiagonal Matrix

by Durmuş Bozkurt

2025, DergiPark (Istanbul University)

In this paper, we derive the general expression of the r th power (r ∈ N) for one type of tridiagonal matrix.

descriptionView Paper arrow_downwardDownload

A New Algorithm for the Inverse of Periodic k Banded and Periodic Anti k banded Matrices

by Durmuş Bozkurt

2025, arXiv (Cornell University)

In this study, an algorithm for computing the inverse of periodic k banded matrices , which are needed for solving the differential equations by using the finite differences, the solution of partial differential equations and the solution... more

descriptionView Paper arrow_downwardDownload

LU factorization on parallel computers

by Beny Neta

2025, Computers & mathematics with applications

A new parallel algorithm for the LU factorization of a given dense matrix A is described. The case of banded matrices is also considered. This algorithm can be combined with Sameh and Brent's [SIAM J. Numer. Anal. 14, 1101Anal. 14, -I... more

descriptionView Paper arrow_downwardDownload

Performance Evaluation of Some Inverse Iteration Algorithms on PowerXCell TM 8i Processor

by Yoshimasa Nakamura

2024, 研究報告数理モデル化と問題解決（MPS）

In this paper, we compare with the inverse iteration algorithms on PowerXCell T M 8i processor, which has been known as a heterogeneous environment. When some of all the eigenvalues are close together or there are clusters of eigenvalues,... more

descriptionView Paper arrow_downwardDownload

Memory-Aware Scheduling Of Tasks Sharing Data On Multiple GPUs

by Samuel Thibault

2024, HAL (Le Centre pour la Communication Scientifique Directe)

Example with a memory of size 2 data. The graph of input data dependencies is shown on the left. The figure on the right corresponds to the partition and schedule produced by the scheduler ▸ Deque Model Data Aware Ready (DMDAR): Deque... more

descriptionView Paper arrow_downwardDownload

Faithful Performance Prediction of a Dynamic Task-based Runtime System, an Opportunity for Task Graph Scheduling

by Samuel Thibault

2024, HAL (Le Centre pour la Communication Scientifique Directe)

descriptionView Paper arrow_downwardDownload

A new recursive algorithm for inverting general periodic pentadiagonal and anti-pentadiagonal matrices

by AbdelRahman A B D E L R A H M A N Karawia

2024, Applied Mathematics and Computation

In the current article, the authors present a new recursive symbolic computational algorithm, that will never break down, for inverting general periodic pentadiagonal and anti-pentadiagonal matrices. It is a natural generalization of the... more

descriptionView Paper arrow_downwardDownload

A review of sparsity vs stability in LU updates

by Michael Saunders

2024

The Forrest-Tomlin update has stood the test of time within many generations of commercial mathematical programming systems. Its ease of implementation leads to high efficiency and evidently acceptable reliability. We review its relation... more

descriptionView Paper arrow_downwardDownload

New Methods for Dynamic Programming Over an Infinite Time Horizon

by Michael Saunders

2024

Two unresolved issues regarding dynamic programming,over an inflnite time horizon are addressed within this dissertation. Previous research uses policy improvement to flnd a strong-present-value optimal policy in such systems, but the... more

descriptionView Paper arrow_downwardDownload

Maintaining LU factors of a general sparse matrix

by Michael Saunders

2024, Linear Algebra and its Applications

We describe a set of procedures for computing and updating an LU factorization of a sparse matrix A, where A may be square (possibly singular) or rectangular. The procedures include a Markowitz factorization and a Bartels-Golub update,... more

descriptionView Paper arrow_downwardDownload

Efficient update of determinants for many-electron wave function overlaps

by José R . Herrero

2024, Computer Physics Communications

The calculation of overlaps between many-electron wave functions at different nuclear geometries during nonadiabatic dynamics simulations requires the evaluation of a large number of determinants of matrices that differ only in a few... more

descriptionView Paper arrow_downwardDownload

The impact of HPF data layout on the design of efficient and maintainable parallel linear algebra libraries

by Anna Tsao

2024

In this document, we are concerned with the effccts of data layouts for nonsquare processor meshes on the implementation of common dense linear algebra kernels such as matrix-matrix multiplication, LU factorizations, or eigenvalue... more

descriptionView Paper arrow_downwardDownload

A flexible numerical framework for engineering - a Response Surface Modelling application

by Dean Vucinic

2024

This work presents an innovative approach adopted for the development of a new numerical software framework for accelerating dense linear algebra calculations and its application within an engineering context. In particular, response... more

descriptionView Paper arrow_downwardDownload

Locality-Aware Work Stealing on Multi-CPU and Multi-GPU Architectures

by Bruno Raffin

2024, HAL (Le Centre pour la Communication Scientifique Directe)

Most recent HPC platforms have heterogeneous nodes composed of a combination of multi-core CPUs and accelerators, like GPUs. Scheduling on such architectures relies on a static partitioning and cost model. In this paper, we present a... more

descriptionView Paper arrow_downwardDownload

Design and analysis of scheduling strategies for multi-CPU and multi-GPU architectures

by Bruno Raffin

2024, Parallel Computing

In this paper, we present a comparison of scheduling strategies for heterogeneous multi-CPU and multi-GPU architectures. We designed and evaluated four scheduling strategies on top of XKaapi runtime: work stealing, data-aware work... more

descriptionView Paper arrow_downwardDownload

An efficient algorithm for simulating fracture using large fuse networks

by Srdjan Simunovic

2024, Journal of Physics A: Mathematical and General

The high computational cost involved in modeling of the progressive fracture simulations using large discrete lattice networks stems from the requirement to solve a new large set of linear equations every time a new lattice bond is... more

descriptionView Paper arrow_downwardDownload

Preface (to the Special Issue in Honor of Pauline van den Driessche’s 65th birthday in 2006)

by Judith McDonald

2024, Linear Algebra and its Applications

This special issue of Linear Algebra and its Applications honours Pauline van den Driessche, who celebrates her sixty fifth birthday in 2006. Pauline has made significant contributions to mathematics, especially in linear algebra and mathematical biology. She has been a highly regarded colleague, teacher, collaborator and mentor to many, and a cherished friend to all. We are pleased to congratulate her on this special occasion and to wish her continued success in all future endeavours. Driessche and her contributions to mathematics Pauline van den Driessche was born in 1941 in England and studied mathematics at Imperial College, University of London, where she obtained a B.Sc. and an M.Sc. in applied mathematics. Pauline completed a Ph.D. in fluid mechanics at the University College of Wales in Aberystwyth in 1964. After one year as an Assistant Lecturer at the University College of Wales, she emigrated to Canada with her husband, Robert, and joined the Department of Mathematics at the University of Victoria in 1965. She has since lived in Victoria, apart from study leaves spent in Australia, Toronto, Oxford, Munich, Bielefeld and Edmonton, and several research visits, especially to other universities in Canada. Her supportive family, which brings her great joy, includes a son Mark and a daughter Ruth, and three grandsons. Pauline is an internationally renowned scientist who has made significant contributions in several different areas of mathematics. Her solid grounding in applied mathematics laid a foundation for her work in dynamical systems and applications. Her earlier work on differential-difference equations and Hopf bifurcations is widely known, and is still cited in the literature, a tribute to the lasting impact of her work. Pauline is well known and internationally recognized for the depth and breadth of her work in linear algebra. She was among the earliest researchers in combinatorial matrix analysis, in which the zero-nonzero or sign pattern of a matrix is exploited in analyzing such matrix properties as stability, nonsingularity, inertia and the

descriptionView Paper arrow_downwardDownload

M-matrix and inverse M-matrix extensions

by Judith McDonald

2024, Special Matrices

A class of matrices that simultaneously generalizes the M-matrices and the inverse M-matrices is brought forward and its properties are reviewed. It is interesting to see how this class bridges the properties of the matrices it... more

descriptionView Paper arrow_downwardDownload

Parallel Matrix Condensation for Calculating Log-Determinant of Large Matrix

by Sudarshan Dhall

2024, arXiv (Cornell University)

Calculating the log-determinant of a matrix is useful for statistical computations used in machine learning, such as generative learning which uses the log-determinant of the covariance matrix to calculate the log-likelihood of model... more

descriptionView Paper arrow_downwardDownload

Fraction-free algorithms for linear and polynomial equations

by George Nakos

2024, ACM SIGSAM Bulletin

This paper extends the ideas behind Bareiss's fraction-free Gauss elimination algorithm in a number of directions. First, in the realm of linear algebra, algorithms are presented for fraction-free LU "factorization" of a... more

descriptionView Paper arrow_downwardDownload

Image Hiding Using QR Factorization And Discrete Wavelet Transform Techniques

by Reham A. El-Shahed

2024, Future Computing and Informatics Journal

Steganography is one of the most important tools in the data security field as there is a huge amount of data transferred each moment over the internet. Hiding secret messages in an image has been widely used because the images are mostly... more

same matrix decomposition is applied to the 3.4.Embedding process Embedding process steps:

4. RESULTS AND DISCUSSION 4.1. Performance criteria

Figure 4 display the extracted secret nages using QR decomposition.

Table 1: Performance of the algorithm using OR factorization lable 2: Performance of the algorithm using QR factorization and DWT

descriptionView Paper arrow_downwardDownload

Automatic optimisation of parallel linear algebra routines in systems with variable load

by Kenneth Roche

2024, Eleventh Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2003. Proceedings.

In this work an architecture of an automatically tuned linear algebra library proposed in previous works is extended in order to adapt it to platforms where both the CPU load and the network traffic vary. During the installation process... more

descriptionView Paper arrow_downwardDownload

Locality Optimization on a NUMA Architecture for Hybrid LU Factorization

by Brigitte Rozoy

2024

We study the impact of non-uniform memory accesses (NUMA) on the solution of dense general linear systems using an LU factorization algorithm. In particular we illustrate how an appropriate placement of the threads and memory on a NUMA... more

descriptionView Paper arrow_downwardDownload

Effective Algorithm for Determination of the Initial Vector for the Definite Quadratic Pencil

by aleksandra kostic

2024, DAAAM Proceedings

In this paper we study properties of definite quadratic eigenproblems. Free vibrations of fluid-solid structures are governed by a nonsymmetric eigenvalue problem. This problem can be transformed into a definite quadratic eigenvalue... more

descriptionView Paper arrow_downwardDownload

A New Algorithm for Inverting General Cyclic Heptadiagonal Matrices Recursively

by AbdelRahman A B D E L R A H M A N Karawia

2024

In this paper, we describe a reliable symbolic computational algorithm for inverting general cyclic heptadiagonal matrices by using parallel computing along with recursion. The algorithm is implementable to the Computer Algebra... more

descriptionView Paper arrow_downwardDownload

Kometen als Motiv in der abendländischen Kunst - Das Quattrocento

by Frank Keim

2024

Das Tondo Die Anbetung der hl. drei Könige, heute in der Gemäldegalerie in Berlin, wird Domenico Veneziano zugeschrieben. Im Vordergrund bezeugt eine Gruppe herrschaftlicher Menschen in prächtigen Gewändern, unter ihnen die drei Könige,... more

descriptionView Paper arrow_downwardDownload

Kometen als Motiv in der abendländischen Kunst - Das Quattrocento

by Frank Keim

2024

descriptionView Paper arrow_downwardDownload

“Our daily bread”: Maurice Potron, from Catholicism to mathematical economics

by Christian Bidard

2024, RePEc: Research Papers in Economics

Maurice Potron (1872-1942) is a French Jesuit and mathematician whose main source of inspiration in economics is the encyclical Rerum Novarum. With virtually no knowledge in economic theory, he wrote down a linear model of production in... more

descriptionView Paper arrow_downwardDownload

Edge separators for quasi-binary trees

by jorge alfonsin Alfonsin

2024, Discrete Applied Mathematics

One wishes to remove k − 1 edges of a vertex-weighted tree T such that the weights of the k induced connected components are approximately the same. How well can one do it ? In this paper, we investigate such k-separator for quasi-binary... more

descriptionView Paper arrow_downwardDownload

CFD Parallel Simulation Using Getfem++ and Mumps

by Michel Fournié

2024, Euro-Par 2010 - Parallel Processing

We consider the finite element environment Getfem++ 1 , which is a C++ library of generic finite element functionalities and allows for parallel distributed data manipulation and assembly. For the solution of the large sparse linear... more

descriptionView Paper arrow_downwardDownload

USING MATLAB TO STUDY MATRICES AND LINEAR (Atena Editora)

by Atena Editora

2024, USING MATLAB TO STUDY MATRICES AND LINEAR (Atena Editora)

Solving linear systems is a fundamental task in several areas of mathematics and engineering, playing a crucial role in solving real-world problems. MATLAB, a powerful numerical computing platform, offers a wide range of tools and... more

descriptionView Paper arrow_downwardDownload

Efficient computation of the sinc matrix function for the integration of second-order differential equations

by Fabio Durastante

2024, arXiv (Cornell University)

This work deals with the numerical solution of systems of oscillatory second-order differential equations which often arise from the semi-discretization in space of partial differential equations. Since these differential equations... more

4.1.2 Padé approximants for ;F (1; 2;-) Following the results given by Luke in [12, Section 3] and again the higher order term that can be obtained from [13, Lemma 2], we can easily obtain the [n/n]-Padé approximant to the function , F’; (1; 2; x). Thus, we write

parity of the sinc function. This does not happen when rewriting the sinc function as n (16). Indeed, a repeated application of (22)-(24) leads to

Thus we can approximate sinc(A)v by approximating the integral of the inverse Fourier transform

Fig. 3: Basis functions for the piecewise linear continuous finite elements. Va = {v€ HQ) : VRE Th, oI, € PEx]} = Span{o;(x)} 27" C VNC((0, 7]; V).

Fig. 4: Relative error for the computation of sinc(A)v with A the matrices described in Table 1 and v a randomly generated vector. We test all the pole choices described in Section 4. with V; the matrix whose column spans the relevant rational Krylov subspace Q;(A, v) and the matrix-argument sinc(A) function computed in MATLAB as “sinc(A) + A\funm(A,@sin)”. From Figure 4 we observe that the best results are given by poles obtained from the Padé diagonal approximant of sinc function here computed in a symbolic way; see Appendix A. Due to the parity of the function and by the fact that the diagonals approximant of odd order coincide with the previous even ones, we report only the even cases. The other satisfactory results are obtained with the poles €,, given by the expansion discussed in Section 4.1.1, which are the complex zeros of the non orthogonal Laguerre polynomials. Also observe that the choice of poles as in Ly,

Fig. 5: Computation of the exponential sums (31) obtained using different approxima- ions for the exponential, relative error versus time (s) graphs. To have a comparison vith the results in Figure 4 we report in both pictures the results obtained with the n/n|-Padé approximation, and the rational Krylov methods with poles €,, and Ly. Che number of poles for the [k/k]-Padé approximation of the exponential is k = 15 or the 1D FD case and & = 20 for the 2D FD and Linear FEM cases. The number of sauss-Legendre quadrature nodes goes from v = 1,..., 15 in all the cases. In the next set of experiments we compare the attained accuracy in terms of the relative error with the time needed to achieve it. In addition to rational Krylov-type approaches we consider also the exponential sums algorithm discussed in Section 4.2. To ensure that the strategy of approximating the matrix exponential within exponential sums does not reduce the overall accuracy we employ the cubic cost dense-matrix computation of the different exponential; this serves just as a sanity check of the overall procedure since it is indeed an expensive procedure. From the results in Figure 5 we

Fig. 7: From left to right, we have the domain, the triangular mesh for the solution of the wave equation (32), the mass and stiffness matrix for the P!-element on the associated triangulation.

Fig. 8: Pointwise absolute error (log; -scale) computed against the solution generated with MATLAB’s solvepde command at T = 1 with h = 10~? with respect to time, and linear FEM with largest element size of hmax = 0.03 in the generateMesh program. We used four poles both for the €,, and £,,. The exponential sum case has been computed with 6 poles for the rational approximation of the exponential and 12 terms in the sum. and a zero initial velocity, i.e., vo(x, y) = 0. Since we simply want to test the robustness of the routines for computing the different matrix functions, we also set the forcing term equal to zero. From the errors reported in Figure 8 it can be observed that also in

We report here a depiction of the scalar bounds for the approximations (20) and (27) on the real line. As shown in the Figure B1 there is an excellent correspondence

Table 1: Test matrices: Finite Difference discretiza- tion of the Laplacian, centered differences for the 1D case, 5-point stencil for the 2D case, and the lin- ear Finite Element Method discussed in Section 5.1. Amin; Amax are the smallest and largest eigenvalues respectively.

Table 2: Comparison in terms of elapsed time and two-norm relative error of the Gautschi integrator and the Adams-Bashforth-Norsett scheme of stiff order 2 from [20].

Table Al: Coefficients in the monomial basis (ascending order) of the denominators of the diagonal Padé approximations for the sinc(r) function for the degree n = 2m, m= 1,...,5.

descriptionView Paper arrow_downwardDownload

Blind Channel Estimation For Frequency Hopping System Using Subspace Based Method

by MAHMOUD QASAYMEH

2024

Subspace channel estimation methods have been<br> studied widely, where the subspace of the covariance matrix is<br> decomposed to separate the signal subspace from noise subspace. The<br> decomposition is normally done... more

descriptionView Paper arrow_downwardDownload

LU factorization for feature transformation

by Christian Wellekens

2024, 7th International Conference on Spoken Language Processing (ICSLP 2002)

Linear feature space transformations are often used for speaker or environment adaptation. Usually, numerical methods are sought to obtain solutions. In this paper, we derive a closed-form solution to ML estimation of full feature... more

descriptionView Paper arrow_downwardDownload

Editors' overview for the Alan Turner Memorial volume

by Danielle Schreve

2024, Quaternary Science Reviews

The full-text may be used and/or reproduced, and given to third parties in any format or medium, without prior permission or charge, for personal research or study, educational, or not-for-prot purposes provided that: • a full... more

descriptionView Paper arrow_downwardDownload

Toward a Standard Interface for User-Defined Scheduling in OpenMP

by Florina Ciorba

2024, arXiv (Cornell University)

Parallel loops are an important part of OpenMP programs. Efficient scheduling of parallel loops can improve performance of the programs. The current OpenMP specification only offers three options for loop scheduling, which are... more

descriptionView Paper arrow_downwardDownload

Toward a Standard Interface for User-Defined Scheduling in OpenMP

by Florina Ciorba

2024, OpenMP: Conquering the Full Hardware Spectrum

descriptionView Paper arrow_downwardDownload

Inverse-Positive Matrices with Checkerboard Pattern

by Maria Teresa Gasso

2024, Lecture Notes in Control and Information Sciences

A nonsingular real matrix A is said to be inverse-positive if all the elements of its inverse are nonnegative. This class of matrices contains the M-matrices, from which inherit some of their properties and applications, especially in... more

descriptionView Paper arrow_downwardDownload

Inverse-Positive Matrices with Checkerboard Pattern

by Maria Teresa Gasso

2024, Positive Systems

Example 1. The following n x n tridiagonal matrix

Now, let A and B be upper triangular matrices of size n x n,n > °

descriptionView Paper arrow_downwardDownload

Automatic optimisation of parallel linear algebra routines in systems with variable load

by jose luis gonzalez

2024, Eleventh Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2003. Proceedings.

descriptionView Paper arrow_downwardDownload

Calculating the Moore–Penrose Generalized Inverse on Massively Parallel Systems

by Vukasin Stanojevic

2024, Algorithms

In this work, we consider the problem of calculating the generalized Moore–Penrose inverse, which is essential in many applications of graph theory. We propose an algorithm for the massively parallel systems based on the recursive... more

descriptionView Paper arrow_downwardDownload

Exploiting nested task-parallelism in the H-LU factorization

by José Aliaga

2024, Journal of Computational Science

We address the parallelization of the LU factorization of hierarchical matrices (H-matrices) arising from boundary element methods. Our approach exploits task-parallelism via the OmpSs programming model and runtime, which discovers the... more

descriptionView Paper arrow_downwardDownload

Parallel Solution of Hierarchical Symmetric Positive Definite Linear Systems

by José Aliaga

2024, Applied Mathematics and Nonlinear Sciences

We present a prototype task-parallel algorithm for the solution of hierarchical symmetric positive definite linear systems via the ℋ-Cholesky factorization that builds upon the parallel programming standards and associated runtimes for... more

descriptionView Paper arrow_downwardDownload

A Case for Malleable Thread-Level Linear Algebra Libraries: The LU Factorization with Partial Pivoting

by Sandra Catalan

2024, arXiv (Cornell University)

We propose two novel techniques for overcoming load-imbalance encountered when implementing so-called look-ahead mechanisms in relevant dense matrix factorizations for the solution of linear systems. Both techniques target the scenario... more

descriptionView Paper arrow_downwardDownload

Parallel Algorithms for Dense Linear Algebra Computations

by Ahmed SAMEH

2024, SIAM Review

descriptionView Paper arrow_downwardDownload