Papers by Gordon Erlebacher
Advances in game-based learning, 2019
The use of general descriptive names, registered names, trademarks, service marks, etc. in this p... more The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.
Modeling of seismic wave propagation using high-order finite elements with MPI on a cluster of 192 GPUs
HAL (Le Centre pour la Communication Scientifique Directe), May 16, 2010
International audienc

Design of Gameplay for Learning
Advances in game-based learning, Dec 8, 2018
A common skepticism about educational games is that learning and play are frequently not well int... more A common skepticism about educational games is that learning and play are frequently not well integrated—the skill or content to be used and learned lacks a semantic or meaningful relation with the fantasy and challenge elements and can be easily swapped without influencing gameplay. In this chapter, we describe and analyze design challenges associated with the core components of gameplay—game mechanics and the narrative scheme as it relates to learning—and review the gameplay design propositions and infield test findings of E-Rebuild. Via a retrospective investigation of design features and strategies in terms of learnability and playability (i.e., capability of activating knowledge-based cognitive performance without interrupting gameplay), this chapter aims to report and discuss how domain-specific learning is integrated in, and activated by, core game actions, rules, game objects, and the game world design.
Mathematics, May 24, 2023
This article is an open access article distributed under the terms and conditions of the Creative... more This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY

Accelerating a 3D finite-difference wave propagation code by a factor of 50 and a spectral-element code by a factor of 25 using a cluster of GPU graphics cards
EGU General Assembly Conference Abstracts, May 1, 2010
We first accelerate a three-dimensional finite-difference in the time domain (FDTD) wave propagat... more We first accelerate a three-dimensional finite-difference in the time domain (FDTD) wave propagation code by a factor of about 50 using Graphics Processing Unit (GPU) computing on a cheap NVIDIA graphics card with the CUDA programming language. We implement the code in CUDA in the case of the fully heterogeneous elastic wave equation. We also implement Convolution Perfectly Matched Layers (CPMLs) on the graphics card to efficiently absorb outgoing waves on the fictitious edges of the grid. We show that the code that runs on the graphics card gives the expected results by comparing our results to those obtained by running the same simulation on a classical processor core. The methodology that we present can be used for Maxwell's equations as well because their form is similar to that of the seismic wave equation written in velocity vector and stress tensor. We then implement a high-order finite-element (spectral-element) application, which performs the numerical simulation of seismic wave propagation resulting for instance from earthquakes at the scale of a continent or from active seismic acquisition experiments in the oil industry, on a cluster of NVIDIA Tesla graphics cards using the CUDA programming language and non blocking message passing based on MPI. We compare it to the implementation in C language and MPI on a classical cluster of CPU nodes. We use mesh coloring to efficiently handle summation operations over degrees of freedom on an unstructured mesh, and we exchange information between nodes using non blocking MPI messages. Using non-blocking communications allows us to overlap the communications across the network and the data transfer between the GPU card and the CPU node on which it is installed with calculations on that GPU card. We perform a number of numerical tests to validate the single-precision CUDA and MPI implementation and assess its accuracy. We then analyze performance measurements and in average we obtain a speedup of 20x to 25x.
This dissertation is concerned with efficient compilation of our Java-based, highperformance, lib... more This dissertation is concerned with efficient compilation of our Java-based, highperformance, library-oriented, SPMD style, data parallel programming language: HPJava.

Accelerating a 3D finite-difference wave propagation code and a spectral-element code using a cluster of GPU graphics cards
HAL (Le Centre pour la Communication Scientifique Directe), May 2, 2010
We first accelerate a three-dimensional finite-difference in the time domain (FDTD) wave propagat... more We first accelerate a three-dimensional finite-difference in the time domain (FDTD) wave propagation code by a factor of about 50 using Graphics Processing Unit (GPU) computing on a cheap NVIDIA graphics card with the CUDA programming language. We implement the code in CUDA in the case of the fully heterogeneous elastic wave equation. We also implement Convolution Perfectly Matched Layers (CPMLs) on the graphics card to efficiently absorb outgoing waves on the fictitious edges of the grid. We show that the code that runs on the graphics card gives the expected results by comparing our results to those obtained by running the same simulation on a classical processor core. The methodology that we present can be used for Maxwell's equations as well because their form is similar to that of the seismic wave equation written in velocity vector and stress tensor. We then implement a high-order finite-element (spectral-element) application, which performs the numerical simulation of seismic wave propagation resulting for instance from earthquakes at the scale of a continent or from active seismic acquisition experiments in the oil industry, on a cluster of NVIDIA Tesla graphics cards using the CUDA programming language and non blocking message passing based on MPI. We compare it to the implementation in C language and MPI on a classical cluster of CPU nodes. We use mesh coloring to efficiently handle summation operations over degrees of freedom on an unstructured mesh, and we exchange information between nodes using non blocking MPI messages. Using non-blocking communications allows us to overlap the communications across the network and the data transfer between the GPU card and the CPU node on which it is installed with calculations on that GPU card. We perform a number of numerical tests to validate the single-precision CUDA and MPI implementation and assess its accuracy. We then analyze performance measurements and in average we obtain a speedup of 20x to 25x.
Chronicle of Designing a Game-Based Learning Platform
Advances in game-based learning, Dec 8, 2018
Our phenomenological examination of learning game design is situated in a four-year, longitudinal... more Our phenomenological examination of learning game design is situated in a four-year, longitudinal design-based research project that encompasses iterative design processes to develop, refine, and study a game-based learning platform called E-Rebuild. This chapter presents an introductory overview of the four facets of the interdisciplinary educational game design—interdisciplinary collaboration, learning-play integration, integrative task and assessment design, and game-based learning support. We then provide a design chronicle of E-Rebuild as the key setting of the phenomenon examined, by explaining its iterative design, testing, and refining processes. The authors’ researcher positionality and reflective summaries of design experiences are presented as well.

Interweaving Task Design and In-Game Measurement
Advances in game-based learning, Dec 8, 2018
There are two important design issues related to game-based learning (GBL) in school settings: (a... more There are two important design issues related to game-based learning (GBL) in school settings: (a) the intrinsic integration of content-related tasks in gameplay and (b) the real-time capture and analysis of in-game performance data. In this chapter, we describe an integrative design approach that is aimed to interweave game-based task design with in-game assessment of learning. Extending other GBL projects in which the mechanism of data mining for assessment was created after game development, in E-Rebuild we have designed the evidence-centered, data-driven assessment during the course of game design. Design-based research findings on emergent core design processes and functional conjectures on the approaches of task generation and evidence accumulation are discussed, with support of infield observations on the implementation feasibility and outcomes of various design assumptions.

Interdisciplinary Design Activities and Patterns
Advances in game-based learning, Dec 8, 2018
In this chapter, we provide a reflective and analytical description of the interdisciplinary desi... more In this chapter, we provide a reflective and analytical description of the interdisciplinary design activities of E-Rebuild, identify driving design questions and salient design patterns that capture and frame the essence of E-Rebuild development, and discuss distilled meta-generalizations that help to decompose the interdisciplinary learning game design processes to inform future work related to design, research, and deployment. The description of salient patterns of interdisciplinary game design in this chapter provides a contextualized design narrative/account of core design processes along with an analytical synthesis of core design pattern elements—a design problem statement with its context and specifics, the solution or technique to solving the stated problem, and the pattern of transferring or scaling this design solution or design move.
Direct numerical simulation of compressible turbulence in a homogeneous shear flow
Springer eBooks, Apr 7, 2008

Designing Dynamic Support for Game-Based Learning
Advances in game-based learning, Dec 8, 2018
The role of support for game-based learning cannot be overemphasized. It remains inconclusive as ... more The role of support for game-based learning cannot be overemphasized. It remains inconclusive as to what, when, and how support for learning should be designed and implemented to foster learners’ extended engagement, in-game performance, and game-based disciplinary knowledge learning and transfer. In this chapter, we review prevalent support features in digital games, prior theoretical and empirical research on scaffolding and support in game-based learning, and the support design conjectures deemed effective. We then share our observations of the obstacles that learners experienced in game-based learning processes when using E-Rebuild, describe the corresponding learning support strategies and features, and report the findings from the iterative testing and refinement of these support features. Propositions for future research and the design of support for game-based learning are discussed in relation to the current project findings and prior research.

Springer eBooks, 1992
In this study, the linear stability of high-speed, rotating Couette flow to two-and threedimensio... more In this study, the linear stability of high-speed, rotating Couette flow to two-and threedimensional disturbances in finite-gap spacings, including the full effects of compressibility and viscosity, is considered. Particularly, the combined effects of Mach number, Reynolds number, radial heating, and gap spacing are investigated. For a stationary outer cylinder, the primary instability is an axisymmetric mode independent of the Mach number. Increasing Mach numbers have a destabilizing effect for wide gaps, and a stabilizing effect for narrow gaps. For a sufficiently fast, counter-rotating outer cylinder, the primary instability becomes a three-dimensional traveling wave. Compressibility has a stabilizing effect on these modes regardless of the gap width; also, heating at the outer cylinder stabilizes the flow. Bicritical points for the primary instability corresponding to the crossover of the azimuthal wave numbers are determined for cylinders counter-rotating with equal angular speed.
arXiv (Cornell University), May 4, 2018
Artificial neural networks (ANNs) may not be worth their computational/memory costs when used in ... more Artificial neural networks (ANNs) may not be worth their computational/memory costs when used in mobile phones or embedded devices. Parameter-pruning algorithms combat these costs, with some algorithms capable of removing over 90% of an ANN's weights without harming the ANN's performance. Removing weights from an ANN is a form of regularization, but existing pruning algorithms do not significantly improve generalization error. We show that pruning ANNs can improve generalization if pruning targets large weights instead of small weights. Applying our pruning algorithm to an ANN leads to a higher image classification accuracy on CIFAR-10 data than applying the popular regularizer dropout. The pruning couples this higher accuracy with an 85% reduction of the ANN's parameter count.

Multi-gpu solutions of geophysical pdes with radial basis function-generated finite differences
Many numerical methods based on Radial Basis Functions (RBFs) are gaining popularity in the geosc... more Many numerical methods based on Radial Basis Functions (RBFs) are gaining popularity in the geosciences due to their competitive accuracy, functionality on unstructured meshes, and natural extension into higher dimensions. One method in particular, the Radial Basis Function-generated Finite Differences (RBF-FD), is drawing attention due to its comparatively low computational complexity versus other RBF methods, high-order accuracy (6th to 10th order is common), and parallel nature. Similar to classical Finite Differences (FD), RBF-FD computes weighted differences of stencil node values to approximate derivatives at stencil centers. The method differs from classical FD in that the basis functions used to calculate the differentiation weights are n-dimensional radially symmetric functions rather than one-dimensional polynomials. This allows for generalization to n-dimensional space on completely scattered node layouts. Although RBF-FD was first proposed nearly a decade ago, it is only now gaining a critical mass to compete against well known competitors in modeling like FD, Finite Volume and Finite Element. To truly contend, RBF-FD must transition from single threaded MATLAB environments to large-scale parallel architectures. Many HPC systems around the world have made the transition to Graphics Processing Unit (GPU) accelerators as a solution for added parallelism and higher throughput. Some systems offer significantly more GPUs than CPUs. As the problem size, N, grows larger, it behooves us to work on parallel architectures, be it CPUs or GPUs. In addition to demonstrating the ability to scale to hundreds or thousands of computer nodes, this work introduces parallelization strategies that span RBF-FD across multi-GPU clusters. The stability and accuracy of the parallel implementation is verified through the explicit solution of two PDEs. Additionally, a parallel implementation for implicit solutions is introduced as part of continued research efforts. This work establishes RBF-FD as a contender in the arena of distributed HPC numerical methods.
standards based approach to building, composing, registering and discovering services [Hoscheck+0... more standards based approach to building, composing, registering and discovering services [Hoscheck+02]. The impact and scope of this thesis are at three distinct levels. First, it suggests changes in the way services are designed. Second, it outlines a modular approach to this problem which can be expanded incrementally to deal with future changes in the nature of these devices. Finally, although this thesis has been organized in the context of device capabilities, some of the ideas of this thesis could be extended to deal with changing protocol, transport and communication standards with its software architectural idea.
Algorithms, Apr 3, 2023
This article is an open access article distributed under the terms and conditions of the Creative... more This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY

Concurrency and Computation: Practice and Experience, 2007
We present the initial architecture and implementation of VLab, a Grid and Web-Service-based syst... more We present the initial architecture and implementation of VLab, a Grid and Web-Service-based system for enabling distributed and collaborative computational chemistry and material science applications for the study of planetary materials. The requirements of VLab include job preparation and submission, job monitoring, data storage and analysis, and distributed collaboration. These components are divided into client entry (input file creation, visualization of data, task requests) and back-end services (storage, analysis, computation). Clients and services communicate through NaradaBrokering, a publish/subscribe Grid middleware system that identifies specific hardware information with topics rather than IP addresses. We describe three aspects of VLab in this paper: (1) managing user interfaces and input data with JavaBeans and Java Server Faces; (2) integrating Java Server Faces with the Java CoG Kit; and (3) designing a middleware framework that supports collaboration. To prototype our collaboration and visualization infrastructure, we have developed a service that transforms a scalar data set into its wavelet representation. General adaptors are placed between the endpoints and NaradaBrokering, which serve to isolate the clients/services from the middleware. This permits client and service development independently of potential changes to the middleware.
Uploads
Papers by Gordon Erlebacher