Structured and Unstructured Entropy-Stable High-Order Methods for Simulating High-Speed Compressible Turbulent Flows
Abstract not provided.
Abstract not provided.
Abstract not provided.
This memo summarizes the aerodynamic drag scoping work done for Goodyear in early FY18. The work is to evaluate the feasibility of using Sierra/Low-Mach (Fuego) for drag predictions of rolling tires, particularly focused on the effects of tire features such as lettering, sidewall geometry, rim geometry, and interaction with the vehicle body. The work is broken into two parts. Part 1 consisted of investigation of a canonical validation problem (turbulent flow over a cylinder) using existing tools with different meshes and turbulence models. Part 2 involved calculating drag differences over plate geometries with simple features (ridges and grooves) defined by Goodyear of approximately the size of interest for a tire. The results of part 1 show the level of noise to be expected in a drag calculation and highlight the sensitivity of absolute predictions to model parameters such as mesh size and turbulence model. There is 20-30% noise in the experimental measurements on the canonical cylinder problem, and a similar level of variation between different meshes and turbulence models. Part 2 shows that there is a notable difference in the predicted drag on the sample plate geometries, however, the computational cost of extending the LES model to a full tire would be significant. This cost could be reduced by implementation of more sophisticated wall and turbulence models (e.g. detached eddy simulations - DES) and by focusing the mesh refinement on feature subsets with the goal of comparing configurations rather than absolute predictivity for the whole tire.
Wind applications require the ability to simulate rotating blades. To support this use-case, a novel design-order sliding mesh algorithm has been developed and deployed. The hybrid method combines the control volume finite element methodology (CVFEM) with concepts found within a discontinuous Galerkin (DG) finite element method (FEM) to manage a sliding mesh. The method has been demonstrated to be design-order for the tested polynomial basis (P=1 and P=2) and has been deployed to provide production simulation capability for a Vestas V27 (225 kW) wind turbine. Other stationary and canonical rotating ow simulations are also presented. As the majority of wind-energy applications are driving extensive usage of hybrid meshes, a foundational study that outlines near-wall numerical behavior for a variety of element topologies is presented. Results indicate that the proposed nonlinear stabilization operator (NSO) is an effective stabilization methodology to control Gibbs phenomena at large cell Peclet numbers. The study also provides practical mesh resolution guidelines for future analysis efforts. Application-driven performance and algorithmic improvements have been carried out to increase robustness of the scheme on hybrid production wind energy meshes. Specifically, the Kokkos-based Nalu Kernel construct outlined in the FY17/Q4 ExaWind milestone has been transitioned to the hybrid mesh regime. This code base is exercised within a full V27 production run. Simulation timings for parallel search and custom ghosting are presented. As the low-Mach application space requires implicit matrix solves, the cost of matrix reinitialization has been evaluated on a variety of production meshes. Results indicate that at low element counts, i.e., fewer than 100 million elements, matrix graph initialization and preconditioner setup times are small. However, as mesh sizes increase, e.g., 500 million elements, simulation time associated with \setup-up" costs can increase to nearly 50% of overall simulation time when using the full Tpetra solver stack and nearly 35% when using a mixed Tpetra- Hypre-based solver stack. The report also highlights the project achievement of surpassing the 1 billion element mesh scale for a production V27 hybrid mesh. A detailed timing breakdown is presented that again suggests work to be done in the setup events associated with the linear system. In order to mitigate these initialization costs, several application paths have been explored, all of which are designed to reduce the frequency of matrix reinitialization. Methods such as removing Jacobian entries on the dynamic matrix columns (in concert with increased inner equation iterations), and lagging of Jacobian entries have reduced setup times at the cost of numerical stability. Artificially increasing, or bloating, the matrix stencil to ensure that full Jacobians are included is developed with results suggesting that this methodology is useful in decreasing reinitialization events without loss of matrix contributions. With the above foundational advances in computational capability, the project is well positioned to begin scientific inquiry on a variety of wind-farm physics such as turbine/turbine wake interactions.
2018 Spring Technical Meeting of the Western States Section of the Combustion Institute, WSSCI 2018
This study addresses predicting the internal thermochemical state in buoyant fire plumes using largeeddy simulations (LES) with a tabular flamelet library for the underlying flame chemistry. Buoyant fire plumes are characterized by moderate turbulent mixing, soot growth and oxidation and radiation transport. Soot moments, mixture fraction and enthalpy evolve in the LES with soot source terms given by the non-adiabatic flamelet library. Participating media radiation transport is predicted using the discrete ordinates method with source terms also from the flamelet library, and the LES subgrid-scale modeling is based on a one-equation kinetic-energy sub-filter model. This library is generated with flamelet states that include unsteady heat loss through extinction nominally representing radiative quenching. We describe the performance of this model both in the context of a laminar coflow configuration where extensive measurements are available and in buoyant turbulent fire plumes where measurements are more global.
Abstract not provided.
The former Nalu interior heterogeneous algorithm design, which was originally designed to manage matrix assembly operations over all elemental topology types, has been modified to operate over homogeneous collections of mesh entities. This newly templated kernel design allows for removal of workset variable resize operations that were formerly required at each loop over a Sierra ToolKit (STK) bucket (nominally, 512 entities in size). Extensive usage of the Standard Template Library (STL) std::vector has been removed in favor of intrinsic Kokkos memory views. In this milestone effort, the transition to Kokkos as the underlying infrastructure to support performance and portability on many-core architectures has been deployed for key matrix algorithmic kernels. A unit-test driven design effort has developed a homogeneous entity algorithm that employs a team-based thread parallelism construct. The STK Single Instruction Multiple Data (SIMD) infrastructure is used to interleave data for improved vectorization. The collective algorithm design, which allows for concurrent threading and SIMD management, has been deployed for the core low-Mach element- based algorithm. Several tests to ascertain SIMD performance on Intel KNL and Haswell architectures have been carried out. The performance test matrix includes evaluation of both low- and higher-order methods. The higher-order low-Mach methodology builds on polynomial promotion of the core low-order control volume nite element method (CVFEM). Performance testing of the Kokkos-view/SIMD design indicates low-order matrix assembly kernel speed-up ranging between two and four times depending on mesh loading and node count. Better speedups are observed for higher-order meshes (currently only P=2 has been tested) especially on KNL. The increased workload per element on higher-order meshes bene ts from the wide SIMD width on KNL machines. Combining multiple threads with SIMD on KNL achieves a 4.6x speedup over the baseline, with assembly timings faster than that observed on Haswell architecture. The computational workload of higher-order meshes, therefore, seems ideally suited for the many-core architecture and justi es further exploration of higher-order on NGP platforms. A Trilinos/Tpetra-based multi-threaded GMRES preconditioned by symmetric Gauss Seidel (SGS) represents the core solver infrastructure for the low-Mach advection/diffusion implicit solves. The threaded solver stack has been tested on small problems on NREL's Peregrine system using the newly developed and deployed Kokkos-view/SIMD kernels. fforts are underway to deploy the Tpetra-based solver stack on NERSC Cori system to benchmark its performance at scale on KNL machines.
Abstract not provided.
Abstract not provided.
This report documents work performed using ALCC computing resources granted under a proposal submitted in February 2016, with the resource allocation period spanning the period July 2016 through June 2017. The award allocation was 10.7 million processor-hours at the National Energy Research Scientific Computing Center. The simulations performed were in support of two projects: the Atmosphere to Electrons (A2e) project, supported by the DOE EERE office; and the Exascale Computing Project (ECP), supported by the DOE Office of Science. The project team for both efforts consists of staff scientists and postdocs from Sandia National Laboratories and the National Renewable Energy Laboratory. At the heart of these projects is the open-source computational-fluid-dynamics (CFD) code, Nalu. Nalu solves the low-Mach-number Navier-Stokes equations using an unstructured- grid discretization. Nalu leverages the open-source Trilinos solver library and the Sierra Toolkit (STK) for parallelization and I/O. This report documents baseline computational performance of the Nalu code on problems of direct relevance to the wind plant physics application - namely, Large Eddy Simulation (LES) of an atmospheric boundary layer (ABL) flow and wall-modeled LES of a flow past a static wind turbine rotor blade. Parallel performance of Nalu and its constituent solver routines residing in the Trilinos library has been assessed previously under various campaigns. However, both Nalu and Trilinos have been, and remain, in active development and resources have not been available previously to rigorously track code performance over time. With the initiation of the ECP, it is important to establish and document baseline code performance on the problems of interest. This will allow the project team to identify and target any deficiencies in performance, as well as highlight any performance bottlenecks as we exercise the code on a greater variety of platforms and at larger scales. The current study is rather modest in scale, examining performance on problem sizes of O(100 million) elements and core counts up to 8k cores. This will be expanded as more computational resources become available to the projects.
2017 Fall Technical Meeting of the Western States Section of the Combustion Institute, WSSCI 2017
A 1-m diameter methane fire plume has been studied using a large eddy simulation (LES) methodology. Eddy dissipation concept (EDC) and steady flamelet combustion models were used to describe interactions between buoyancy-induced turbulence and gas-phase combustion. Detailed comparisons with experimental data showed that the simulation is sensitive to the combustion model and mesh resolution. In particular, any excessive mixing results in a wider and more diffusive plume. As mesh resolution increases, the current simulations demonstrate a tendency toward excessive mixing.
Proceedings of the Combustion Institute
Turbulent fluctuations of the scalar dissipation rate have a major impact on extinction in non-premixed combustion. Recently, an unsteady extinction criterion has been developed (Hewson, 2013) that predicts extinction dependent on the duration and the magnitude of dissipation rate fluctuations exceeding a critical quenching value; this quantity is referred to as the dissipation impulse. The magnitude of the dissipation impulse corresponding to unsteady extinction is related to the difficulty with which a flamelet is exintguished, based on the steady-state S-curve. In this paper we evaluate this new extinction criterion for more realistic dissipation rates by evolving a stochastic Ornstein-Uhlenbeck process for the dissipation rate. A comparison between unsteady flamelet evolution using this dissipation rate and the extinction criterion exhibit good agreement. The rate of predicted extinction is examined over a range of Damköhler and Reynolds numbers and over a range of the extinction difficulty. The results suggest that the rate of extinction is proportional to the average dissipation rate and the area under the dissipation rate probability density function exceeding the steady-state quenching value. It is also inversely related to the actual probability that this steady-state quenching dissipation rate is observed and the difficulty of extinction associated with the distance between the upper and middle branches of the S-curve.
Abstract not provided.
Abstract not provided.
Abstract not provided.