Publications

33 Results
Skip to search filters

Accelerating phase-field predictions via recurrent neural networks learning the microstructure evolution in latent space

Computer Methods in Applied Mechanics and Engineering

Hu, C.; Martin, Shawn; Dingreville, R.

The phase-field method is a popular modeling technique used to describe the dynamics of microstructures and their physical properties at the mesoscale. However, because in these simulations the microstructure is described by a system of continuous variables evolving both in space and time, phase-field models are computationally expensive. They require refined spatio-temporal discretization and a parallel computing approach to achieve a useful degree of accuracy. As an alternative, we present and discuss an accelerated phase-field approach which uses a recurrent neural network (RNN) to learn the microstructure evolution in latent space. We perform a comprehensive analysis of different dimensionality-reduction methods and types of recurrent units in RNNs. Specifically, we compare statistical functions combined with linear and nonlinear embedding techniques to represent the microstructure evolution in latent space. We also evaluate several RNN models that implement a gating mechanism, including the long short-term memory (LSTM) unit and the gated recurrent unit (GRU) as the microstructure-learning engine. We analyze the different combinations of these methods on the spinodal decomposition of a two-phase system. Our comparison reveals that describing the microstructure evolution in latent space using an autocorrelation-based principal component analysis (PCA) method is the most efficient. We find that the LSTM and GRU RNN implementations provide comparable accuracy with respect to the high-fidelity phase-field predictions, but with a considerable computational speedup relative to the full simulation. This study not only enhances our understanding of the performance of dimensionality reduction on the microstructure evolution, but it also provides insights on strategies for accelerating phase-field modeling via machine learning techniques.

More Details

Rapid Response Data Science for COVID-19

Bandlow, Alisa B.; Bauer, Travis L.; Crossno, Patricia J.; Garcia, Rudy J.; Astuto Gribble, Lisa A.; Hernandez, Patricia M.; Martin, Shawn; McClain, Jonathan T.; Patrizi, Laura P.

This report describes the results of a seven day effort to assist subject matter experts address a problem related to COVID-19. In the course of this effort, we analyzed the 29K documents provided as part of the White House's call to action. This involved applying a variety of natural language processing techniques and compression-based analytics in combination with visualization techniques and assessment with subject matter experts to pursue answers to a specific question. In this paper, we will describe the algorithms, the software, the study performed, and availability of the software developed during the effort.

More Details

VideoSwarm: Analyzing video ensembles

IS and T International Symposium on Electronic Imaging Science and Technology

Martin, Shawn; Sielicki, Milosz A.; Gittinger, Jaxon M.; Letter, Matthew L.; Hunt, Warren L.; Crossno, Patricia J.

We present VideoSwarm, a system for visualizing video ensembles generated by numerical simulations. VideoSwarm is a web application, where linked views of the ensemble each represent the data using a different level of abstraction. VideoSwarm uses multidimensional scaling to reveal relationships between a set of simulations relative to a single moment in time, and to show the evolution of video similarities over a span of time. VideoSwarm is a plug-in for Slycat, a web-based visualization framework which provides a web-server, database, and Python infrastructure. The Slycat framework provides support for managing multiple users, maintains access control, and requires only a Slycat supported commodity browser (such as Firefox, Chrome, or Safari).

More Details

Slycat™ User Manual

Crossno, Patricia J.; Gittinger, Jaxon M.; Hunt, Warren L.; Letter, Matthew L.; Martin, Shawn; Sielicki, Milosz A.

Slycat™ is a web-based system for performing data analysis and visualization of potentially large quantities of remote, high-dimensional data. Slycat™ specializes in working with ensemble data. An ensemble is a group of related data sets, which typically consists of a set of simulation runs exploring the same problem space. An ensemble can be thought of as a set of samples within a multi-variate domain, where each sample is a vector whose value defines a point in high-dimensional space. To understand and describe the underlying problem being modeled in the simulations, ensemble analysis looks for shared behaviors and common features across the group of runs. Additionally, ensemble analysis tries to quantify differences found in any members that deviate from the rest of the group. The Slycat™ system integrates data management, scalable analysis, and visualization. Results are viewed remotely on a user’s desktop via commodity web clients using a multi-tiered hierarchy of computation and data storage, as shown in Figure 1. Our goal is to operate on data as close to the source as possible, thereby reducing time and storage costs associated with data movement. Consequently, we are working to develop parallel analysis capabilities that operate on High Performance Computing (HPC) platforms, to explore approaches for reducing data size, and to implement strategies for staging computation across the Slycat™ hierarchy. Within Slycat™, data and visual analysis are organized around projects, which are shared by a project team. Project members are explicitly added, each with a designated set of permissions. Although users sign-in to access Slycat™, individual accounts are not maintained. Instead, authentication is used to determine project access. Within projects, Slycat™ models capture analysis results and enable data exploration through various visual representations. Although for scientists each simulation run is a model of real-world phenomena given certain conditions, we use the term model to refer to our modeling of the ensemble data, not the physics. Different model types often provide complementary perspectives on data features when analyzing the same data set. Each model visualizes data at several levels of abstraction, allowing the user to range from viewing the ensemble holistically to accessing numeric parameter values for a single run. Bookmarks provide a mechanism for sharing results, enabling interesting model states to be labeled and saved.

More Details

Screening for High Conductivity/Low Viscosity Ionic Liquids Using Product Descriptors

Molecular Informatics

Martin, Shawn; Pratt, Harry P.; Anderson, Travis M.

We seek to optimize Ionic liquids (ILs) for application to redox flow batteries. As part of this effort, we have developed a computational method for suggesting ILs with high conductivity and low viscosity. Since ILs consist of cation-anion pairs, we consider a method for treating ILs as pairs using product descriptors for QSPRs, a concept borrowed from the prediction of protein-protein interactions in bioinformatics. We demonstrate the method by predicting electrical conductivity, viscosity, and melting point on a dataset taken from the ILThermo database on June 18th, 2014. The dataset consists of 4,329 measurements taken from 165 ILs made up of 72 cations and 34 anions. We benchmark our QSPRs on the known values in the dataset then extend our predictions to screen all 2,448 possible cation-anion pairs in the dataset.

More Details

Towards a more robust understanding of the uncertainty of wind farm reliability

Probabilistic Prognostics and Health Management of Energy Systems

Westergaard, Carsten H.; Martin, Shawn; White, Jonathan; Carter, Charles M.; Karlson, Benjamin K.

Understanding wind farm reliability from various data sources is highly complex because the boundary conditions for the data are often undocumented and impact the outcome of aggregation significantly. Sandia National Laboratories has been investigating the reliability of wind farms through the Continuous Reliability Enhancement Wind (CREW) project since 2007 through the use of Supervisory Control and Data Acquisition (SCADA) data from multiple wind farms in the fleet of the USA. However, data streaming from sample wind farms does not lead to better understanding as it is merely a generic status of those samples. Economic type benchmark studies are used in the industry, but these do not yield much technical understanding and give only a managerial cost perspective. Further, it is evident that there are many situations in which average benchmark data cannot be presented in a meaningful way due to discrete events, especially when the data is only based on smaller samples relative to the probability of the events and the sample size. The discrete events and insufficient descriptive tagging contribute significantly to the uncertainty of a fleet average and may even impair the way we communicate reliability. These aspects will be discussed. It is speculated that some aspects of reliability can be understood better through SCADA data-mining techniques and considering the real operating environment, as, it will be shown that there is no particular reason that two identical wind turbines in the same wind farm should have identical reliability performance. The operation and the actual environmental impact on the turbine level are major parameters in determining the remaining useful life. Methods to normalize historical data for future predictions need to be developed, both for discrete events and for general operational conditions.

More Details

Visualizing Wind Farm Wakes Using SCADA Data

Martin, Shawn; Westergaard, Carsten W.; White, Jonathan; Karlson, Benjamin K.

As wind farms scale to include more and more turbines, questions about turbine wake interactions become increasingly important. Turbine wakes reduce wind speed and downwind turbines suffer decreased performance. The cumulative effect of the wakes throughout a wind farm will therefore decrease the performance of the entire farm. These interactions are dynamic and complicated, and it is difficult to quantify the overall effect of the wakes. This problem has attracted some attention in terms of computational modelling for siting turbines on new farms, but less attention in terms of empirical studies and performance validation of existing farms. In this report, Supervisory Control and Data Acquisition (SCADA) data from an existing wind farm is analyzed in order to explore methods for documenting wake interactions. Visualization techniques are proposed and used to analyze wakes in a 67 turbine farm. The visualizations are based on directional analysis using power measurements, and can be considered to be normalized capacity factors below rated power. Wind speed measurements are not used in the analysis except for data pre-processing. Four wake effects are observed; including wake deficit, channel speed up, and two potentially new effects, single and multiple shear point speed up. In addition, an attempt is made to quantify wake losses using the same SCADA data. Power losses for the specific wind farm investigated are relatively low, estimated to be in the range of 3-5%. Finally, a simple model based on the wind farm geometrical layout is proposed. Key parameters for the model have been estimated by comparing wake profiles at different ranges and making some ad hoc assumptions. A preliminary comparison of six selected profiles shows excellent agreement with the model. Where discrepancies are observed, reasonable explanations can be found in multi-turbine speedup effects and landscape features, which are yet to be modelled.

More Details

Interactive visualization of multivariate time series data

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Martin, Shawn; Quach, Tu T.

Organizing multivariate time series data for presentation to an analyst is a challenging task. Typically, a dataset contains hundreds or thousands of datapoints, and each datapoint consists of dozens of time series measurements. Analysts are interested in how the datapoints are related, which measurements drive trends and/or produce clusters, and how the clusters are related to available metadata. In addition, interest in particular time series measurements will change depending on what the analyst is trying to understand about the dataset. Rather than providing a monolithic single use machine learning solution, we have developed a system that encourages analyst interaction. This system, Dial-A-Cluster (DAC), uses multidimensional scaling to provide a visualization of the datapoints depending on distance measures provided for each time series. The analyst can interactively adjust (dial) the relative influence of each time series to change the visualization (and resulting clusters). Additional computations are provided which optimize the visualization according to metadata of interest and rank time series measurements according to their influence on analyst selected clusters. The DAC system is a plug-in for Slycat (slycat.readthedocs.org), a framework which provides a web server, database, and Python infrastructure. The DAC web application allows an analyst to keep track of multiple datasets and interact with each as described above. It requires no installation, runs on any platform, and enables analyst collaboration. We anticipate an open source release in the near future.

More Details
33 Results
33 Results