Sandia Information Sciences Initiative
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Sandia has identified autonomy as a strategic initiative and an important area for providing national leadership. A key question is, “How might autonomy change how we think about the national security challenges we address and the kinds of solutions we deliver?” Three workshops at Sandia early in 2017 brought together internal stakeholders and potential academic partners in autonomy to address this question. The first focused on programmatic applications and needs. The second explored existing internal capabilities and research and development needs. This report summarizes the outcome of the third workshop, held March 3, 2017 in Albuquerque, NM, which engaged Academic Alliance partners in autonomy efforts at Sandia by discussing research needs and synergistic areas of interest within the complex systems and system modeling domains, and identifying opportunities for partnering on laboratory directed and other joint research opportunities.
This report contains the written footprint of a Sandia-hosted workshop held in Albuquerque, New Mexico, June 22-23, 2016 on “Complex Systems Models and Their Applications: Towards a New Science of Verification, Validation and Uncertainty Quantification,” as well as of pre-work that fed into the workshop. The workshop’s intent was to explore and begin articulating research opportunities at the intersection between two important Sandia communities: the complex systems (CS) modeling community, and the verification, validation and uncertainty quantification (VVUQ) community The overarching research opportunity (and challenge) that we ultimately hope to address is: how can we quantify the credibility of knowledge gained from complex systems models, knowledge that is often incomplete and interim, but will nonetheless be used, sometimes in real-time, by decision makers?
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
We report on the use of a supercomputer simulation to study the performance sensitivity to systematic changes in the job parameters of run time, number of CPUs, and interarrival time. We also examine the effect of changes in share allocation and service ratio for job prioritization under a Fair Share queuing Algorithm to see the effect on facility figures of merit. We used log data from the ASCI supercomputer Blue Mountain and the ASCI simulator BIRMinator to perform this study. The key finding is that the performance of the supercomputer is quite sensitive to all the job parameters with the interarrival rate of the jobs being most sensitive at the highest rates and increasing run times the least sensitive job parameter with respect to utilization and rapid turnaround. We also find that this facility is running near its maximum practical utilization. Finally, we show the importance of the use of simulation in understanding the performance sensitivity of a supercomputer.
Proceedings - IEEE International Conference on Cluster Computing, ICCC
This paper presents an analysis of utilizing unused cycles on supercomputers through the use of many small jobs. What we call "interstitial computing," is important to supercomputer centers for both productivity and political reasons. Interstitial computing makes use of the fact that small jobs are more or less fungible consumers of compute cycles that are more efficient for bin packing than the typical jobs on a supercomputer. An important feature of interstitial computing is that it not have a significant impact on the makespan of native jobs on the machine. Also, a facility can obtain higher utilizations that may only be otherwise possible with more complicated schemes or with very long wait times. The key contribution of this paper is that it provides theoretical and empirical guidelines for users and administrators for how currently unused supercomputer cycles may be exploited. We find that that interstitial computing is a more effective means for increasing machine utilization than increasing native job run times or size.
Proceedings - CCGrid 2003: 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid
Proceedings of the IEEE International Symposium on High Performance Distributed Computing
This paper characterizes "queue storms" in supercomputer systems and discusses methods for quelling them. Queue storms are anomalously large queue lengths dependent upon the job size mix, the queuing system, the machine size, and correlations and dependencies between job submissions. We use synthetic data generated from actual job log data from the ASCI Blue Mountain supercomputer combined with different long-range dependencies. We show the distribution of times from the first storm to occur, which is in a sense the time when the machine becomes obsolete because it represents the time when the machine first fails to provide satisfactory turnaround. To overcome queue storms, more resources are needed even if they appear superfluous most of the time. We present two methods, including a grid-based solution, for reducing these correlations and their resulting effect on the size and frequency of queue storms.
Proceedings of the Hawaii International Conference on System Sciences
In manufacturing, the conceptual design and detailed design stages are typically regarded as sequential and distinct. Decisions made in conceptual design are often made with little information as to how they would affect detailed design or manufacturing process specification. Many possibilities and unknowns exist in conceptual design where ideas about product shape and functionality are changing rapidly. Few if any tools exist to aid in this difficult, amorphous stage in contrast to the many CAD and analysis tools for detailed design where much more is known about the final product. The Materials Process Design Environment (MPDE) is a collaborative problem solving environment (CPSE) that was developed so geographically dispersed designers in both the conceptual and detailed stage can work together and understand the impacts of their design decisions on functionality, cost and manufacturability.