If we are to build a supercomputer with a speed of 10{sup 15} floating operations per second (1 PetaFLOPS), interconnect technology will need to be improved considerably over what it is today. In this report, we explore one possible interconnect design for such a network. The guiding principle in this design is the optimization of all components for the finiteness of the speed of light. To achieve a linear speedup in time over well-tested supercomputers of todays' designs will require scaling up of processor power and bandwidth and scaling down of latency. Latency scaling is the most challenging: it requires a 100 ns user-to-user latency for messages traveling the full diameter of the machine. To meet this constraint requires simultaneously minimizing wire length through 3D packaging, new low-latency electrical signaling mechanisms, extremely fast routers, and new network interfaces. In this report, we outline approaches and implementations that will meet the requirements when implemented as a system. No technology breakthroughs are required.
The maximum contact map overlap (MAX-CMO) between a pair of protein structures can be used as a measure of protein similarity. It is a purely topological measure and does not depend on the sequence of the pairs involved in the comparison. More importantly, the MAX-CMO present a very favorable mathematical structure which allows the formulation of integer, linear and Lagrangian models that can be used to obtain guarantees of optimality. It is not the intention of this paper to discuss the mathematical properties of MAX-CMO in detail as this has been dealt elsewhere. In this paper we compare three algorithms that can be used to obtain maximum contact map overlaps between protein structures. We will point to the weaknesses and strengths of each one. It is our hope that this paper will encourage researchers to develop new and improve methods for protein comparison based on MAX-CMO.
We consider the convergence properties of a non-elitist self-adaptive evolutionary strategy (ES) on multi-dimensional problems. In particular, we apply our recent convergence theory for a discretized (1,{lambda})-ES to design a related (1,{lambda})-ES that converges on a class of seperable, unimodal multi-dimensional problems. The distinguishing feature of self-adaptive evolutionary algorithms (EAs) is that the control parameters (like mutation step lengths) are evolved by the evolutionary algorithm. Thus the control parameters are adapted in an implicit manner that relies on the evolutionary dynamics to ensure that more effective control parameters are propagated during the search. Self-adaptation is a central feature of EAs like evolutionary stategies (ES) and evolutionary programming (EP), which are applied to continuous design spaces. Rudolph summarizes theoretical results concerning self-adaptive EAs and notes that the theoretical underpinnings for these methods are essentially unexplored. In particular, convergence theories that ensure convergence to a limit point on continuous spaces have only been developed by Rudolph, Hart, DeLaurentis and Ferguson, and Auger et al. In this paper, we illustrate how our analysis of a (1,{lambda})-ES for one-dimensional unimodal functions can be used to ensure convergence of a related ES on multidimensional functions. This (1,{lambda})-ES randomly selects a search dimension in each iteration, along which points generated. For a general class of separable functions, our analysis shows that the ES searches along each dimension independently, and thus this ES converges to the (global) minimum.
Broadcasting messages through the earth is a daunting task. Indeed, broadcasting a normal telephone conversion through the earth by wireless means is impossible with todays technology. Most of us don't care, but some do. Industries that drill into the earth need wireless communication to broadcast navigation parameters. This allows them to steer their drill bits. They also need information about the natural formation that they are drilling. Measurements of parameters such as pressure, temperature, and gamma radiation levels can tell them if they have found a valuable resource such as a geothermal reservoir or a stratum bearing natural gas. Wireless communication methods are available to the drilling industry. Information is broadcast via either pressure waves in the drilling fluid or electromagnetic waves in the earth and well tubing. Data transmission can only travel one way at rates around a few baud. Given that normal Internet telephone modems operate near 20,000 baud, these data rates are truly very slow. Moreover, communication is often interrupted or permanently blocked by drilling conditions or natural formation properties. Here we describe a tool that communicates with stress waves traveling through the steel drill pipe and production tubing in the well. It's based on an old idea called Acoustic Telemetry. But what we present here is more than an idea. This tool exists, it's drilled several wells, and it works. Currently, it's the first and only acoustic telemetry tool that can withstand the drilling environment. It broadcasts one way over a limited range at much faster rates than existing methods, but we also know how build a system that can communicate both up and down wells of indefinite length.
This report presents a perspective on the role of code comparison activities in verification and validation. We formally define the act of code comparison as the Code Comparison Principle (CCP) and investigate its application in both verification and validation. One of our primary conclusions is that the use of code comparisons for validation is improper and dangerous. We also conclude that while code comparisons may be argued to provide a beneficial component in code verification activities, there are higher quality code verification tasks that should take precedence. Finally, we provide a process for application of the CCP that we believe is minimal for achieving benefit in verification processes.
We seek to understand which supercomputer architecture will be best for supercomputers at the Petaflops scale and beyond. The process we use is to predict the cost and performance of several leading architectures at various years in the future. The basis for predicting the future is an expanded version of Moore's Law called the International Technology Roadmap for Semiconductors (ITRS). We abstract leading supercomputer architectures into chips connected by wires, where the chips and wires have electrical parameters predicted by the ITRS. We then compute the cost of a supercomputer system and the run time on a key problem of interest to the DOE (radiation transport). These calculations are parameterized by the time into the future and the technology expected to be available at that point. We find the new advanced architectures have substantial performance advantages but conventional designs are likely to be less expensive (due to economies of scale). We do not find a universal ''winner'', but instead the right architectural choice is likely to involve non-technical factors such as the availability of capital and how long people are willing to wait for results.
The Trilinos Project is an effort to facilitate the design, development, integration and ongoing support of mathematical software libraries. In particular, our goal is to develop parallel solver algorithms and libraries within an object-oriented software framework for the solution of large-scale, complex multi-physics engineering and scientific applications. Our emphasis is on developing robust, scalable algorithms in a software framework, using abstract interfaces for flexible interoperability of components while providing a full-featured set of concrete classes that implement all abstract interfaces. Trilinos uses a two-level software structure designed around collections of packages. A Trilinos package is an integral unit usually developed by a small team of experts in a particular algorithms area such as algebraic preconditioners, nonlinear solvers, etc. Packages exist underneath the Trilinos top level, which provides a common look-and-feel, including configuration, documentation, licensing, and bug-tracking. Trilinos packages are primarily written in C++, but provide some C and Fortran user interface support. We provide an open architecture that allows easy integration with other solver packages and we deliver our software to the outside community via the Gnu Lesser General Public License (LGPL). This report provides an overview of Trilinos, discussing the objectives, history, current development and future plans of the project.
The Trilinos Project is an effort to facilitate the design, development, integration and ongoing support of mathematical software libraries. A new software capability is introduced into Trilinos as a package. A Trilinos package is an integral unit usually developed by a small team of experts in a particular algorithms area such as algebraic preconditioners, nonlinear solvers, etc. The Trilinos Users Guide is a resource for new and existing Trilinos users. Topics covered include how to configure and build Trilinos, what is required to integrate an existing package into Trilinos and examples of how those requirements can be met, as well as what tools and services are available to Trilinos packages. Also discussed are some common practices that are followed by many Trilinos package developers. Finally, a snapshot of current Trilinos packages and their interoperability status is provided, along with a list of supported computer platforms.
Protein microtubules (MTs) 25 nm in diameter and tens of micrometers long have been used as templates for the biomimetic mineralization of FeOOH. Exposure of MTs to anaerobic aqueous solutions of Fe{sup 2+} buffered to neutral pH followed by aerial oxidation leads to the formation of iron oxide coated MTs. The iron oxide layer was found to grow via a two-step process: initially formed 10-30 nm thick coatings were found to be amorphous in structure and comprised of several iron-containing species. Further growth resulted in MTs coated with highly crystalline layers of lepidocrocite with a controllable thickness of up to 125 nm. On the micrometer size scale, these coated MTs were observed to form large, irregular bundles containing hundreds of individually coated MTs. Iron oxide grew selectively on the MT surface, a result of the highly charged MT surface that provided an interface favorable for iron oxide nucleation. This result illustrates that MTs can be used as scaffolds for the in-situ production of high-aspect-ratio inorganic nanowires.
Currently, the Egyptian Atomic Energy Authority is designing a shallow-land disposal facility for low-level radioactive waste. To insure containment and prevent migration of radionuclides from the site, the use of a reactive backfill material is being considered. One material under consideration is hydroxyapatite, Ca{sub 10}(PO{sub 4}){sub 6}(OH){sub 2}, which has a high affinity for the sorption of many radionuclides. Hydroxyapatite has many properties that make it an ideal material for use as a backfill including low water solubility (K{sub sp} > 10{sup -40}), high stability under reducing and oxidizing conditions over a wide temperature range, availability, and low cost. However, there is often considerable variation in the properties of apatites depending on source and method of preparation. In this work, we characterized and compared a synthetic hydroxyapatite with hydroxyapatites prepared from cattle bone calcined at 500 C, 700 C, 900 C and 1100 C. The analysis indicated the synthetic hydroxyapatite was similar in morphology to 500 C prepared cattle hydroxyapatite. With increasing calcination temperature the crystallinity and crystal size of the hydroxyapatites increased and the BET surface area and carbonate concentration decreased. Batch sorption experiments were performed to determine the effectiveness of each material to sorb uranium. Sorption of U was strong regardless of apatite type indicating all apatite materials evaluated. Sixty day desorption experiments indicated desorption of uranium for each hydroxyapatite was negligible.
The purpose of this study was to investigate the impact of instructions on aircraft visual inspection performance and strategy. Forty-two inspectors from industry were asked to perform inspections of six areas of a Boeing 737. Six different instruction versions were developed for each inspection task, varying in the number and type of directed inspections. The amount of time spent inspecting, the number of calls made, and the number of the feedback calls detected all varied widely across the inspectors. However, inspectors who used instructions with a higher number of directed inspections referred to the instructions more often during and after the task, and found a higher percentage of a selected set of feedback cracks than inspectors using other instruction versions. This suggests that specific instructions can help overall inspection performance, not just performance on the defects specified. Further, instructions were shown to change the way an inspector approaches a task.
Memory may be the only system component that is more commoditized than a microprocessor. To simultaneously exploit this and address the impending memory wall, processing in memory (PIM) research efforts are considering ways to move processing into memory without significantly increasing the cost of the memory. As such, PIM devices may become the basis for future commodity clusters. Although these PIM devices may leverage new computational paradigms such as hardware support for multi-threading and traveling threads, they must provide support for legacy programming models if they are to supplant commodity clusters. This paper presents a prototype implementation of MPI over a traveling thread mechanism called parcels. A performance analysis indicates that the direct hardware support of a traveling thread model can lead to an efficient, lightweight MPI implementation.
Mechanisms for enhanced low-dose-rate sensitivity are described. In these mechanisms, bimolecular reactions dominate the kinetics at high dose rates thereby causing a sub-linear dependence on total dose, and this leads to a dose-rate dependence. These bimolecular mechanisms include electron-hole recombination, hydrogen recapture at hydrogen source sites, and hydrogen dimerization to form hydrogen molecules. The essence of each of these mechanisms is the dominance of the bimolecular reactions over the radiolysis reaction at high dose rates. However, at low dose rates, the radiolysis reaction dominates leading to a maximum effect of the radiation.
Properties of relevance for the equation of state for a high-density glass are discussed. We review the effects of failure waves, comminuted phase, and compaction on the validity of the Mie-Grueneisen EOS. The specific heat and the Grueneisen parameter at standard conditions for a {rho}{sub 0} = 5.085 g/cm{sup 3} glass ('Glass A') is then estimated to be 522 mJ/g/K and 0.1-0.3, respectively. The latter value is substantially smaller than the value of 2.1751 given in the SESAME tables for a high-density glass with {rho}{sub 0} = 5.46 g/cm{sup 3}. The present unusual value of the Grueneisen parameter is confirmed from the volume dependence determined from fitting the Mie-Grueneisen EOS to shock data in Ref. [2].
Submovements are hypothesized building blocks of human movement, discrete ballistic movements of which more complex movements are composed. Using a novel algorithm, submovements were extracted from the point-to-point movements of 41 persons recovering from stroke. Analysis of the extracted submovements showed that, over the course of therapy, patients' submovements tended to increase in peak speed and duration. The number of submovements employed to produce a given movement decreased. The time between the peaks of adjacent submovements decreased for inpatients (those less than 1 month post-stroke), but not for outpatients (those greater than 12 months post-stroke) as a group. Submovements became more overlapped for all patients, but more markedly for inpatients. The strength and consistency with which it quantified patients' recovery indicates that analysis of submovement overlap might be a useful tool for measuring learning or other changes in motor behavior in future human movement studies.
Proposed for publication in Coordinated & Multiple Views in Exploratory Visualization, Special Issue of Information Visualization Journal, Vol 2 No. 4, Palgrave/Macmillan.
We present a series of electronic structure calculations that demonstrate a mechanism for spontaneous ionization of hydrogen at the Si-SiO{sub 2} interface. Specifically, we show that an isolated neutral hydrogen atom will spontaneously give up its charge and bond to a threefold coordinated oxygen atom. We refer to this entity as a proton. We have calculated the potential surface and found it to be entirely attractive. In contrast, hydrogen molecules will not undergo an analogous reaction. We relate these calculations both to proton generation experiments and to hydrogen plasma experiments.
Density functional theory is used to predict workfunctions, {psi}. For relaxed clean W(1 0 0), the local density approximation (LDA) agrees with experiment better than the newer generalized gradient approximation, probably due to the surface electron self-energy. The large Ba metallic radius indicates it covers W(1 0 0) at about 0.5 monolayer (ML). However, Ba{sup 2+}, O{sup 2-}, and metallic W all have similar radii. Thus 1 ML of BaO (one BaO unit for each two W atoms) produces minimum strain, indicating commensurate interfaces. BaO (1 ML) and Ba (1/2 ML) have the same {psi} to within 0.02 V, so at these coverages reduction or oxidation is not important. Due to greater chemical activity of ScO vs. highly ionic BaO, when mixing the latter with this suboxide of scandia, the overlayer always has BaO as the top layer and ScO as the second layer. The BaO/ScO bilayer has a rocksalt structure, suggesting high stability. In the series BaO/ScO/, BaO/YO/, and BaO/LaO/W(1 0 0), the latter has a remarkably low {psi} of 1.3 V (LDA), but 2 ML of rocksalt BaO also has {psi} at 1.3 V. We suggest BaO (1 ML) does not exist and that it is worthwhile to attempt the direct synthesis and study of BaO (2 ML) and BaO/LaO.