Sandia News

Radiation-aware Xyce simulations of memory circuits


Integrated circuit simulation for design

The development of reliable, radiation-hardened microelectronics is an important aspect of Sandia’s core mission. For many decades, Sandia has had a prominent role in advancing the state-of-the-art in microsystems, R&D radiation effects, reliability physics and failure analysis. The process of radiation-aware integrated circuit design involves many steps including architectural design, logic design, physical design, physical verification and final sign-off. Each of these steps involves different types of computational tools, which comprise a “tool flow.” Such tool flows are a standard practice in the microelectronics community and form the basis for the electronic design automation industry.

Figure 1. Runtime speedup of block subdomain preconditioned method (overlap = 1, Intel MKL Pardiso, 4 threads) compared to direct method (Intel MKL Pardiso, 16 threads) in simulating SRAM circuit on CTS-2. The MPI processors are varied from 16, 32, 64, and 128, while the number of CTS-2 nodes are varied from 1,2,4,8, and 16.
Figure 1. Runtime speedup of block subdomain preconditioned method (overlap = 1, Intel MKL Pardiso, 4 threads) compared to direct method (Intel MKL Pardiso, 16 threads) in simulating SRAM circuit on CTS-2. The MPI processors are varied from 16, 32, 64, and 128, while the number of CTS-2 nodes are varied from 1,2,4,8, and 16.

One type of computational analysis is circuit simulation that involves a detailed, transistor-level description of the circuit to generate a system of network-coupled differential algebraic equations. Originally made popular by the Berkeley SPICE program, transistor-level simulation becomes impractical for large-scale circuits due to reliance on sparse direct linear solvers. As a result, while SPICE-style simulation is standard practice for analog circuit design, it is less commonly used in design of larger digital circuits. Digital designs are subject to constraints not present in analog designs, and these constraints can often be exploited to expedite computational analysis. Hence, the traditional tool flow for digital integrated circuit (IC) design only applies transistor-level simulation codes in a very limited manner: to characterize standard cells. Analysis of entire ICs—which may comprise billions of transistors—is subsequently performed using specialized, lower-fidelity tools.  These specialized tools include applications such as timing simulators, and use standard cell models as input.

In the context of radiation-hardened design, it can still be beneficial to perform transistor-level circuit simulation on an entire digital IC to provide high-fidelity information of the circuit response. This is because low-fidelity digital design tools are not intended to account for certain types of radiation effects, and their presence can violate some of the assumed constraints, potentially affecting simulation accuracy. Hence, to efficiently perform SPICE-style simulations on large ICs, novel computational tools, like Xyce, are necessary.

Figure 2. Sandia Integrated Circuit.
Figure 2. Sandia Integrated Circuit.

Xyce is an open source, SPICE-compatible, high-performance transistor-level circuit simulator developed at Sandia. It is capable of solving a broad range of circuit problems, from small-scale circuits on desktop computers to extremely large circuits on large-scale, high-performance computing platforms. Actively developed since 1999, Xyce provides the capability to investigate general network systems and has been integral to simulating radiation effects on Sandia-designed circuits as well as biological/neural networks and power grids. Diverse requirements in capacity, application and analysis have necessitated R&D of unique algorithms and techniques that facilitate simulation in both the time and frequency domains. Network simulation tools like Xyce are generally part of a larger analysis tool flow that has motivated recent work in improving capabilities for workflow integration.

Xyce is used to analyze ionizing radiation effects in mission-relevant, application-specific integrated circuits (ASICs). This work is supported by the Advanced Simulation and Computing program’s Accelerated Digital Engineering initiative and facilitates the use of radiation-aware digital engineering techniques within the standard ASIC design flow.

A recent performance study illustrates the scalability that Xyce can achieve on the new CTS-2 HPC Production Cluster, called Amber. This computing resource has 2.0 GHz Intel Sapphire Rapids processors with dual sockets and 56 cores, for a total of 112 cores, and 256 GB RAM per node. There are 1496 nodes available, but the results from this study are limited to 16 nodes.

The circuit of interest for this study is a large, static random-access memory (SRAM) circuit designed for a 180-nm complementary metal oxide semiconductor technology. The SRAM circuit comprises 1.6 million transistors, leading to a coupled system of equations with 8.8 million total unknowns. The computational time required to perform transistor-level simulation on a circuit of this magnitude is dominated by repeatedly solving a linear system of equations. The assembly of the linear system, which evaluates device models to obtain contributions to the matrix and right-hand side vector, is scalable. However, the matrices generated in circuit simulation are typically sparse, have heterogeneous non-symmetric structure and are often ill-conditioned. Traditional circuit simulators rely on direct methods to solve this challenging linear system because they are dependable and easy to use, but such methods scale poorly to this size of problem and dictate the total simulation time. Iterative methods have greater potential for scalability, but their performance is predicated on finding an adequate preconditioner.

Figure 3. Sandia Integrated Circuit.
Figure 3. Sandia Integrated Circuit.

The performance study compared the observed scaling on CTS-2 when using a direct method versus a block subdomain preconditioned iterative method to perform transistor-level simulation. The direct method and block subdomain preconditioner used the same solver, Intel MKL Pardiso, with 16 threads and 4 threads, respectively, to either perform the global or local solve. This strong scaling study illustrated that the linear solver speedup can be substantial when iterative methods are used, up to 5.7x faster on 16 CTS-2 nodes with 128 MPI processors. The optimal number of CTS-2 nodes that can accelerate the simulation is dependent upon the number of MPI processors.

In the end, the result of this study showed that Xyce’s capability allows detailed, radiation-aware circuit performance predictions to be produced in a reasonable amount of time, which in turn can improve understanding of circuit margins, allowing more time for informed decisions to be made.

Next article >