{"id":2993,"date":"2011-06-24T20:13:49","date_gmt":"2011-06-24T20:13:49","guid":{"rendered":"https:\/\/mtlsites.mit.edu\/annual_reports\/2011\/?p=2993"},"modified":"2011-07-19T19:19:45","modified_gmt":"2011-07-19T19:19:45","slug":"power-and-performance-optimized-sram-caches-for-exascale-processors-2","status":"publish","type":"post","link":"https:\/\/mtlsites.mit.edu\/annual_reports\/2011\/power-and-performance-optimized-sram-caches-for-exascale-processors-2\/","title":{"rendered":"Power and Performance Optimized SRAM Caches for Exascale Processors"},"content":{"rendered":"

On-chip memories are responsible for a large portion (40% by many estimates) of the total energy consumption and area of modern processor designs. Therefore, memory optimization for density, power, and frequency trade-offs is crucial to meet the aggressive power goals for the exascale processors. Today’s cache bit-cells in 65-nm CMOS consume 1 pJ per access at 1.0 V. Our goal is to reduce this amount by a factor of 18 to Angstrom Project\u2019s target at 11-nm technology. This gives us a clear target of 50 fJ energy per operation (E\/Op) per bit-cell at 11-nm.<\/p>\n

We are designing the first version of Angstrom microprocessor\u2019s L1-cache using 65-nm CMOS. To decrease E\/Op from ~1 pJ to ~200 fF per bit-cell, we designed our L1-cache bit-cells to work down to 0.5 V. SRAM bit-cells suffer from decreased stability at low-voltages. In Figure 1, read and write margins of a bit-cell are simulated by 1000-point Monte Carlo analyses at 0.5 V, and negative values indicate failures. To combat margin problems, the work in [1<\/a>] <\/sup> uses an 8-transistor (8T) bit-cell. This bit-cell’s read port is de-coupled from the storage nodes as it is given in Figure 1, so it is immune to read-upsets. The remaining 6 transistors can be sized to favor writes.<\/p>\n

For the 8T bit-cell, a single ended sense amplifier (SA) is necessary since read-bit-line (RBL) is the only port used for the read operation. Different SA techniques have been analyzed such as non-strobed regenerative sensing [2<\/a>] <\/sup> and strobed strong-arm based sensing; the latter is chosen due to its robust design and low-voltage compatibility. The offset of SA is reduced using offset-compensation techniques, and this concept is illustrated by 1000-point Monte Carlo analyses on input offset of our strong-arm type SA with or without compensation. All analyses shown are done on 22-nm predictive technology [3<\/a>] <\/sup>.<\/p>\n\n\t\t