A Highly Scalable Parallel Boundary Element Method for Capacitance Extraction

Standard boundary element methods (BEMs) involve both an embarrassingly parallelizable system setup step and a linear system solving step of time complexity O(N3) that cannot be parallelized efficiently. When piecewise constant (PWC) basis functions are adopted to represent solutions, the system solving step dominates the overall computation time (usually more than 90%) and limits the scalability of standard BEMs with the number of parallel computing nodes. For capacitance extraction problems, traditional acceleration techniques, such as the multipole expansion [1] and the pre-corrected FFT methods [2] , can reduce the solving time complexity to O(N log N). However, available parallelization implementations of these two techniques showed that their parallel acceleration saturates quickly with the number of parallel nodes: their parallel efficiency drops to 40% to 60% at just 8 nodes [3] [4] .

The aforementioned methods suffer from poor parallel scalability because their underlying solution representation, PWC basis functions, is inefficient for representing charge distribution, resulting in a large linear system. Solving such a large system dominates the overall computation and drastically degrades the parallel efficiency. To circumvent the bottleneck of solving a large system in parallel, we employ our recently developed instantiable basis functions, which are 30 times more compact than PWC basis functions for the same capacitance accuracy [5] . Accordingly, the computation for solving a system is reduced from the original 90% of the total time to less than 5%, while the embarrassingly parallelizable part is now dominant (growing from 10% of the total time to more than 95%). In addition, we develop four integration techniques to further accelerate the system matrix filling process by 86%. In our demonstrated examples, our new algorithm is 6 times faster than FastCap [1] in a single-core environment and achieves 90% parallel efficiency on a 2-cpu-10-core distributed memory system implemented in C++ with MPI parallelization [6] .

  1. K. Nabors and J. White, “FastCap: A multipole accelerated 3-D capacitance extraction program,” IEEE Transactions on Computer-Aided Design, vol. 10, no. 10, pp. 1447-1459, Nov. 1991. [] [] []
  2. J. R. Phillips and J. K. White, “A precorrected-FFT method for electrostatic analysis of complicated 3-D structures,” IEEE Transaction on Computer-Aided Design, vol. 16, no. 10, pp. 059-1072, Oct. 1997. []
  3. Y. Yuan and P. Banerjee, “A parallel implementation of a fast multipole-based 3-d capacitance extraction program on distributed memory multicomputers,” Journal of Parallel and Distributed Computing, vol. 61, no. 12, pp. 1751–1774, Dec. 2001. [] []
  4. N. R. Aluru, V. B. Nadkarni, and J. White, “A parallel precorrected FFT based capacitance extraction program for signal integrity analysis,” Proc. 33rd annual Design Automation Conference, 1996, pp. 363–366. [] []
  5. Y.-C. Hsiao, T. El-Moselhy, and L. Daniel, “Efficient capacitance solver for 3d interconnect based on template-instantiated basis functions,” IEEE 18th Conference on Electrical Performance of Electronic Packaging and Systems, 2009, pp. 179–182. []
  6. Y.-C. Hsiao and L. Daniel, “A highly scalable parallel boundary element method for capacitance extraction,” Proc. 48th Annual Design Automation Conference, 2011, pp. 552–557. []