Also , the relactions between the best block size for matrix transpose and the size and associativity of the processor ' s cache is formulized . for parallel optimization , several programming models available on a numa system , such as lightweight processes ( sproc ) , posix threads , openmp and mpi , are compared , and their speedup and coding complexity are analyzed 對于sar成像處理的并行優(yōu)化,本文對比了在numa架構(gòu)上可用的幾種并行編程模型:輕量級進(jìn)程、 posix線程、 openmp和mpi ,針對numa架構(gòu)和sar成像處理的特點(diǎn)從加速比、編程復(fù)雜度等多個(gè)方面進(jìn)行了討論。