endarterectomy 发表于 2025-3-23 11:20:29

http://reply.papertrans.cn/43/4264/426340/426340_11.png

falsehood 发表于 2025-3-23 16:04:20

Programming the LU Factorization for a Multicore System with Acceleratorsle presents an implementation of the algorithm for a hybrid, shared memory, system with standard CPU cores and GPU accelerators. Performance in excess of one TeraFLOPS is achieved using four AMD Magny Cours CPUs and four NVIDIA Fermi GPUs.

荧光 发表于 2025-3-23 18:39:24

Efficient Two-Level Preconditioned Conjugate Gradient Method on the GPUonditioner in combination with deflation. This combination exhibits fine-grain parallelism and hence we gain considerably in execution time when compared with a similar implementation on the CPU. Its numerical performance is comparable to the Block Incomplete Cholesky approach. Our method provides a

accessory 发表于 2025-3-23 22:51:06

http://reply.papertrans.cn/43/4264/426340/426340_14.png

stress-test 发表于 2025-3-24 05:43:19

A High Performance SYMV Kernel on a Fermi-core GPUbandwidth and reduced memory usage. On a Tesla C2050, sustained double-precision and single-precision performances of approximately 43 GFLOPS and 78 GFLOPS, respectively, were achieved. The proposed SYMV kernel also performs on a GeForce GTX580 with 72 GFLOPS and 128 GFLOPS in the double-precision a

知识 发表于 2025-3-24 07:04:09

Optimizing Memory-Bound SYMV Kernel on GPU Hardware Acceleratorsnt execution contexts. High-level programming language extensions (e.g., CUDA), profiling tools (e.g., PAPI-CUDA, CUDA Profiler) are paramount to improve productivity, while effectively exploiting the underlying hardware. We present an optimized numerical kernel for computing the symmetric matrix-ve

Jargon 发表于 2025-3-24 13:07:24

Numerical Simulation of Long-Term Fate of CO2 Stored in Deep Reservoir Rocks on Massively Parallel V worldwide. CO. is captured from large emission sources and injected and stored in deep reservoir rocks, including saline aquifers, depleted oil and gas field. Under typical pressure and temperature conditions at deep reservoirs (depths > 800m), CO. will be stored in supercritical state, subsequentl

sperse 发表于 2025-3-24 16:46:51

http://reply.papertrans.cn/43/4264/426340/426340_18.png

修正案 发表于 2025-3-24 22:30:56

Parallel Scalability Enhancements of Seismic Response and Evacuation Simulations of Integrated EarthSimulator (IES), with the aim of simulating earthquake disaster in large urban areas. For the SRA module, near ideal scalability is attained by introducing a static load balancer which is based on the previous run time data. The use of SystemV IPC as a means of reusing legacy seismic response analys

cruise 发表于 2025-3-25 01:03:35

http://reply.papertrans.cn/43/4264/426340/426340_20.png
页: 1 [2] 3 4 5 6 7
查看完整版本: Titlebook: High Performance Computing for Computational Science - VECPAR 2012; 10th International C Michel Daydé,Osni Marques,Kengo Nakajima Conferenc