Embolic-Stroke 发表于 2025-3-25 03:51:10
Sparse Matrix Multiplication on Dataflow Enginesents. The proposed architecture allows replication of its blocks in order to parallelize the computation. The architecture is implemented on Maxeler dataflow engine based on Virtex 5 FPGA. The implementation results are given.Brochure 发表于 2025-3-25 07:42:13
http://reply.papertrans.cn/75/7411/741018/741018_22.png离开就切除 发表于 2025-3-25 13:18:31
Parallel Algorithm for Quasi-Band Matrix-Matrix Multiplicationt an efficient algorithm for multiplying two such matrices on a many-core architecture such as a GPU..Our implementation outperforms the corresponding library implementation by a factor of 2x on average over a wide variety of quasi-band matrices from standard datasets. We analyze our performance over synthetic quasi-band matrices.Water-Brash 发表于 2025-3-25 16:47:42
Massively Parallel Approach to Sensitivity Analysis on HPC Architectures by Using Scalarm Platformate Sensitivity Analysis calculations by using modern e-infrastructures for distribution and parallelization purposes. The paper contains both description of the proposed solution and results obtained for a selected industrial case study.Glower 发表于 2025-3-25 20:12:50
0302-9743facilitate efficient and convenient utilization of modern parallel and distributed computing architectures, as well as on large-scale applications, including big data problems..978-3-319-32148-6978-3-319-32149-3Series ISSN 0302-9743 Series E-ISSN 1611-3349Kidney-Failure 发表于 2025-3-26 02:56:29
http://reply.papertrans.cn/75/7411/741018/741018_26.png令人心醉 发表于 2025-3-26 05:38:25
Exploring Memory Error Vulnerability for Parallel Programming Modelsect of memory errors through program flow. Our results show the need for soft error resiliency methods based on memory behavior of programs, and the evaluation of the tradeoffs between performance and reliability.吵闹 发表于 2025-3-26 11:07:28
Performance Analysis of the Chebyshev Basis Conjugate Gradient Method on the K Computerdicates that the CBCG method is faster than CG method if the number of cores is sufficient large. We then measure the execution time of both methods on the K computer, and obtained results agree with our estimation.Amenable 发表于 2025-3-26 13:16:28
http://reply.papertrans.cn/75/7411/741018/741018_29.pngIngredient 发表于 2025-3-26 18:33:50
http://reply.papertrans.cn/75/7411/741018/741018_30.png