Shuttle 发表于 2025-3-23 13:23:28

Tuning Hadoop Map Slot Value Using CPU Metrica large number of companies who have adopted Hadoop for their business purposes. One of the configuration parameters that influences the resource allocation and thus the performance of a Hadoop application is map slot value (MSV). MSV determines the number of map tasks that run concurrently on a nod

Assault 发表于 2025-3-23 17:35:37

A Study of SQL-on-Hadoop Systems, providing SQL analysis functionality to the big data resided in HDFS becomes more and more important. Hive is a pioneer system that support SQL-like analysis to the data in HDFS. However, the performance of Hive is not satisfactory for many applications. This leads to the quick emergence of dozens

Entreaty 发表于 2025-3-23 20:24:00

http://reply.papertrans.cn/19/1857/185635/185635_13.png

Emg827 发表于 2025-3-24 00:03:15

http://reply.papertrans.cn/19/1857/185635/185635_14.png

Condescending 发表于 2025-3-24 04:16:00

Efficient HTTP Based I/O on Very Large Datasets for High Performance Computing with the Libdavix Libprotocols are highly optimized for high throughput on very large datasets, multi-streams, high availability, low latency and efficient parallel I/O. The purpose of this paper is to describe how we have adapted a generic protocol, the Hyper Text Transport Protocol (HTTP) to make it a competitive alte

发表于 2025-3-24 09:56:50

DSIMBench: A Benchmark for Microarray Data Using Rthe tool kits are suited for a specific environment. In this paper we propose DSIMBench, a benchmark containing two classic microarray analysis functions with eight different parallel R workflows, and evaluate the benchmark in the IC Cloud testbed platform.

反感 发表于 2025-3-24 13:25:49

http://reply.papertrans.cn/19/1857/185635/185635_17.png

任意 发表于 2025-3-24 16:03:22

http://reply.papertrans.cn/19/1857/185635/185635_18.png

大方一点 发表于 2025-3-24 20:50:05

https://doi.org/10.1007/978-94-017-6798-9Sort, Kmeans and PageRank. We conduct detailed deep analysis of their I/O characteristics, including disk read/write bandwidth, I/O devices utilization, average waiting time of I/O requests, and average size of I/O requests, which act as a guide to design highperformance, low-power and cost-aware big data storage systems.

不足的东西 发表于 2025-3-25 02:16:22

Generalizations of Dirichlet Convolution,7 % and 50 % speedups compared with those of Hadoop and Spark, respectively. Most of the benefits come from the high-efficiency communication mechanisms in DataMPI. We also notice that the resource (CPU, memory, disk and network I/O) utilizations of DataMPI are also more efficient than those of the other two frameworks.
页: 1 [2] 3 4 5 6
查看完整版本: Titlebook: Big Data Benchmarks, Performance Optimization, and Emerging Hardware; 4th and 5th Workshop Jianfeng Zhan,Rui Han,Chuliang Weng Conference p