TY - Generic
T1 - Scalable Euclidean Embedding for Big Data
T2 - 2015 IEEE 8th International Conference on Cloud Computing
Y1 - 2015
A1 - Zohreh Alavi
A1 - Sagar Sharma
A1 - Lu Zhou
A1 - Keke Chen
KW - Algorithm design and analysis
KW - Approximation algorithms
KW - arbitrary metric space
KW - Big Data
KW - Big data scale
KW - Complexity theory
KW - data reduction
KW - data visualisation
KW - data visualization
KW - Euclidean embedding algorithms
KW - Euclidean space
KW - FastMap-MR algorithm
KW - LMDS-MR algorithm
KW - massive data parallel infrastructure
KW - Measurement
KW - parallel algorithms
KW - parallel processing
KW - Scalability
KW - scalable Euclidean embedding algorithm
KW - visualization technique
AB - Euclidean embedding algorithms transform data defined in an arbitrary metric space to the Euclidean space, which is critical to many visualization techniques. At big-data scale, these algorithms need to be scalable to massive data-parallel infrastructures. Designing such scalable algorithms and understanding the factors affecting the algorithms are important research problems for visually analyzing big data. We propose a framework that extends the existing Euclidean embedding algorithms to scalable ones. Specifically, it decomposes an existing algorithm into naturally parallel components and non-parallelizable components. Then, data parallel implementations such as MapReduce and data reduction techniques are applied to the two categories of components, respectively. We show that this can be possibly done for a collection of embedding algorithms. Extensive experiments are conducted to understand the important factors in these scalable algorithms: scalability, time cost, and the effect of data reduction to result quality. The result on sample algorithms: Fast Map-MR and LMDS-MR shows that with the proposed approach the derived algorithms can preserve result quality well, while achieving desirable scalability.
JA - 2015 IEEE 8th International Conference on Cloud Computing
PB - IEEE
CY - New York City, NY
ER -