02451nas a2200409 4500008004100000022002200041245004600063210004600109260003700155300001400192520123100206653003401437653002901471653002701500653001301527653001901540653002201559653001901581653002301600653002301623653003501646653002001681653002501701653002201726653004101748653001601789653002401805653002401829653001601853653004301869653002801912100001801940700001801958700001301976700001501989856003702004 2015 eng d a978-1-4673-7286-200aScalable Euclidean Embedding for Big Data0 aScalable Euclidean Embedding for Big Data aNew York City, NYbIEEEc07/2015 a773 - 7803 aEuclidean embedding algorithms transform data defined in an arbitrary metric space to the Euclidean space, which is critical to many visualization techniques. At big-data scale, these algorithms need to be scalable to massive data-parallel infrastructures. Designing such scalable algorithms and understanding the factors affecting the algorithms are important research problems for visually analyzing big data. We propose a framework that extends the existing Euclidean embedding algorithms to scalable ones. Specifically, it decomposes an existing algorithm into naturally parallel components and non-parallelizable components. Then, data parallel implementations such as MapReduce and data reduction techniques are applied to the two categories of components, respectively. We show that this can be possibly done for a collection of embedding algorithms. Extensive experiments are conducted to understand the important factors in these scalable algorithms: scalability, time cost, and the effect of data reduction to result quality. The result on sample algorithms: Fast Map-MR and LMDS-MR shows that with the proposed approach the derived algorithms can preserve result quality well, while achieving desirable scalability.10aAlgorithm design and analysis10aApproximation algorithms10aarbitrary metric space10aBig Data10aBig data scale10aComplexity theory10adata reduction10adata visualisation10adata visualization10aEuclidean embedding algorithms10aEuclidean space10aFastMap-MR algorithm10aLMDS-MR algorithm10amassive data parallel infrastructure10aMeasurement10aparallel algorithms10aparallel processing10aScalability10ascalable Euclidean embedding algorithm10avisualization technique1 aAlavi, Zohreh1 aSharma, Sagar1 aZhou, Lu1 aChen, Keke uhttp://www.knoesis.org/node/2748