Descripción
|
|
---|---|
Large-scale scienti?c experiments increasingly rely on geo- distributed clouds to serve relevant data to scientists world- wide with minimal latency. State-of-the-art caching systems often require the client to access the data through a caching proxy, or to contact a metadata server to locate the closest available copy of the desired data. Also, such caching sys- tems are inconsistent with the design of distributed hash- table databases such as Dynamo, which focus on allowing clients to locate data independently. We argue there is a gap between existing state-of-the-art solutions and the needs of geographically distributed applications, which require fast access to popular objects while not degrading access latency for the rest of the data. In this paper, we introduce a proba- bilistic algorithm allowing the user to locate the closest copy of the data e?ciently and independently with minimal over- head, allowing low-latency access to non-cached data. Also, we propose a network-e?cient technique to identify the most popular data objects in the cluster and trigger their replica- tion close to the clients. Experiments with a real-world data set show that these principles allow clients to locate the clos- est available copy of data with small memory footprint and low error-rate, thus improving read-latency for non-cached data and allowing hot data to be read locally. | |
Internacional
|
Si |
Nombre congreso
|
7th Workshop on Scienti?c Cloud Computing (ScienceCloud) 2016, ACM HPDC |
Tipo de participación
|
960 |
Lugar del congreso
|
Kyoto, Japan |
Revisores
|
Si |
ISBN o ISSN
|
978-1-4503-4353-4 |
DOI
|
http://dx.doi.org/10.1145/2913712.2913715 |
Fecha inicio congreso
|
31/05/2016 |
Fecha fin congreso
|
04/06/2016 |
Desde la página
|
3 |
Hasta la página
|
9 |
Título de las actas
|
Science Cloud 2016 |