Observatorio de I+D+i UPM

| Otras actividades
HOME

Proyectos Internacionales Art�culos Patentes UPM Software UPM Empresas UPM Otras actividades Memorias de investigaci�n

Memorias de investigación

Art�culos en revistas:

Reliability of a System of k Nodes for High Performance Computing Applications

A�o:2010

�reas de investigaci�n

Ciencias de la computaci�n y tecnolog�a inform�tica

Datos

Descripci�n
Reliability estimation of High Performance Computing (HPC) systems enables resource allocation, and fault tolerance frameworks to minimize the performance loss due to unexpected failures. Recent studies have shown that compute nodes in HPC systems follow a time varying failure rate distribution such as Weibull, instead of the exponential distribution. In this paper, we propose a model for the Time to Failure (TTF) distribution of a system of k s-independent nodes when individual nodes exhibit time varying failure rates. We also present the system reliability, failure rates, Mean Time to Failure (MTTF), and derivations of the proposed system TTF model. The model is validated using observed data on time to failure.
Internacional	Si
JCR del ISI	Si
T�tulo de la revista	IEEE TRANSACTIONS ON RELIABILITY
ISSN	0018-9529
Factor de impacto JCR	1,331
Informaci�n de impacto
Volumen
DOI
N�mero de revista
Desde la p�gina	162
Hasta la p�gina	169
Mes	ENERO
Ranking

Esta actividad pertenece a memorias de investigaci�n

Participantes

Autor: Mihaela Marinela Paun . UPM

Grupos de investigaci�n, Departamentos, Centros e Institutos de I+D+i relacionados

Creador: Grupo de Investigaci�n: Grupo de Inteligencia Artificial (LIA)