원문정보
초록
영어
Job checkpointing is one of the most common utilized techniques for providing fault tolerance in computational grids. The efficiency of checkpointing depends on the choice of the checkpoint interval. Inappropriate checkpointing interval can delay job execution. In this paper, a fault-tolerant job scheduling system based on checkpointing technique is presented and evaluated. When scheduling a job, the system uses both average failure time and failure rate of grid resources combined with resources response time to generate scheduling decisions. The system uses the failure rate of the assigned resources to calculate the checkpoint interval for each job. Extensive simulation experiments are conducted to quantify the performance of the proposed system. Experiments have shown that the proposed system can considerably improve throughput, turnaround time and failure tendency.
목차
1. Introduction
2. Related Work
3. Problem Definition and Scope
4. FTCS Scheduling
4.1. Components of the FTCS
5. Simulation Environment
6. Results and Discussions
6.1. Throughput
6.2. Average Turnaround Time
6.3. Failure Tendency
7. Conclusions
References