ai65

Efficient job scheduling policies in very large distributed systems, which evolve dynamically, as the off-line data processing for LHC experiments, is a challenging task. It requires analysing a large number of parameters describing the jobs and the time dependent state of the system. The problem became even more difficult when not all this parameters are correctly identified or when the knowledge about the state of a large distributed system is incomplete or/and known with a certain delay in the past.
The aim of this paper is to describe a possible approach for the scheduling task, as a system able to dynamically learn and cluster information in a large dimensional parameter space. This dynamic scheduling system should be seen as an adaptive middle layer software, aware of current available resources and based on the "past experience" to optimise the job performance and resource utilisation. This self-organising scheduling system may offer a possible solution to provide an effective use of resources for the off-line data processing jobs for future HEP experiments.
In the area of competitive learning a quite large number of modes exists which may have similar goals but differ considerably in the way they work or the implementation is done to solve certain problems. In our case, a feature mapping architecture able to map a high-dimensional input space into a much lower-dimensional structure in such a way that most of similarly correlated patters in the original data remain in the mapped space. Such a clustering scheme seems possible for this decision making scheme as we expert a strong correlation between the parameters involved. Compared with an "intuitive" model we expect that such an approach will offer a better way to analyse possible options and can evolve and improve itself dynamically.
Results in how such a scheduling algorithm performs are evaluated using a complex simulation frame for modelling distributed processing systems.

Back to Program

Back to ACAT Web page