12:00-1:30 Registration - Wilson Hall Atrium South End
1:30 Opening Session - Chaired by Dane Skow (FNAL)
Opening - Goals of the Workshop - Alan Silverman (CERN) and Dane Skow (FNAL)1:45 Fermi Lab Overview and Welcome - Matthias Kasemann (FNAL)
2:15 LHC Computing Needs - Wolfgang Von Rueden (CERN)
3:00 The IEEE Task Force on Cluster Computing and Directions in Scalable Clusters - Bill Gropp (ANL)
4:00 Panel on Usage Cases - FNAL, BNL, BaBar, Non-HEP
Panel Chair: Dane Skow
For Each Site:
- Tom Yanuklis Talk (RHIC)
- Charles Young Talk (BaBar)
- Steve Wolbers Talk (FNAL)
- James Cuff Talk (The Sanger Centre)
- Ralf Gerhards Talk (H1 at DESY)
- Atsushi Manabe Talk (Kek)
- Jim Simone Talk (TH QCD at Fermilab)
- A Brief Description of Each Cluster
- Its Size
- Its Architecture
- What it is used for
- Any Special Features
- What decisions/accomodations (architecture, hardware etc) did you make because of any special nature of the applications to be run.
- What Optimizations Did You Make
5:30 End of Day 1
9:00 - 10:30 Clusters at Large Sites
Chairman - Steve Wolbers10:30 Break
10:50 Panel on Hardware Issues
Selection Criteria, Life Cycles, Cluster Heterogeneity Etc...12:00 Lunch
Chairwoman: Lisa Giachetti
- What criteria are used to select hardware - price, price performance, compatibility with another site, in-house expertise, future evolution of the architecture, network interconnect, etc. Obviously all of these may play a role - which are the 3 most important in order of significance.
- Do you perform your own benchmarking of equipment?
- How do you handle life-cycles of the hardware? For example, the evolution of Pentium processors where later configurations and generations may need a new system image.
- Have you experience, positive or negative, with heterogeneous clusters
1:00 Parallel Sessions I
Parallel Session A1 - Cluster Design, Configuration Management
Chairman - Thomas Davis (LBNL)
Parallel Session B1 - Data Access, Data Movement
- Do you use modeling tools to design the cluster?
- Do you use a formal database for configuration management?
Chairman: Don Petravick (FNAL)3:00 Break
- James Cuff (Sanger Inst)
- Doug Thain (Wisconsin) (Condor I/O)
- Kors Bos Talk (NIKHEF)
- Chris Dwan Talk (Minnesota)
- Size of the data store and tools in use
3:30 Parallel Sessions II
Parallel Session A2 - Installation, Upgrading, Testing
Chairman: Steven Timm (FNAL)Parallel Session B2 - CPU and Resource Allocation
- Atsushi Manabe Talk (KEK) (Dolly++)
- Philip Papadopoulos (San Diego) (Rocks)
- Jarek Polok (CERN) (Work in progress for DatGRID)
- Do you buy-in installation services? From the supplier or a third-party vendor?
- Do you buy pre-configured systems or build your own configuation?
- Do you upgrade the full cluster at one time or in rolling mode?
- Do you perform formal acceptance or burn-in tests?
Chairman - Jim Amundson
- David Bigagli
- Charles Young (BaBar)
- Batch queueing system in use?
- Turnaround guarantees?
- Pre-allocation of resources?
5:30 End of Day 2 Working Sessions
7:00 Social Event
9:00 - 10:30 Perspectives from Smaller Sites
Chairman - Wolfgang von Rueden (CERN)
10:50 Panel on Software Issues
Tool Selection Criteria, Tool Evaluation, Etc12:00 Lunch
Chairman: Ian Bird
- Derek Wright (Wisconsin) (Installing, Configuring and Monitoring a Condor Pool)
- Ruth Pordes (FNAL)
- How do you select software tools? By reputation, from conference reports, after in-house evaluation, by personal experience, etc. Obviously all of these may play a role - which are the 3 most important in order of significance
- Do you trade-off personnel costs against the cost of acquiring commercial tools?
1:00 - 3:00 Parallel Sessions III
Parallel Session A3 - Monitoring
Chairman: Olof BarringParallel Session B3 - User Isses, Security
- Do you monitor services or servers. In other words, do you monitor that a service is being delivered or that a particular hardware or software status is faulty
Chairwoman: Ruth Pordes3:00 Break
- Mark Kaletka Talk (FNAL)
- Chris Dwan (Minnesota)
- Do you have written policies for users - non-abuse of the system, the right to check e-mail, the right to enforce password rules
- Do you have a dedicated security team?
- Do you permit access from off-site, do you enforce rules for this?
3:30 - 5:30 Parallel Sessions IV
Parallel Session A4 - GRID computing
Chairman - Chuck Boeheim (SLAC)Parallel Session B4 - Application Enviroment, Load Balancing, Job and Queue Mgmt
Chairman: Tim Smith (CERN)
- David Bigagli (Platform)
- Jeff Tseng (MIT) (Run 2 CDF Level 3 Trigger Online Cluster)
- Tim Smith (CERN) Talk
- What kind of applications run on the cluster?
- Does the cluster support both interactive and batch jobs
- Is load balancing automatic or manual?
5:30 End of Day 3 Working Sessions
9:00 - 10:20 Summaries of 8 panels (A1-4, B1-4), 10 mins each
Chairman: Alan Silverman10:20 Break
Session A1 Summary (ppt) (HTML)
Session B1 Summary
Session A2 Summary
Session B2 Summary
Session A3 Summary
Session B3 Summary
Session A4 Summary
Session B4 Summary
10:40 Summary from the 5th Conference on Distributed SuperComputing - Greg Lindahl and Neil Pundit
Chairman - Dane Skow (FNAL)12:00 Official Closing of Workshop
Last Updated by AGS, 13 Sep, 2001