Grids - DMJCCLT - dmj.one

Grids

1. Understanding Grids

Grids provide a distributed system for sharing computational and data resources across multiple administrative domains. They enable high-performance computing (HPC) applications to leverage geographically distributed resources efficiently.

1.1 Grid Applications

Grid computing is ideal for computation-intensive applications such as weather modeling, biological simulations, and physics experiments. An example is the Rapid Atmospheric Modeling System (RAMS), which uses grids to predict weather phenomena by running models on multiple processors.

1.2 Grid Infrastructure

Grids consist of resources from various sites, like university clusters or supercomputing centers. Resources can be workstations or dedicated servers used during idle times or continuously.

1.2.1 Directed Acyclic Graph (DAG) for Jobs

Grid applications are structured as DAGs, where:

Jobs are embarrassingly parallel, meaning they can be split into tasks executed independently on different machines. For example:

Job 0 → Job 1 and Job 2 → Job 3

Here, Job 1 and Job 2 can execute simultaneously, followed by Job 3.

2. Scheduling in Grids

Efficient scheduling is crucial to optimize resource utilization and reduce execution time.

2.1 Intrasite Scheduling

Handles job scheduling within a site. Protocols like HTCondor manage task distribution, resource allocation, and fault recovery.

Key Features:

2.2 Intersite Scheduling

Manages resource allocation across multiple sites. Standard protocols like Globus oversee job distribution and file transfers between sites.

Key Responsibilities:

2.2.1 Globus Toolkit

An open-source platform providing tools for intersite grid management.

Components:

3. Security in Grid Computing

Security is critical due to the federated nature of grids. Mechanisms like Grid Security Infrastructure (GSI) ensure data integrity and user authentication.

Key Features:

4. Comparing Grid and Cloud Computing

Both provide distributed computing resources but differ in control and use cases.