Grids

1. Understanding Grids

Grids provide a distributed system for sharing computational and data resources across multiple administrative domains. They enable high-performance computing (HPC) applications to leverage geographically distributed resources efficiently.

1.1 Grid Applications

Grid computing is ideal for computation-intensive applications such as weather modeling, biological simulations, and physics experiments. An example is the Rapid Atmospheric Modeling System (RAMS), which uses grids to predict weather phenomena by running models on multiple processors.

Characteristics:
- High computation-to-data ratio
- Distributed resource utilization

1.2 Grid Infrastructure

Grids consist of resources from various sites, like university clusters or supercomputing centers. Resources can be workstations or dedicated servers used during idle times or continuously.

1.2.1 Directed Acyclic Graph (DAG) for Jobs

Grid applications are structured as DAGs, where:

Each node represents a job
Edges indicate dependencies between jobs

Jobs are embarrassingly parallel, meaning they can be split into tasks executed independently on different machines. For example:

Job 0 → Job 1 and Job 2 → Job 3

Here, Job 1 and Job 2 can execute simultaneously, followed by Job 3.

2. Scheduling in Grids

Efficient scheduling is crucial to optimize resource utilization and reduce execution time.

2.1 Intrasite Scheduling

Handles job scheduling within a site. Protocols like HTCondor manage task distribution, resource allocation, and fault recovery.

Key Features:

Monitoring and restarting failed tasks
Data staging within the site
Task execution on idle workstations or dedicated servers

2.2 Intersite Scheduling

Manages resource allocation across multiple sites. Standard protocols like Globus oversee job distribution and file transfers between sites.

Key Responsibilities:

Staging in and out of files
Interoperability with local schedulers
Resource delegation across federated sites

2.2.1 Globus Toolkit

An open-source platform providing tools for intersite grid management.

Components:

GridFTP: Transfers large datasets between sites.
GRAM: Allocates and manages jobs across grids.
Replica Location Service (RLS): Maps user-friendly file names to physical locations.

3. Security in Grid Computing

Security is critical due to the federated nature of grids. Mechanisms like Grid Security Infrastructure (GSI) ensure data integrity and user authentication.

Key Features:

Single sign-on for seamless user access
Support for local security protocols (e.g., Kerberos, UNIX authentication)
Delegation and third-party authentication

4. Comparing Grid and Cloud Computing

Both provide distributed computing resources but differ in control and use cases.

Grid: Federated resources across organizations. Emphasizes computation-intensive tasks.
Cloud: Centrally managed resources. Focuses on scalability and on-demand resource allocation.