30 Jan, 2026
0 Comments
3 Mins Read

What Happens When One Node Goes Down?

In an Oracle RAC (Real Application Clusters) environment, node failure is not an exception it’s an expected scenario. RAC is designed for high availability, meaning the database should continue running even if one node goes down.

This blog explains step-by-step what actually happens when one node goes down, from the moment of failure to full stabilization, in simple and practical DBA language.

What Does “One Node Goes Down” Mean?

A node can go down due to several reasons:

Server crash or power failure
OS hang or kernel panic
Network failure
Manual shutdown or reboot
Hardware issues (CPU, RAM, disk)

In RAC terms, this means:

One instance + its local services are no longer available

But the database itself is NOT down.

Immediate Detection by Oracle Clusterware

Oracle Clusterware continuously monitors all nodes using:

Voting disks
Private interconnect

What happens first?

Surviving nodes stop receiving heartbeat from the failed node
Clusterware confirms the node is unreachable
Node is declared evicted / down

This usually happens within a few seconds.

Instance on Failed Node Terminates

Once the node is marked down:

The Oracle instance running on that node is terminated
All background processes (PMON, SMON, DBWR, LGWR) stop
Memory (SGA) on that node is lost

Any active sessions on that node are disconnected immediately.

What Happens to User Sessions?

Sessions on Failed Node

All sessions connected to that instance are terminated
Uncommitted transactions are rolled back

Sessions on Other Nodes

Sessions on remaining nodes continue normally
No impact if applications are RAC-aware

If Application Continuity (AC) / TAF is configured:

Sessions may reconnect automatically
In-flight transactions may be replayed

Global Cache Service (GCS) & Global Enqueue Service (GES)

These two services play a critical role after node failure.

GCS Actions

Reassigns cache ownership
Cleans up dirty buffers from failed instance
Ensures data consistency

GES Actions

Releases locks held by failed instance
Prevents deadlocks

This process is called:

Instance Recovery

Instance Recovery on Surviving Node(s)

One of the surviving instances automatically performs instance recovery.

During Instance Recovery:

Redo logs of failed instance are read
Uncommitted transactions are rolled back
Committed but unapplied changes are applied

Data consistency is fully restored

Duration depends on:

Number of active transactions
Redo generated
System load

Services Failover

Oracle RAC services are configured with preferred and available instances.

When a node goes down:

Services running on failed node are relocated
They start on surviving node(s)

Applications using SCAN listeners automatically connect to new instances.

What Happens to SCAN and Listeners?

SCAN listeners remain available
Local listener on failed node stops
SCAN redirects connections to healthy nodes

End users usually don’t notice anything except a brief reconnect.

ASM Behavior During Node Failure

If ASM is used:

ASM instance on failed node goes down
ASM on surviving node continues
Disks remain accessible

If redundancy is NORMAL/HIGH:

No data loss
ASM rebalance is not triggered immediately

Alerts and Logs Generated

As a DBA, you’ll see alerts in:

Clusterware alert log
Database alert log
CRS logs
OEM alerts (if configured)

Typical messages:

Node eviction detected
Instance terminated
Instance recovery completed

What DBA Should Check After Node Failure

Immediate Checks

crsctl stat res -t
olsnodes -n -s
Database instance status

Logs to Review

CRS alert log
Database alert log
OS logs

Recovery Actions

Fix OS / network / hardware issue
Restart node
Verify services placement

Key Points to Remember (Interview Gold)

RAC is designed to survive node failures
Database remains available
Only sessions on failed node are affected
Instance recovery is automatic
No manual DBA intervention required (usually)

Final Summary

When one node goes down in Oracle RAC:

Clusterware detects failure
Instance terminates
Sessions on that node are lost
Other nodes continue working
Services failover automatically
Data consistency is maintained

This is true high availability in action.

Learning Oracle RAC doesn’t have to be complicated. At Learnomate Technologies we focus on clear explanations, hands-on learning, and real DBA scenarios that actually happen in production.

What Happens When One Node Goes Down?

What Happens When One Node Goes Down?

What Does “One Node Goes Down” Mean?

Immediate Detection by Oracle Clusterware

What happens first?

Instance on Failed Node Terminates

What Happens to User Sessions?

Sessions on Failed Node

Sessions on Other Nodes

Global Cache Service (GCS) & Global Enqueue Service (GES)

GCS Actions

GES Actions

Instance Recovery on Surviving Node(s)

During Instance Recovery:

Services Failover

What Happens to SCAN and Listeners?

ASM Behavior During Node Failure

Alerts and Logs Generated

What DBA Should Check After Node Failure

Immediate Checks

Logs to Review

Recovery Actions

Key Points to Remember (Interview Gold)

Final Summary

Let's Talk

Let's Talk