NOTE: This is from book <Pro Oracle Database 11g RAC on Linux>
Oracle RAC—though branded as an entirely new product when released with Oracle 9i Release 1—
has a long track record. Initially known as Oracle Parallel Server (OPS), it was introduced with Oracle
6.0.35, which eventually was renamed Oracle 6.2. OPS was based on the VAX/VMS distributed lock
manager because VAX/VMS machines essentially were the only clustered computers at the time;
however, the DLM used proved too slow for OPS due to internal design limitations. So Oracle
development wrote its own distributed lock manager, which saw the light of day with Oracle 6.2 for
The OPS code matured well over time in the Oracle 7, 8, and 8i releases. You can read a remarkable
story about the implementation of OPS in Oracle Insights: Tales of the Oak Table (Apress, 2004).
Finally, with the advent of Oracle 9.0.1, OPS was relaunched as Real Application Clusters, and it
hadn’t been renamed since. Oracle was available on the Linux platform prior to 9i Release 1, but at that
time no standard enterprise Linux distributions as we know them today were available. Linux—even
though very mature by then—was still perceived to be lacking in support, so vendors such as Red Hat
and SuSE released road maps and support for their distributions alongside their community versions. By
2001, these platforms emerged as stable and mature, justifying the investment by Oracle and other big
software players, who recognized the potential behind the open source operating system. Because it
runs on almost all hardware, but most importantly on industry-standard components, Linux offers a
great platform and cost model for running OPS and RAC.
At the time the name was changed from OPS to RAC, marketing material suggested that RAC was an
entirely new product. However, RAC 9i was not entirely new at the time; portions of its code were
leveraged from previous Oracle releases.
That said, there was a significant change between RAC and OPS in the area of cache coherency. The
basic dilemma any shared-everything software has to solve is how to limit access to a block at a time. No
two processes can be allowed to modify the same block at the same time; otherwise, a split brain
situation would arise. One approach to solving this problem is to simply serialize access to the block.
However, that would lead to massive contention, and it wouldn’t scale at all. So Oracle’s engineers
decided to coordinate multiple versions of a block in memory across different instances. At the time,
parallel cache management was used in conjunction with a number of background processes (most
notably the distributed lock manager, DLM). Oracle ensured that a particular block could only be
modified by one instance at a time, using an elaborate system of locks. For example, if instance B needed
a copy of a block instance A modified, then the dirty block had to be written to disk by instance A before
instance B could read it. This was called block pinging, which tended to be slow because it involved disk
activity. Therefore, avoiding or reducing block pinging was one of Oracle’s design goals when tuning and
developing OPS applications; a lot of effort was spent on ensuring that applications connecting to OPS
changed only their own data.
The introduction of Cache Fusion phase I in Oracle 8i proved a significant improvement. Block
pings were no longer necessary for consistent read blocks and read-only traffic. However, they were still
needed for current reads. The Cache Fusion architecture reduced the need to partition workload to
instances. The Oracle 8.1.5 “New Features” guide states that changes to the interinstance traffic
“… a new diskless ping architecture, called cache fusion, that provides copies of blocks
directly from the holding instance’s memory cache to the requesting instance’s
memory cache. This functionality greatly improves interinstance communication.
Cache fusion is particularly useful for databases where updates and queries on the
same data tend to occur simultaneously and where, for whatever reason, the data and
users have not been isolated to specific nodes so that all activity can take place on a
single instance. With cache fusion, there is less need to concentrate on data or user
partitioning by instance.”
In Oracle 9i Release 1, Oracle finally implemented Cache Fusion phase II, which uses a fast, high
speed interconnect to provide cache-to-cache transfers between instances, completely eliminating disk
IO and optimizing read/write concurrency. Finally, blocks could be shipped across the interconnect for
current and consistent reads.
Oracle addressed two general weaknesses of its Linux port with RAC 9.0.1: previous versions lacked
a cluster manager and a cluster file system. With Oracle 9i, Oracle shipped its cluster manager, called
OraCM for Linux and Windows NT (all other platforms used a third-party cluster manager). OraCM
provided a global view of the cluster and all nodes in it. It also controlled cluster membership, and it
needed to be installed and configured before the actual binaries for RAC could be deployed.
Cluster configuration was stored in a sever-management file on shared storage, and cluster
membership was determined by using a quorum file or partition (also on shared storage).
Oracle also initiated the Oracle Cluster File System (OCFS) project for Linux 2.4 kernels
(subsequently OCFS2 has been developed for 2.6 kernels, see below); this file system is released under
the GNU public license. OCFS version one was not POSIX compliant; nevertheless, it allowed users to
store Oracle database files such as control files, online redo logs, and database files. However, it was not
possible to store any Oracle binaries in OCFS for shared Oracle homes. OCFS partitions are configured
just like normal file systems in the configuration file. Equally, they are reported like an
ordinary mount point in output of the command. The main drawback was the inherent
fragmentation that could not be defragmented, except by reformatting the file system.
With the release of Oracle 10.1, Oracle delivered significant improvements in cluster manageability,
many of which have already been discussed. Two of the main new features were Automatic Storage
Management and Cluster Ready Services (which was renamed to Clusterware with 10.2 and 11.1, and is
now called Grid Infrastructure). The ORACM cluster manager, which was available for Linux and
Windows NT only, has been replaced by the Cluster Ready Services feature, which now offers the same
“feel” for RAC on every platform. The server-management file has been replaced by the Oracle Cluster
Registry, whereas the quorum disk is now known as the voting disk. With 10g Release 2, voting disks
could be stored at multiple locations to provide further redundancy in case of logical file corruption. In
10.1, the files could only reside on raw devices; since 10.2, they can be moved to block devices, as well.
The Oracle 11.1 installer finally allows the placement of the Oracle Cluster Registry and voting disks on
block devices without also having to use raw devices. Raw devices have been deprecated in the Linux
kernel in favor of the O_DIRECT flag. With Grid Infrastructure 11.2, the voting disk and cluster registry
should be stored in ASM, and they are only allowed on block/raw devices during the migration phase.
ASM is a clustered logical volume manager that’s available on all platforms and is Oracle’s preferred
storage option—in fact, you have to use ASM with RAC Standard Edition.
In 2005, Oracle released OCFS2, which was now finally POSIX compliant and much more feature
rich. It is possible to install Oracle binaries on OCFS2, but the binaries have to reside on a different
partition than the datafiles because different mount options are required. It is no longer possible to
install Grid Infrastructure, the successor to Clusterware, as a shared Oracle home on OCFS2; however, it
is possible to install the RDBMS binaries on OCFS2 as a shared Oracle home.
Since the introduction of RAC, we’ve seen the gradual change from SMP servers to hardware, based
on the industry-standard x86 and x86-64 architectures. Linux has seen great acceptance in the industry,
and it keeps growing, taking market share mainly from the established UNIX systems, such as IBM’s AIX,
HP-UX, and Sun Solaris. With the combined reduced costs for the hardware and the operating system,
RAC is an increasingly viable option for businesses.