Archive

Archive for the ‘Oracle DB’ Category

resolved – ORA-12578: TNS:wallet open failed

September 1st, 2015 Comments off

If you met error like "ORA-12578: TNS:wallet open failed", then one possibility is that the Oracle RAC Database is using a local wallet(created with parameter -auto_login_local, which is from 11.2 release, usually local wallet is used in a highly confidential system) but the wallet is migrated from another server.

The migrated local wallet can be opened and read without problems on the new host, but the information inside does not match the hostname and this leads to the error ORA-12578: TNS:wallet open failed. Be noted that even on the original host, the wallet cannot be used by another OS user.

Master encryption key is stored in wallet in TDE(transparent data encryption), it's the key that wraps(encrypts) the Oracle TDE columns and tablespace encryption keys. The wallet must be open before you can create the encrypted tablespace and before you can store or retrieve encrypted data. Also when recovering a database with encrypted tablespaces (for example after a SHUTDOWN ABORT or a catastrophic error that brings down the database instance), you must open the Oracle wallet after database mount and before database open, so the recovery process can decrypt data blocks and redo. When you open the wallet, it is available to all session, and it remains open until you explicitly close it or until the database is shut down.

Tablespace encryption encrypts at the physical block level, can perform better than encrypting many columns. When using column encryption for tables,  there is only one table key regardless of the number of encrypted columns in a table, and the table key is stored in data dictionary. And when using tablespace encryption, the tablespace key is stored in the header of each datafile of the encrypted tablespace.

Below is from here:

TDE uses a two tier key mechanism. When TDE column encryption is applied to an existing application table column, a new table key is created and stored in the Oracle data dictionary. When TDE tablespace encryption is used, the individual tablespace keys are stored in the header of the underlying OS file(s). The table and tablespace keys are encrypted using the TDE master encryption key. The master encryption key is generated when TDE is initialized and stored outside the database in the Oracle Wallet. Both the master key and table keys can be independently changed (rotated, re-keyed) based on company security policies. Tablespace keys cannot be re-keyed (rotated); work around is to move the data into a new encrypted tablespace. Oracle recommends backing up the wallet before and after each master key change.

Categories: Databases, IT Architecture, Oracle DB Tags:

resolved – ORA-27303: additional information: Invalid protocol requested (2) or protocol not loaded

August 6th, 2015 Comments off

Today when I tried to start up crs after patching(crsctl start crs), the following error occurred in /u01/app/11.2.0.4/grid/log/test/alerttest.log:

2015-07-31 11:57:54.702:
[/u01/app/11.2.0.4/grid/bin/oraagent.bin(18654)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/scratch/app/11.2.0.4/grid/log/slcai081/agent/ohasd/oraagent_oracle/oraagent_oracle.log"
2015-07-31 11:57:59.894:
[/u01/app/11.2.0.4/grid/bin/oraagent.bin(18654)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/scratch/app/11.2.0.4/grid/log/slcai081/agent/ohasd/oraagent_oracle/oraagent_oracle.log"
2015-07-31 11:58:05.121:
[/u01/app/11.2.0.4/grid/bin/oraagent.bin(18654)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/scratch/app/11.2.0.4/grid/log/slcai081/agent/ohasd/oraagent_oracle/oraagent_oracle.log"
2015-07-31 11:58:10.322:
[ohasd(17944)]CRS-2807:Resource 'ora.asm' failed to start automatically.
2015-07-31 11:58:10.326:
[ohasd(17944)]CRS-2807:Resource 'ora.crsd' failed to start automatically.

And running crsctl check crs, I saw that CRS/EM were not up:

[root@test ~]# /u01/app/11.2.0.4/grid/bin/crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4529: Cluster Synchronization Services is online
CRS-4534: Cannot communicate with Event Manager

Later I tried manually start up ora.crsd, but still failed:

[root@test ~]# /u01/app/11.2.0.4/grid/bin/crsctl start res ora.crsd -init
CRS-2672: Attempting to start 'ora.asm' on 'test'
CRS-5017: The resource action "ora.asm start" encountered the following error:
ORA-27504: IPC error creating OSD context
ORA-00600: internal error code, arguments: [OSDEP_INTERNAL], [], [], [], [], [], [], [], [], [], [], []
ORA-27302: failure occurred at: sskgxplp
ORA-27303: additional information: Invalid protocol requested (2) or protocol not loaded.
. For details refer to "(:CLSN00107:)" in "/u01/app/11.2.0.4/grid/log/test/agent/ohasd/oraagent_oracle//oraagent_oracle.log".
CRS-2674: Start of 'ora.asm' on 'test' failed
CRS-2679: Attempting to clean 'ora.asm' on 'test'
CRS-2681: Clean of 'ora.asm' on 'test' succeeded
CRS-4000: Command Start failed, or completed with errors.

From the output, it's complaining about "Invalid protocol requested". As this RAC is non-exadata, so we should use ipc_g protocol rather than ipc_rds which is for Exadata when relinking oracle binaries. So I made the following change in the script:

#su $OWNER -c "cd $1/rdbms/lib && export ORACLE_HOME=$1 && /usr/bin/make -f ins_rdbms.mk ipc_rds lbac_off dv_off ioracle > /dev/null
su $OWNER -c "cd $1/rdbms/lib && export ORACLE_HOME=$1 && /usr/bin/make -f ins_rdbms.mk ipc_g ioracle > /dev/null"

After that, I rollbacked all patches, and re-run patching, and later all were ok.

PS:

You can run $GRID_HOME/bin/skgxpinfo & $ORACLE_HOME/bin/skgxpinfo to check whether RAC Interconnect is uing UDP or RDS.

Categories: Databases, IT Architecture, Oracle DB Tags:

resolved – ORA-01102: cannot mount database in EXCLUSIVE mode

June 16th, 2015 1 comment

Today when I tried to startup one RAC DB, it failed with ORA-01102:

[oracle@testvm ~]$ srvctl start database -d testdb -o "open"
PRCR-1079 : Failed to start resource ora.testdb.db
CRS-5017: The resource action "ora.testdb.db start" encountered the following error:
ORA-01102: cannot mount database in EXCLUSIVE mode
. For details refer to "(:CLSN00107:)" in "/u01/app/11.2.0.4/grid/log/testvm/agent/crsd/oraagent_oracle//oraagent_oracle.log".

CRS-2674: Start of 'ora.testdb.db' on 'testvm' failed
CRS-2632: There are no more servers to try to place resource 'ora.testdb.db' on that would satisfy its placement policy

SQL> startup
ORA-01102: cannot mount database in EXCLUSIVE mode

Later, I realized that the DB was still out of cluster mode:

SQL> show parameter cluster;

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
cluster_database boolean FALSE
cluster_database_instances integer 1
cluster_interconnects string

So I toke the following steps to take it into cluster mode:

SQL> alter system set cluster_database=true scope=spfile;

System altered.

SQL> alter system set cluster_database_instances=2 scope=spfile;

System altered.

After this, the DB started up normally.

Categories: Databases, IT Architecture, Oracle DB Tags:

generate a load on oracle database

June 5th, 2015 Comments off

Sometimes you're doing a test of Oracle database, but the DB load is not high but you want the load go high to facilitate your testing. Here's the way:

DECLARE
 N NUMBER;
BEGIN
FOR I IN 1..100000 LOOP
 SELECT /*+ ORDERED USE_NL(C) FULL(C) FULL(S)*/ COUNT(*) INTO N
 FROM SH.SALES S, SH.CUSTOMERS C
 WHERE C.CUST_ID = S.CUST_ID AND CUST_FIRST_NAME='Sarah'
 ORDER BY TIME_ID;
 DBMS_LOCK.SLEEP(1);
END LOOP;
END;
/

You can press Ctrl+C to cancel it.

Categories: Databases, IT Architecture, Oracle DB Tags:

create a big table with one million lines on oracle database for testing

May 10th, 2015 Comments off

#First, create tablespace along with datafile

SQL> create tablespace test datafile '/u01/app/oracle/product/11.2.0.4/dbhome_1/test/datafile1.dbf' size 512m;

#Then create table. We disable logging to avoid unnecessary redo data

SQL> create table bigtab tablespace test as select rownum id, a.* from all_objects a where 1=0;
SQL> alter table bigtab nologging;

#now populate the table named bigtab

DECLARE
  L_CNT NUMBER;
  L_ROWS NUMBER := 1000000;
BEGIN
  INSERT /*+ APPEND */ INTO BIGTAB SELECT ROWNUM, A.* FROM ALL_OBJECTS A;
  L_CNT := SQL%ROWCOUNT;
  COMMIT;
  WHILE (L_CNT < L_ROWS)
  LOOP
    INSERT /*+ APPEND */ INTO BIGTAB
    SELECT ROWNUM+L_CNT,
      OWNER, OBJECT_NAME, SUBOBJECT_NAME, OBJECT_ID, DATA_OBJECT_ID, OBJECT_TYPE, CREATED,
      LAST_DDL_TIME, TIMESTAMP, STATUS, TEMPORARY, GENERATED, SECONDARY, NAMESPACE, EDITION_NAME
    FROM BIGTAB
      WHERE ROWNUM <= L_ROWS-L_CNT;
    L_CNT := L_CNT + SQL%ROWCOUNT;
    COMMIT;
   END LOOP;
END;
/

#check result

SQL> select count(*) from bigtab;

COUNT(*)
----------
1000000

#get the tracefile of the session

SQL> SELECT TRACEFILE FROM V$SESSION S, V$PROCESS P WHERE S.PADDR=P.ADDR AND S.SID=SYS_CONTEXT('USERENV','SID');

SQL> alter session set events '10046 trace name context forever, level 12';

#to get the maximum number of blocks that can be read

[oracle@testvm ~]$ grep scattered <trace file from above>

WAIT #139828744903832: nam='db file scattered read' ela= 1715 file#=10 block#=14192 blocks=8 obj#=56403 tim=1431233617613776
WAIT #139828744903832: nam='db file scattered read' ela= 6836 file#=10 block#=14268 blocks=8 obj#=56403 tim=1431233617620994

PS: 

More info is here.

Categories: Databases, Oracle DB Tags:

install oracle instant client to use sqlplus on linux

April 30th, 2015 Comments off

To use sqlplus to connect remotely to oracle database, you should have oracle database client installed on your box. To do this, you can follow below steps:

1. Download oracle-instantclient11.2-basic-11.2.0.3.0-1.x86_64.rpm and oracle-instantclient11.2-sqlplus-11.2.0.3.0-1.x86_64.rpm from here.

2. Install oracle-instantclient11.2-basic-11.2.0.3.0-1.x86_64.rpm and oracle-instantclient11.2-sqlplus-11.2.0.3.0-1.x86_64.rpm using rpm -i <package name>.

3. Set linux environment variables as below in ~/.bashrc:

export LANG=C
export HISTSIZE=100000
export HISTTIMEFORMAT="%h/%d - %H:%M:%S "
export ORACLE_HOME=/usr/lib/oracle/11.2/client64
export PATH=$PATH:/doxer/tools/bin:$ORACLE_HOME/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/oracle/11.2/client64/lib
export PS1='[\u@\h-doxer \W]\$ '

4. Connect using sqlplus, e.g. sqlplus sys/pass@scan-example.test.com:1521/service1 as sysdba

Categories: Databases, IT Architecture, Oracle DB Tags:

printtbl8.sql – oracle sqlplus print output vertically

April 17th, 2015 Comments off

Put the following as printtbl8.sql in $ORACLE_HOME/rdbms/admin/printtbl8.sql:

set serveroutput on
set linesize 200
declare
    l_theCursor    integer default dbms_sql.open_cursor;
    l_columnValue    varchar2(4000);
    l_status        integer;
    l_descTbl    dbms_sql.desc_tab;
    l_colCnt        number;
    procedure execute_immediate( p_sql in varchar2 )
    is
    BEGIN
        dbms_sql.parse(l_theCursor,p_sql,dbms_sql.native);
        l_status := dbms_sql.execute(l_theCursor);
    END;
begin
    execute_immediate( 'alter session set nls_date_format=
                        ''dd-mon-yyyy hh24:mi:ss'' ');
    dbms_sql.parse(    l_theCursor,
                    replace( '&1', '"', ''''),
                    dbms_sql.native );
    dbms_sql.describe_columns( l_theCursor,
                            l_colCnt, l_descTbl );
    for i in 1 .. l_colCnt loop
        dbms_sql.define_column( l_theCursor, i,
                                l_columnValue, 4000 );
    end loop;
    l_status := dbms_sql.execute(l_theCursor);
    while ( dbms_sql.fetch_rows(l_theCursor) > 0 ) loop
        for i in 1 .. l_colCnt loop
            dbms_sql.column_value( l_theCursor, i,
                                l_columnValue );
            dbms_output.put_line
                ( rpad( l_descTbl(i).col_name,
         35 ) || ': ' || l_columnValue );
        end loop;
        dbms_output.put_line( '-----------------' );
    end loop;
    execute_immediate( 'alter session set nls_date_format=
                        ''dd-MON-yy'' ');
exception
    when others then
        execute_immediate( 'alter session set
                        nls_date_format=''dd-MON-yy'' ');
        raise;
end;
/

Now you can have a test in oracle sqlplus:

SQL> @?/rdbms/admin/printtbl8.sql 'select name,LOG_MODE,OPEN_MODE from v$database'
old 17: replace( '&1', '"', ''''),
new 17: replace( 'select name,LOG_MODE,OPEN_MODE from v$database', '"', ''''),

NAME : TEST
LOG_MODE : ARCHIVELOG
OPEN_MODE : READ WRITE
-----------------

PL/SQL procedure successfully completed.

Cool, right?

Categories: Databases, IT Architecture, Oracle DB Tags:

resolved – ORA-01013: user requested cancel of current operation

March 24th, 2015 Comments off

ORA-01013: user requested cancel of current operation may occur in the following occasions:

  • All events blocking the shutdown do not occur within one hour, the shutdown operation aborts with the following message: ORA-01013: user requested cancel of current operation.
  • This message is also displayed if you interrupt the shutdown process, for example by pressing CTRL-C.
Categories: Databases, IT Architecture, Oracle DB Tags:

resolved – ORA-00020: maximum number of processes (1000) exceeded

March 18th, 2015 Comments off

Today I encountered ORA-12516 error when trying to access oracle database:

[root@client-doxer ~]# sqlplus tauser/password1@rac0102-r.example.com:1521/qainfac1

SQL*Plus: Release 11.2.0.3.0 Production on Tue Mar 17 07:31:04 2015

Copyright (c) 1982, 2011, Oracle. All rights reserved.

ERROR:
ORA-12516: TNS:listener could not find available handler with matching protocol
stack

Enter user-name:

Then I had a try of connecting using VIP instead of scan name, but it failed too:

[root@client-doxer ~]# sqlplus tauser/password1@rac0102-v.example.com:1521/qainfac1

SQL*Plus: Release 11.2.0.3.0 Production on Tue Mar 17 07:37:22 2015

Copyright (c) 1982, 2011, Oracle. All rights reserved.

ERROR:
ORA-12516: TNS:listener could not find available handler with matching protocol
stack

Enter user-name:

Then on the database server, I had a check of service qainfac1:

[root@rac01 crsd]# /u01/app/11.2.0.4/grid/bin/crsctl status res -t|grep -A5 ora.qainf1.db
ora.qainf1.db
1 ONLINE ONLINE rac01 Open
2 OFFLINE OFFLINE Instance Shutdown
ora.qainf1.qainfac1.svc
1 ONLINE ONLINE rac01
2 OFFLINE OFFLINE

So one instance is running fine. I tried sqlplus connection from local server:

[oracle@rac01 ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Tue Mar 17 07:45:39 2015

Copyright (c) 1982, 2013, Oracle. All rights reserved.

ERROR:
ORA-00020: maximum number of processes (1000) exceeded

Enter user-name: ^C

That's it, "ORA-00020: maximum number of processes (1000) exceeded". Then it's going to be a question of adjusting parameter PROCESSES. As parameter PROCESSES cannot be changed with ALTER SYSTEM unless a server parameter file was used to start the instance and the change takes effect in subsequent instances, so a bounce of instance is needed to activiate the new setting:

SQL> set lines 200
SQL> col NAME for a30
SQL> col VALUE for a40
SQL> select NAME,VALUE,ISSES_MODIFIABLE,ISSYS_MODIFIABLE,ISINSTANCE_MODIFIABLE from v$parameter where name='processes';
NAME VALUE ISSES ISSYS_MOD ISINS
------------------------------ ---------------------------------------- ----- --------- -----
processes 1500 FALSE FALSE FALSE

SQL> show parameter processes;

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
aq_tm_processes integer 1
db_writer_processes integer 3
gcs_server_processes integer 2
global_txn_processes integer 1
job_queue_processes integer 1000
log_archive_max_processes integer 4
processes integer 1000

In trace file /u01/app/oracle/diag/rdbms/qainf1/qainf12/trace/alert_qainf12.log, I can see below errors:

Unable to allocate flashback log of 51094 blocks from
current recovery area of size 214748364800 bytes.
Recovery Writer (RVWR) is stuck until more space
is available in the recovery area.
Unable to write Flashback database log data because the
recovery area is full, presence of a guaranteed
restore point and no reusable flashback logs.

Here's Fast Recovery Area info:

SQL> show parameter db_recovery_file_dest;

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_recovery_file_dest string +DATA
db_recovery_file_dest_size big integer 200G

And here's ASM diskgroup info:

[oracle@rac01 ~]$ export ORACLE_SID=+ASM2
[oracle@rac01 ~]$ export ORACLE_HOME=/u01/app/11.2.0.4/grid
[oracle@rac01 ~]$ export PATH=$ORACLE_HOME/bin:$PATH
[oracle@rac01 ~]$ sqlplus / as sysasm

SQL*Plus: Release 11.2.0.4.0 Production on Wed Mar 18 02:27:13 2015

Copyright (c) 1982, 2013, Oracle. All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

SQL> set lines 200;
SQL> select name, total_mb, free_mb, total_mb-free_mb used_mb from v$asm_diskgroup;

NAME TOTAL_MB FREE_MB USED_MB
------------------------------ ---------- ---------- ----------
DATA 4681689 533392 4148297

I had a check of restore points:

SQL> col NAME for a20
SQL> col time for a40
SQL> col SCN for 999999999999999
SQL> col STORAGE_SIZE for 999999999999999
SQL> SELECT NAME, SCN, TIME, DATABASE_INCARNATION#, GUARANTEE_FLASHBACK_DATABASE,STORAGE_SIZE FROM V$RESTORE_POINT;

NAME SCN TIME DATABASE_INCARNATION# GUA STORAGE_SIZE
-------------------- -------------- ---------------------------------------- --------------------- --- ------------
GRPT_BF_UPGR 14035000000000 03-MAR-15 04.16.12.000000000 PM 2 YES 214310000000

And I dropped the restore point to free space it's no longer needed to keep the restore point:

SQL> drop restore point GRPT_BF_UPGR;

After this, 214G space released from FRA and I can startup DB then set processes parameter to 1500(kill some processes with root by "ps -ef|grep <sid>" if sqlplus won't work even on local server):

SQL> alter system set processes=1500 scope=spfile;
SQL> shutdown immediate;
SQL> startup mount;
SQL> alter database flashback on;
SQL> alter database open;
SQL> select LOG_MODE,flashback_on from v$database;
LOG_MODE FLASHBACK_ON
------------ ------------------
ARCHIVELOG NO

Categories: Databases, IT Architecture, Oracle DB Tags:

resolved – TNS:listener does not currently know of service requested in connect descriptor

February 3rd, 2015 Comments off

Today we found errors in weblogic log about datasource connection:

TNS:listener does not currently know of service requested in connect descriptor

And in our configuration, data source was using below info:

jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=testrac-r.example.com)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=testservice)))

This was weird as it worked before. After some debugging, we found that the 3 IPs of scan name testrac-r.example.com behaved abnormally on RAC:

[root@rac1 ~]# /sbin/ifconfig|egrep -B1 '192.168.20.5|192.168.20.6|192.168.20.7'
v115_FE:3 Link encap:Ethernet HWaddr 00:21:28:F0:30:4C
inet addr:192.168.20.5 Bcast:10.245.87.255 Mask:255.255.248.0
--
v115_FE:4 Link encap:Ethernet HWaddr 00:21:28:F0:30:4C
inet addr:192.168.20.7 Bcast:10.245.87.255 Mask:255.255.248.0
--
v115_FE:5 Link encap:Ethernet HWaddr 00:21:28:F0:30:4C
inet addr:192.168.20.6 Bcast:10.245.87.255 Mask:255.255.248.0

[root@rac2 ~]# /sbin/ifconfig|egrep -B1 '192.168.20.5|192.168.20.6|192.168.20.7'
v115_FE:6 Link encap:Ethernet HWaddr 00:21:28:E8:3C:16
inet addr:192.168.20.7 Bcast:10.245.87.255 Mask:255.255.248.0
--
v115_FE:7 Link encap:Ethernet HWaddr 00:21:28:E8:3C:16
inet addr:192.168.20.6 Bcast:10.245.87.255 Mask:255.255.248.0

As showed above, 192.168.20.6 and 192.168.20.7 were up on both of the nodes. This behavior indicated scan name was somehow wrong. So we did a bounce of scan name service. And after that, the issue was gone.

Categories: Databases, IT Architecture, Oracle DB Tags:

ORA-12154 – TNS:could not resolve the connect identifier specified

December 2nd, 2014 Comments off

Today I try to connect to one Db service named pditui using the following easy connect method:

export ORACLE_HOME=/u01/app/oracle/product/11.2.0/client_1
export PATH=$ORACLE_HOME/bin:$PATH

sqlplus "sys/password@(DESCRIPTION =(ADDRESS = (PROTOCOL = TCP)(HOST = scanname.test.example.com)(PORT = 1521))(CONNECT_DATA =(SERVER = DEDICATED)(SERVICE_NAME = pditui)))" as sysdba

However, the following error messages prompted:

SQL*Plus: Release 11.2.0.3.0 Production on Tue Dec 2 14:07:35 2014

Copyright (c) 1982, 2011, Oracle. All rights reserved.

ERROR:
ORA-12154: TNS:could not resolve the connect identifier specified

Enter user-name:
ERROR:
ORA-12162: TNS:net service name is incorrectly specified

The username/password and service name were all correct, but the error was there. After some checking, I found that it was caused by wrong configuration of NAMES.DIRECTORY_PATH in file $ORACLE_HOME/network/admin/sqlnet.ora:

[root@centos-doxer ~]# cat /u01/app/oracle/product/11.2.0/client_1/network/admin/sqlnet.ora
# sqlnet.ora Network Configuration File: /u01/app/oracle/product/11.2.0/client_1/network/admin/sqlnet.ora
# Generated by Oracle configuration tools.

#NAMES.DIRECTORY_PATH= (TNSNAMES)
NAMES.DIRECTORY_PATH= (TNSNAMES,ezconnect) -- add ezconnect methond here

ADR_BASE = /u01/app/oracle

After this, the connection was ok.

PS: 

You can read more about NAMES.DIRECTORY_PATH in file $ORACLE_HOME/network/admin/sqlnet.ora here.

Categories: Databases, Oracle DB Tags:

resolved – ORA-27300 OS system dependent operation:fork failed with status: 11

November 18th, 2014 Comments off

Today we observed all our DB were in down status, and in the trace file:

Errors in file /u01/database/diag/rdbms/oimdb/OIMDB/trace/OIMDB_psp0_3173.trc:

ORA-27300: OS system dependent operation:fork failed with status: 11

ORA-27301: OS failure message: Resource temporarily unavailable

ORA-27302: failure occurred at: skgpspawn5

After some searching for ORA-27300, I found this article, which suggested that it's the issue of user processes used up and system could not spawn new one at the time. As the problem happened at Mon Nov 17 02:08:51 2014, so I did some check using sysstat sar:

[root@test sa]# sar -f /var/log/sa/sa17 -s 00:00:00 -e 03:20:00
Linux 2.6.32-300.27.1.el5uek (slcn11vmf0029) 11/17/14

00:00:01 CPU %user %nice %system %iowait %steal %idle
00:10:01 all 1.16 0.12 0.48 0.71 0.18 97.35
00:20:02 all 1.30 0.00 0.47 0.95 0.19 97.10
00:30:01 all 1.88 0.00 0.63 1.98 0.19 95.32
00:40:01 all 1.00 0.00 0.35 2.15 0.18 96.32
00:50:01 all 1.09 0.00 0.40 0.47 0.18 97.87
01:00:01 all 1.03 0.00 0.34 0.25 0.16 98.23
01:10:01 all 3.98 0.02 1.72 4.26 0.22 89.80
01:20:01 all 9.98 0.13 5.99 47.40 0.31 36.19
01:30:01 all 1.86 0.00 1.24 48.72 0.16 48.01
01:40:01 all 1.08 0.00 0.82 48.77 0.18 49.15
01:50:01 all 1.54 0.00 0.97 49.32 0.18 47.98
02:00:01 all 1.05 0.00 0.85 48.74 0.18 49.19 --- problem occurred at Mon Nov 17 02:08:51 2014
02:10:01 all 10.14 0.14 8.95 44.75 0.34 35.68
02:20:01 all 0.06 0.00 0.21 1.87 0.07 97.78
02:30:01 all 0.08 0.00 0.29 2.81 0.08 96.74
02:40:01 all 0.09 0.00 0.31 3.08 0.08 96.44
02:50:01 all 0.05 0.00 0.13 0.96 0.06 98.81
03:00:01 all 0.07 0.00 0.26 2.38 0.07 97.22
03:10:01 all 0.06 0.12 0.21 1.52 0.07 98.02
Average: all 1.85 0.03 1.20 15.89 0.16 80.88

[root@test sa]# sar -f /var/log/sa/sa17 -s 01:10:00 -e 02:11:00 -A
......
......
01:10:01 kbmemfree kbmemused %memused kbbuffers kbcached kbswpfree kbswpused %swpused kbswpcad
01:20:01 259940 15482728 98.35 2004 11703696 0 2104504 100.00 194056 -- even all SWAP spaces were used up
01:30:01 398584 15344084 97.47 904 11703152 0 2104504 100.00 191728
01:40:01 409104 15333564 97.40 984 11716924 0 2104504 100.00 191404
01:50:01 452844 15289824 97.12 1004 11711548 0 2104504 100.00 189076
02:00:01 440780 15301888 97.20 1424 11757600 0 2104504 100.00 189364
02:10:01 14602712 1139956 7.24 19548 382588 1978020 126484 6.01 3096
Average: 2760661 12982007 82.46 4311 9829251 329670 1774834 84.34 159787

So this proved that system was very busy during that time. I then increased oracle user's user process number to 131072 in /etc/security/limits.conf with the following:

* soft nproc 131072
* hard nproc 131072

And also set kernel.pid_max to 139264(which is 131072 plus 8192 which is recommended for OS stability) in /etc/sysctl.conf.

[root@test ~]# sysctl -a|grep pid_max
kernel.pid_max = 139264

Then increased memory from 16G to 32G of the box, and reboot.

change oracle to use ASM for archive log destination

October 15th, 2013 Comments off

Now our DB is using filesystem for storing archive log:

/u01/app/oracle/product/11.2.0.3/dbhome_1/dbs/arch1_117_826841548.dbf
......

First, let's check available DGs:

[oracle@test03 ~]$ export ORACLE_HOME=/u01/app/11.2.0.3/grid
[oracle@test03 ~]$ export ORACLE_SID=+ASM1
[oracle@test03 ~]$ $ORACLE_HOME/bin/sqlplus / as sysasm
SQL> select name,state from v$asm_diskgroup ;

NAME STATE
------------------------------ -----------
DA_SLCM07 MOUNTED #for archive log
DBFS_DG MOUNTED #for mounting RAM disk from DB
RE_SLCM07 MOUNTED #for backup

Now we can change archive log destination:

[oracle@test03 ~]$ export ORACLE_HOME=/u01/app/oracle/product/11.2.0.3/dbhome_1
[oracle@test03 ~]$ export ORACLE_SID=oimdb31
[oracle@test03 ~]$ $ORACLE_HOME/bin/sqlplus / as sysdba
SQL> archive log list;
Database log mode Archive Mode
Automatic archival Enabled
Archive destination /u01/app/oracle/product/11.2.0.3/dbhome_1/dbs/arch
Oldest online log sequence 6
Next log sequence to archive 8
Current log sequence 8

SQL> alter system set log_archive_dest_1='location=+DA_SLCM07' scope=both; #On rac, you need only change this once

After this, you can remove old archive log on the filesystem.

oracle rac on linux installation

August 6th, 2013 Comments off

1,2 -> database; 3-> grid; 4 -> client; 5 -> gateways; 6 -> examples; 7 -> deinstall

---private IP, eth1
192.168.13.86/24
192.168.13.87/24

---public ip, eth0
10.240.149.109/21
10.240.149.110/21

---vip #vip alias will auto install, should in same subnet with public ip/scan ip
10.240.149.107/21
10.240.149.108/21

---scan IP #scan ip should be defined in dns. multiple IPs can map to one scan-name(round-robin)
10.240.149.106

---/etc/hosts
127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6

10.240.149.109 node1.example.com node1

10.240.149.110 node2.example.com node2
192.168.13.86 node1-priv.localdomain node1-priv
192.168.13.87 node2-priv.localdomain node2-priv

PS:
hostname should return node1.us.test.com

---DNS configuration

yum install bind bind-chroot caching-nameserver
cd /var/named/chroot/etc
cp -p named.caching-nameserver.conf named.conf
options {
listen-on port 53 { any; };
listen-on-v6 port 53 { ::1; };
directory "/var/named";
dump-file "/var/named/data/cache_dump.db";
statistics-file "/var/named/data/named_stats.txt";
memstatistics-file "/var/named/data/named_mem_stats.txt";

allow-query { any; };
allow-query-cache { any; };
};
logging {
channel default_debug {
file "data/named.run";
severity dynamic;
};
};
view localhost_resolver {
match-clients { any; };
match-destinations { any; };
recursion no;
include "/etc/named.rfc1912.zones";
};
vi named.rfc1912.zones #in the end

zone "us.test.com" IN {
type master;
file "us.test.com.zone" #dns -> ip
allow-update { none; };
};

zone "149.240.10.in-addr.arpa." IN {
type master;
file "149.240.10.in-addr.arpa"; #ip -> dns, reverse dns
allow-update { none; };
};
cd /var/named/chroot/var/named/
vi us.test.com.zone #in the end

scan-cluster IN A 10.240.149.106
node1 IN A 10.240.149.109
node1-vip IN A 10.240.149.107
node2 IN A 10.240.149.110
node2-vip IN A 10.240.149.108

 

cp -p named.local 149.240.10.in-addr.arpa
vi 149.240.10.in-addr.arpa #in the end

106 IN PTR scan-cluster.us.test.com. #if using scan, then multiple IPs should added here
107 IN PTR node1-vip.us.test.com.
108 IN PTR node2-vip.us.test.com.
109 IN PTR node1.us.test.com.
110 IN PTR node2.us.test.com.
/etc/init.d/named start
chkconfig named on

vi /etc/resolv.conf #on node1, node2
search us.test.com #on first line
nameserver 10.245.28.40

To get the scan VIP name and IPs, run /u01/app/grid/11.2.0.3/bin/srvctl config scan. To get the VIP name of each node, run /u01/app/grid/11.2.0.3/bin/srvctl config vip -n <node name>.

PS:

If you're using Oracle Linux, get yum repo from http://public-yum.oracle.com/ and do the following to automatically install any additional packages needed for installing Oracle Grid Infrastructure and Oracle Database, and configure your server operating system automatically, including setting kernel parameters and other basic operating system requirements for installation.

#Oracle Linux 6
#log /var/log/oracle-rdbms-server-11gR2-preinstall/results/orakernel.log

yum install oracle-rdbms-server-11gR2-preinstall -y

yum install oracle-rdbms-server-12cR1-preinstall -y

#Oracle Linux 5
#log /var/log/oracle-validated/results/orakernel.log

yum install oracle-validated -y

#When install OS with Oracle Linux

Customize now - Servers - System administration tools - select Oracle Preinstallation or Oracle Validated

---#1_preuser.sh,

---2_predir.sh

---3_prelimits.sh

---4_prelogin.sh

---5_preprofile.sh

---6_presysctl.sh

---stop ntpd
service ntpd stop
chkconfig ntpd off
57 18 * * * /usr/sbin/ntpdate 10.245.72.1

---ssh config for oracle/grid(passwordless even forthe host  itself)

ssh-keygen -t rsa
cat .ssh/id_rsa.pub | tr -d '\r\n'
vi .ssh/authorized_keys #for both peer host and itself

chmod 755 ~oracle

---config disks
allocate iscsi LUN
fdisk LUN(on one disk, no need to create filesystem)

for i in /etc/yum.repos.d/*.repo;do sed -i 's/gpgcheck=1/gpgcheck=0/g' $i;done

for i in /etc/yum.repos.d/*.repo;do sed -i 's/enabled=0/enabled=1/g' $i;done
yum install oracleasm-support oracleasm oracleasmlib kmod-oracleasm #on two nodes

#Configuring Storage Device Path Persistence Using Oracle ASMLIB, more info here

/usr/sbin/oracleasm status
/usr/sbin/oracleasm configure -i #grid/asmadmin/y/y
/usr/sbin/oracleasm init #more info is here about oracleasm commands with steps
/usr/sbin/oracleasm status
/usr/sbin/oracleasm configure
/usr/sbin/oracleasm listdisks #on one node
/usr/sbin/oracleasm createdisk VOL1 /dev/sda1
/usr/sbin/oracleasm createdisk VOL2 /dev/sdb1
/usr/sbin/oracleasm createdisk VOL3 /dev/sdc1
/usr/sbin/oracleasm createdisk VOL4 /dev/sdd1
/usr/sbin/oracleasm listdisks
/usr/sbin/oracleasm scandisks #on the other node
/usr/sbin/oracleasm listdisks
/usr/sbin/oracleasm querydisk /dev/sd*

#Configuring Oracle ASMLIB for Multipath Disks is here

#Configuring Disk Devices Manually for Oracle ASM(udev) is here

On both testvm773/774.example.com

    iscsiadm -m session -P 3 > iscsiadm.before
    ls -l /dev/mapper/ > multipath.before

On ZFS u1dis01nas35.example.com
    
    first on shares/LUNs, have a check of next available LUN name (e.g. if Data4 is used, then we should use Data5)
    on configuration/SAN/iscsi/initiators, add a LUN

On both testvm773/774.example.com
    
    iscsiadm -m session --rescan
    iscsiadm -m session -P 3 > iscsiadm.after
    ls -l /dev/mapper/ > multipath.after

Send multipath device to DBA for futhur configuration
    
    diff multipath.before multipath.after

If we are going to do ASM config (assume /dev/mapper/3600144f0d330489f00005955be860014 -> ../dm-14):

    On both testvm773/774.example.com

        #diff iscsiadm.before iscsiadm.after
        #/sbin/scsi_id -g -u -d /dev/sd #will be 3600144f0d330489f00005955be860014
        fdisk /dev/dm-14 #/dev/mapper/3600144f0d330489f00005955be860014p1 will be there
        dmsetup ls|egrep 3600144f0d330489f00005955be860014
        cd /etc/udev/rules.d
        cp 99-oracle-asmdevices.rules 99-oracle-asmdevices.rules.bak`date +%F`
        vi 99-oracle-asmdevices.rules #change 'NAME' according to /dev/mapper/asm-disk*

            KERNEL=="dm-*", ENV{DM_NAME}=="3600144f0d330489f00005955be860014", RUN+="/sbin/kpartx -a /dev/%k"
            KERNEL=="dm-*", ENV{DM_NAME}=="3600144f0d330489f00005955be860014", RUN+="/sbin/kpartx -a /dev/%k"
            KERNEL=="dm-*", ENV{DM_NAME}=="3600144f0d330489f00005955be860014*1", NAME="asm-disk4", OWNER="oracle", GROUP="dba", MODE="0660"
            KERNEL=="dm-*", ENV{DM_NAME}=="3600144f0d330489f00005955be860014*1", NAME="asm-disk4", OWNER="oracle", GROUP="dba", MODE="0660"        

        /sbin/partprobe /dev/dm-14
        /sbin/udevtest
        /sbin/udevcontrol reload_rules #or udevcontrol reload_rules. When do add, run "udevadm trigger --action=add"
        /sbin/start_udev

---runcluvfy.sh
su - grid
cd grid
/u03/upgrade/11.2.0.4/software/grid/runcluvfy.sh stage -pre crsinst -n node1,node2 -verbose #run -fixup with root

cvu_stages

---install cvuqdisk
rpm -Uvh /u03/upgrade/11.2.0.4/software/grid/rpm/cvuqdisk-1.0.9-1.rpm

---install grid
vncviewer #su - grid
su - grid
/u03/upgrade/11.2.0.4/software/grid/runInstaller #Install and Configure Oracle Grid Infrastructure for a Cluster/Advanced Installation/scan-cluster(.localdomain), uncheck configure GNS<grid naming service>

Change Discovery Path -> /dev/oracleasm/disks

#GRIDDG -> VOL1_VOL2/first node1,then node2

run as root first in local node, then on other node(s):

/u01/app/oraInventory/orainstRoot.sh

/u01/app/12.1.0.2/grid/root.sh #Configure Oracle Grid Infrastructure for a Cluster

#For user - grep GROUP $OLD_GRID_HOME/crs/install/crsconfig_params

SILENT=false
ORACLE_OWNER=oracle
ORA_DBA_GROUP=oinstall
ORA_ASM_GROUP=oinstall
LANGUAGE_ID=AMERICAN_AMERICA.AL32UTF8
TZ=UTC
ISROLLING=true
REUSEDG=false
ASM_AU_SIZE=1
USER_IGNORED_PREREQ=true
INSTALL_NODE=node1.example.com

#LISTENER_USERNAME=
LISTENER_USERNAME=oracle
MGMT_DB=true
BIG_CLUSTER=false
HUB_SIZE=-1
HUB_NODE_LIST=
RIM_NODE_LIST=
HUB_NODE_VIPS=
PING_TARGETS=

ORACLE_HOME=/u01/app/12.1.0.2/grid
ORACLE_BASE=/u01/app/oracle
OLD_CRS_HOME=

JREDIR=/u01/app/12.1.0.2/grid/jdk/jre/
JLIBDIR=/u01/app/12.1.0.2/grid/jlib

VNDR_CLUSTER=false
OCR_LOCATIONS=NO_VAL
CLUSTER_NAME=ecradb
HOST_NAME_LIST=node1,node2
NODE_NAME_LIST=node1,node2
PRIVATE_NAME_LIST=
VOTING_DISKS=NO_VAL
#VF_DISCOVERY_STRING=%s_vfdiscoverystring%

# ASM consts
ASM_UPGRADE=false
ASM_SPFILE=
ASM_DISK_GROUP=DATA
ASM_DISCOVERY_STRING=/dev/oracleasm/disks
ASM_DISKS=/dev/oracleasm/disks/DATA01
ASM_REDUNDANCY=EXTERNAL
ASM_CONFIG=local
ASM_CREDENTIALS=

CRS_STORAGE_OPTION=1
CSS_LEASEDURATION=400
CRS_NODEVIPS='node1-vip.us.oracle.com/255.255.248.0/eth0,node2-vip.us.oracle.com/255.255.248.0/eth0'
NODELIST=node1,node2
NETWORKS="eth0"/10.240.216.0:public,"eth2"/192.168.1.0:cluster_interconnect
SCAN_NAME=node12-r.example.com
SCAN_PORT=1521
GPNP_PA=

# GNS consts
GNS_CONF=false
GNS_TYPE=
GNS_ADDR_LIST=
GNS_DOMAIN_LIST=
GNS_ALLOW_NET_LIST=
GNS_DENY_NET_LIST=
GNS_DENY_ITF_LIST=
GNS_CREDENTIALS=

#### Required by OUI add node
NEW_HOST_NAME_LIST=
NEW_NODE_NAME_LIST=
NEW_PRIVATE_NAME_LIST=
NEW_NODEVIPS='node1-vip.us.oracle.com/255.255.248.0/eth0,node2-vip.us.oracle.com/255.255.248.0/eth0'

############### OCR constants
# GPNPCONFIGDIR is handled differently in dev (T_HAS_WORK for all)
# GPNPGCONFIGDIR in dev expands to T_HAS_WORK_GLOBAL
GPNPCONFIGDIR=$ORACLE_HOME
GPNPGCONFIGDIR=$ORACLE_HOME
OCRLOC=
OLRLOC=
OCRID=
CLUSTER_GUID=

CLSCFG_MISSCOUNT=

#### IPD/OS
CRFHOME="/u01/app/12.1.0.2/grid"

 

---install RAC(aka RDBMS software)
vncviewer #su - oracle
/u03/upgrade/11.2.0.4/software/database/runInstaller #install database software only

run as root first in local node, then on other node(s):

$ORACLE_HOME/root.sh

---create DATA/FLASH disk groups
vncviewer to grid user
su - grid
asmca #create DATA/VOL3, FLASH/VOL4

---create db
vncviewer to oracle user
su - oracle
dbca #Admin-Managed/devdb/select node1,node2/+DATA/+FLASH/AL32UTF8_AL16UTF16, database template in $ORACLE_HOME/assistants/dbca/templates/. DBCA scripts in $ORACLE_BASE/admin/devdb/scripts/(will be removed when removing DB using dbca).

Scripts

1_preuser.sh

#!/bin/bash
#Purpose:Create 6 groups named 'oinstall','dba','asmadmin','asmdba','asmoper','oper', plus 2 users named 'oracle','grid'.
#Also setting the Environment
#variable for oracle user.
#variable for grid user.
#Usage:Log on as the superuser('root'),and then execute the command:#./1preusers.sh
#Author:Asher Huang
echo "Now create 6 groups named 'oinstall','dba','asmadmin','asmdba','asmoper','oper'"
echo "Plus 2 users named 'oracle','grid',Also setting the Environment"
groupadd -g 54321 oinstall
groupadd -g 54324 asmadmin
groupadd -g 54325 asmdba
groupadd -g 54326 asmoper
groupadd -g 54327 dba
groupadd -g 54328 oper
useradd -u 54329 -g oinstall -G asmadmin,asmdba,asmoper -d /home/grid -s /bin/bash -c "grid Infrastructure Owner" grid
echo "grid" | passwd --stdin grid
echo 'export PS1="`/bin/hostname`-> "'>> /home/grid/.bash_profile
echo "export TMP=/tmp">> /home/grid/.bash_profile
echo 'export TMPDIR=$TMP'>>/home/grid/.bash_profile
echo "export ORACLE_BASE=/u01/app/grid">> /home/grid/.bash_profile
echo "export ORACLE_HOME=/u01/app/11.2.0/grid">> /home/grid/.bash_profile
echo "export ORACLE_TERM=xterm">> /home/grid/.bash_profile
echo "export NLS_DATE_FORMAT='yyyy/mm/dd hh24:mi:ss'" >> /home/grid/.bash_profile
echo 'export TNS_ADMIN=$ORACLE_HOME/network/admin' >> /home/grid/.bash_profile
echo 'export PATH=/usr/sbin:$PATH'>> /home/grid/.bash_profile
echo 'export PATH=$ORACLE_HOME/bin:$PATH'>> /home/grid/.bash_profile
echo 'export LD_LIBRARY_PATH=$ORACLE_HOME/lib:/lib:/usr/lib'>> /home/grid/.bash_profile
echo 'export CLASSPATH=$ORACLE_HOME/JRE:$ORACLE_HOME/jlib:$ORACLE_HOME/rdbms/jlib'>> /home/grid/.bash_profile
echo "export EDITOR=vi" >> /home/grid/.bash_profile
echo "export LANG=en_US" >> /home/grid/.bash_profile
echo "export NLS_LANG=american_america.AL32UTF8" >> /home/grid/.bash_profile
echo "umask 022">> /home/grid/.bash_profile

useradd -u 54330 -g oinstall -G dba,oper,asmdba -d /home/oracle -s /bin/bash -c "Oracle Software Owner" oracle
echo "oracle" | passwd --stdin oracle
echo 'export PS1="`/bin/hostname`-> "'>> /home/oracle/.bash_profile
echo "export TMP=/tmp">> /home/oracle/.bash_profile
echo 'export TMPDIR=$TMP'>>/home/oracle/.bash_profile
echo "export ORACLE_BASE=/u01/app/oracle">> /home/oracle/.bash_profile
echo 'export ORACLE_HOME=$ORACLE_BASE/product/11.2.0/db_1'>> /home/oracle/.bash_profile
echo "export ORACLE_UNQNAME=devdb">> /home/oracle/.bash_profile
echo 'export TNS_ADMIN=$ORACLE_HOME/network/admin' >> /home/oracle/.bash_profile
echo "export ORACLE_TERM=xterm">> /home/oracle/.bash_profile
echo 'export PATH=/usr/sbin:$PATH'>> /home/oracle/.bash_profile
echo 'export PATH=$ORACLE_HOME/bin:$PATH'>> /home/oracle/.bash_profile
echo 'export LD_LIBRARY_PATH=$ORACLE_HOME/lib:/lib:/usr/lib'>> /home/oracle/.bash_profile
echo 'export CLASSPATH=$ORACLE_HOME/JRE:$ORACLE_HOME/jlib:$ORACLE_HOME/rdbms/jlib'>> /home/oracle/.bash_profile
echo "export EDITOR=vi" >> /home/oracle/.bash_profile
echo "export LANG=en_US" >> /home/oracle/.bash_profile
echo "export NLS_LANG=american_america.AL32UTF8" >> /home/oracle/.bash_profile
echo "export NLS_DATE_FORMAT='yyyy/mm/dd hh24:mi:ss'" >> /home/oracle/.bash_profile
echo "umask 022">> /home/oracle/.bash_profile
#should be changed on your env
#echo "export ORACLE_SID=+ASM1">> /home/grid/.bash_profile
#echo "export ORACLE_SID=+ASM2">> /home/grid/.bash_profile
#echo "export ORACLE_HOSTNAME=node1.us.test.com">> /home/oracle/.bash_profile
#echo "export ORACLE_HOSTNAME=node2.us.test.com">> /home/oracle/.bash_profile
#echo "export ORACLE_SID=devdb1">> /home/oracle/.bash_profile
#echo "export ORACLE_SID=devdb2">> /home/oracle/.bash_profile
chown -R oracle:oinstall /home/oracle/
chmod -R 755 /home/oracle/
chown -R grid:oinstall /home/grid
chmod -R 755 /home/grid
echo "The Groups and users has been created"
echo "The Environment for grid,oracle also has been set successfully"

2_predir.sh

#!/bin/bash
#Purpose:Create the necessary directory for oracle,grid users and change the authention to oracle,grid users.
#Usage:Log on as the superuser('root'),and then execute the command:#./2predir.sh
#Author:Asher Huang
echo "Now create the necessary directory for oracle,grid users and change the authention to oracle,grid users..."
mkdir -p /u01/app/grid
mkdir -p /u01/app/11.2.0/grid
mkdir -p /u01/app/oracle
chown -R oracle:oinstall /u01
chown -R grid:oinstall /u01/app/grid
chown -R grid:oinstall /u01/app/11.2.0
chmod -R 775 /u01
echo "The necessary directory for oracle,grid users and change the authention to oracle,grid users has been finished"

3_prelimits.sh

#!/bin/bash
#Purpose:Change the /etc/security/limits.conf.
#Usage:Log on as the superuser('root'),and then execute the command:#./3prelimits.sh
#Author:Asher Huang
echo "Now modify the /etc/security/limits.conf,but backup it named /etc/security/limits.conf.bak before"
cp /etc/security/limits.conf /etc/security/limits.conf.bak
echo "oracle soft nproc 2047" >>/etc/security/limits.conf
echo "oracle hard nproc 16384" >>/etc/security/limits.conf
echo "oracle soft nofile 1024" >>/etc/security/limits.conf
echo "oracle hard nofile 65536" >>/etc/security/limits.conf
echo "grid soft nproc 2047" >>/etc/security/limits.conf
echo "grid hard nproc 16384" >>/etc/security/limits.conf
echo "grid soft nofile 1024" >>/etc/security/limits.conf
echo "grid hard nofile 65536" >>/etc/security/limits.conf
echo "Modifing the /etc/security/limits.conf has been succeed."

4_prelogin.sh

#!/bin/bash
#Purpose:Modify the /etc/pam.d/login.
#Usage:Log on as the superuser('root'),and then execute the command:#./4prelimits.sh
#Author:Asher Huang
echo "Now modify the /etc/pam.d/login,but with a backup named /etc/pam.d/login.bak"
cp /etc/pam.d/login /etc/pam.d/login.bak
echo "session required /lib64/security/pam_limits.so" >>/etc/pam.d/login
echo "session required pam_limits.so" >>/etc/pam.d/login
echo "Modifing the /etc/pam.d/login has been succeed."

5_preprofile.sh

#!/bin/bash
#Purpose:Modify the /etc/profile.
#Usage:Log on as the superuser('root'),and then execute the command:#./5preprofile.sh
#Author:Asher Huang
echo "Now modify the /etc/profile,but with a backup named /etc/profile.bak"
cp /etc/profile /etc/profile.bak
echo 'if [ $USER = "oracle" ]||[ $USER = "grid" ]; then' >> /etc/profile
echo 'if [ $SHELL = "/bin/ksh" ]; then' >> /etc/profile
echo 'ulimit -p 16384' >> /etc/profile
echo 'ulimit -n 65536' >> /etc/profile
echo 'else' >> /etc/profile
echo 'ulimit -u 16384 -n 65536' >> /etc/profile
echo 'fi' >> /etc/profile
echo "Modifing the /etc/profile has been succeed."

6_presysctl.sh

#!/bin/bash
#Purpose:Modify the /etc/sysctl.conf.
#Usage:Log on as the superuser('root'),and then execute the command:#./6presysctl.sh
#Author:Asher Huang
echo "Now modify the /etc/sysctl.conf,but with a backup named /etc/sysctl.bak"
cp /etc/sysctl.conf /etc/sysctl.conf.bak
echo "fs.aio-max-nr = 1048576" >> /etc/sysctl.conf
echo "fs.file-max = 6815744" >> /etc/sysctl.conf
echo "kernel.shmall = 2097152" >> /etc/sysctl.conf
echo "kernel.shmmax = 2100549632" >> /etc/sysctl.conf
echo "kernel.shmmni = 4096" >> /etc/sysctl.conf
echo "kernel.sem = 500 144000 2048 8192" >> /etc/sysctl.conf
echo "net.ipv4.ip_local_port_range = 9000 65500" >> /etc/sysctl.conf
echo "net.core.rmem_default = 262144" >> /etc/sysctl.conf
echo "net.core.rmem_max = 4194304" >> /etc/sysctl.conf
echo "net.core.wmem_default = 262144" >> /etc/sysctl.conf
echo "net.core.wmem_max = 1048586" >> /etc/sysctl.conf
echo "net.ipv4.tcp_wmem = 262144 262144 262144" >> /etc/sysctl.conf
echo "net.ipv4.tcp_rmem = 4194304 4194304 4194304" >> /etc/sysctl.conf
echo "Modifing the /etc/sysctl.conf has been succeed."
echo "Now make the changes take effect....."
sysctl -p

Categories: Databases, IT Architecture, Oracle DB Tags:

oracle database installation

June 30th, 2013 Comments off

yum -y install libaio-devel sysstat pdksh.x86_64 compat-libstdc++* unixODBC unixODBC-devel
./jdk-6u38-linux-x64-rpm.bin
cat /proc/sys/net/ipv4/ip_local_port_range #Tcp/ip ephemeral port range(starting point bigger than 9000)
cd database/rpm/
rpm -e cvuqdisk
groupadd oinstall;groupadd dba;groupadd oper #for oracle grid: groupadd asmdba;groupadd asmadmin;groupadd asmoper
usermod -g oinstall oracle #oinstall, Oracle Inventory group; grid, Oracle Restart owner user;oracle, database software owner;Groups => OSDBA(dba), OSASM(asmadmin), OSOPER(oper); Privileges => SYSDBA, SYSASM, SYSOPER
useradd oracle -g oinstall
usermod -a -G dba,oper oracle
useradd grid -g oinstall
usermod -a -G asmadmin,asmdba,dba,oper grid
[root@test-centos database]# id oracle
uid=501(oracle) gid=503(oinstall) groups=503(oinstall),502(dba),504(oper)

[root@test-centos database]# id grid
uid=502(grid) gid=503(oinstall) groups=503(oinstall),502(dba),505(asmdba),506(asmadmin)

CVUQDISK_GRP=oinstall; export CVUQDISK_GRP
rpm -ivh cvuqdisk-1.0.7-1.rpm #Cluster Verification Utility, shared disk etc

#check resource limits(for each software installation users: oracle, grid)
ulimit -Sn #nofile, file descriptor setting(soft) >1024
ulimit -Sn #nofile, file descriptor setting(hard) >65536
ulimit -Su #nproc, processes available(soft) >2047
ulimit -Hu #nproc, processes available(hard) >16384
ulimit -Ss #stack, stack setting(soft) >10240KB
ulimit -Hs #stack, stack setting(hard) >32768KB

#/etc/security/limits.conf, then logout/login again
oracle soft nproc 2047
oracle hard nproc 16384
oracle soft nofile 1024
oracle hard nofile 65536
oracle soft stack 10240

grid soft nproc 2047
grid hard nproc 16384
grid soft nofile 1024
grid hard nofile 65536
grid soft stack 10240

#kernel parameters
http://docs.oracle.com/cd/E11882_01/install.112/e24321/pre_install.htm#autoId50
/sbin/sysctl -a | grep sem #semmsl, semmns, semopm, and semmni, /proc/sys/kernel/sem: the value of the semaphore parameters
/sbin/sysctl -a | grep shm #shmall, shmmax, and shmmni, /proc/sys/kernel/shm{all, max, mni}: shared memory segment sizes
/sbin/sysctl -a | grep file-max #file-max, /proc/sys/fs/file-max: maximum number of file handles
/sbin/sysctl -a | grep ip_local_port_range, /proc/sys/net/ipv4/ip_local_port_range
rmem_{default, max}, wmem_{default, max} #/sbin/sysctl -a | grep rmem_, /proc/sys/net/core/rmem_default
/sbin/sysctl -a | grep aio-max-nr #/proc/sys/fs/aio-max-nr

#/etc/sysctl.conf, sysctl -p
fs.aio-max-nr = 1048576
fs.file-max = 6815744
kernel.shmall = 2097152
kernel.shmmax = 4294967295
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
net.ipv4.ip_local_port_range = 9000 65500
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048576

net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048576
fs.aio-max-nr = 1048576
net.ipv4.ip_local_port_range = 9000 65500
#mkdir -p /u01/app/grid
mkdir -p /u01/app/oracle
mkdir -p /u01/app/oraInventory
chown -R oracle:oinstall /u01/app/oracle
chown -R grid:oinstall /u01/app/grid
chown -R grid:oinstall /u01/app/oraInventory/
#chmod -R 775 /u01/app/grid
chmod -R 775 /u01/app/oracle
chmod -R 777 /u01/app/oraInventory/
export ORACLE_BASE=/u01/app/oracle/

#edit /etc/udev/rules.d/99-oracle.rules if using a new partition for oracle.After this,/sbin/partprobe devicename to loadupdated block device partition tables(restart udev). grid asm need at least two disks to form a new disk group
KERNEL=="sdb1", OWNER="grid", GROUP="asmadmin", MODE="0660"
KERNEL=="sdc1", OWNER="grid", GROUP="asmadmin", MODE="0660"

KERNEL=="sdb2", OWNER="oracle", GROUP="dba", MODE="0660"
KERNEL=="sdc2", OWNER="oracle", GROUP="dba", MODE="0660"

PS: More about writing udev rules here http://www.reactivated.net/writing_udev_rules.html

/sbin/partprobe /dev/sdb
/sbin/partprobe /dev/sdc

#install vncviewer
yum grouplist
yum groupinstall "X Window System" -y
yum groupinstall "GNOME Desktop Environment" -y
yum groupinstall "Graphical Internet" -y
yum groupinstall "Graphics" -y
yum install vnc-server
echo "DESKTOP="GNOME"" > /etc/sysconfig/desktop
sed -i.bak '/VNCSERVERS=/d' /etc/sysconfig/vncservers
echo "VNCSERVERS=\"1:root 2:oracle 3:sysdba\"" >> /etc/sysconfig/vncservers

mkdir -p /root/.vnc
wget --no-proxy http://test/iso/HostConfiguration/vnc/passwd -P /root/.vnc/
wget --no-proxy http://test/iso/HostConfiguration/vnc/xstartup -P /root/.vnc/
chmod 755 ~/.vnc ; chmod 600 ~/.vnc/passwd ; chmod 755 ~/.vnc/xstartup
[root@test-centos .vnc]# cat xstartup
#!/bin/sh

# Uncomment the following two lines for normal desktop:
# unset SESSION_MANAGER
# exec /etc/X11/xinit/xinitrc

[ -x /etc/vnc/xstartup ] && exec /etc/vnc/xstartup
[ -r $HOME/.Xresources ] && xrdb $HOME/.Xresources
xsetroot -solid grey
vncconfig -iconic &
#xterm -geometry 80x24+10+10 -ls -title "$VNCDESKTOP Desktop" &
#twm &
gnome-terminal &
gnome-session &

PS: you can download passwd and xstartup here vnc-config-passwd (wel*1). Or you can set password through use vncpasswd ~/.vnc/passwd

mkdir -p ~oracle/.vnc
wget --no-proxy http://test/iso/HostConfiguration/vnc/passwd -P ~oracle/.vnc/
wget --no-proxy http://test/iso/HostConfiguration/vnc/xstartup -P ~oracle/.vnc/
chown oracle:oinstall -R ~oracle/.vnc ; chmod 755 ~oracle/.vnc ; chmod 600 ~oracle/.vnc/passwd ; chmod 755 ~oracle/.vnc/xstartup
chkconfig --level 345 vncserver on
chkconfig --list | grep vncserver
service vncserver start

PS: You can change vncserver's resolution through editing /usr/bin/vncserver, change the default "$geometry = "1024x768";" to any one you like, for example "$geometry = "1600x900";". You can also control each user's vnc resolution setting through adding line like "VNCSERVERARGS[1]="-geometry 1600x900"" in /etc/sysconfig/vncservers

#oracle env variables http://docs.oracle.com/cd/E11882_01/install.112/e24321/pre_install.htm#autoId68
[root@test-centos ~]# su - oracle

-bash-3.2$ cat .bash_profile
export ORACLE_HOME="/u01/app/oracle/product/11.2.0/dbhome_6"
export ORACLE_SID="orcl6"

##begin installation
##first install oracle grid infrastructure & ASM http://docs.oracle.com/cd/E11882_01/install.112/e24321/oraclerestart.htm#BABCHCBG
##log file under /u01/app/oraInventory/logs/installActions*
chmod -R 777 database*
su - grid
cd /backup/downloads/database-grid/grid
./runInstaller
Specify ASM Password #Use same passwords
Privileged OS Groups #OSDBA -> dba; OSOPER -> oper; OSASM -> asmadmin
oracle base #/u01/app/grid
Software Location #/u01/app/grid/product/11.2.0/grid
Inventory Directory #/u01/app/oraInventory
Save response file #/home/grid/grid.rsp
#Oracle Grid Infrastructure home => /u01/app/grid/product/11.2.0/grid/

/u01/app/grid/product/11.2.0/grid/bin/crsctl check has #verify that the Oracle High Availability Service is installed properly
ps aux|grep -i Ohasd #Ohasd is a daemon installed with Oracle Grid Infrastructure that starts software services, such as Oracle ASM
#test the Oracle ASM installation
#ASM instance sid<system identifier> => cat /etc/oratab(also used to control whether to boot with OS)
export ORACLE_SID=+ASM
export ORACLE_HOME=/u01/app/grid/product/11.2.0/grid
$ORACLE_HOME/bin/asmcmd lsdg
#If the Oracle ASM instance is not running, you can start the instance with the following
#$ORACLE_HOME/bin/asmcmd startup
#or through sqlplus to startup asm instance
$ORACLE_HOME/bin/sqlplus /nolog
SQL> CONNECT SYS as SYSASM
Enter password: SYS_password
SQL> SELECT NAME,TYPE,TOTAL_MB,FREE_MB FROM V$ASM_DISKGROUP;
SQL> STARTUP

##log under /u01/app/oraInventory/logs/installActions*
##/u01/app/oracle/cfgtoollogs/dbca/orcl6/
export ORACLE_SID=+ASM
export ORACLE_HOME=/u01/app/grid/product/11.2.0/grid
$ORACLE_HOME/bin/sqlplus /nolog
SQL> CONNECT SYS as SYSASM
SQL> SELECT NAME,TYPE,TOTAL_MB,FREE_MB FROM V$ASM_DISKGROUP;
#Enhanced security controls such as database auditing options, and password policy and expiration settings
#Database Security Options
#To enable the security configuration, after db installation:
dbca -silent -configureDatabase -sourceDB SID -disableSecurityConfiguration NONE -enableSecurityConfiguration true
#To disable the security configuration, after db installation:
dbca -silent -configureDatabase -sourceDB SID -disableSecurityConfiguration [ALL|PASSWORD_PROFILE] -enableSecurityConfiguration false
su - oracle
cd /backup/downloads/database
./runInstaller
oracle base => /u01/app/oracle
oracle home directory => /u01/app/oracle/product/11.2.0/dbhome_6
global database name => orcl6.andy
Oracle service identifier => orcl6
Character sets => AL32UTF8
Security => choose "Assert all new security settings"
choose "create database with sample schemas"
Do not enable automated backups

Server Parameter File name +DATA/orcl6/spfileorcl6.ora
Enterprise Manager Database Control URL: https://10.172.12.177:1158/em

#recompiles all PL/SQL modules that might be in an invalid state(including packages, procedures, and types)
su - oracle
$ORACLE_HOME/bin/sqlplus / as sysdba
SQL> @?/rdbms/admin/utlrp.sql

#Generating the Client Static Library
su - oracle
$ORACLE_HOME/bin/genclntst

#create a fast recovery area disk group(large enough to hold all of your data files and control files, the online redo logs, and the archived redo log files: Multiple databases can use the same fast recovery area)
vncviewer
GRID_HOME/bin/asmca

#Configuring Oracle Net Services(listener.ora/tnsnames.ora)
su - oracle
/u01/app/grid/product/11.2.0/grid/bin/tnsping orcl6

#install windows oracle 11g client

PS:

1.More about oracle rac installation on linux here http://www.oracleonlinux.cn/?s=%E4%B8%80%E6%AD%A5%E4%B8%80%E6%AD%A5%E5%9C%A8Linux%E4%B8%8A%E5%AE%89%E8%A3%85 (it's 32bits, change for example /lib/security/pam_limits.so to /lib64/security/pam_limits.so)

2.PDF files of above link oracle_on_linux.zip

How HA is achived in Oracle Exadata

November 27th, 2012 Comments off
  1. Each Exadata Database Machine has completely redundant hardware including redundant InfiniBand networking, redundant Power Distribution Units (PDU), redundant power upplies, and redundant database and storage servers.
  2. Oracle RAC protects against database server failure.
  3. ASM provides data mirroring to protect against disk or storage server failures.
  4. Oracle RMAN provides extremely fast and efficient backups to disk or tape.
  5. Oracle’s Flashback technology allows backing out user errors at the database, table or even row level.
  6. Using Oracle Data Guard, a second Exadata Database Machine can be configured to maintain a real-time copy of the database at a remote site to provide full protection against site failures and disasters.

resolved – semget failed with status 28 failed oracle database starting up

August 2nd, 2012 Comments off

Today we met a problem with semaphore and unable to start oracle instances. Here's the error message:

ORA-27154: post/wait create failed
ORA-27300: OS system dependent operation:semget failed with status: 28
ORA-27301: OS failure message: No space left on device
ORA-27302: failure occurred at: sskgpcreates

So it turns out, the max number of arrays have been reached:
#check limits of all IPC
root@doxer# ipcs -al

------ Shared Memory Limits --------
max number of segments = 4096
max seg size (kbytes) = 67108864
max total shared memory (kbytes) = 17179869184
min seg size (bytes) = 1

------ Semaphore Limits --------
max number of arrays = 128
max semaphores per array = 250
max semaphores system wide = 1024000
max ops per semop call = 100
semaphore max value = 32767

------ Messages: Limits --------
max queues system wide = 16
max size of message (bytes) = 65536
default max size of queue (bytes) = 65536

#check summary of semaphores
root@doxer# ipcs -su

------ Semaphore Status --------
used arrays = 127
allocated semaphores = 16890

To resolve this, we need increase value of max number of semaphore arrays:

root@doxer# cat /proc/sys/kernel/sem
250 1024000 100 128
                 ^---needs to be increased

PS:

Here's an example with toilets that describes differences between mutex and semaphore LOL http://koti.mbnet.fi/niclasw/MutexSemaphore.html

oracle RMAN backups and hot backup mode

July 28th, 2012 Comments off

In one sentence, to backup Oracle with OS(BCV for example), database should be put into hot backup mode. But RMAN backups can be performed while the database is online.

Also, oracle GoldenGate is used to replicate DB between heterogeneous systems, for example, oracle replicated to mysql/sql server etc.

oracle golden gate documentation

July 28th, 2012 Comments off

Here's some excerpts from oracle document about oracle golden gate:

Robust Modular Architecture

The Oracle GoldenGate software architecture is comprised of three primary components:
Capture, Trail Files, and Delivery. This modular approach allows each component to perform
its tasks independently of the others, accelerating data replication and ensuring data integrity.
Figure 1: Oracle GoldenGate leverages a component-based architecture to optimize real-time
information access and availability.

  • Capture

Oracle GoldenGate’s Capture module resides on the source database and looks for new
transactional activity. The Capture module reads the result of insert, update, and delete
operations by directly accessing the database transaction (redo) logs, and then immediately
captures new and changed data for distribution.
The Capture module only moves committed transactions—filtering out intermediate activities
and rolled-back operations—which not only reduces infrastructure load but also eliminates
potential data inconsistencies. Further optimization is achieved through transaction grouping
and optional compression features.
Oracle GoldenGate 11g can also capture messages from JMS messaging systems to deliver to
heterogeneous databases in real time for scalable and reliable data distribution.

  • Trail Files

Oracle GoldenGate’s Trail Files contain the database operations for the changed data in a
transportable, platform-independent data format. Trail Files are a critical component within
Oracle GoldenGate’s optimized queuing mechanism. They reside on the source and/or target
server but exist outside of the database to ensure heterogeneity, improved reliability, and
minimal data loss. This architecture minimizes impact to the source system because no
additional tables or queries to the database are required to support the data capture process.
The Capture module reads once, and then immediately moves the captured data to the external
Trail File for delivery to the target(s).
In the event of an outage at the source and/or target, the Trail Files contain the most-recent
data up to the point of the outage, and the data is applied once the systems are online again.

  • Delivery

Oracle GoldenGate’s Delivery module takes the changed data from the latest Trail File and
applies it to the target database using native SQL for the appropriate relational database
management system. Delivery can be made to any open database connectivity–compliant
database. The Delivery module applies each transaction in the same order as it was committed
and within the same transactional context as at the source, enabling consistency and referential
integrity at the target. To enhance IT flexibility, captured data can also be delivered to a Java
Message Service destination or as a flat file using Oracle GoldenGate Application Adapters.

For full documentation, you can refer to the following pdf file: http://www.oracle.com/us/products/middleware/data-integration/goldengate11g-ds-168062.pdf?ssSourceSiteId=otnen

ORA-00600 internal error caused by /tmp swap full

June 22nd, 2012 Comments off

Today we encountered a problem when oracle failed to functioning. After some checking, this error was caused by /tmp running out of space. This also confirmed by OS logs:

Jun 20 17:43:59 tmpfs: [ID 518458 kern.warning] WARNING: /tmp: File system full, swap space limit exceeded

Oracle uses /tmp to compile PL/SQL code, so if there no space it unable to compile/execute. Which causing functions/procedures/packeges and trigers to timeout. The same also described in oracle note: ID 1389623.1

So in order to prevent further occurrences of this error, we should increase /tmp on the system to at least 4Gb.

There is an Oracle parameter to change the default location of these temporary files(_ncomp_shared_objects_dir), but it's not a dynamic parameter. And also, while there is a way to resize a tmpfs filesystem online but it's somehow risky. So the best idea is that, we firstly bring down Oracle DB on this host, then modify /etc/vfstab, and then reboot the whole system. This way will protect our data against the risk of corruption or lost etc, also it'll have some outage time.
So finally, here's the steps:
Amend the line in /etc/vfstab from:

swap - /tmp tmpfs - yes size=512m

To:

swap - /tmp tmpfs - yes size=4096m

Reboot machine and bring up oracle DB

using oracle materialized view with one hour refresh interval to reduce high concurrency

June 8th, 2012 Comments off

If your oracle DB is at a very high concurrency and you find that the top sqls are some views, then there's a quick way to resolve this: using oracle materialized view. You may consider setting the refresh interval to one hour which means the view will refresh every hour. After the setting go live, you'll find the normal performance will appear.

For more information about oracle materialized view, you can visit http://en.wikipedia.org/wiki/Materialized_view

Here's a image with high oracle concurrency:

oracle high concurrency