resolved – passwd: User not known to the underlying authentication module

August 12th, 2015

Today we met the following error when tried to change one user's password:

[root@test ~]# echo 2cool|passwd --stdin test
Changing password for user test.
passwd: User not known to the underlying authentication module

And after some searching work, we found it's caused by /etc/shadow file missing:

[root@test ~]# ls -l /etc/shadow
ls: /etc/shadow: No such file or directory

To generate the /etc/shadow file, use pwconv command:

[root@test ~]# pwconv

[root@test ~]# ls -l /etc/shadow
-r-------- 1 root root 1254 Aug 11 12:13 /etc/shadow

After this, we can reset password without issue:

[root@test ~]# echo mypass|passwd --stdin test
Changing password for user test.
passwd: all authentication tokens updated successfully.

Categories: IT Architecture, Linux, Systems, Unix Tags:

resolved – ORA-27303: additional information: Invalid protocol requested (2) or protocol not loaded

August 6th, 2015

Today when I tried to start up crs after patching(crsctl start crs), the following error occurred in /u01/app/11.2.0.4/grid/log/test/alerttest.log:

2015-07-31 11:57:54.702:
[/u01/app/11.2.0.4/grid/bin/oraagent.bin(18654)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/scratch/app/11.2.0.4/grid/log/slcai081/agent/ohasd/oraagent_oracle/oraagent_oracle.log"
2015-07-31 11:57:59.894:
[/u01/app/11.2.0.4/grid/bin/oraagent.bin(18654)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/scratch/app/11.2.0.4/grid/log/slcai081/agent/ohasd/oraagent_oracle/oraagent_oracle.log"
2015-07-31 11:58:05.121:
[/u01/app/11.2.0.4/grid/bin/oraagent.bin(18654)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/scratch/app/11.2.0.4/grid/log/slcai081/agent/ohasd/oraagent_oracle/oraagent_oracle.log"
2015-07-31 11:58:10.322:
[ohasd(17944)]CRS-2807:Resource 'ora.asm' failed to start automatically.
2015-07-31 11:58:10.326:
[ohasd(17944)]CRS-2807:Resource 'ora.crsd' failed to start automatically.

And running crsctl check crs, I saw that CRS/EM were not up:

[root@test ~]# /u01/app/11.2.0.4/grid/bin/crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4529: Cluster Synchronization Services is online
CRS-4534: Cannot communicate with Event Manager

Later I tried manually start up ora.crsd, but still failed:

[root@test ~]# /u01/app/11.2.0.4/grid/bin/crsctl start res ora.crsd -init
CRS-2672: Attempting to start 'ora.asm' on 'test'
CRS-5017: The resource action "ora.asm start" encountered the following error:
ORA-27504: IPC error creating OSD context
ORA-00600: internal error code, arguments: [OSDEP_INTERNAL], [], [], [], [], [], [], [], [], [], [], []
ORA-27302: failure occurred at: sskgxplp
ORA-27303: additional information: Invalid protocol requested (2) or protocol not loaded.
. For details refer to "(:CLSN00107:)" in "/u01/app/11.2.0.4/grid/log/test/agent/ohasd/oraagent_oracle//oraagent_oracle.log".
CRS-2674: Start of 'ora.asm' on 'test' failed
CRS-2679: Attempting to clean 'ora.asm' on 'test'
CRS-2681: Clean of 'ora.asm' on 'test' succeeded
CRS-4000: Command Start failed, or completed with errors.

From the output, it's complaining about "Invalid protocol requested". As this RAC is non-exadata, so we should use ipc_g protocol rather than ipc_rds which is for Exadata when relinking oracle binaries. So I made the following change in the script:

#su $OWNER -c "cd $1/rdbms/lib && export ORACLE_HOME=$1 && /usr/bin/make -f ins_rdbms.mk ipc_rds lbac_off dv_off ioracle > /dev/null
su $OWNER -c "cd $1/rdbms/lib && export ORACLE_HOME=$1 && /usr/bin/make -f ins_rdbms.mk ipc_g ioracle > /dev/null"

After that, I rollbacked all patches, and re-run patching, and later all were ok.

PS:

You can run $GRID_HOME/bin/skgxpinfo & $ORACLE_HOME/bin/skgxpinfo to check whether RAC Interconnect is uing UDP or RDS.

Categories: Databases, IT Architecture, Oracle DB Tags:

remove usb disk from LVM

July 29th, 2015

On some servers, USB stick may become part of the LVM volume group. The usb is more prone to fail, which will cause big issues when they start. Also they work at a different speed than the drives, and this also causes performance issues.

For example, on one server, you can see below:

[root@test ~]# vgs
  VG      #PV #LV #SN Attr   VSize VFree
  DomUVol   2   1   0 wz--n- 3.77T    0

[root@test~]# pvs
  PV         VG      Fmt  Attr PSize PFree
  /dev/sdb1  DomUVol lvm2 a--  3.76T    0 #all PE allocated
  /dev/sdc1  DomUVol lvm2 a--  3.59G    0 #this is usb device

[root@test~]# lvs
  LV      VG      Attr   LSize Origin Snap%  Move Log Copy%  Convert
  scratch DomUVol -wi-ao 3.77T

[root@test ~]# df -h /scratch
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/DomUVol-scratch
                      3.7T  257G  3.3T   8% /scratch

[root@test~]# pvdisplay /dev/sdc1
  --- Physical volume ---
  PV Name               /dev/sdc1
  VG Name               DomUVol
  PV Size               3.61 GB / not usable 14.61 MB
  Allocatable           yes (but full)
  PE Size (KByte)       32768
  Total PE              115
  Free PE               0
  Allocated PE          115 #so physical extents are allocated on this usb device
  PV UUID               a8a0P5-AlCz-Cu5e-acC2-ldEQ-NPCn-kc4Du0

As you can see from above, as the USB device has PE allocated, so to remove that(pvreduce), we need first move PE from it to other PV. We can see the other PV has all space allocated(can also confirm from vgs above):

[root@test~]# pvdisplay /dev/sdb1
  --- Physical volume ---
  PV Name               /dev/sdb1
  VG Name               DomUVol
  PV Size               3.76 TB / not usable 30.22 MB
  Allocatable           yes (but full)
  PE Size (KByte)       32768
  Total PE              123360
  Free PE               0
  Allocated PE          123360
  PV UUID               5IyCgh-JsiV-EnpO-XKj4-yxNq-pRjI-d7LKGy

Here are the steps for taking usb device out of VG:

umount /scratch
#fsck -y /dev/mapper/DomUVol-scratch
lvreduce --size -5G /dev/mapper/DomUVol-scratch
resize2fs /dev/mapper/DomUVol-scratch

[root@test ~]# vgs
  VG      #PV #LV #SN Attr   VSize VFree
  DomUVol   2   1   0 wz--n- 3.77T 5.00G

[root@test ~]# pvs
  PV         VG      Fmt  Attr PSize PFree
  /dev/sdb1  DomUVol lvm2 a--  3.76T 1.41G
  /dev/sdc1  DomUVol lvm2 a--  3.59G 3.59G #PEs on the usb device are all freed, if not, use pvmove /dev/sdc1. More info is here about pvmove

[root@test ~]# pvdisplay /dev/sdc1
  --- Physical volume ---
  PV Name               /dev/sdc1
  VG Name               DomUVol
  PV Size               3.61 GB / not usable 14.61 MB
  Allocatable           yes
  PE Size (KByte)       32768
  Total PE              115
  Free PE               115
  Allocated PE          0
  PV UUID               a8a0P5-AlCz-Cu5e-acC2-ldEQ-NPCn-kc4Du0

[root@test ~]# vgreduce DomUVol /dev/sdc1
  Removed "/dev/sdc1" from volume group "DomUVol"

[root@test ~]# pvs
  PV         VG      Fmt  Attr PSize PFree
  /dev/sdb1  DomUVol lvm2 a--  3.76T 1.41G
  /dev/sdc1          lvm2 a--  3.61G 3.61G #VG column is empty for the usb device, this confirms the usb device is taken out of VG. You can run pvremove /dev/sdc1 to remove the pv.

PS:

  1. If you want to shrink lvm volume(lvreduce) on /, then you'll need go to linux rescue mode. Select "Skip" when the system prompts for the option of mounting / to /mnt/sysimage. Run "lvm vgchange -a y" first and then the other steps are more or less the same as above, but you'll need type "lvm" before any lvm command, such as "lvm lvs", "lvm pvs", "lvm lvreduce" etc.
  2. You can refer to this article about using vgcfgrestore to restore vg config from /etc/lvm/archive/.

resolved – yum install Error: Protected multilib versions

June 30th, 2015

Today when I tried to install firefox.i686 on Linux using yum, the following error occurred:

Protected multilib versions: librsvg2-2.26.0-14.el6.i686 != librsvg2-2.26.0-5.el6_1.1.x86_64
Error: Protected multilib versions: devhelp-2.28.1-6.el6.i686 != devhelp-2.28.1-3.el6.x86_64
Error: Protected multilib versions: ImageMagick-6.5.4.7-7.el6_5.i686 != ImageMagick-6.5.4.7-6.el6_2.x86_64
Error: Protected multilib versions: vte-0.25.1-9.el6.i686 != vte-0.25.1-8.el6_4.x86_64
Error: Protected multilib versions: polkit-gnome-0.96-4.el6.i686 != polkit-gnome-0.96-3.el6.x86_64
You could try using --skip-broken to work around the problem
You could try running: rpm -Va --nofiles --nodigest

To resolve this, just run yum update <package names>, then the problem will go away.

Categories: IT Architecture, Linux, Systems, Unix Tags:

resolved – ORA-01102: cannot mount database in EXCLUSIVE mode

June 16th, 2015

Today when I tried to startup one RAC DB, it failed with ORA-01102:

[oracle@testvm ~]$ srvctl start database -d testdb -o "open"
PRCR-1079 : Failed to start resource ora.testdb.db
CRS-5017: The resource action "ora.testdb.db start" encountered the following error:
ORA-01102: cannot mount database in EXCLUSIVE mode
. For details refer to "(:CLSN00107:)" in "/u01/app/11.2.0.4/grid/log/testvm/agent/crsd/oraagent_oracle//oraagent_oracle.log".

CRS-2674: Start of 'ora.testdb.db' on 'testvm' failed
CRS-2632: There are no more servers to try to place resource 'ora.testdb.db' on that would satisfy its placement policy

SQL> startup
ORA-01102: cannot mount database in EXCLUSIVE mode

Later, I realized that the DB was still out of cluster mode:

SQL> show parameter cluster;

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
cluster_database boolean FALSE
cluster_database_instances integer 1
cluster_interconnects string

So I toke the following steps to take it into cluster mode:

SQL> alter system set cluster_database=true scope=spfile;

System altered.

SQL> alter system set cluster_database_instances=2 scope=spfile;

System altered.

After this, the DB started up normally.

Categories: Databases, IT Architecture, Oracle DB Tags:

generate a load on oracle database

June 5th, 2015

Sometimes you're doing a test of Oracle database, but the DB load is not high but you want the load go high to facilitate your testing. Here's the way:

DECLARE
 N NUMBER;
BEGIN
FOR I IN 1..100000 LOOP
 SELECT /*+ ORDERED USE_NL(C) FULL(C) FULL(S)*/ COUNT(*) INTO N
 FROM SH.SALES S, SH.CUSTOMERS C
 WHERE C.CUST_ID = S.CUST_ID AND CUST_FIRST_NAME='Sarah'
 ORDER BY TIME_ID;
 DBMS_LOCK.SLEEP(1);
END LOOP;
END;
/

You can press Ctrl+C to cancel it.

Categories: Databases, IT Architecture, Oracle DB Tags:

create a big table with one million lines on oracle database for testing

May 10th, 2015

#First, create tablespace along with datafile

SQL> create tablespace test datafile '/u01/app/oracle/product/11.2.0.4/dbhome_1/test/datafile1.dbf' size 512m;

#Then create table. We disable logging to avoid unnecessary redo data

SQL> create table bigtab tablespace test as select rownum id, a.* from all_objects a where 1=0;
SQL> alter table bigtab nologging;

#now populate the table named bigtab

DECLARE
  L_CNT NUMBER;
  L_ROWS NUMBER := 1000000;
BEGIN
  INSERT /*+ APPEND */ INTO BIGTAB SELECT ROWNUM, A.* FROM ALL_OBJECTS A;
  L_CNT := SQL%ROWCOUNT;
  COMMIT;
  WHILE (L_CNT < L_ROWS)
  LOOP
    INSERT /*+ APPEND */ INTO BIGTAB
    SELECT ROWNUM+L_CNT,
      OWNER, OBJECT_NAME, SUBOBJECT_NAME, OBJECT_ID, DATA_OBJECT_ID, OBJECT_TYPE, CREATED,
      LAST_DDL_TIME, TIMESTAMP, STATUS, TEMPORARY, GENERATED, SECONDARY, NAMESPACE, EDITION_NAME
    FROM BIGTAB
      WHERE ROWNUM <= L_ROWS-L_CNT;
    L_CNT := L_CNT + SQL%ROWCOUNT;
    COMMIT;
   END LOOP;
END;
/

#check result

SQL> select count(*) from bigtab;

COUNT(*)
----------
1000000

#get the tracefile of the session

SQL> SELECT TRACEFILE FROM V$SESSION S, V$PROCESS P WHERE S.PADDR=P.ADDR AND S.SID=SYS_CONTEXT('USERENV','SID');

SQL> alter session set events '10046 trace name context forever, level 12';

#to get the maximum number of blocks that can be read

[oracle@testvm ~]$ grep scattered <trace file from above>

WAIT #139828744903832: nam='db file scattered read' ela= 1715 file#=10 block#=14192 blocks=8 obj#=56403 tim=1431233617613776
WAIT #139828744903832: nam='db file scattered read' ela= 6836 file#=10 block#=14268 blocks=8 obj#=56403 tim=1431233617620994

PS: 

More info is here.

Categories: Databases, Oracle DB Tags:

resolved – Checking for glibc-devel-2.12-1.7-i686; Not found. Failed

May 5th, 2015

Today when I tried to install Oracle EM Cloud Control 12c, below error prompted when pre-checking:

pre-check failed

So from above, we can see that it's complaining about missing package "glibc-devel-2.12-1.7-i686"(Checking for glibc-devel-2.12-1.7-i686; Not found. Failed). And I found there were glibc related packages on the system:

[root@testvm ~]# rpm -qa|grep glibc
glibc-common-2.12-1.149.el6_6.7.x86_64
glibc-devel-2.12-1.149.el6_6.7.x86_64
glibc-headers-2.12-1.149.el6_6.7.x86_64
glibc-2.12-1.149.el6_6.7.x86_64

But they were all x86_64 version, not the missing i686 one. So I determined to install i686 ones:

[root@testvm]# yum install -y glibc.i686 glibc-devel.i686 glibc-static.i686

After this, and press "Rerun", the check succeeded.

Categories: IT Architecture, Linux, Systems, Unix Tags:

install oracle instant client to use sqlplus on linux

April 30th, 2015

To use sqlplus to connect remotely to oracle database, you should have oracle database client installed on your box. To do this, you can follow below steps:

1. Download oracle-instantclient11.2-basic-11.2.0.3.0-1.x86_64.rpm and oracle-instantclient11.2-sqlplus-11.2.0.3.0-1.x86_64.rpm from here.

2. Install oracle-instantclient11.2-basic-11.2.0.3.0-1.x86_64.rpm and oracle-instantclient11.2-sqlplus-11.2.0.3.0-1.x86_64.rpm using rpm -i <package name>.

3. Set linux environment variables as below in ~/.bashrc:

export LANG=C
export HISTSIZE=100000
export HISTTIMEFORMAT="%h/%d - %H:%M:%S "
export ORACLE_HOME=/usr/lib/oracle/11.2/client64
export PATH=$PATH:/doxer/tools/bin:$ORACLE_HOME/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/oracle/11.2/client64/lib
export PS1='[\u@\h-doxer \W]\$ '

4. Connect using sqlplus, e.g. sqlplus sys/pass@scan-example.test.com:1521/service1 as sysdba

Categories: Databases, IT Architecture, Oracle DB Tags:

raid10 and raid01

April 21st, 2015

RAID 0 over RAID 1(raid 0+1, raid 10, stripe of mirrors, better)

(RAID 1) A = Drive A1 + Drive A2 (Mirrored)
(RAID 1) B = Drive B1 + Drive B2 (Mirrored)
RAID 0 = (RAID 1) A + (RAID 1) B (Striped)

stripe-of-mirrors-raid10


RAID 1 over RAID 0(raid 1+0, raid01, mirror of stripes)

(RAID 0) A = Drive A1 + Drive A2 (Striped)
(RAID 0) B = Drive B1 + Drive B2 (Striped)
RAID 1 = (RAID 1) A + (RAID 1) B (Mirrored)
mirror-of-stripes

PS:

For write performance: raid0 > raid10 > raid5

For read performance: raid1

For data protection: raid1

Raid2 - put parity data into multiple disks. stipe using bit/byte

Raid3 - put parity data into single disk. Good for sequential data, but for random data, parity disk will become bottleneck. stipe using bit/byte

Raid4 - put parity data into multiple disks. stipe using block/record.

Raid5 - put parity data into all disks. Good for small/random access. write punishment(one write will generate two reads for old parity/data, two writes for new parity/data)

Raid6 - added another parity data. Can tolerate two disk failing. Worse write punishment.

Categories: Hardware, IT Architecture, Storage, Systems Tags: