resolved – ORA-00020: maximum number of processes (1000) exceeded

March 18th, 2015

Today I encountered ORA-12516 error when trying to access oracle database:

[root@client-doxer ~]# sqlplus tauser/password1@rac0102-r.example.com:1521/qainfac1

SQL*Plus: Release 11.2.0.3.0 Production on Tue Mar 17 07:31:04 2015

Copyright (c) 1982, 2011, Oracle. All rights reserved.

ERROR:
ORA-12516: TNS:listener could not find available handler with matching protocol
stack

Enter user-name:

Then I had a try of connecting using VIP instead of scan name, but it failed too:

[root@client-doxer ~]# sqlplus tauser/password1@rac0102-v.example.com:1521/qainfac1

SQL*Plus: Release 11.2.0.3.0 Production on Tue Mar 17 07:37:22 2015

Copyright (c) 1982, 2011, Oracle. All rights reserved.

ERROR:
ORA-12516: TNS:listener could not find available handler with matching protocol
stack

Enter user-name:

Then on the database server, I had a check of service qainfac1:

[root@rac01 crsd]# /u01/app/11.2.0.4/grid/bin/crsctl status res -t|grep -A5 ora.qainf1.db
ora.qainf1.db
1 ONLINE ONLINE rac01 Open
2 OFFLINE OFFLINE Instance Shutdown
ora.qainf1.qainfac1.svc
1 ONLINE ONLINE rac01
2 OFFLINE OFFLINE

So one instance is running fine. I tried sqlplus connection from local server:

[oracle@rac01 ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Tue Mar 17 07:45:39 2015

Copyright (c) 1982, 2013, Oracle. All rights reserved.

ERROR:
ORA-00020: maximum number of processes (1000) exceeded

Enter user-name: ^C

That's it, "ORA-00020: maximum number of processes (1000) exceeded". Then it's going to be a question of adjusting parameter PROCESSES. As parameter PROCESSES cannot be changed with ALTER SYSTEM unless a server parameter file was used to start the instance and the change takes effect in subsequent instances, so a bounce of instance is needed to activiate the new setting:

SQL> set lines 200
SQL> col NAME for a30
SQL> col VALUE for a40
SQL> select NAME,VALUE,ISSES_MODIFIABLE,ISSYS_MODIFIABLE,ISINSTANCE_MODIFIABLE from v$parameter where name='processes';
NAME VALUE ISSES ISSYS_MOD ISINS
------------------------------ ---------------------------------------- ----- --------- -----
processes 1500 FALSE FALSE FALSE

SQL> show parameter processes;

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
aq_tm_processes integer 1
db_writer_processes integer 3
gcs_server_processes integer 2
global_txn_processes integer 1
job_queue_processes integer 1000
log_archive_max_processes integer 4
processes integer 1000

In trace file /u01/app/oracle/diag/rdbms/qainf1/qainf12/trace/alert_qainf12.log, I can see below errors:

Unable to allocate flashback log of 51094 blocks from
current recovery area of size 214748364800 bytes.
Recovery Writer (RVWR) is stuck until more space
is available in the recovery area.
Unable to write Flashback database log data because the
recovery area is full, presence of a guaranteed
restore point and no reusable flashback logs.

Here's Fast Recovery Area info:

SQL> show parameter db_recovery_file_dest;

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_recovery_file_dest string +DATA
db_recovery_file_dest_size big integer 200G

And here's ASM diskgroup info:

[oracle@rac01 ~]$ export ORACLE_SID=+ASM2
[oracle@rac01 ~]$ export ORACLE_HOME=/u01/app/11.2.0.4/grid
[oracle@rac01 ~]$ export PATH=$ORACLE_HOME/bin:$PATH
[oracle@rac01 ~]$ sqlplus / as sysasm

SQL*Plus: Release 11.2.0.4.0 Production on Wed Mar 18 02:27:13 2015

Copyright (c) 1982, 2013, Oracle. All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

SQL> set lines 200;
SQL> select name, total_mb, free_mb, total_mb-free_mb used_mb from v$asm_diskgroup;

NAME TOTAL_MB FREE_MB USED_MB
------------------------------ ---------- ---------- ----------
DATA 4681689 533392 4148297

I had a check of restore points:

SQL> col NAME for a20
SQL> col time for a40
SQL> col SCN for 999999999999999
SQL> col STORAGE_SIZE for 999999999999999
SQL> SELECT NAME, SCN, TIME, DATABASE_INCARNATION#, GUARANTEE_FLASHBACK_DATABASE,STORAGE_SIZE FROM V$RESTORE_POINT;

NAME SCN TIME DATABASE_INCARNATION# GUA STORAGE_SIZE
-------------------- -------------- ---------------------------------------- --------------------- --- ------------
GRPT_BF_UPGR 14035000000000 03-MAR-15 04.16.12.000000000 PM 2 YES 214310000000

And I dropped the restore point to free space it's no longer needed to keep the restore point:

SQL> drop restore point GRPT_BF_UPGR;

After this, 214G space released from FRA and I can startup DB then set processes parameter to 1500(kill some processes with root by "ps -ef|grep <sid>" if sqlplus won't work even on local server):

SQL> alter system set processes=1500 scope=spfile;
SQL> shutdown immediate;
SQL> startup mount;
SQL> alter database flashback on;
SQL> alter database open;
SQL> select LOG_MODE,flashback_on from v$database;
LOG_MODE FLASHBACK_ON
------------ ------------------
ARCHIVELOG NO

Categories: Databases, IT Architecture, Oracle DB Tags:

sendmail DSN: Data format error

March 5th, 2015

If you met error when sending mail using sendmail(or linux mail/mailx command), then you should check /var/log/maillog for details. For example:

Mar 5 02:39:10 testhost1 sendmail[15281]: t252dAZr015281: from=root, size=78, class=0, nrcpts=1, msgid=<201503050239.t252dAZr015281@testhost1.us.example.com>, relay=root@localhost
Mar 5 02:39:10 testhost1 sendmail[15282]: t252dA8Z015282: from=<root@testhost1.us.example.com>, size=393, class=0, nrcpts=1, msgid=<201503050239.t252dAZr015281@testhost1.us.example.com>, proto=ESMTP, daemon=MTA, relay=localhost.localdomain [127.0.0.1]
Mar 5 02:39:10 testhost1 sendmail[15281]: t252dAZr015281: to=user1@example.com, ctladdr=root (0/0), delay=00:00:00, xdelay=00:00:00, mailer=relay, pri=30078, relay=[127.0.0.1] [127.0.0.1], dsn=2.0.0, stat=Sent (t252dA8Z015282 Message accepted for delivery)
Mar 5 02:39:10 testhost1 sendmail[15284]: t252dA8Z015282: to=<user1@example.com>, ctladdr=<root@testhost1.us.example.com> (0/0), delay=00:00:00, xdelay=00:00:00, mailer=esmtp, pri=120393, relay=smtpserver1.example.com. [192.151.231.4], dsn=5.6.0, stat=Data format error
Mar 5 02:39:10 testhost1 sendmail[15284]: t252dA8Z015282: t252dA8Z015284: DSN: Data format error
Mar 5 02:39:10 testhost1 sendmail[15284]: t252dA8Z015284: to=<root@testhost1.us.example.com>, delay=00:00:00, xdelay=00:00:00, mailer=local, pri=31660, dsn=2.0.0, stat=Sent

From here, you can see that after relaying, the mail finally failed with DSN code 5.6.0(Delivery Status Notification extension of SMTP). So you should check the code details:

5 Permanent or Fatal error. This can be caused by a non existent email address, DNS problem, or your email was blocked by the receiving server.
X.6.0 - Other or undefined media error

X.6.0 Other or undefined media error Not given Something about the content of a message caused it to be considered undeliverable and the problem cannot be well expressed with any of the other provided detail codes.

PS:

For more info abotu DSN code, you can check http://www.inmotionhosting.com/support/email/email-troubleshooting/smtp-and-esmtp-error-code-list or http://tools.ietf.org/rfc/rfc3463.txt or http://www.iana.org/assignments/smtp-enhanced-status-codes/smtp-enhanced-status-codes.xml for details.

Categories: IT Architecture, Linux, Systems Tags:

TCP Window Scaling – values about TCP buffer size

February 4th, 2015

TCP Window Scaling(TCP socket buffer size, TCP window size)

/proc/sys/net/ipv4/tcp_window_scaling #1 is to enable window scaling
/proc/sys/net/ipv4/tcp_rmem - memory reserved for TCP rcv buffers. minimum, initial and maximum buffer size
/proc/sys/net/ipv4/tcp_wmem - memory reserved for TCP send buffers
/proc/sys/net/core/rmem_max - maximum receive window
/proc/sys/net/core/wmem_max - maximum send window

The following values (which are the defaults for 2.6.17 with more than 1 GByte of memory) would be reasonable for all paths with a 4MB BDP or smaller:

echo 1 > /proc/sys/net/ipv4/tcp_moderate_rcvbuf #autotuning enabled. The receiver buffer size (and TCP window size) is dynamically updated (autotuned) for each connection. (Sender side autotuning has been present and unconditionally enabled for many years now).
echo 108544 > /proc/sys/net/core/wmem_max
echo 108544 > /proc/sys/net/core/rmem_max
echo "4096 87380 4194304" > /proc/sys/net/ipv4/tcp_rmem
echo "4096 16384 4194304" > /proc/sys/net/ipv4/tcp_wmem

Advanced TCP features

cat /proc/sys/net/ipv4/tcp_timestamps #more is here(allow more accurate RTT measurements for deriving the retransmission timeout estimator; protect against old segments from the previous incarnations of the TCP connection; allow detection of unnecessary retransmissions. But enabling it will also allow you to guess the uptime of a target system.)
cat /proc/sys/net/ipv4/tcp_sack

Here are some background knowledge:

  • The throughput of a communication is limited by two windows: the congestion window and the receive window(TCP congestion window is maintained by the sender, and TCP window size is maintained by the receiver). The former tries not to exceed the capacity of the network (congestion control) and the latter tries not to exceed the capacity of the receiver to process data (flow control). The receiver may be overwhelmed by data if for example it is very busy (such as a Web server). Each TCP segment contains the current value of the receive window. If for example a sender receives an ack which acknowledges byte 4000 and specifies a receive window of 10000 (bytes), the sender will not send packets after byte 14000, even if the congestion window allows it.
  • TCP uses what is called the "congestion window", or CWND, to determine how many packets can be sent at one time. The larger the congestion window size, the higher the throughput. The TCP "slow start" and "congestion avoidance" algorithms determine the size of the congestion window. The maximum congestion window is related to the amount of buffer space that the kernel allocates for each socket. For each socket, there is a default value for the buffer size, which can be changed by the program using a system library call just before opening the socket. There is also a kernel enforced maximum buffer size. The buffer size can be adjusted for both the send and receive ends of the socket.
  • To get maximal throughput it is critical to use optimal TCP send and receive socket buffer sizes for the link you are using. If the buffers are too small, the TCP congestion window will never fully open up. If the receiver buffers are too large, TCP flow control breaks and the sender can overrun the receiver, which will cause the TCP window to shut down. This is likely to happen if the sending host is faster than the receiving host. Overly large windows on the sending side is not usually a problem as long as you have excess memory; note that every TCP socket has the potential to request this amount of memory even for short connections, making it easy to exhaust system resources.
  • More about TCP Buffer Sizing is here.
  • More about /proc/sys/net/ipv4/* Variables is here.

resolved – TNS:listener does not currently know of service requested in connect descriptor

February 3rd, 2015

Today we found errors in weblogic log about datasource connection:

TNS:listener does not currently know of service requested in connect descriptor

And in our configuration, data source was using below info:

jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=testrac-r.example.com)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=testservice)))

This was weird as it worked before. After some debugging, we found that the 3 IPs of scan name testrac-r.example.com behaved abnormally on RAC:

[root@rac1 ~]# /sbin/ifconfig|egrep -B1 '192.168.20.5|192.168.20.6|192.168.20.7'
v115_FE:3 Link encap:Ethernet HWaddr 00:21:28:F0:30:4C
inet addr:192.168.20.5 Bcast:10.245.87.255 Mask:255.255.248.0
--
v115_FE:4 Link encap:Ethernet HWaddr 00:21:28:F0:30:4C
inet addr:192.168.20.7 Bcast:10.245.87.255 Mask:255.255.248.0
--
v115_FE:5 Link encap:Ethernet HWaddr 00:21:28:F0:30:4C
inet addr:192.168.20.6 Bcast:10.245.87.255 Mask:255.255.248.0

[root@rac2 ~]# /sbin/ifconfig|egrep -B1 '192.168.20.5|192.168.20.6|192.168.20.7'
v115_FE:6 Link encap:Ethernet HWaddr 00:21:28:E8:3C:16
inet addr:192.168.20.7 Bcast:10.245.87.255 Mask:255.255.248.0
--
v115_FE:7 Link encap:Ethernet HWaddr 00:21:28:E8:3C:16
inet addr:192.168.20.6 Bcast:10.245.87.255 Mask:255.255.248.0

As showed above, 192.168.20.6 and 192.168.20.7 were up on both of the nodes. This behavior indicated scan name was somehow wrong. So we did a bounce of scan name service. And after that, the issue was gone.

Categories: Databases, IT Architecture, Oracle DB Tags:

Close Putty sessions without exit confirmation dialog

January 14th, 2015

You can set this in Putty "Change Settings" -> "Window" -> "Behaviour", and uncheck "Warn before closing window". Save the config in "Session", and now all windows can be closed without any exit confirmation dialog.

putty_session

Categories: Misc Tags:

resolved – su: cannot set user id: Resource temporarily unavailable

January 12th, 2015

When i try to log on as user "test", error occurred:

su: cannot set user id: Resource temporarily unavailable

I had a check of limits.conf:

[root@testvm ~]# cat /etc/security/limits.conf|egrep -v '^$|^#'
oracle   soft   nofile    131072
oracle   hard   nofile    131072
oracle   soft   nproc    131072
oracle   hard   nproc    131072
oracle   soft   core    unlimited
oracle   hard   core    unlimited
oracle   soft   memlock    50000000
oracle   hard   memlock    50000000
@svrtech    soft    memlock         500000
@svrtech    hard    memlock         500000
*   soft   nofile    131072
*   hard   nofile    131072
*   soft   nproc    131072
*   hard   nproc    131072
*   soft   core    unlimited
*   hard   core    unlimited
*   soft   memlock    50000000
*   hard   memlock    50000000

Then I had a check of the number of processes/threads with the maximum number of processes to see whether it's coming over the line:

[root@c9qa131-slcn03vmf0293 ~]# ps -eLF | grep test | wc -l
1026

So it's not exceeding. Then I had a check of open files:

[root@testvm ~]# lsof | grep aime | wc -l

6059

It's not exceeding 131072 either, then why the error "su: cannot set user id: Resource temporarily unavailable" was there? Actually the culprit was in file /etc/security/limits.d/90-nproc.conf:

[root@testvm ~]# cat /etc/security/limits.d/90-nproc.conf
# Default limit for number of user's processes to prevent
# accidental fork bombs.
# See rhbz #432903 for reasoning.

* soft nproc 1024
root soft nproc unlimited

After I modified 1024 to 131072, the issue gone away immediately.

Categories: IT Architecture, Kernel, Linux, Systems, Unix Tags:

resolved – Error: Unable to connect to xend: Connection reset by peer. Is xend running?

January 7th, 2015

Today I met some issue when trying to run xm commands on a XEN server:

[root@xenhost1 ~]# xm list
Error: Unable to connect to xend: Connection reset by peer. Is xend running?

I had a check, and found xend was actually running:

[root@xenhost1 ~]# /etc/init.d/xend status
xend daemon running (pid 8329)

After some debugging, I found it's caused by libvirtd & xend corrupted. And then I did a bounce of them:

[root@xenhost1 ~]# /etc/init.d/libvirtd restart
Stopping libvirtd daemon: [ OK ]
Starting libvirtd daemon: [ OK ]

[root@xenhost1 ~]# /etc/init.d/xend restart #this may not be needed 
restarting xend...
xend daemon running (pid 19684)

Later, the xm commands went good.

PS:

  • If you met issue like "resolved - xend error: (98, 'Address already in use')" when restart xend or "can't connect: (111, 'Connection refused')" when doing xm live migrate, then you can refer this article.
  • For more information about libvirt, you can check here.

 

Categories: Clouding, IT Architecture, Oracle Cloud Tags:

remove entries in perl array with specified value

December 30th, 2014

Assume that in array @array_filtered:

my @array_filtered = ("your", "array", "here", 1, 3, 8, "here", 2, 5, 9, "sit", "here",3, 4, 7,"yes","now",8,1,7,6); #or my @array_filtered=qw(your array here 1 3 8 here 2 5 9 sit here 3 4 7 yes now 8 1 7 6) which uses Alternative Quotes(q, qq, qw, qx)

You want to remove entries that have value "here" or "now" and it's following 3 entries, you can use splice:

#!/usr/bin/perl
my @array_filtered = ("your", "array", "here", 1, 3, 8, "here", 2, 5, 9, "sit", "here",3, 4, 7,"yes","now",8,1,7,6);
my @search_for = ("here","now");
#return keys that have specified value, =~/!~ for regular expression, eq/ne for string, ==/!= for number. or use unless()/if(not()). use m{} instead of // if there's too much / in the expression and you're tired of using \/ to escape them.

$search_for_s=join('|',@search_for);
@index_all = grep { $array_filtered[$_] =~ /$search_for_s/ } 0..$#array_filtered;

for($i=0;$i<=$#index_all;$i++) {
@index_all_one = grep { $array_filtered[$_] =~ /$search_for_s/ } 0..$#array_filtered;
splice(@array_filtered,$index_all_one[0],4);
#print $indexone."\n"
}

print "@array_filtered"."\n";

The output is "your array sit yes 6".

PS:

  • For more info about perl regular expression(such as operators<m, s, tr> and their modifiers, complex regular expression cheat sheet<.\s\S\d\D\w\W[aeiou][^aeiou](foo|bar), \G, $, $&, $`, $'> and more), you can refer to this article.
  • The following is about perl alternative quotes:

q// is generally the same thing as using single quotes - meaning it doesn't interpolate values inside the delimiters.
qq// is the same as double quoting a string. It interpolates.
qw// return a list of white space delimited words. @q = qw/this is a test/ is functionally the same as @q = ('this', 'is', 'a', 'test')
qx// is the same thing as using the backtick operators.

Categories: IT Architecture, Perl, Programming Tags:

resolved – cssh installation on linux server

December 29th, 2014

ClusterSSH can be used if you need controls a number of xterm windows via a single graphical console window, and you want to run commands interactively on multiple servers over an ssh connection. This guide will show the process to install clusterssh on a linux box from tarball.

At the very first, you should download cssh tarball App-ClusterSSH-4.03_04.tar.gz from sourceforge. You may need export proxy settings if it's needed in your env:

export https_proxy=http://my-proxy.example.com:80/
export http_proxy=http://my-proxy.example.com:80/
export ftp_proxy=http://my-proxy.example.com:80/

After the proxy setting, you can now get the package:

wget 'http://sourceforge.net/projects/clusterssh/files/latest/download'
tar zxvf App-ClusterSSH-4.03_04.tar.gz
cd App-ClusterSSH-4.03_04
cat README

Before installing, let's install some prerequisites packages:

yum install gcc libX11-devel gnome* -y
yum groupinstall "X Window System" -y
yum groupinstall "GNOME Desktop Environment" -y
yum groupinstall "Graphical Internet" -y
yum groupinstall "Graphics" -y

Now run "perl Build.PL" as indicated by README:

[root@centos-32bits App-ClusterSSH-4.03_04]# perl Build.PL
Can't locate Module/Build.pm in @INC (@INC contains: /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.8 /usr/lib/perl5/vendor_perl /usr/lib/perl5/5.8.8/i386-linux-thread-multi /usr/lib/perl5/5.8.8 .) at Build.PL line 5.
BEGIN failed--compilation aborted at Build.PL line 5.

As it challenged, you need install Module::Build.pm first. Let's use cpan to install that module.

Run "cpan" and enter "follow" when below info occurred:

Policy on building prerequisites (follow, ask or ignore)? [ask] follow

If you had already ran cpan before, then you can configure the policy as below:

cpan> o conf prerequisites_policy follow
cpan> o conf commit

Now Let's install Module::Build:

cpan> install Module::Build

After the installation, let's run "perl Build.PL" again:

[root@centos-32bits App-ClusterSSH-4.03_04]# perl Build.PL
Checking prerequisites...
  requires:
    !  Exception::Class is not installed
    !  Tk is not installed
    !  Try::Tiny is not installed
    !  X11::Protocol is not installed
  build_requires:
    !  CPAN::Changes is not installed
    !  File::Slurp is not installed
    !  File::Which is not installed
    !  Readonly is not installed
    !  Test::Differences is not installed
    !  Test::DistManifest is not installed
    !  Test::PerlTidy is not installed
    !  Test::Pod is not installed
    !  Test::Pod::Coverage is not installed
    !  Test::Trap is not installed

ERRORS/WARNINGS FOUND IN PREREQUISITES.  You may wish to install the versions
of the modules indicated above before proceeding with this installation

Run 'Build installdeps' to install missing prerequisites.

Created MYMETA.yml and MYMETA.json
Creating new 'Build' script for 'App-ClusterSSH' version '4.03_04'

As the output says, run "./Build installdeps" to install the missing packages. Make sure you're in GUI env(through vncserver maybe), as "perl Build.PL" has a step to test GUI.

[root@centos-32bits App-ClusterSSH-4.03_04]# ./Build installdeps

......

Running Mkbootstrap for Tk::Xlib ()
chmod 644 "Xlib.bs"
"/usr/bin/perl" "/usr/lib/perl5/5.8.8/ExtUtils/xsubpp" -typemap "/usr/lib/perl5/5.8.8/ExtUtils/typemap" -typemap "/root/.cpan/build/Tk-804.032/Tk/typemap" Xlib.xs > Xlib.xsc && mv Xlib.xsc Xlib.c
make[1]: *** No rule to make target `pTk/tkInt.h', needed by `Xlib.o'. Stop.
make[1]: Leaving directory `/root/.cpan/build/Tk-804.032/Xlib'
make: *** [subdirs] Error 2
/usr/bin/make -- NOT OK
Running make test
Can't test without successful make
Running make install
make had returned bad status, install seems impossible

Errors again, we can see it's complaining something about TK related thing. To resolve this, I manully installed the latest perl-tk module as below:

wget --no-check-certificate 'https://github.com/eserte/perl-tk/archive/master.zip'
unzip master
cd perl-tk-master
perl Makefile.PL
make
make install

After this, let's run "./Build installdeps" and "perl Build.PL" again which all went through good:

[root@centos-32bits App-ClusterSSH-4.03_04]# ./Build installdeps

[root@centos-32bits App-ClusterSSH-4.03_04]# perl Build.PL

And let's run ./Build now:

[root@centos-32bits App-ClusterSSH-4.03_04]# ./Build
Building App-ClusterSSH
Generating: ccon
Generating: crsh
Generating: cssh
Generating: ctel

And now "./Build install" which is the last step:

[root@centos-32bits App-ClusterSSH-4.03_04]# ./Build install

After installation, let's have a test:

[root@centos-32bits App-ClusterSSH-4.03_04]# echo 'svr testserver1 testserver2' > /etc/clusters

Now run 'cssh svr', and you'll get the charm!

clusterssh

PS: 

If you met error like below:

Can't connect to display `unix:1': No such file or directory at /usr/local/share/perl5/X11/Protocol.pm line 2264.

And you are connecting to vnc session like below:

root 3291 1 0 07:36 ? 00:00:02 /usr/bin/Xvnc :1 -desktop Yue-test:1 (root) -auth /root/.Xauthority -geometry 1600x900 -rfbwait 30000 -rfbauth /root/.vnc/passwd -rfbport 5901 -fp catalogue:/etc/X11/fontpath.d -pn

Then make sure to do below:

export DISPLAY=localhost:1.0

Categories: Clouding, IT Architecture, Linux, Systems, Unix Tags:

resolved – error:0D0C50A1:asn1 encoding routines:ASN1_item_verify:unknown message digest algorithm

December 17th, 2014

Today when I tried using curl to get url info, error occurred like below:

[root@centos-doxer ~]# curl -i --user username:password -H "Content-Type: application/json" -X POST --data @/u01/shared/addcredential.json https://testserver.example.com/actions -v

* About to connect() to testserver.example.com port 443

*   Trying 10.242.11.201... connected

* Connected to testserver.example.com (10.242.11.201) port 443

* successfully set certificate verify locations:

*   CAfile: /etc/pki/tls/certs/ca-bundle.crt

  CApath: none

* SSLv2, Client hello (1):

SSLv3, TLS handshake, Server hello (2):

SSLv3, TLS handshake, CERT (11):

SSLv3, TLS alert, Server hello (2):

error:0D0C50A1:asn1 encoding routines:ASN1_item_verify:unknown message digest algorithm

* Closing connection #0

After some searching, I found that it's caused by the current version of openssl(openssl-0.9.8e) does not support SHA256 Signature Algorithm. To resolve this, there are two ways:

1. add -k parameter to curl to ignore the SSL error

2. update openssl to "OpenSSL 0.9.8e-fips-rhel5 01 Jul 2008", just try "yum update openssl".

# openssl version
OpenSSL 0.9.8e-fips-rhel5 01 Jul 2008

3. upgrade openssl to at least openssl-0.9.8o. Here's the way to upgrade openssl: --- this may not be needed, try method 2 above

wget --no-check-certificate 'https://www.openssl.org/source/old/0.9.x/openssl-0.9.8o.tar.gz'
tar zxvf openssl-0.9.8o.tar.gz
cd openssl-0.9.8o
./config --prefix=/usr --openssldir=/usr/openssl
make
make test
make install

After this, run openssl version to confirm:

[root@centos-doxer openssl-0.9.8o]# /usr/bin/openssl version
OpenSSL 0.9.8o 01 Jun 2010

PS:

If you installed openssl from rpm package, then you'll find the openssl version is still the old one even after you install the new package. This is expected so don't rely too much on rpm:

[root@centos-doxer openssl-0.9.8o]# /usr/bin/openssl version
OpenSSL 0.9.8o 01 Jun 2010

Even after rebuilding rpm DB(rpm --rebuilddb), it's still the old version:

[root@centos-doxer openssl-0.9.8o]# rpm -qf /usr/bin/openssl
openssl-0.9.8e-26.el5_9.1
openssl-0.9.8e-26.el5_9.1

[root@centos-doxer openssl-0.9.8o]# rpm -qa|grep openssl
openssl-0.9.8e-26.el5_9.1
openssl-devel-0.9.8e-26.el5_9.1
openssl-0.9.8e-26.el5_9.1
openssl-devel-0.9.8e-26.el5_9.1