Monday, March 09, 2009

SQL*Plus 10.2.0.1 Hangs, When System Uptime Is Long Period of Time


Today, my colleague told me, Why I can't use "sqlplus" (Oracle client) on my application server connect your database, But I can use "tnsping"... And I'd ever connected!


I remoted on this server and found "sqlplus" hung ... and sqlplus used more CPU %

$ ps aux
USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND
oracle   12722 96.4  0.2 19560 4600 pts/5    R    15:36   0:06 sqlplus

So, found out on metalink... (338461.1)
SQL*Plus 10.2.0.1 Hangs, When System Uptime Is Long Period of Time (Linux x86)

Used "strace" :

$ strace sqlplus -V 2>&1 |less

execve("/oracle/10.2.0/client/bin/sqlplus", ["sqlplus", "-V"], [/* 31 vars */]) = 0
uname({sys="Linux", node="host01", ...})  = 0
brk(0)                                  = 0x804a000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
.
.
.
times(NULL)                             = -2138395754
times(NULL)                             = -2138395754
times(NULL)                             = -2138395754
times(NULL)                             = -2138395754
times(NULL)                             = -2138395754
times(NULL)                             = -2138395754
times(NULL)                             = -2138395754
times(NULL)                             = -2138395754
times(NULL)                             = -2138395754
times(NULL)                             = -2138395754
times(NULL)                             = -2138395754
times(NULL)                             = -2138395754
times(NULL)                             = -2138395754
times(NULL)                             = -2138395754
times(NULL)                             = -2138395754
times(NULL)                             = -2138395754
times(NULL)                             = -2138395754
times(NULL)                             = -2138395754
times(NULL)                             = -2138395754
times(NULL)                             = -2138395754
times(NULL)                             = -2138395754
times(NULL)                             = -2138395754
times(NULL)                             = -2138395754
It is looping on the times() function.


There have been cases where problem occurs when uptime reaches 60 days and others as long as 248 days.

In addition to sqlplus, it has been reported that the netca and dbca tools also hang.


Solution from metalink...
Select one of the following two solutions:

1) Apply one-off patch available for 10.2.0.1.
   a. Download one-off patch off Metalink:
       Patch 4612267
       Description OCI CLIENT IS IN AN INFINITE LOOP WHEN MACHINE UPTIME HITS 248 DAYS
       Product CORE
       Release Oracle 10.2.0.1

   b. To apply patch on Instant Client install, please follow instructions documented in the OCI manual.
        You can find this in:

        under "Patching Instant Client Shared Libraries on Linux or UNIX".

2)  Apply Patchset 10.2.0.2  or higher. 
      According to Bug 4612267, this bug is fixed in version 11, and backported to 10.2.0.2 patchset.

2 comments:

leorick said...

Thanks a lot, luckily I found your post. At first, I suspect the database is having a problem. But I have no problem when use sqlplus from other servers including my local PC.
All our servers have been patched to 10.2.0.4 finally.
Thanks again.

Surachart Opun said...

This's a oracle bug. I hope you'll patch your database to avoid the problem.

Good Luck