Resolved:[Load Manager Shared Memory]. Error is [28]: [No space left on device](for apache, pmserver etc. running on linux, solaris, unix)
This error may occur in pmserver, apache, oracle, rsync, up2date and many other services running on linux, solaris, unix, so it’s a widespread and a famous question if you try to search google the keyword:”[Load Manager Shared Memory]. Error is [28]: [No space left on device]“.
Now, let’s take pmserver running on solaris10 for example to demonstrate to you step by step on how to solve the annoying problem.
Firstly, from “[No space left on device]” and “Load Manager Shared Memory”, we firstly guessed that it’s caused by shortage of memory, but after checking, we can see that memory is enough to allocate:
1.check the total memory size:
# /usr/sbin/prtconf |grep -i mem
Memory size: 32640 Megabytes
memory (driver not attached)
virtual-memory (driver not attached)
2.check application project memory size:
# su – sbintprd #as you have guessed, pmserver is running by user sbintprd in the box
$ id -p
uid=71269(sbintprd) gid=70772(sbintprd) projid=3(default)
This means that pmserver is running inside ‘default’ project. Then let’s check the setting of “default” project:
# projects -l default
default
projid : 3
comment: “”
users : (none)
groups : (none)
attribs: project.max-msg-ids=(privileged,256,deny)
project.max-shm-memory=(privileged,17179869184,deny)
# prctl -n project.max-shm-memory -i project default
project: 3: default
NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT
project.max-shm-memory
privileged 16.0GB - deny -
system 16.0EB max deny -
16GB available to ‘default’ project. How come the shortage of memory then?
Let’s bump up the max-shm-memory size by 2 GB to see what happens:
#prctl -n project.max-shm-memory -r -v 18gb -i project default
After this, we tried to bounce the pmserver, but the problem is still there:
#tail -f pmserver.log
INFO : LM_36070 [Fri Apr 22 22:19:42 2011] : (25218|1) The server is running on a host with 32 logical processors.
INFO : LM_36039 [Fri Apr 22 22:19:42 2011] : (25218|1) The maximum number of sessions that can run simultaneously is [10].
FATAL ERROR : CMN_1011 [Fri Apr 22 22:19:42 2011] : (25218|1) Error allocating system shared memory of [2000000] bytes for [Load Manager Shared Memory]. Error is [28]: [No space left on device]
FATAL ERROR : SF_34004 [Fri Apr 22 22:19:42 2011] : (25218|1) Server initialization failed.
INFO : SF_34014 [Fri Apr 22 22:19:42 2011] : (25218|1) Server shut down.
OK, then, we should think in other ways.
As we know, linux use shared memory between processes. We can use ipcs to check the information about active shared memory segments:
# ipcs -m|grep sbintprd
m 671088691 0 –rw——- sbintprd sbintprd
NOTE:pmserver is running by user sbintprd in the box
Then,
ipcs -mA|grep sbintprd|wc -l
92
And each of them use 20000 size:
IPC status from <running system> as of Sat Apr 23 03:29:51 BST 2011
T ID KEY MODE OWNER GROUP CREATOR CGROUP NATTCH SEGSZ CPID LPID ATIME DTIME CTIME ISMATTCH PROJECT
Shared Memory:
m 671088691 0 –rw——- sbintprd sbintprd sbintprd sbintprd 1 2000000 7781 16109 3:28:35 3:28:50 2:01:43 0 default
Now, we can conclude that the sbintprd user has over allocated and is not freeing up the space.
Let’s clear the shared memeory:
#for i in `ipcs -m | grep prd | awk ‘{print $2}’`; do ipcrm -m $i; done
After this step, the pmserver started successfully. From the log we can see:
NFO : LM_36070 [Sat Apr 23 01:51:17 2011]
: (5979|1) The server is running on a host with 32 logical processors.
INFO : LM_36039 [Sat Apr 23 01:51:18 2011] : (5979|1) The maximum number of sessions that
can run simultaneously is [10].
INFO : CMN_1010 [Sat Apr 23 01:51:18 2011] : (5979|1) Allocated system shared memory [id =
469762275] of [2000000] bytes for [Load Manager Shared Memory].
INFO : LM_36095 [Sat Apr 23 01:51:50 2011] : (5979|1) Persistent session cache file
cleanup is scheduled to run on [Sun Apr 24 01:51:50 2011].
INFO : SF_34003 [Sat Apr 23 01:51:50 2011] : (5979|1) Server initialization completed.
Problem resolved!
