Archive

Archive for the ‘Kernel’ Category

Linux tips – Performance and Troubleshooting

April 10th, 2014 No comments

System CPU

top
procinfo #yum install procinfo
gnome-system-monitor #can also see network flow rate
mpstat
sar

System Memory

top
free
slabtop
sar
/proc/meminfo #provides the most complete view of system memory usage
procinfo
gnome-system-monitor #can also see network flow rate

Process-specific CPU

time
strace #traces the system calls that a program makes while executing
ltrace #traces the calls(functions) that an application makes to libraries rather than to the kernel. Then use ldd to display which libraries are used, and use objdump to search each of those libraries for the given function.
ps
ld.so #ld

Process-specific Memory

ps
/proc/<pid> #you can refer to http://www.doxer.org/proc-filesystem-day-1/ for more info.

/proc/<PID>/status #provides information about the status of a given process PID
/proc/<PID>/maps #how the process’s virtual address space is used

ipcs #more info on http://www.doxer.org/resolved-semget-failed-with-status-28-failed-oracle-database-starting-up/ and http://www.doxer.org/resolvedload-manager-shared-memory-error-is-28-no-space-left-on-devicefor-apache-pmserver-etc-running-on-linux-solaris-unix/

Disk I/O

vmstat #provides totals rather than the rate of change during the sample
sar
lsof
time sh -c “dd if=/dev/zero of=System2.img bs=1M count=10240 && sync” #10G
time dd if=ddfile of=/dev/null bs=8k
dd if=/dev/zero of=vm1disk bs=1M seek=10240 count=0 #10G

Network

ethtool
ifconfig
ip
iptraf
gkrellm
netstat
gnome-system-monitor #can also see network flow rate
sar #network statistics
/etc/cron.d/sysstat #/var/log/sa/

General Ideas & options & outputs

Run Queue Statistics
In Linux, a process can be either runnable or blocked waiting for an event to complete.

A blocked process may be waiting for data from an I/O device or the results of a system call.

When these processes are runnable, but waiting to use the processor, they form a line called the run queue.
The load on a system is the total amount of running and runnable process.

Context Switches
To create the illusion that a given single processor runs multiple tasks simultaneously, the Linux kernel constantly switches between different processes.
The switch between different processes is called a context switch.
To guarantee that each process receives a fair share of processor time, the kernel periodically interrupts the running process and, if appropriate, the kernel scheduler decides to start another process rather than let the current process continue executing. It is possible that your system will context switch every time this periodic interrupt or timer occurs. (cat /proc/interrupts | grep timer, and do this again after e.g. 10s interval)

Interrupts
In addition, periodically, the processor receives an interrupt by hardware devices.
/proc/interrupts can be examined to show which interrupts are firing on which CPUs

CPU Utilization
At any given time, the CPU can be doing one of seven things:
Idle
Running user code #user time
System time #executing code in the Linux kernel on behalf of the application code
Executing user code that has been “nice”ed or set to run at a lower priority than normal processes
iowait #waiting for I/O (such as disk or network) to complete
irq #means it is in high-priority kernel code handling a hardware interrupt
softirq #executing kernel code that was also triggered by an interrupt, but it is running at a lower priority


Buffers and cache
Alternatively, if your system has much more physical memory than required by your applications, Linux will cache recently used files in physical memory so that subsequent accesses to that file do not require an access to the hard drive. This can greatly speed up applications that access the hard drive frequently, which, obviously, can prove especially useful for frequently launched applications. The first time the application is launched, it needs to be read from the disk; if the application remains in the cache, however, it needs to be read from the much quicker physical memory. This disk cache differs from the processor cache mentioned in the previous chapter. Other than oprofile, valgrind, and kcachegrind, most tools that report statistics about “cache” are actually referring to disk cache.

In addition to cache, Linux also uses extra memory as buffers. To further optimize applications, Linux sets aside memory to use for data that needs to be written to disk. These set-asides are called buffers. If an application has to write something to the disk, which would usually take a long time, Linux lets the application continue immediately but saves the file data into a memory buffer. At some point in the future, the buffer is flushed to disk, but the application can continue immediately.
Active Versus Inactive Memory
Active memory is currently being used by a process. Inactive memory is memory that is allocated but has not been used for a while. Nothing is essentially different between the two types of memory. When required, the Linux kernel takes a process’s least recently used memory pages and moves them from the active to the inactive list. When choosing which memory will be swapped to disk, the kernel chooses from the inactive memory list.
Kernel Usage of Memory (Slabs)
In addition to the memory that applications allocate, the Linux kernel consumes a certain amount for bookkeeping purposes. This bookkeeping includes, for example, keeping track of data arriving from network and disk I/O devices, as well as keeping track of which processes are running and which are sleeping. To manage this bookkeeping, the kernel has a series of caches that contains one or more slabs of memory. Each slab consists of a set of one or more objects. The amount of slab memory consumed by the kernel depends on which parts of the Linux kernel are being used, and can change as the type of load on the machine changes.

slabtop

slabtop shows in real-time how the kernel is allocating its various caches and how full they are. Internally, the kernel has a series of caches that are made up of one or more slabs. Each slab consists of a set of one or more objects. These objects can be active (or used) or inactive (unused). slabtop shows you the status of the different slabs. It shows you how full they are and how much memory they are using.


time

time measures three types of time. First, it measures the real or elapsed time, which is the amount of time between when the program started and finished execution. Next, it measures the user time, which is the amount of time that the CPU spent executing application code on behalf of the program. Finally, time measures system time, which is the amount of time the CPU spent executing system or kernel code on behalf of the application.


Disk I/O

When an application does a read or write, the Linux kernel may have a copy of the file stored into its cache or buffers and returns the requested information without ever accessing the disk. If the Linux kernel does not have a copy of the data stored in memory, however, it adds a request to the disk’s I/O queue. If the Linux kernel notices that multiple requests are asking for contiguous locations on the disk, it merges them into a single big request. This merging increases overall disk performance by eliminating the seek time for the second request. When the request has been placed in the disk queue, if the disk is not currently busy, it starts to service the I/O request. If the disk is busy, the request waits in the queue until the drive is available, and then it is serviced.

iostat

iostat provides a per-device and per-partition breakdown of how many blocks are written to and from a particular disk. (Blocks in iostat are usually sized at 512 bytes.)

lsof
lsof can prove helpful when narrowing down which applications are generating I/O


 top output

S(or STAT) – This is the current status of a process, where the process is either sleeping (S), running (R), zombied (killed but not yet dead) (Z), in an uninterruptable sleep (D), or being traced (T).

TIME – The total amount CPU time (user and system) that this process has used since it started executing.

top options

-b Run in batch mode. Typically, top shows only a single screenful of information, and processes that don’t fit on the screen never display. This option shows all the processes and can be very useful if you are saving top’s output to a file or piping the output to another command for processing.

I This toggles whether top will divide the CPU usage by the number of CPUs on the system. For example, if a process was consuming all of both CPUs on a two-CPU system, this toggles whether top displays a CPU usage of 100% or 200%.

1 (numeral 1) This toggles whether the CPU usage will be broken down to the individual usage or shown as a total.

mpstat options

-P { cpu | ALL } This option tells mpstat which CPUs to monitor. cpu is the number between 0 and the total CPUs minus 1.

The biggest benefit of mpstat is that it shows the time next to the statistics, so you can look for a correlation between CPU usage and time of day.

mpstat can be used to determine whether the CPUs are fully utilized and relatively balanced. By observing the number of interrupts each CPU is handling, it is possible to find an imbalance.

 sar options

-I {irq | SUM | ALL | XALL} This reports the rates that interrupts have been occurring in the system.
-P {cpu | ALL} This option specifies which CPU the statistics should be gathered from. If this isn’t specified, the system totals are reported.
-q This reports information about the run queues and load averages of the machine.
-u This reports information about CPU utilization of the system. (This is the default output.)
-w This reports the number of context switches that occurred in the system.
-o filename This specifies the name of the binary output file that will store the performance statistics.
-f filename This specifies the filename of the performance statistics.

-B – This reports information about the number of blocks that the kernel swapped to and from disk. In addition, for kernel versions after v2.5, it reports information about the number of page faults.
-W – This reports the number of pages of swap that are brought in and out of the system.
-r – This reports information about the memory being used in the system. It includes information about the total free memory, swap, cache, and buffers being used.
-R Report memory statistics

-d –  reports disk activities

-n DEV – Shows statistics about the number of packets and bytes sent and received by each device.
-n EDEV – Shows information about the transmit and receive errors for each device.
-n SOCK – Shows information about the total number of sockets (TCP, UDP, and RAW) in use.
-n ALL – Shows all the network statistics.

sar output

runq-sz This is the size of the run queue when the sample was taken.
plist-sz This is the number of processes present (running, sleeping, or waiting for I/O) when the sample was taken.
proc/s This is the number of new processes created per second. (This is the same as the forks statistic from vmstat.)

tps – Transfers per second. This is the number of reads and writes to the drive/partition per second.
rd_sec/s – Number of disk sectors read per second.
wr_sec/s – Number of disk sectors written per second.


vmstat options

-n print header info only once

-a This changes the default output of memory statistics to indicate the active/inactive amount of memory rather than information about buffer and cache usage.
-s (procps 3.2 or greater) This prints out the vm table. This is a grab bag of different statistics about the system since it has booted. It cannot be run in sample mode. It contains both memory and CPU statistics.

-d – This option displays individual disk statistics at a rate of one sample per interval. The statistics are the totals since system boot, rather than just those that occurred between this sample and the previous sample.
-p partition – This displays performance statistics about the given partition at a rate of one sample per interval. The statistics are the totals since system boot, rather than just those that occurred between this sample and the previous sample.

vmstat output
si – The rate of memory (in KB/s) that has been swapped in from disk during the last sample.
so – The rate of memory (in KB/s) that has been swapped out to disk during the last sample.
pages paged in – The amount of memory (in pages) read from the disk(s) into the system buffers. (On most IA32 systems, a page is 4KB.)
pages paged out – The amount of memory (in pages) written to the disk(s) from the system cache. (On most IA32 systems, a page is 4KB.)
pages swapped in – The amount of memory (in pages) read from swap into system memory.
pages swapped in/out – The amount of memory (in pages) written from system memory to the swap.

bo – This indicates the number of total blocks written to disk in the previous interval. (In vmstat, block size for a disk is typically 1,024 bytes.)
bi – This shows the number of blocks read from the disk in the previous interval. (In vmstat, block size for a disk is typically 1,024 bytes.)
wa – This indicates the amount of CPU time spent waiting for I/O to complete. The rate of disk blocks written per second.
reads: ms – The amount of time (in ms) spent reading from the disk.
writes: ms – The amount of time (in ms) spent writing to the disk.
IO: cur – The total number of I/O that are currently in progress. Note that there is a bug in recent versions of vmstat in which this is incorrectly divided by 1,000, which almost always yields a 0.
IO: s – This is the number of seconds spent waiting for I/O to complete.

iostat options
-d – This displays only information about disk I/O rather than the default display, which includes information about CPU usage as well.
-k – This shows statistics in kilobytes rather than blocks.
-x – This shows extended-performance I/O statistics.
device – If a device is specified, iostat shows only information about that device.

iostat output
tps – Transfers per second. This is the number of reads and writes to the drive/partition per second.
Blk_read/s – The rate of disk blocks read per second.
Blk_wrtn/s – The rate of disk blocks written per second.
Blk_read – The total number of blocks read during the interval.
Blk_wrtn – The total number of blocks written during the interval.
rrqm/s – The number of reads merged before they were issued to the disk.
wrqm/s – The number of writes merged before they were issued to the disk.
r/s – The number of reads issued to the disk per second.
w/s – The number of writes issued to the disk per second.
rsec/s – Disk sectors read per second.
wsec/s – Disk sectors written per second.
avgrq-sz – The average size (in sectors) of disk requests.
avgqu-sz – The average size of the disk request queue.
await – The average time (in ms) for a request to be completely serviced. This average includes the time that the request was waiting in the disk’s queue plus the amount of time it was serviced by the disk.
svctm – The average service time (in ms) for requests submitted to the disk. This indicates how long on average the disk took to complete a request. Unlike await, it does not include the amount of time spent waiting in the queue.

lsof options
+D directory – This causes lsof to recursively search all the files in the given directory and report on which processes are using them.
+d directory – This causes lsof to report on which processes are using the files in the given directory.

lsof output
FD – The file descriptor of the file, or tex for a executable, mem for a memory mapped file.
TYPE – The type of file. REG for a regular file.
DEVICE – Device number in major, minor number.
SIZE – The size of the file.
NODE – The inode of the file.


free options

-s delay – This option causes free to print out new memory statistics every delay seconds.


 strace options

strace [-p <pid>] -s 200 <program>#attach to a process. -s 200 to make the maximum string size to print (the default is 32) to 200. Note that filenames are not considered strings and are always printed in full.

-c – This causes strace to print out a summary of statistics rather than an individual list of all the system calls that are made.

ltrace options
-c – This option causes ltrace to print a summary of all the calls after the command has completed.
-S – ltrace traces system calls in addition to library calls, which is identical to the functionality strace provides.
-p pid – This traces the process with the given PID.


ps options
vsz The virtual set size is the amount of virtual memory that the application is using. Because Linux only allocated physical memory when an application tries to use it, this value may be much greater than the amount of physical memory the application is using.
rss The resident set size is the amount of physical memory the application is currently using.
pmep The percentage of the system memory that the process is consuming.
command This is the command name.

/proc/<PID>/status output
VmSize This is the process’s virtual set size, which is the amount of virtual memory that the application is using. Because Linux only allocates physical memory when an application tries to use it, this value may be much greater than the amount of physical memory the application is actually using. This is the same as the vsz parameter provided by ps.
VmLck This is the amount of memory that has been locked by this process. Locked memory cannot be swapped to disk.
VmRSS This is the resident set size or amount of physical memory the application is currently using. This is the same as the rss statistic provided by ps.

ipcs
Because shared memory is used by multiple processes, it cannot be attributed to any particular process. ipcs provides enough information about the state of the system-wide shared memory to determine which processes allocated the shared memory, which processes are using it, and how often they are using it. This information proves useful when trying to reduce shared memory usage.

ipcs options

lsof –u oracle | grep <shmid> #shmid is from output of ipcs -m. lists the processes under the oracle user attached to the shared memory segment

-t – This shows the time when the shared memory was created, when a process last attached to it, and when a process last detached from it.
-u – This provides a summary about how much shared memory is being used and whether it has been swapped or is in memory.
-l – This shows the system-wide limits for shared memory usage.
-p – This shows the PIDs of the processes that created and last used the shared memory segments.
-c – creator


ifconfig output #more on http://www.thegeekscope.com/linux-ifconfig-command-output-explained/

Errors – Frames with errors (possibly because of a bad network cable or duplex mismatch).
Dropped – Frames that were discarded (most likely because of low amounts of memory or buffers).
Overruns – Frames that may have been discarded by the network card because the kernel or network card was overwhelmed with frames. This should not normally happen.
Frame – These frames were dropped as a result of problems on the physical level. This could be the result of cyclic redundancy check (CRC) errors or other low-level problems.
Compressed – Some lower-level interfaces, such as Point-to-Point Protocol (PPP) or Serial Line Internet Protocol (SLIP) devices compress frames before they are sent over the network. This value indicates the number of these compressed frames. (Compressed packets are usually present during SLIP or PPP connections)

carrier – The number of packets discarded because of link media failure (such as a faulty cable)

ip options
-s [-s] link – If the extra -s is provided to ip, it provides a more detailed list of low-level Ethernet statistics.

iptraf options
-d interface – Detailed statistics for an interface including receive, transmit, and error rates
-s interface – Statistics about which IP ports are being used on an interface and how many bytes are flowing through them
-t <minutes> – Number of minutes that iptraf runs before exiting
-z interface – shows packet counts by size on the specified interface

netstat options
-p – Displays the PID/program name responsible for opening each of the displayed sockets
-c – Continually updates the display of information every second
–interfaces=<name> – Displays network statistics for the given interface
–statistics|-s – IP/UDP/ICMP/TCP statistics
–tcp|-t – Shows only information about TCP sockets
–udp|-u – Shows only information about UDP sockets.
–raw|-w – Shows only information about RAW sockets (IP and ICMP)
–listening|-l – Show only listening sockets. (These are omitted by default.)
–all|-a – Show both listening and non-listening (for TCP this means established connections) sockets. With the –interfaces option, show interfaces that are not marked
–numeric|-n – Show numerical addresses instead of trying to determine symbolic host, port or user names.
–extend|-e – Display additional information. Use this option twice for maximum detail.

netstat output

Active Internet connections (w/o servers)
Proto - The protocol (tcp, udp, raw) used by the socket.
Recv-Q - The count of bytes not copied by the user program connected to this socket.
Send-Q - The count of bytes not acknowledged by the remote host.
Local Address - Address and port number of the local end of the socket. Unless the --numeric (-n) option is specified, the socket address is resolved to its canonical host name (FQDN), and the port number is translated into the corresponding service name.
Foreign Address - Address and port number of the remote end of the socket. Analogous to "Local Address."
State - The state of the socket. Since there are no states in raw mode and usually no states used in UDP, this column may be left blank. Normally this can be one of several values: #more on http://www.doxer.org/tcp-flags-explanation-in-details-syn-ack-fin-rst-urg-psh-and-iptables-for-sync-flood/
    ESTABLISHED
        The socket has an established connection.
    SYN_SENT
        The socket is actively attempting to establish a connection.
    SYN_RECV
        A connection request has been received from the network.
    FIN_WAIT1
        The socket is closed, and the connection is shutting down.
    FIN_WAIT2
        Connection is closed, and the socket is waiting for a shutdown from the remote end.
    TIME_WAIT
        The socket is waiting after close to handle packets still in the network.
    CLOSED
        The socket is not being used.
    CLOSE_WAIT
        The remote end has shut down, waiting for the socket to close.
    LAST_ACK
        The remote end has shut down, and the socket is closed. Waiting for acknowledgement.
    LISTEN
        The socket is listening for incoming connections. Such sockets are not included in the output unless you specify the --listening (-l) or --all (-a) option.
    CLOSING
        Both sockets are shut down but we still don't have all our data sent.
    UNKNOWN
        The state of the socket is unknown.
User - The username or the user id (UID) of the owner of the socket.
PID/Program name - Slash-separated pair of the process id (PID) and process name of the process that owns the socket. --program causes this column to be included. You will also need superuser privileges to see this information on sockets you don't own. This identification information is not yet available for IPX sockets.

Example

[ezolt@scrffy ~/edid]$ vmstat 1 | tee /tmp/output
procs -----------memory---------- ---swap-- -----io----  --system-- ----cpu----
r  b   swpd   free   buff  cache   si   so    bi    bo    in    cs  us sy id wa
0  1 201060  35832  26532 324112    0    0     3     2     6     2  5  1  94  0
0  0 201060  35888  26532 324112    0    0    16     0  1138   358  0  0  99  0
0  0 201060  35888  26540 324104    0    0     0    88  1163   371  0  0 100  0

The number of context switches looks good compared to the number of interrupts. The scheduler is switching processes less than the number of timer interrupts that are firing. This is most likely because the system is nearly idle, and most of the time when the timer interrupt fires, the scheduler does not have any work to do, so it does not switch from the idle process.

[ezolt@scrffy manuscript]$ sar -w -c -q 1 2
Linux 2.6.8-1.521smp (scrffy)   10/20/2004

08:23:29 PM    proc/s
08:23:30 PM      0.00

08:23:29 PM   cswch/s
08:23:30 PM    594.00

08:23:29 PM   runq-sz  plist-sz   ldavg-1    ldavg-5  ldavg-15
08:23:30 PM         0       163      1.12       1.17      1.17

08:23:30 PM    proc/s
08:23:31 PM      0.00

08:23:30 PM   cswch/s
08:23:31 PM    812.87

08:23:30 PM   runq-sz  plist-sz   ldavg-1    ldavg-5  ldavg-15
08:23:31 PM         0       163      1.12       1.17      1.17

Average:       proc/s
Average:         0.00

Average:      cswch/s
Average:       703.98

Average:      runq-sz  plist-sz   ldavg-1    ldavg-5  ldavg-15
Average:            0       163      1.12       1.17      1.17

In this case, we ask sar to show us the total number of context switches and process creations that occur every second. We also ask sar for information about the load average. We can see in this example that this machine has 163 process that are in memory but not running. For the past minute, on average 1.12 processes have been ready to run.

bash-2.05b$ vmstat -a
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free  inact active   si   so    bi    bo   in    cs us sy id wa
 2  1 514004   5640 79816 1341208   33   31   204   247 1111  1548  8  5 73 14

The amount of inactive pages indicates how much of the memory could be swapped to disk and how much is currently being used. In this case, we can see that 1310MB of memory is active, and only 78MB is considered inactive. This machine has a large amount of memory, and much of it is being actively used.


bash-2.05b$ vmstat -s

      1552528  total memory
      1546692  used memory
      1410448  active memory
        11100  inactive memory
         5836  free memory
         2676  buffer memory
       645864  swap cache
      2097096  total swap
       526280  used swap
      1570816  free swap
     20293225 non-nice user cpu ticks
     18284715 nice user cpu ticks
     17687435 system cpu ticks
    357314699 idle cpu ticks
     67673539 IO-wait cpu ticks
       352225 IRQ cpu ticks
      4872449 softirq cpu ticks
    495248623 pages paged in
    600129070 pages paged out
     19877382 pages swapped in
     18874460 pages swapped out
   2702803833 interrupts
   3763550322 CPU context switches
   1094067854 boot time
     20158151 forks

It can be helpful to know the system totals when trying to figure out what percentage of the swap and memory is currently being used. Another interesting statistic is the pages paged in, which indicates the total number of pages that were read from the disk. This statistic includes the pages that are read starting an application and those that the application itself may be using.


[ezolt@wintermute tmp]$ ps -o etime,time,pcpu,cmd 10882
    ELAPSED     TIME %CPU CMD
      00:06 00:00:05 88.0 ./burn

This example shows a test application that is consuming 88 percent of the CPU and has been running for 6 seconds, but has only consumed 5 seconds of CPU time.


[ezolt@wintermute tmp]$ ps –o vsz,rss,tsiz,dsiz,majflt,minflt,cmd 10882
VSZ RSS TSIZ DSIZ MAJFLT MINFLT CMD
11124 10004 1 11122 66 2465 ./burn

The burn application has a very small text size (1KB), but a very large data size (11,122KB). Of the total virtual size (11,124KB), the process has a slightly smaller resident set size (10,004KB), which represents the total amount of physical memory that the process is actually using. In addition, most of the faults generated by burn were minor faults, so most of the memory faults were due to memory allocation rather than loading in a large amount of text or data from the program image on the disk.


[ezolt@wintermute tmp]$ cat /proc/4540/status
Name: burn
State: T (stopped)
Tgid: 4540
Pid: 4540
PPid: 1514
TracerPid: 0
Uid: 501 501 501 501
Gid: 501 501 501 501
FDSize: 256
Groups: 501 9 502
VmSize: 11124 kB
VmLck: 0 kB
VmRSS: 10004 kB
VmData: 9776 kB
VmStk: 8 kB
VmExe: 4 kB
VmLib: 1312 kB
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 0000000000000000
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000

The VmLck size of 0KB means that the process has not locked any pages into memory, making them unswappable. The VmRSS size of 10,004KB means that the application is currently using 10,004KB of physical memory, although it has either allocated or mapped the VmSize or 11,124KB. If the application begins to use the memory that it has allocated but is not currently using, the VmRSS size increases but leaves the VmSize unchanged.

[ezolt@wintermute test_app]$ cat /proc/4540/maps
08048000-08049000 r-xp 00000000 21:03 393730 /tmp/burn
08049000-0804a000 rw-p 00000000 21:03 393730 /tmp/burn
0804a000-089d3000 rwxp 00000000 00:00 0
40000000-40015000 r-xp 00000000 21:03 1147263 /lib/ld-2.3.2.so
40015000-40016000 rw-p 00015000 21:03 1147263 /lib/ld-2.3.2.so
4002e000-4002f000 rw-p 00000000 00:00 0
4002f000-40162000 r-xp 00000000 21:03 2031811 /lib/tls/libc-2.3.2.so
40162000-40166000 rw-p 00132000 21:03 2031811 /lib/tls/libc-2.3.2.so
40166000-40168000 rw-p 00000000 00:00 0
bfffe000-c0000000 rwxp fffff000 00:00 0

The burn application is using two libraries: ld and libc. The text section (denoted by the permission r-xp) of libc has a range of 0x4002f000 through 0×40162000 or a size of 0×133000 or 1,257,472 bytes.
The data section (denoted by permission rw-p) of libc has a range of 40162000 through 40166000 or a size of 0×4000 or 16,384 bytes. The text size of libc is bigger than ld’s text size of 0×15000 or 86,016 bytes. The data size of libc is also bigger than ld’s text size of 0×1000 or 4,096 bytes. libc is the big library that burn is linking in.


[ezolt@wintermute tmp]$ ipcs -u

------ Shared Memory Status --------
segments allocated 21
pages allocated 1585
pages resident 720
pages swapped 412
Swap performance: 0 attempts 0 successes

------ Semaphore Status --------
used arrays = 0
allocated semaphores = 0

------ Messages: Status --------
allocated queues = 0
used headers = 0
used space = 0 bytes

In this case, we can see that 21 different segments or pieces of shared memory have been allocated. All these segments consume a total of 1,585 pages of memory; 720 of these exist in physical memory and 412 have been swapped to disk.

[ezolt@wintermute tmp]$ ipcs

------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x00000000 0 root 777 49152 1
0x00000000 32769 root 777 16384 1
0x00000000 65538 ezolt 600 393216 2 dest

we ask ipcs for a general overview of all the shared memory segments in the system. This indicates who is using each memory segment. In this case, we see a list of all the shared segments. For one in particular, the one with a share memory ID of 65538, the user (ezolt) is the owner. It has a permission of 600 (a typical UNIX permission), which in this case, means that only ezolt can read and write to it. It has 393,216 bytes, and 2 processes are attached to it.

[ezolt@wintermute tmp]$ ipcs -p

------ Shared Memory Creator/Last-op --------
shmid owner cpid lpid
0 root 1224 11954
32769 root 1224 11954
65538 ezolt 1229 11954

Finally, we can figure out exactly which processes created the shared memory segments and which other processes are using them. For the segment with shmid 32769, we can see that the PID 1229 created it and 11954 was the last to use it.


[ezolt@wintermute procps-3.2.0]$ ./vmstat 1 3

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
1 1 0 197020 81804 29920 0 0 236 25 1017 67 1 1 93 4
1 1 0 172252 106252 29952 0 0 24448 0 1200 395 1 36 0 63
0 0 0 231068 50004 27924 0 0 19712 80 1179 345 1 34 15 49

During one of the samples, the system read 24,448 disk blocks. As mentioned previously, the block size for a disk is 1,024 bytes(or 4,096 bytes), so this means that the system is reading in data at about 23MB per second. We can also see that during this sample, the CPU was spending a significant portion of time waiting for I/O to complete. The CPU waits on I/O 63 percent of the time during the sample in which the disk was reading at ~23MB per second, and it waits on I/O 49 percent for the next sample, in which the disk was reading at ~19MB per second.

[ezolt@wintermute procps-3.2.0]$ ./vmstat -D
3 disks
5 partitions
53256 total reads
641233 merged reads
4787741 read sectors
343552 milli reading
14479 writes
17556 merged writes
257208 written sectors
7237771 milli writing
0 inprogress IO
342 milli spent IO

In this example, a large number of the reads issued to the system were merged before they were issued to the device. Although there were ~640,000 merged reads, only ~53,000 read commands were actually issued to the drives. The output also tells us that a total of 4,787,741 sectors have been read from the disk, and that since system boot, 343,552ms (or 344 seconds) were spent reading from the disk. The same statistics are available for write performance.

[ezolt@wintermute procps-3.2.0]$ ./vmstat -p hde3 1 3
hde3 reads read sectors writes requested writes
18999 191986 24701 197608
19059 192466 24795 198360
- 19161 193282 24795 198360

Shows that 60 (19,059 – 18,999) reads and 94 writes (24,795 – 24,795) have been issued to partition hde3. This view can prove particularly useful if you are trying to determine which partition of a disk is seeing the most usage.


 

[ezolt@localhost sysstat-5.0.2]$ ./iostat -x -dk 1 5 /dev/hda2
Linux 2.4.22-1.2188.nptl (localhost.localdomain) 05/01/2004
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s
avgrq-sz avgqu-sz await svctm %util
hda2 11.22 44.40 3.15 4.20 115.00 388.97 57.50 194.49
68.52 1.75 237.17 11.47 8.43

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s
avgrq-sz avgqu-sz await svctm %util
hda2 0.00 1548.00 0.00 100.00 0.00 13240.00 0.00 6620.00
132.40 55.13 538.60 10.00 100.00

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s
avgrq-sz avgqu-sz await svctm %util
hda2 0.00 1365.00 0.00 131.00 0.00 11672.00 0.00 5836.00
89.10 53.86 422.44 7.63 100.00

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s
avgrq-sz avgqu-sz await svctm %util
hda2 0.00 1483.00 0.00 84.00 0.00 12688.00 0.00 6344.00
151.0 39.69 399.52 11.90 100.00

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s
avgrq-sz avgqu-sz await svctm %util
hda2 0.00 2067.00 0.00 123.00 0.00 17664.00 0.00 8832.00
143.61 58.59 508.54 8.13 100.00

you can see that the average queue size is pretty high (~237 to 538) and, as a result, the amount of time that a request must wait (~422.44ms to 538.60ms) is much greater than the amount of time it takes to service the request (7.63ms to 11.90ms). These high average service times, along with the fact that the utilization is 100 percent, show that the disk is completely saturated.


[ezolt@wintermute sysstat-5.0.2]$ sar -n SOCK 1 2

Linux 2.4.22-1.2174.nptlsmp (wintermute.phil.org) 06/07/04
21:32:26 totsck tcpsck udpsck rawsck ip-frag
21:32:27 373 118 8 0 0
21:32:28 373 118 8 0 0
Average: 373 118 8 0 0

We can see the total number of open sockets and the TCP, RAW, and UDP sockets. sar also displays the number of fragmented IP packets.

PS:

resolved – /lib/ld-linux.so.2: bad ELF interpreter: No such file or directory

April 1st, 2014 No comments

When I ran perl command today, I met problem below:

[root@test01 bin]# /usr/local/bin/perl5.8
-bash: /usr/local/bin/perl5.8: /lib/ld-linux.so.2: bad ELF interpreter: No such file or directory

Now let’s check which package /lib/ld-linux.so.2 belongs to on a good linux box:

[root@test02 ~]# rpm -qf /lib/ld-linux.so.2
glibc-2.5-118.el5_10.2

So here’s the resolution to the issue:

[root@test01 bin]# yum install -y glibc.x86_64 glibc.i686 glibc-devel.i686 glibc-devel.x86_64 glibc-headers.x86_64

Categories: Kernel, Linux, Systems Tags:

resolved – ssh Read from socket failed: Connection reset by peer and Write failed: Broken pipe

March 13th, 2014 No comments

If you met following errors when ssh to linux box:

Read from socket failed: Connection reset by peer

Write failed: Broken pipe

Then there’s one possibility that the linux box’s filesystem was corrupted. As in my case there’s output to stdout:

EXT3-fs error ext3_lookup: deleted inode referenced

To resolve this, you need make linux go to single user mode and fsck -y <filesystem>. You can get corrupted filesystem names when booting:

[/sbin/fsck.ext3 (1) -- /usr] fsck.ext3 -a /dev/xvda2
/usr contains a file system with errors, check forced.
/usr: Directory inode 378101, block 0, offset 0: directory corrupted

/usr: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
(i.e., without -a or -p options)

[/sbin/fsck.ext3 (1) -- /oem] fsck.ext3 -a /dev/xvda5
/oem: recovering journal
/oem: clean, 8253/1048576 files, 202701/1048233 blocks
[/sbin/fsck.ext3 (1) -- /u01] fsck.ext3 -a /dev/xvdb
u01: clean, 36575/14548992 files, 2122736/29081600 blocks
[FAILED]

So in this case, I did fsck -y /dev/xvda2 && fsck -y /dev/xvda5. Later reboot host, and then everything went well.

PS:

If two VMs are booted up in two hypervisors and these VMs shared the same filesystem(like NFS), then after fsck -y one FS and booted up the VM, the FS will corrupt soon as there’re other copies of itself is using that FS. So you need first make sure that only one copy of VM is running on hypervisors of the same server pool.

Categories: Kernel, Linux Tags:

debugging nfs problem with snoop in solaris

December 3rd, 2013 No comments

Network analyzers are ultimately the most useful tools available when it comes to debugging NFS problems. The snoop network analyzer bundled with Solaris was introduced in Section 13.5. This section presents an example of how to use snoop to resolve NFS-related problems.

Consider the case where the NFS client rome attempts to access the contents of the filesystems exported by the server zeus through the /net automounter path:

rome% ls -la /net/zeus/export
total 5
dr-xr-xr-x   3 root     root           3 Jul 31 22:51 .
dr-xr-xr-x   2 root     root           2 Jul 31 22:40 ..
drwxr-xr-x   3 root     other        512 Jul 28 16:48 eng
dr-xr-xr-x   1 root     root           1 Jul 31 22:51 home
rome% ls /net/zeus/export/home
/net/zeus/export/home: Permission denied

 

The client is not able to open the contents of the directory /net/zeus/export/home, although the directory gives read and execute permissions to all users:

Code View: Scroll / Show All
rome% df -k /net/zeus/export/home
filesystem            kbytes    used   avail capacity  Mounted on
-hosts                     0       0       0     0%    /net/zeus/export/home

 

The df command shows the -hosts automap mounted on the path of interest. This means that the NFS filesystem rome:/export/home has not yet been mounted. To investigate the problem further, snoopis invoked while the problematic ls command is rerun:

Code View: Scroll / Show All
 rome# snoop -i /tmp/snoop.cap rome zeus
  1   0.00000      rome -> zeus      PORTMAP C GETPORT prog=100003 (NFS) vers=3 
proto=UDP
  2   0.00314      zeus -> rome      PORTMAP R GETPORT port=2049
  3   0.00019      rome -> zeus      NFS C NULL3
  4   0.00110      zeus -> rome      NFS R NULL3 
  5   0.00124      rome -> zeus      PORTMAP C GETPORT prog=100005 (MOUNT) vers=1 
proto=TCP
  6   0.00283      zeus -> rome      PORTMAP R GETPORT port=33168
  7   0.00094      rome -> zeus      TCP D=33168 S=49659 Syn Seq=1331963017 Len=0 
Win=24820 Options=<nop,nop,sackOK,mss 1460>
  8   0.00142      zeus -> rome      TCP D=49659 S=33168 Syn Ack=1331963018 
Seq=4025012052 Len=0 Win=24820 Options=<nop,nop,sackOK,mss 1460>
  9   0.00003      rome -> zeus      TCP D=33168 S=49659     Ack=4025012053 
Seq=1331963018 Len=0 Win=24820
 10   0.00024      rome -> zeus      MOUNT1 C Get export list
 11   0.00073      zeus -> rome      TCP D=49659 S=33168     Ack=1331963062 
Seq=4025012053 Len=0 Win=24776
 12   0.00602      zeus -> rome      MOUNT1 R Get export list 2 entries
 13   0.00003      rome -> zeus      TCP D=33168 S=49659     Ack=4025012173 
Seq=1331963062 Len=0 Win=24820
 14   0.00026      rome -> zeus      TCP D=33168 S=49659 Fin Ack=4025012173 
Seq=1331963062 Len=0 Win=24820
 15   0.00065      zeus -> rome      TCP D=49659 S=33168     Ack=1331963063 
Seq=4025012173 Len=0 Win=24820
 16   0.00079      zeus -> rome      TCP D=49659 S=33168 Fin Ack=1331963063 
Seq=4025012173 Len=0 Win=24820
 17   0.00004      rome -> zeus      TCP D=33168 S=49659     Ack=4025012174 
Seq=1331963063 Len=0 Win=24820
 18   0.00058      rome -> zeus      PORTMAP C GETPORT prog=100005 (MOUNT) vers=3 
proto=UDP
 19   0.00412      zeus -> rome      PORTMAP R GETPORT port=34582
 20   0.00018      rome -> zeus      MOUNT3 C Null
 21   0.00134      zeus -> rome      MOUNT3 R Null 
 22   0.00056      rome -> zeus      MOUNT3 C Mount /export/home
 23   0.23112      zeus -> rome      MOUNT3 R Mount Permission denied

 

Packet 1 shows the client rome requesting the port number of the NFS service (RPC program number 100003, Version 3, over the UDP protocol) from the server’s rpcbind (portmapper). Packet 2 shows the server’s reply indicating nfsd is running on port 2049. Packet 3 shows the automounter’s call to the server’s nfsd daemon to verify that it is indeed running. The server’s successful reply is shown in packet 4. Packet 5 shows the client’s request for the port number for RPC program number 100005, Version 1, over TCP (the RPC MOUNT program). The server replies with packet 6 with port=33168. Packets 7 through 9 are TCP hand shaking between our NFS client and the server’s mountd. Packet 10 shows the client’s call to the server’s mountd daemon (which implements the MOUNT program) currently running on port 33168. The client is requesting the list of exported entries. The server replies with packet 12 including the names of the two entries exported. Packets 18 and 19 are similar to packets 5 and 6, except that this time the client is asking for the port number of the MOUNT program version 3 running over UDP. Packet 20 and 21 show the client verifying that version 3 of the MOUNT service is up and running on the server. Finally, the client issues the Mount /export/home request to the server in packet 22, requesting the filehandle of the /export/home path. The server’s mountd daemon checks its export list, and determines that the host rome is not present in it and replies to the client with a “Permission Denied” error in packet 23.

The analysis indicates that the “Permission Denied” error returned to the ls command came from the MOUNT request made to the server, not from problems with directory mode bits on the client. Having gathered this information, we study the exported list on the server and quickly notice that the filesystem /export/home is exported only to the host verona:

rome$ showmount -e zeus
export list for zeus:
/export/eng  (everyone)
/export/home verona

 

We could have obtained the same information by inspecting the contents of packet 12, which contains the export list requested during the transaction:

Code View: Scroll / Show All
rome# snoop -i /tmp/cap -v -p 10,12
...
      Packet 10 arrived at 3:32:47.73
RPC:  ----- SUN RPC Header -----
RPC:  
RPC:  Record Mark: last fragment, length = 40
RPC:  Transaction id = 965581102
RPC:  Type = 0 (Call)
RPC:  RPC version = 2
RPC:  Program = 100005 (MOUNT), version = 1, procedure = 5
RPC:  Credentials: Flavor = 0 (None), len = 0 bytes
RPC:  Verifier   : Flavor = 0 (None), len = 0 bytes
RPC:  
MOUNT:----- NFS MOUNT -----
MOUNT:
MOUNT:Proc = 5 (Return export list)
MOUNT:
...
       Packet 12 arrived at 3:32:47.74
RPC:  ----- SUN RPC Header -----
RPC:  
RPC:  Record Mark: last fragment, length = 92
RPC:  Transaction id = 965581102
RPC:  Type = 1 (Reply)
RPC:  This is a reply to frame 10
RPC:  Status = 0 (Accepted)
RPC:  Verifier   : Flavor = 0 (None), len = 0 bytes
RPC:  Accept status = 0 (Success)
RPC:  
MOUNT:----- NFS MOUNT -----
MOUNT:
MOUNT:Proc = 5 (Return export list)
MOUNT:Directory = /export/eng
MOUNT:Directory = /export/home
MOUNT: Group = verona
MOUNT:

 

For simplicity, only the RPC and NFS Mount portions of the packets are shown. Packet 10 is the request for the export list, packet 12 is the reply. Notice that every RPC packet contains the transaction ID (XID), the message type (call or reply), the status of the call, and the credentials. Notice that the RPC header includes the string “This is a reply to frame 10″. This is not part of the network packet. Snoopkeeps track of the XIDs it has processed and attempts to match calls with replies and retransmissions. This feature comes in very handy during debugging. The Mount portion of packet 12 shows the list of directories exported and the group of hosts to which they are exported. In this case, we can see that /export/home was only exported with access rights to the host verona. The problem can be fixed by adding the host rome to the export list on the server.

PS:

explain solaris snoop network analyzer with example

December 2nd, 2013 No comments

Here’s the code:

# snoop -i /tmp/capture -v -p 3
ETHER:  ----- Ether Header -----
ETHER:  
ETHER:  Packet 3 arrived at 15:08:43.35
ETHER:  Packet size = 82 bytes
ETHER:  Destination = 0:0:c:7:ac:56, Cisco
ETHER:  Source      = 8:0:20:b9:2b:f6, Sun
ETHER:  Ethertype = 0800 (IP)
ETHER:  
IP:   ----- IP Header -----
IP:   
IP:   Version = 4
IP:   Header length = 20 bytes
IP:   Type of service = 0x00
IP:         xxx. .... = 0 (precedence)
IP:         ...0 .... = normal delay
IP:         .... 0... = normal throughput
IP:         .... .0.. = normal reliability
IP:   Total length = 68 bytes
IP:   Identification = 35462
IP:   Flags = 0x4
IP:         .1.. .... = do not fragment
IP:         ..0. .... = last fragment
IP:   Fragment offset = 0 bytes
IP:   Time to live = 255 seconds/hops
IP:   Protocol = 17 (UDP)
IP:   Header checksum = 4503
IP:   Source address = 131.40.52.223, caramba
IP:   Destination address = 131.40.52.27, mickey
IP:   No options
IP:   
UDP:  ----- UDP Header -----
UDP:  
UDP:  Source port = 55559
UDP:  Destination port = 2049 (Sun RPC)
UDP:  Length = 48 
UDP:  Checksum = 3685 
UDP:  
RPC:  ----- SUN RPC Header -----
RPC:  
RPC:  Transaction id = 969440111
RPC:  Type = 0 (Call)
RPC:  RPC version = 2
RPC:  Program = 100003 (NFS), version = 3, procedure = 0
RPC:  Credentials: Flavor = 0 (None), len = 0 bytes
RPC:  Verifier   : Flavor = 0 (None), len = 0 bytes
RPC:  
NFS:  ----- Sun NFS -----
NFS:  
NFS:  Proc = 0 (Null procedure)
NFS:

And let’s analyze this:

The Ethernet header displays the source and destination addresses as well as the type of information embedded in the packet. The IP layer displays the IP version number, flags, options, and address of the sender and recipient of the packet. The UDP header displays the source and destination ports, along with the length and checksum of the UDP portion of the packet. Embedded in the UDP frame is the RPC data. Every RPC packet has a transaction ID used by the sender to identify replies to its requests, and by the server to identify duplicate calls. The previous example shows a request from the host caramba to the server mickey. The RPC version = 2 refers to the version of the RPC protocol itself, the program number 100003 and Version 3 apply to the NFS service. NFS procedure 0 is always the NULL procedure, and is most commonly invoked with no authentication information. The NFS NULL procedure does not take any arguments, therefore none are listed in the NFS portion of the packet.

PS:

  1. Here’s more usage about snoop on solaris:

The amount of traffic on a busy network can be overwhelming, containing many irrelevant packets to the problem at hand. The use of filters reduces the amount of noise captured and displayed, allowing you to focus on relevant data. A filter can be applied at the time the data is captured, or at the time the data is displayed. Applying the filter at capture time reduces the amount of data that needs to be stored and processed during display. Applying the filter at display time allows you to further refine the previously captured information. You will find yourself applying different display filters to the same data set as you narrow the problem down, and isolate the network packets of interest.

Snoop uses the same syntax for capture and display filters. For example, the host filter instructs snoop to only capture packets with source or destination address matching the specified host:

Code View: Scroll / Show All
# snoop host caramba
Using device /dev/hme (promiscuous mode)
     caramba -> schooner     NFS C GETATTR3 FH=B083
    schooner -> caramba      NFS R GETATTR3 OK
     caramba -> schooner     TCP D=2049 S=1023     Ack=3647506101 Seq=2611574902 Len=0 Win=24820

 

In this example the host filter instructs snoop to capture packets originating at or addressed to the host caramba. You can specify the IP address or the hostname, and snoop will use the name service switch to do the conversion. Snoop assumes that the hostname specified is an IPv4 address. You can specify an IPv6 address by using the inet6 qualifier in front of the host filter:

Code View: Scroll / Show All
# snoop inet6 host caramba
Using device /dev/hme (promiscuous mode)
     caramba -> 2100::56:a00:20ff:fea0:3390    ICMPv6 Neighbor advertisement
2100::56:a00:20ff:fea0:3390 -> caramba         ICMPv6 Echo request (ID: 1294 Sequence number: 0)
     caramba -> 2100::56:a00:20ff:fea0:3390    ICMPv6 Echo reply (ID: 1294 Sequence number: 0)

 

You can restrict capture of traffic addressed to the specified host by using the to or dst qualifier in front of the host filter:

# snoop to host caramba
Using device /dev/hme (promiscuous mode)
    schooner -> caramba      RPC R XID=1493500696 Success
    schooner -> caramba      RPC R XID=1493500697 Success
    schooner -> caramba      RPC R XID=1493500698 Success

 

Similarly you can restrict captured traffic to only packets originating from the specified host by using the from or src qualifier:

Code View: Scroll / Show All
# snoop from host caramba
Using device /dev/hme (promiscuous mode)
     caramba -> schooner     NFS C GETATTR3 FH=B083
     caramba -> schooner     TCP D=2049 S=1023     Ack=3647527137 Seq=2611841034 Len=0 Win=24820

 

Note that the host keyword is not required when the specified hostname does not conflict with the name of another snoop primitive.The previous snoop from host caramba command could have been invoked without the host keyword and it would have generated the same output:

Code View: Scroll / Show All
 
					# snoop from caramba 
Using device /dev/hme (promiscuous mode)
     caramba -> schooner     NFS C GETATTR3 FH=B083
     caramba -> schooner     TCP D=2049 S=1023     Ack=3647527137 Seq=2611841034 Len=0 Win=24820

 

For clarity, we use the host keyword throughout this book. Two or more filters can be combined by using the logical operators and and or :

# snoop -o /tmp/capture -c 20 from host caramba and rpc nfs 3
Using device /dev/hme (promiscuous mode)
20 20 packets captured

 

Snoop captures all NFS Version 3 packets originating at the host caramba. Here, snoop is invoked with the -c and -o options to save 20 filtered packets into the /tmp/capture file. We can later apply other filters during display time to further analyze the captured information. For example, you may want to narrow the previous search even further by only listing TCP traffic by using the proto filter:

# snoop -i /tmp/capture proto tcp
Using device /dev/hme (promiscuous mode)
  1   0.00000     caramba -> schooner    NFS C GETATTR3 FH=B083
  2   2.91969     caramba -> schooner    NFS C GETATTR3 FH=0CAE
  9   0.37944     caramba -> rea         NFS C FSINFO3 FH=0156
 10   0.00430     caramba -> rea         NFS C GETATTR3 FH=0156
 11   0.00365     caramba -> rea         NFS C ACCESS3 FH=0156 (lookup)
 14   0.00256     caramba -> rea         NFS C LOOKUP3 FH=F244 libc.so.1
 15   0.00411     caramba -> rea         NFS C ACCESS3 FH=772D (lookup)

 

Snoop reads the previously filtered data from /tmp/capture, and applies the new filter to only display TCP traffic. The resulting output is NFS traffic originating at the host caramba over the TCP protocol. We can apply a UDP filter to the same NFS traffic in the /tmp/capture file and obtain the NFS Version 3 traffic over UDP from host caramba without affecting the information in the /tmp/capture file:

# snoop -i /tmp/capture proto udp
Using device /dev/hme (promiscuous mode)
  1   0.00000      caramba -> rea          NFS C NULL3

 

So far, we’ve presented filters that let you specify the information you are interested in. Use the not operator to specify the criteria of packets that you wish to have excluded during capture. For example, you can use the not operator to capture all network traffic, except that generated by the remote shell:

Code View: Scroll / Show All
# snoop not port login
Using device /dev/hme (promiscuous mode)
      rt-086 -> BROADCAST        RIP R (25 destinations)
      rt-086 -> BROADCAST        RIP R (10 destinations)
     caramba -> schooner         NFS C GETATTR3 FH=B083
    schooner -> caramba          NFS R GETATTR3 OK
     caramba -> donald           NFS C GETATTR3 FH=00BD
    jamboree -> donald           NFS R GETATTR3 OK
     caramba -> donald           TCP D=2049 S=657     Ack=3855205229 Seq=2331839250 Len=0 Win=24820
     caramba -> schooner         TCP D=2049 S=1023    Ack=3647569565 Seq=2612134974 Len=0 Win=24820
     narwhal -> 224.2.127.254    UDP D=9875 S=32825 LEN=368

 

On multihomed hosts (systems with more than one network interface device), use the -d option to specify the particular network interface to snoop on:

snoop -d hme2

 

You can snoop on multiple network interfaces concurrently by invoking separate instances of snoop on each device. This is particularly useful when you don’t know what interface the host will use to generate or receive the requests. The -d option can be used in conjunction with any of the other options and filters previously described:

# snoop -o /tmp/capture-hme0 -d hme0 not port login &
# snoop -o /tmp/capture-hme1 -d hme1 not port login &

2.This article is from book <Managing NFS and NIS, Second Edition>

rpc remote procedure call mechanism

December 2nd, 2013 No comments

The rpcbind daemon (also known as the portmapper),[8] exists to register RPC services and to provide their IP port numbers when given an RPC program number. rpcbind itself is an RPC service, but it resides at a well-known IP port (port 111) so that it may be contacted directly by remote hosts. For example, if host fred needs to mount a filesystem from host barney, it must send an RPC request to themountd daemon on barney. The mechanics of making the RPC request are as follows:

[8] The rpcbind daemon and the old portmapper provide the same RPC service. The portmapper implements Version 2 of the portmap protocol (RPC program number 100000), where the rpcbind daemon implements Versions 3 and 4 of the protocol, in addition to Version 2. This means that the rpcbind daemon already implements the functionality provided by the old portmapper. Due to this overlap in functionality and to add to the confusion, many people refer to the rpcbind daemon as the portmapper.

  • fred gets the IP address for barney, using the ipnodes NIS map. fred also looks up the RPC program number for mountd in the rpc NIS map. The RPC program number for mountd is 100005.
  • Knowing that the portmapper lives at port 111, fred sends an RPC request to the portmapper on barney, asking for the IP port (on barney) of RPC program 100005. fred also specifies the particular protocol and version number for the RPC service. barney ‘s portmapper responds to the request with port 704, the IP port at which mountd is listening for incoming mount RPC requests over the specified protocol. Note that it is possible for the portmapper to return an error, if the specified program does not exist or if it hasn’t been registered on the remote host. barney, for example, might not be an NFS server and would therefore have no reason to run the mountd daemon.
  • fred sends a mount RPC request to barney, using the IP port number returned by the portmapper. This RPC request contains an RPC procedure number, which tells the mountd daemon what to do with the request. The RPC request also contains the parameters for the procedure, in this case, the name of the filesystem fred needs to mount.

Note: this is from book <Managing NFS and NIS, Second Edition>

Categories: Kernel, Linux, Network Tags:

resolved – mount clntudp_create: RPC: Program not registered

December 2nd, 2013 No comments

When I did a showmount -e localhost, error occured:

[root@centos-doxer ~]# showmount -e localhost
mount clntudp_create: RPC: Program not registered

So I checked what RPC program number of showmount was using:

[root@centos-doxer ~]# grep showmount /etc/rpc
mountd 100005 mount showmount

As this indicated, we should startup mountd daemon to make showmount -e localhost work. And mountd is part of nfs, so I started up nfs:

[root@centos-doxer ~]# /etc/init.d/nfs start
Starting NFS services: [ OK ]
Starting NFS quotas: [ OK ]
Starting NFS daemon: [ OK ]
Starting NFS mountd: [ OK ]

Now as mountd was running, showmount -e localhost should work.

 

Categories: Kernel, Linux, Network Tags:

resolved – kernel panic not syncing: Fatal exception Pid: comm: not Tainted

November 13th, 2013 No comments

We’re install IDM OAM today and the linux server panic every time we run the startup script. Server panic info was like this:

Pid: 4286, comm: emdctl Not tainted 2.6.32-300.29.1.el5uek #1
Process emdctl (pid: 4286, threadinfo ffff88075bf20000, task ffff88073d0ac480)
Stack:
ffff88075bf21958 ffffffffa02b1769 ffff88075bf21948 ffff8807cdcce500
<0> ffff88075bf95cc8 ffff88075bf95ee0 ffff88075bf21998 ffffffffa01fd5c6
<0> ffffffffa02b1732 ffff8807bc2543f0 ffff88075bf95cc8 ffff8807bc2543f0
Call Trace:
[<ffffffffa02b1769>] nfs3_xdr_writeargs+0×37/0x7a [nfs]
[<ffffffffa01fd5c6>] rpcauth_wrap_req+0x7f/0x8b [sunrpc]
[<ffffffffa02b1732>] ? nfs3_xdr_writeargs+0×0/0x7a [nfs]
[<ffffffffa01f612a>] call_transmit+0×199/0x21e [sunrpc]
[<ffffffffa01fc8ba>] __rpc_execute+0×85/0×270 [sunrpc]
[<ffffffffa01fcae2>] rpc_execute+0×26/0x2a [sunrpc]
[<ffffffffa01f5546>] rpc_run_task+0×57/0x5f [sunrpc]
[<ffffffffa02abd86>] nfs_write_rpcsetup+0x20b/0x22d [nfs]
[<ffffffffa02ad1e8>] nfs_flush_one+0×97/0xc3 [nfs]
[<ffffffffa02a86b4>] nfs_pageio_doio+0×37/0×60 [nfs]
[<ffffffffa02a87c5>] nfs_pageio_complete+0xe/0×10 [nfs]
[<ffffffffa02ac264>] nfs_writepages+0xa7/0xe4 [nfs]
[<ffffffffa02ad151>] ? nfs_flush_one+0×0/0xc3 [nfs]
[<ffffffffa02acd2e>] nfs_write_mapping+0×63/0x9e [nfs]
[<ffffffff810f02fe>] ? __pmd_alloc+0x5d/0xaf
[<ffffffffa02acd9c>] nfs_wb_all+0×17/0×19 [nfs]
[<ffffffffa029f6f7>] nfs_do_fsync+0×21/0x4a [nfs]
[<ffffffffa029fc9c>] nfs_file_flush+0×67/0×70 [nfs]
[<ffffffff81117025>] filp_close+0×46/0×77
[<ffffffff81059e6b>] put_files_struct+0x7c/0xd0
[<ffffffff81059ef9>] exit_files+0x3a/0x3f
[<ffffffff8105b240>] do_exit+0×248/0×699
[<ffffffff8100e6a1>] ? xen_force_evtchn_callback+0xd/0xf
[<ffffffff8106898a>] ? freezing+0×13/0×15
[<ffffffff8105b731>] sys_exit_group+0×0/0x1b
[<ffffffff8106bd03>] get_signal_to_deliver+0×303/0×328
[<ffffffff8101120a>] do_notify_resume+0×90/0x6d7
[<ffffffff81459f06>] ? kretprobe_table_unlock+0x1c/0x1e
[<ffffffff8145ac6f>] ? kprobe_flush_task+0×71/0x7c
[<ffffffff8103164c>] ? paravirt_end_context_switch+0×17/0×31
[<ffffffff81123e8f>] ? path_put+0×22/0×27
[<ffffffff8101207e>] int_signal+0×12/0×17
Code: 55 48 89 e5 0f 1f 44 00 00 48 8b 06 0f c8 89 07 48 8b 46 08 0f c8 89 47 04 c9 48 8d 47 08 c3 55 48 89 e5 0f 1f 44 00 00 48 0f ce <48> 89 37 c9 48 8d 47 08 c3 55 48 89 e5 53 0f 1f 44 00 00 f6 06
RIP [<ffffffffa02b03c3>] xdr_encode_hyper+0xc/0×15 [nfs]
RSP <ffff88075bf21928>
—[ end trace 04ad5382f19cf8ad ]—
Kernel panic – not syncing: Fatal exception
Pid: 4286, comm: emdctl Tainted: G D 2.6.32-300.29.1.el5uek #1
Call Trace:
[<ffffffff810579a2>] panic+0xa5/0×162
[<ffffffff81450075>] ? threshold_create_device+0×242/0x2cf
[<ffffffff8100ed2f>] ? xen_restore_fl_direct_end+0×0/0×1
[<ffffffff814574b0>] ? _spin_unlock_irqrestore+0×16/0×18
[<ffffffff810580f5>] ? release_console_sem+0×194/0x19d
[<ffffffff810583be>] ? console_unblank+0x6a/0x6f
[<ffffffff8105766f>] ? print_oops_end_marker+0×23/0×25
[<ffffffff814583a6>] oops_end+0xb7/0xc7
[<ffffffff8101565d>] die+0x5a/0×63
[<ffffffff81457c7c>] do_trap+0×115/0×124
[<ffffffff81013731>] do_alignment_check+0×99/0xa2
[<ffffffff81012cb5>] alignment_check+0×25/0×30
[<ffffffffa02b03c3>] ? xdr_encode_hyper+0xc/0×15 [nfs]
[<ffffffffa02b06be>] ? xdr_encode_fhandle+0×15/0×17 [nfs]
[<ffffffffa02b1769>] nfs3_xdr_writeargs+0×37/0x7a [nfs]
[<ffffffffa01fd5c6>] rpcauth_wrap_req+0x7f/0x8b [sunrpc]
[<ffffffffa02b1732>] ? nfs3_xdr_writeargs+0×0/0x7a [nfs]
[<ffffffffa01f612a>] call_transmit+0×199/0x21e [sunrpc]
[<ffffffffa01fc8ba>] __rpc_execute+0×85/0×270 [sunrpc]
[<ffffffffa01fcae2>] rpc_execute+0×26/0x2a [sunrpc]
[<ffffffffa01f5546>] rpc_run_task+0×57/0x5f [sunrpc]
[<ffffffffa02abd86>] nfs_write_rpcsetup+0x20b/0x22d [nfs]
[<ffffffffa02ad1e8>] nfs_flush_one+0×97/0xc3 [nfs]
[<ffffffffa02a86b4>] nfs_pageio_doio+0×37/0×60 [nfs]
[<ffffffffa02a87c5>] nfs_pageio_complete+0xe/0×10 [nfs]
[<ffffffffa02ac264>] nfs_writepages+0xa7/0xe4 [nfs]
[<ffffffffa02ad151>] ? nfs_flush_one+0×0/0xc3 [nfs]
[<ffffffffa02acd2e>] nfs_write_mapping+0×63/0x9e [nfs]
[<ffffffff810f02fe>] ? __pmd_alloc+0x5d/0xaf
[<ffffffffa02acd9c>] nfs_wb_all+0×17/0×19 [nfs]
[<ffffffffa029f6f7>] nfs_do_fsync+0×21/0x4a [nfs]
[<ffffffffa029fc9c>] nfs_file_flush+0×67/0×70 [nfs]
[<ffffffff81117025>] filp_close+0×46/0×77
[<ffffffff81059e6b>] put_files_struct+0x7c/0xd0
[<ffffffff81059ef9>] exit_files+0x3a/0x3f
[<ffffffff8105b240>] do_exit+0×248/0×699
[<ffffffff8100e6a1>] ? xen_force_evtchn_callback+0xd/0xf
[<ffffffff8106898a>] ? freezing+0×13/0×15
[<ffffffff8105b731>] sys_exit_group+0×0/0x1b
[<ffffffff8106bd03>] get_signal_to_deliver+0×303/0×328
[<ffffffff8101120a>] do_notify_resume+0×90/0x6d7
[<ffffffff81459f06>] ? kretprobe_table_unlock+0x1c/0x1e
[<ffffffff8145ac6f>] ? kprobe_flush_task+0×71/0x7c
[<ffffffff8103164c>] ? paravirt_end_context_switch+0×17/0×31
[<ffffffff81123e8f>] ? path_put+0×22/0×27
[<ffffffff8101207e>] int_signal+0×12/0×17

We tried a lot(application coredump, kdump etc) but still not got solution until we notice that there were a lot of nfs related message in the kernel panic info(marked as red above).

As our linux server was not using NFS or autofs, so we tried upgrade nfs client(nfs-utils) and disabled autofs:

yum update nfs-utils

chkconfig autofs off

After this, the startup for IDM succeeded, and no server panic found anymore!

Categories: Kernel, Linux Tags: ,

linux kernel module installation

June 30th, 2013 No comments
yum install kernel-devel
#only install module
#compile kernel
cp /boot/config-xxx .config, then make oldconfig(will ask only for new features)
AND
cp /usr/src/kernels/linux-2.6.23/.config /boot/config-2.6.23
mkdir -p /usr/share/doc/linux-2.6.23
cp -r Documentation/* /usr/share/doc/linux-2.6.23/
grep -r  read_bytes /usr/share/doc/linux-2.6.23/
Categories: Kernel Tags:

/proc filesystem – day 3

June 30th, 2013 No comments
/proc/sys - use sysctl to control
              This directory (present since 1.3.57) contains a number of files
              and subdirectories corresponding  to  kernel  variables.   These
              variables  can  be  read  and sometimes modified using the /proc
              file  system,  and  the  (deprecated)  sysctl(2)  system   call.
              Presently, there are subdirectories abi, debug, dev, fs, kernel,
              net, proc, rxrpc, sunrpc and vm that each contain more files and
              subdirectories. More info on https://www.kernel.org/doc/Documentation/sysctl/

[root@centos-doxer sys]# cat /proc/sys/dev/cdrom/info
CD-ROM information, Id: cdrom.c 3.20 2003/12/17

drive name: hdc
drive speed: 32
drive # of slots: 1

……

/proc/sys/fs/file-max
              This file defines a system-wide limit  on  the  number  of  open
              files  for  all processes.  (See also setrlimit(2), which can be
              used by a process to set the per-process  limit,  RLIMIT_NOFILE,
              on  the  number of files it may open.)  If you get lots of error
              messages about running out of file handles, try increasing  this
              value:

              echo 100000 > /proc/sys/fs/file-max

              The  kernel constant NR_OPEN imposes an upper limit on the value
              that may be placed in file-max.

              If you  increase  /proc/sys/fs/file-max,  be  sure  to  increase
              /proc/sys/fs/inode-max   to   3-4   times   the   new  value  of
              /proc/sys/fs/file-max, or you will run out of inodes.
/proc/sys/fs/file-nr
              This (read-only)  file  gives  the  number  of  files  presently
              opened.  It contains three numbers: the number of allocated file
              handles; the number of free file handles; and the maximum number
              of file handles.  The kernel allocates file handles dynamically,
              but it doesn't free them again.   If  the  number  of  allocated
              files  is  close  to the maximum, you should consider increasing
              the maximum.  When the number of free  file  handles  is  large,
              you've  encountered a peak in your usage of file handles and you
              probably don't need to increase the maximum.
/proc/sys/kernel/ctrl-alt-del
              This file controls the handling of Ctrl-Alt-Del  from  the  key-
              board.   When  the  value  in  this  file  is 0, Ctrl-Alt-Del is
              trapped and sent to the init(8) program  to  handle  a  graceful
              restart.   When the value is greater than zero, Linux's reaction
              to a Vulcan Nerve Pinch (tm) will be an immediate reboot,  with-
              out  even syncing its dirty buffers.  Note: when a program (like
              dosemu) has the keyboard in  "raw"  mode,  the  ctrl-alt-del  is
              intercepted by the program before it ever reaches the kernel tty
              layer, and it's up to the program to decide what to do with it.
/proc/sys/kernel/sysrq
              This file controls the functions allowed to be  invoked  by  the
              SysRq  key.   By default, the file contains 1 meaning that every
              possible SysRq request is allowed  (in  older  kernel  versions,
              SysRq was disabled by default, and you were required to specifi-
              cally enable it at run-time, but this is not the case any more).
              Possible values in this file are:

                 0 - disable sysrq completely
                 1 - enable all functions of sysrq
                >1 - bitmask of allowed sysrq functions, as follows:
                        2 - enable control of console logging level
                        4 - enable control of keyboard (SAK, unraw)
                        8 - enable debugging dumps of processes etc.
                       16 - enable sync command
                       32 - enable remount read-only
                       64  -  enable signalling of processes (term, kill, oom-
              kill)
                      128 - allow reboot/poweroff
                      256 - allow nicing of all real-time tasks

              This file is only present if the CONFIG_MAGIG_SYSRQ kernel  con-
              figuration  option is enabled.  For further details see the ker-
              nel source file Documentation/sysrq.txt.
/proc/sys/vm
              This directory contains files for memory management tuning, buf-
              fer and cache management. More on https://www.kernel.org/doc/Documentation/sysctl/vm.txt
/proc/sys/vm/swappiness
              The value in this file controls how aggressively the kernel will
              swap  memory pages.  Higher values increase agressiveness, lower
              values descrease aggressiveness.  The default value is 60.
min_free_kbytes:

This is used to force the Linux VM to keep a minimum number
of kilobytes free.  The VM uses this number to compute a
watermark[WMARK_MIN] value for each lowmem zone in the system.
Each lowmem zone gets a number of reserved free pages based
proportionally on its size.

Some minimal amount of memory is needed to satisfy PF_MEMALLOC
allocations; if you set this to lower than 1024KB, your system will
become subtly broken, and prone to deadlock under high loads.

Setting this too high will OOM your machine instantly.

panic_on_oom

This enables or disables panic on out-of-memory feature.

If this is set to 0, the kernel will kill some rogue process,
called oom_killer.  Usually, oom_killer can kill rogue processes and
system will survive.

If this is set to 1, the kernel panics when out-of-memory happens.
However, if a process limits using nodes by mempolicy/cpusets,
and those nodes become memory exhaustion status, one process
may be killed by oom-killer. No panic occurs in this case.
Because other nodes' memory may be free. This means system total status
may be not fatal yet.

If this is set to 2, the kernel panics compulsorily even on the
above-mentioned. Even oom happens under memory cgroup, the whole
system panics.

The default value is 0.
1 and 2 are for failover of clustering. Please select either
according to your policy of failover.
panic_on_oom=2+kdump gives you very strong tool to investigate
why oom happens. You can get snapshot.

/proc/sysrq-trigger (since Linux 2.4.21)
              Writing a character to this file triggers the same  SysRq  func-
              tion  as  typing  ALT-SysRq-<character>  (see the description of
              /proc/sys/kernel/sysrq).  This file is normally only writable by
/proc/vmstat (since Linux 2.6)
              This file displays various virtual memory statistics.

Categories: Kernel, Linux Tags:

application core dump and kernel core dump(kdump) in linux

May 30th, 2013 No comments

—App core dump
vi /etc/profile

ulimit -c unlimited >/dev/null 2>&1

vi /etc/sysctl.conf

kernel.core_uses_pid = 1
kernel.core_pattern = /var/tmp/app-core-%e-%s-%u-%g-%p-%t
fs.suid_dumpable = 2

echo “DAEMON_COREFILE_LIMIT=’unlimited’” >> /etc/sysconfig/init #enable debugging for all apps. For specific app, e.g /etc/init.d/httpd(DAEMON_COREFILE_LIMIT=’unlimited’)
sysctl -p

After this, you can kill -s SIGSEGV <pid> to generate a linux application core dump(send a Segmentation fault signal)

—Kernel core dump
yum install system-config-kdump kexec-tools crash kernel-debuginfo kernel-debuginfo-common
vi /boot/grub/grub.conf

crashkernel=0M-2G:128M,2G-6G:256M,6G-8G:512M,8G-:768M nmi_watchdog=1 #Or crashkernel=256M@16M, tells the system kernel to reserve 256 MB of memory starting at physical address 0×01000000 (16MB) for the dump-capture kernel, crashkernel=256M

vi /etc/kdump.conf #more on http://linux.die.net/man/5/kdump.conf

ext3 /dev/mapper/VolGroup00-LogVol00
core_collector makedumpfile -c -d 31
path /var/log/dumps

vi /etc/sysctl.conf

kernel.unknown_nmi_panic=1 #should be there by default
fs.suid_dumpable = 2
kernel.core_uses_pid = 1 #should be there by default
kernel.panic_on_unrecovered_nmi=1
vm.panic_on_oom=1

vi /etc/security/limits.conf #can not exceed values defined in /etc/security/limits.conf though

* soft core unlimited
* hard core unlimited

vi /etc/profile

ulimit -c unlimited >/dev/null 2>&1

sysctl -p
chkconfig kdump on
reboot

dmesg|egrep -i ‘crash|dump’ #sometimes memory reservation may failed with message “crashkernel reservation failed – memory is in use”
cat /proc/iomem | grep -i crash

cat /proc/meminfo|grep Slab #The total amount of memory, in kilobytes, used by the kernel to cache data structures for its own use
cat /proc/cmdline #check this after configuring crash kernel ram

/etc/init.d/kdump status
/etc/init.d/kdump restart
sync;sync;sync
echo 1 > /proc/sys/kernel/sysrq
echo “c” > /proc/sysrq-trigger #will reboot, to test the kdump
ls -lrth /var/log/dumps/
—Analyse coredump file
crash /usr/lib/debug/lib/modules/2.6.18-194.17.4.el5/vmlinux /var/crash/127.0.0.1-2011-03-16-12\:23\:06/vmcore
objdump -a ./ls #binutils, file format elf64-x86-64
gdb /path/to/application /path/to/corefile
strace

Categories: Kernel, Linux, Systems Tags: ,

/proc filesystem – day 2

May 29th, 2013 No comments

/proc/bus/pci

Contains  various bus subdirectories and pseudo-files containing
              information about PCI  busses,  installed  devices,  and  device
              drivers.  Some of these files are not ASCII.
lspci #SCSI or SATA controller
lspci -tv #all pci devices
lspci|grep -i controller
lsusb -tv #all usb devices

[root@centos-doxer 00]# tree /proc/bus/pci
/proc/bus/pci
|– 00
| |– 00.0
| |– 01.0
| |– 01.1
| |– 02.0
| |– 03.0
| |– 04.0
| |– 05.0
| |– 07.0
| `– 0d.0
`– devices

1 directory, 10 files
[root@centos-doxer 00]# lspci
00:00.0 Host bridge: Intel Corporation 440FX – 82441FX PMC [Natoma] (rev 02)
00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
00:01.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev 01)
00:02.0 VGA compatible controller: InnoTek Systemberatung GmbH VirtualBox Graphics Adapter
00:03.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 02)
00:04.0 System peripheral: InnoTek Systemberatung GmbH VirtualBox Guest Service
00:05.0 Multimedia audio controller: Intel Corporation 82801AA AC’97 Audio Controller (rev 01)
00:07.0 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 08)
00:0d.0 SATA controller: Intel Corporation 82801HM/HEM (ICH8M/ICH8M-E) SATA Controller [AHCI mode] (rev 02)

/proc/cmdline
              Arguments passed to the Linux kernel at boot time.   Often  done
              via a boot manager such as lilo(8) or grub(8).

[root@centos-doxer 00]# cat /proc/cmdline
ro root=/dev/VolGroup00/LogVol00 rhgb quiet
[root@centos-doxer 00]#
[root@centos-doxer 00]#
[root@centos-doxer 00]#
[root@centos-doxer 00]#
[root@centos-doxer 00]# cat /boot/grub/grub.conf
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE: You have a /boot partition. This means that
# all kernel and initrd paths are relative to /boot/, eg.
# root (hd0,0)
# kernel /vmlinuz-version ro root=/dev/VolGroup00/LogVol00
# initrd /initrd-version.img
#boot=/dev/sda
default=0
timeout=5
splashimage=(hd0,0)/grub/splash.xpm.gz
hiddenmenu
title CentOS (2.6.18-308.el5)
root (hd0,0)
kernel /vmlinuz-2.6.18-308.el5 ro root=/dev/VolGroup00/LogVol00 rhgb quiet
initrd /initrd-2.6.18-308.el5.img

/proc/diskstats (since Linux 2.5.69)
              This file contains disk I/O statistics  for  each  disk  device.
              See the kernel source file Documentation/iostats.txt for further
              information.

[root@centos-doxer proc]# cat /proc/diskstats | grep sda1
8 1 sda1 68 1002 2156 4754 7 2 18 1 0 4023 4755

[root@centos-doxer proc]# cat /sys/block/sda/sda1/stat
68 1002 2156 4754 7 2 18 1 0 4023 4755

As for the number meaning, you can check http://blog.csdn.net/tenfyguo/article/details/7477526

/proc/driver/rtc #real time clock, hardware clock

[root@centos-doxer proc]# cat /proc/driver/rtc
rtc_time : 00:34:10
rtc_date : 2013-05-29
rtc_epoch : 1900
alarm : 00:00:00
DST_enable : no
BCD : yes
24hr : yes
square_wave : no
alarm_IRQ : no
update_IRQ : no
periodic_IRQ : no
periodic_freq : 1024
batt_status : okay
[root@centos-doxer proc]#
[root@centos-doxer proc]# ls -l /dev/rtc
crw-r–r– 1 root root 10, 135 May 29 08:05 /dev/rtc

/proc/fb

Frame buffer information when CONFIG_FB is defined during kernel
              compilation.

In xen DomU configuration:

vfb = ['type=vnc,vncunused=1,vnclisten=0.0.0.0,vncpasswd=test']

/proc/filesystems
              A text listing of the file systems which are  supported  by  the
              kernel,  namely file systems which were compiled into the kernel
              or  whose  kernel  modules  are  currently  loaded.   (See  also
              filesystems(5).)   If a file system is marked with "nodev", this
              means that it does not require a  block  device  to  be  mounted
              (e.g., virtual file system, network file system).

              Incidentally,  this  file  may  be used by mount(8) when no file
              system is specified and it didn't manage to determine  the  file
              system type.  Then file systems contained in this file are tried
              (excepted those that are marked with "nodev").

[root@centos-doxer proc]# cat /proc/filesystems
nodev sysfs
nodev rootfs
nodev bdev
nodev proc
nodev cpuset
nodev binfmt_misc
nodev debugfs
nodev securityfs
nodev sockfs
nodev usbfs
nodev pipefs
nodev anon_inodefs
nodev futexfs
nodev tmpfs
nodev inotifyfs
nodev eventpollfs
nodev devpts
ext2
nodev ramfs
nodev hugetlbfs
iso9660
nodev mqueue
ext3
nodev rpc_pipefs
nodev autofs

/proc/ide
              This directory exists on systems with the IDE  bus.   There  are
              directories  for  each  IDE  channel and attached device.  Files
              include:

                  cache              buffer size in KB
                  capacity           number of sectors
                  driver             driver version
                  geometry           physical and logical geometry
                  identify           in hexadecimal
                  media              media type
                  model              manufacturer's model number
                  settings           drive settings
                  smart_thresholds   in hexadecimal

[root@centos-doxer ide]# lspci |grep -i ide
00:01.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev 01)

[root@centos-doxer ide]# pwd
/proc/ide

[root@centos-doxer ide]# ls -l
total 0
-r–r–r– 1 root root 0 May 29 08:43 drivers
lrwxrwxrwx 1 root root 8 May 29 08:43 hdc -> ide1/hdc
dr-xr-xr-x 3 root root 0 May 29 08:43 ide1
[root@centos-doxer ide]# cat drivers
ide-disk version 1.18
ide-floppy version 0.99.newide
ide-cdrom version 4.61

[root@centos-doxer ide]# cat hdc/driver
ide-cdrom version 4.61

[root@centos-doxer ide]# cat hdc/media
cdrom
[root@centos-doxer ide]# cat hdc/model
VBOX CD-ROM
[root@centos-doxer ide]# cat hdc/settings
name value min max mode
—- —– — — —-
current_speed 66 0 70 rw
dsc_overlap 0 0 1 rw
init_speed 66 0 70 rw
io_32bit 0 0 3 rw
keepsettings 0 0 1 rw
nice1 1 0 1 rw
number 2 0 3 rw
pio_mode write-only 0 255 w
unmaskirq 0 0 1 rw
using_dma 1 0 1 rw

/proc/iomem
              I/O memory map in Linux 2.4.

[root@centos-doxer ide]# cat /proc/iomem
00010000-0009fbff : System RAM
0009fc00-0009ffff : reserved
000a0000-000bffff : Video RAM area
000c0000-000c8fff : Video ROM
000e2000-000e6fff : Adapter ROM
000f0000-000fffff : System ROM
00100000-3ffeffff : System RAM
00200000-0048e1bb : Kernel code
0048e1bc-0063016f : Kernel data
3fff0000-3fffffff : ACPI Tables
e0000000-e0ffffff : 0000:00:02.0
f0000000-f001ffff : 0000:00:03.0
f0000000-f001ffff : e1000
f0400000-f07fffff : 0000:00:04.0
f0800000-f0803fff : 0000:00:04.0
f0804000-f0805fff : 0000:00:0d.0
f0804000-f0805fff : ahci
fffc0000-ffffffff : reserved

/proc/ioports
              This is a list of currently registered Input-Output port regions
              that are in use.

[root@centos-doxer ide]# cat /proc/ioports
0000-001f : dma1
0020-0021 : pic1
0040-0043 : timer0
0050-0053 : timer1
0060-0060 : keyboard
0064-0064 : keyboard
0070-0077 : rtc
0080-008f : dma page reg
00a0-00a1 : pic2
00c0-00df : dma2
00f0-00ff : fpu
0170-0177 : ide1
0376-0376 : ide1
03c0-03df : vga+
0cf8-0cff : PCI conf1
4000-4003 : ACPI PM1a_EVT_BLK
4004-4005 : ACPI PM1a_CNT_BLK
4008-400b : ACPI PM_TMR
4020-4021 : ACPI GPE0_BLK
d000-d00f : 0000:00:01.1
d000-d007 : ide0
d008-d00f : ide1
d010-d017 : 0000:00:03.0
d010-d017 : e1000
d020-d03f : 0000:00:04.0
d100-d1ff : 0000:00:05.0
d100-d1ff : Intel 82801AA-ICH
d200-d23f : 0000:00:05.0
d200-d23f : Intel 82801AA-ICH
d240-d247 : 0000:00:0d.0
d250-d257 : 0000:00:0d.0
d260-d26f : 0000:00:0d.0

/proc/loadavg
              The  first  three  fields  in this file are load average figures
              giving the number of jobs in the run queue (state R) or  waiting
              for disk I/O (state D) averaged over 1, 5, and 15 minutes.  They
              are the same as the load average numbers given by uptime(1)  and
              other  programs.  The fourth field consists of two numbers sepa-
              rated by a slash (/).  The first of these is the number of  cur-
              rently   executing   kernel   scheduling   entities  (processes,
              threads); this will be less than or equal to the number of CPUs.
              The  value  after  the  slash is the number of kernel scheduling
              entities that currently exist on the system.  The fifth field is
              the  PID  of  the  process that was most recently created on the
              system.

[root@centos-doxer ide]# cat /proc/loadavg
0.00 0.00 0.00 1/219 3524

/proc/modules
              A  text list of the modules that have been loaded by the system.
              See also lsmod(8).

[root@centos-doxer ide]# lsmod
Module Size Used by
autofs4 62281 3
hidp 83521 2
rfcomm 104681 0
l2cap 89537 10 hidp,rfcomm
bluetooth 118725 5 hidp,rfcomm,l2cap
lockd 101425 0
sunrpc 203145 2 lockd
be2iscsi 94685 0
ib_iser 68161 0
rdma_cm 68689 1 ib_iser
ib_cm 72809 1 rdma_cm
iw_cm 43465 1 rdma_cm
ib_sa 74953 2 rdma_cm,ib_cm
ib_mad 70757 2 ib_cm,ib_sa
ib_core 104901 6 ib_iser,rdma_cm,ib_cm,iw_cm,ib_sa,ib_mad
ib_addr 41673 1 rdma_cm
iscsi_tcp 50893 0
bnx2i 81249 0
cnic 85481 1 bnx2i
ipv6 437857 1 cnic
xfrm_nalgo 43333 1 ipv6
crypto_api 42945 1 xfrm_nalgo
uio 45777 1 cnic
cxgb3i 64849 0
libcxgbi 91597 1 cxgb3i
cxgb3 216241 1 cxgb3i
8021q 58449 1 cxgb3
libiscsi_tcp 53573 3 iscsi_tcp,cxgb3i,libcxgbi
libiscsi2 77765 7 be2iscsi,ib_iser,iscsi_tcp,bnx2i,cxgb3i,libcxgbi,libiscsi_tcp
scsi_transport_iscsi2 73945 8 be2iscsi,ib_iser,iscsi_tcp,bnx2i,libcxgbi,libiscsi2
scsi_transport_iscsi 35017 1 scsi_transport_iscsi2
dm_multipath 58969 0
scsi_dh 42561 1 dm_multipath
video 53197 0
backlight 39872 1 video
sbs 49921 0
power_meter 47053 0
hwmon 36553 1 power_meter
i2c_ec 38593 1 sbs
dell_wmi 37601 0
wmi 41985 1 dell_wmi
button 40545 0
battery 43849 0
asus_acpi 50917 0
acpi_memhotplug 40517 0
ac 38729 0
lp 47121 0
snd_intel8x0 69481 1
snd_ac97_codec 143257 1 snd_intel8x0
ac97_bus 35649 1 snd_ac97_codec
snd_seq_dummy 37061 0
snd_seq_oss 65473 0
snd_seq_midi_event 41025 1 snd_seq_oss
snd_seq 87777 5 snd_seq_dummy,snd_seq_oss,snd_seq_midi_event
snd_seq_device 41557 3 snd_seq_dummy,snd_seq_oss,snd_seq
snd_pcm_oss 77377 0
snd_mixer_oss 49985 1 snd_pcm_oss
snd_pcm 116681 3 snd_intel8x0,snd_ac97_codec,snd_pcm_oss
snd_timer 57161 2 snd_seq,snd_pcm
parport_pc 62953 0
sg 70649 0
i2c_piix4 43725 0
snd 100201 11 snd_intel8x0,snd_ac97_codec,snd_seq_oss,snd_seq,snd_seq_device,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_timer
parport 73165 2 lp,parport_pc
tpm_tis 48077 0
ide_cd 73825 0
e1000 162665 0
soundcore 41825 1 snd
tpm 50273 1 tpm_tis
i2c_core 57537 2 i2c_ec,i2c_piix4
pcspkr 36289 0
tpm_bios 40897 1 tpm
serio_raw 40517 0
cdrom 68713 1 ide_cd
snd_page_alloc 44113 2 snd_intel8x0,snd_pcm
dm_raid45 99785 0
dm_message 36289 1 dm_raid45
dm_region_hash 46145 1 dm_raid45
dm_mem_cache 38977 1 dm_raid45
dm_snapshot 52233 0
dm_zero 35265 0
dm_mirror 54737 0
dm_log 44993 3 dm_raid45,dm_region_hash,dm_mirror
dm_mod 102417 11 dm_multipath,dm_raid45,dm_snapshot,dm_zero,dm_mirror,dm_log
ata_piix 57925 0
ahci 73805 2
libata 208977 2 ata_piix,ahci
sd_mod 56513 3
scsi_mod 199641 11 be2iscsi,ib_iser,iscsi_tcp,bnx2i,libcxgbi,libiscsi2,scsi_transport_iscsi2,scsi_dh,sg,libata,sd_mod
ext3 169297 2
jbd 94897 1 ext3
uhci_hcd 57433 0
ohci_hcd 56181 0
ehci_hcd 66765 0
[root@centos-doxer ide]#
[root@centos-doxer ide]#
[root@centos-doxer ide]# cat /proc/modules
autofs4 62281 3 – Live 0xffffffff88723000
hidp 83521 2 – Live 0xffffffff8870d000
rfcomm 104681 0 – Live 0xffffffff886f2000
l2cap 89537 10 hidp,rfcomm, Live 0xffffffff886db000
bluetooth 118725 5 hidp,rfcomm,l2cap, Live 0xffffffff886bd000
lockd 101425 0 – Live 0xffffffff886a3000
sunrpc 203145 2 lockd, Live 0xffffffff88670000
be2iscsi 94685 0 – Live 0xffffffff88657000
ib_iser 68161 0 – Live 0xffffffff88645000
rdma_cm 68689 1 ib_iser, Live 0xffffffff88633000
ib_cm 72809 1 rdma_cm, Live 0xffffffff88620000
iw_cm 43465 1 rdma_cm, Live 0xffffffff88614000
ib_sa 74953 2 rdma_cm,ib_cm, Live 0xffffffff88600000
ib_mad 70757 2 ib_cm,ib_sa, Live 0xffffffff885ed000
ib_core 104901 6 ib_iser,rdma_cm,ib_cm,iw_cm,ib_sa,ib_mad, Live 0xffffffff885d2000
ib_addr 41673 1 rdma_cm, Live 0xffffffff885c6000
iscsi_tcp 50893 0 – Live 0xffffffff885b8000
bnx2i 81249 0 – Live 0xffffffff885a3000
cnic 85481 1 bnx2i, Live 0xffffffff8858d000
ipv6 437857 1 cnic, Live 0xffffffff88521000
xfrm_nalgo 43333 1 ipv6, Live 0xffffffff88515000
crypto_api 42945 1 xfrm_nalgo, Live 0xffffffff88509000
uio 45777 1 cnic, Live 0xffffffff884fc000
cxgb3i 64849 0 – Live 0xffffffff884eb000
libcxgbi 91597 1 cxgb3i, Live 0xffffffff884d3000
cxgb3 216241 1 cxgb3i, Live 0xffffffff8849d000
8021q 58449 1 cxgb3, Live 0xffffffff8848d000
libiscsi_tcp 53573 3 iscsi_tcp,cxgb3i,libcxgbi, Live 0xffffffff8847e000
libiscsi2 77765 7 be2iscsi,ib_iser,iscsi_tcp,bnx2i,cxgb3i,libcxgbi,libiscsi_tcp, Live 0xffffffff8846a000
scsi_transport_iscsi2 73945 8 be2iscsi,ib_iser,iscsi_tcp,bnx2i,libcxgbi,libiscsi2, Live 0xffffffff88456000
scsi_transport_iscsi 35017 1 scsi_transport_iscsi2, Live 0xffffffff8844c000
dm_multipath 58969 0 – Live 0xffffffff8843c000
scsi_dh 42561 1 dm_multipath, Live 0xffffffff88430000
video 53197 0 – Live 0xffffffff88422000
backlight 39872 1 video, Live 0xffffffff88417000
sbs 49921 0 – Live 0xffffffff88409000
power_meter 47053 0 – Live 0xffffffff883fc000
hwmon 36553 1 power_meter, Live 0xffffffff883f2000
i2c_ec 38593 1 sbs, Live 0xffffffff883e7000
dell_wmi 37601 0 – Live 0xffffffff883dc000
wmi 41985 1 dell_wmi, Live 0xffffffff883d0000
button 40545 0 – Live 0xffffffff883c5000
battery 43849 0 – Live 0xffffffff883b9000
asus_acpi 50917 0 – Live 0xffffffff883ab000
acpi_memhotplug 40517 0 – Live 0xffffffff883a0000
ac 38729 0 – Live 0xffffffff88395000
lp 47121 0 – Live 0xffffffff88388000
snd_intel8x0 69481 1 – Live 0xffffffff88376000
snd_ac97_codec 143257 1 snd_intel8x0, Live 0xffffffff88352000
ac97_bus 35649 1 snd_ac97_codec, Live 0xffffffff88348000
snd_seq_dummy 37061 0 – Live 0xffffffff8833d000
snd_seq_oss 65473 0 – Live 0xffffffff8832c000
snd_seq_midi_event 41025 1 snd_seq_oss, Live 0xffffffff88320000
snd_seq 87777 5 snd_seq_dummy,snd_seq_oss,snd_seq_midi_event, Live 0xffffffff88309000
snd_seq_device 41557 3 snd_seq_dummy,snd_seq_oss,snd_seq, Live 0xffffffff882fd000
snd_pcm_oss 77377 0 – Live 0xffffffff882e9000
snd_mixer_oss 49985 1 snd_pcm_oss, Live 0xffffffff882db000
snd_pcm 116681 3 snd_intel8x0,snd_ac97_codec,snd_pcm_oss, Live 0xffffffff882bd000
snd_timer 57161 2 snd_seq,snd_pcm, Live 0xffffffff882ae000
parport_pc 62953 0 – Live 0xffffffff8829d000
sg 70649 0 – Live 0xffffffff8828a000
i2c_piix4 43725 0 – Live 0xffffffff8827e000
snd 100201 11 snd_intel8x0,snd_ac97_codec,snd_seq_oss,snd_seq,snd_seq_device,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_timer, Live 0xffffffff88264000
parport 73165 2 lp,parport_pc, Live 0xffffffff88251000
tpm_tis 48077 0 – Live 0xffffffff88242000
ide_cd 73825 0 – Live 0xffffffff8822c000
e1000 162665 0 – Live 0xffffffff88201000
soundcore 41825 1 snd, Live 0xffffffff881f5000
tpm 50273 1 tpm_tis, Live 0xffffffff881e7000
i2c_core 57537 2 i2c_ec,i2c_piix4, Live 0xffffffff881d7000
pcspkr 36289 0 – Live 0xffffffff881cd000
tpm_bios 40897 1 tpm, Live 0xffffffff881c2000
serio_raw 40517 0 – Live 0xffffffff881b7000
cdrom 68713 1 ide_cd, Live 0xffffffff881a5000
snd_page_alloc 44113 2 snd_intel8x0,snd_pcm, Live 0xffffffff88199000
dm_raid45 99785 0 – Live 0xffffffff8817f000
dm_message 36289 1 dm_raid45, Live 0xffffffff88175000
dm_region_hash 46145 1 dm_raid45, Live 0xffffffff88168000
dm_mem_cache 38977 1 dm_raid45, Live 0xffffffff8815d000
dm_snapshot 52233 0 – Live 0xffffffff8814f000
dm_zero 35265 0 – Live 0xffffffff88145000
dm_mirror 54737 0 – Live 0xffffffff88136000
dm_log 44993 3 dm_raid45,dm_region_hash,dm_mirror, Live 0xffffffff8812a000
dm_mod 102417 11 dm_multipath,dm_raid45,dm_snapshot,dm_zero,dm_mirror,dm_log, Live 0xffffffff8810f000
ata_piix 57925 0 – Live 0xffffffff880ff000
ahci 73805 2 – Live 0xffffffff880eb000
libata 208977 2 ata_piix,ahci, Live 0xffffffff880b6000
sd_mod 56513 3 – Live 0xffffffff880a7000
scsi_mod 199641 11 be2iscsi,ib_iser,iscsi_tcp,bnx2i,libcxgbi,libiscsi2,scsi_transport_iscsi2,scsi_dh,sg,libata,sd_mod, Live 0xffffffff88075000
ext3 169297 2 – Live 0xffffffff8804a000
jbd 94897 1 ext3, Live 0xffffffff88031000
uhci_hcd 57433 0 – Live 0xffffffff88021000
ohci_hcd 56181 0 – Live 0xffffffff88012000
ehci_hcd 66765 0 – Live 0xffffffff88000000

/proc/mounts
              Before kernel 2.4.19, this file was a list of all the file  sys-
              tems  currently mounted on the system.  With the introduction of
              per-process mount namespaces in Linux 2.4.19, this file became a
              link  to  /proc/self/mounts, which lists the mount points of the
              process's own mount namespace.  The format of this file is docu-
              mented in fstab(5).
Here's something about linux mount namespace http://www.cnblogs.com/lisperl/archive/2012/05/03/2480316.html

[root@centos-doxer ide]# cat /proc/mounts
rootfs / rootfs rw 0 0
/dev/root / ext3 rw,data=ordered 0 0
/dev /dev tmpfs rw 0 0
/proc /proc proc rw 0 0
/sys /sys sysfs rw 0 0
/proc/bus/usb /proc/bus/usb usbfs rw 0 0
devpts /dev/pts devpts rw 0 0
/dev/sda1 /boot ext3 rw,data=ordered 0 0
tmpfs /dev/shm tmpfs rw 0 0
none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0
sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0
/etc/auto.misc /misc autofs rw,fd=7,pgrp=2137,timeout=300,minproto=5,maxproto=5,indirect 0 0
-hosts /net autofs rw,fd=13,pgrp=2137,timeout=300,minproto=5,maxproto=5,indirect 0 0
[root@centos-doxer ide]#
[root@centos-doxer ide]# ls -l /proc/mounts
lrwxrwxrwx 1 root root 11 May 29 08:54 /proc/mounts -> self/mounts

/proc/net/arp
              This holds an ASCII readable dump of the kernel ARP  table  used
              for  address resolutions.  It will show both dynamically learned
              and pre-programmed ARP entries.  The format is:

        IP address     HW type   Flags     HW address          Mask   Device
        192.168.0.50   0x1       0x2       00:50:BF:25:68:F3   *      eth0
        192.168.0.250  0x1       0xc       00:00:00:00:00:00   *      eth0

              Here "IP address" is the IPv4 address of the machine and the "HW
              type"  is  the  hardware  type of the address from RFC 826.  The
              flags are the internal flags of the ARP structure (as defined in
              /usr/include/linux/if_arp.h)  and  the  "HW address" is the data
              link layer mapping for that IP address if it is known.

[root@centos-doxer net]# netstat -rnv
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
10.182.120.0 0.0.0.0 255.255.254.0 U 0 0 0 eth0
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0
0.0.0.0 10.182.120.1 0.0.0.0 UG 0 0 0 eth0
[root@centos-doxer net]# cat /proc/net/arp
IP address HW type Flags HW address Mask Device
10.182.120.179 0×1 0×2 00:21:CC:C9:CD:BB * eth0
10.182.120.1 0×1 0×2 00:00:0C:07:AC:2A * eth0

[root@centos-doxer net]# arp -an
? (10.182.120.179) at 00:21:CC:C9:CD:BB [ether] on eth0
? (10.182.120.1) at 00:00:0C:07:AC:2A [ether] on eth0

/proc/net/dev
              The dev pseudo-file contains network device status  information.
              This  gives  the number of received and sent packets, the number
              of errors and collisions and other basic statistics.  These  are
              used  by  the  ifconfig(8) program to report device status.

[root@centos-doxer net]# cat /proc/net/dev
Inter-| Receive | Transmit
face |bytes packets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed
lo: 6097320 16211 0 0 0 0 0 0 6097320 16211 0 0 0 0 0 0
eth0: 761200 6083 0 0 0 0 0 423 2952411 5163 0 0 0 0 0 0
[root@centos-doxer net]#
[root@centos-doxer net]# ifconfig eth0
eth0 Link encap:Ethernet HWaddr 08:00:27:3F:C5:08
inet addr:10.182.120.178 Bcast:10.182.121.255 Mask:255.255.254.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:6109 errors:0 dropped:0 overruns:0 frame:0
TX packets:5183 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:763900 (745.9 KiB) TX bytes:2955075 (2.8 MiB)

/proc/net/dev_mcast
              current reverse mapping database used to provide rarp(8) reverse
              address  lookup  services.   If  RARP is not configured into the
              kernel, this file will not be present.

[root@centos-doxer net]# cat /proc/net/dev_mcast
2 eth0 1 0 01005e0000fb
2 eth0 1 0 01005e000001

/proc/net/snmp #more info here http://blog.csdn.net/tenfyguo/article/details/7478584
              This file holds the ASCII data needed for the IP, ICMP, TCP, and
              UDP management information bases for an SNMP agent.

[root@centos-doxer net]# cat /proc/net/snmp
Ip: Forwarding DefaultTTL InReceives InHdrErrors InAddrErrors ForwDatagrams InUnknownProtos InDiscards InDelivers OutRequests OutDiscards OutNoRoutes ReasmTimeout ReasmReqds ReasmOKs ReasmFails FragOKs FragFails FragCreates
Ip: 2 64 21835 0 0 0 0 0 21816 20194 0 0 0 0 0 0 0 0 0
Icmp: InMsgs InErrors InDestUnreachs InTimeExcds InParmProbs InSrcQuenchs InRedirects InEchos InEchoReps InTimestamps InTimestampReps InAddrMasks InAddrMaskReps OutMsgs OutErrors OutDestUnreachs OutTimeExcds OutParmProbs OutSrcQuenchs OutRedirects OutEchos OutEchoReps OutTimestamps OutTimestampReps OutAddrMasks OutAddrMaskReps
Icmp: 33 0 33 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
IcmpMsg: InType3
IcmpMsg: 33
Tcp: RtoAlgorithm RtoMin RtoMax MaxConn ActiveOpens PassiveOpens AttemptFails EstabResets CurrEstab InSegs OutSegs RetransSegs InErrs OutRsts
Tcp: 1 200 120000 -1 44 45 1 2 49 19927 20039 2 0 6
Udp: InDatagrams NoPorts InErrors OutDatagrams
Udp: 70 0 0 125

/proc/scsi/scsi
              This is a listing of all SCSI devices known to the kernel.   The
              listing  is  similar  to  the one seen during bootup.  scsi cur-
              rently supports only the add-single-device command which  allows
              root to add a hotplugged device to the list of known devices.

              The command

                  echo 'scsi add-single-device 1 0 5 0' > /proc/scsi/scsi

              will  cause host scsi1 to scan on SCSI channel 0 for a device on
              ID 5 LUN 0.  If there is already a device known on this  address
              or the address is invalid, an error will be returned.

[root@centos-doxer scsi]# cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
Vendor: ATA Model: VBOX HARDDISK Rev: 1.0
Type: Direct-Access ANSI SCSI revision: 05

PS: issue_lip in linux

root@testserver01# pwd
/sys/class/scsi_host/host0
root@testserver01# echo ‘- – -’ > scan

http://www.doxer.org/learn-linux/extend-filesystem-in-vxvm-which-connects-to-san-fibre-channel-storage/#more-777

/proc/swaps
              Swap areas in use.  See also swapon(8).

[root@centos-doxer scsi]# cat /proc/swaps
Filename Type Size Used Priority
/dev/mapper/VolGroup00-LogVol01 partition 2064376 0 -1
[root@centos-doxer scsi]# swapon -s
Filename Type Size Used Priority
/dev/mapper/VolGroup00-LogVol01 partition 2064376 0 -1

[root@centos-doxer scsi]# fdisk -l /dev/mapper/VolGroup00-LogVol01

Disk /dev/mapper/VolGroup00-LogVol01: 2113 MB, 2113929216 bytes
255 heads, 63 sectors/track, 257 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/mapper/VolGroup00-LogVol01 doesn’t contain a valid partition table

Categories: Kernel, Linux, Systems Tags:

hwclock and 11 minute mode in linux

May 28th, 2013 No comments

From manpage of hwclock:

You should be aware of another way that the Hardware Clock is kept synchronized in some systems. The Linux kernel has a mode wherein it copies the System Time to the Hardware Clock every 11 minutes. This is a good mode to use when you are using something sophisticated like ntp to keep your System Time synchronized. (ntp is a way to keep your System Time synchronized either to a time server somewhere on the network or to a radio clock hooked up to your system. See RFC 1305).

This mode (we’ll call it “11 minute mode”) is off until something turns it on. The ntp daemon xntpd is one thing that turns it on. You can turn it off by running anything, including hwclock –hctosys, that sets the System Time the old fashioned way.

To see if it is on or off, use the command adjtimex –print and look at the value of “status”. If the “64″ bit of this number (expressed in binary) equal to 0, 11 minute mode is on. Otherwise, it is off.

If your system runs with 11 minute mode on, don’t use hwclock –adjust or hwclock –hctosys. You’ll just make a mess. It is acceptable to use a hwclock –hctosys at startup time to get a reasonable System Time until your system is able to set the System Time from the external source and start 11 minute mode.

To test whether 11 minutes mode is on or not, you can use the following script(yum install adjtimex first!):

[root@centos-doxer ~]# cat 11minmode
#!/bin/bash

adjtimex –print | awk ‘/status/ {
if ( and($2, 64) == 0 ) {
print “11 minute mode is enabled”
} else {
print “11 minute mode is disabled”
}
}’

Actually, we better use ntp to do time syncing on OS, and then sync system time to hardware time using hwclock –systohc in cron job to do the time syncing between OS & hardware.

Categories: Kernel, Linux, Systems Tags:

/proc filesystem – day 1

May 24th, 2013 No comments

Assumption:

[root@centos-doxer proc]# env|grep SSH_TTY
SSH_TTY=/dev/pts/3

[root@centos-doxer proc]# ps -ef|grep ‘root@pts/3′
root 3157 2185 0 08:11 ? 00:00:00 sshd: root@pts/3

[root@centos-doxer 3157]# cd /proc/3157

Now we’re going to see /proc/[pid]/:

/proc/[pid]/cmdline
This holds the complete command line for the process, unless the
process is a zombie. In the latter case, there is nothing in
this file: that is, a read on this file will return 0 charac-
ters. The command-line arguments appear in this file as a set
of null-separated strings, with a further null byte (‘\0′) after
the last string.
[root@centos-doxer 3157]# cat cmdline
sshd: root@pts/3
/proc/[pid]/cwd
This is a symbolic link to the current working directory of the
process. To find out the current working directory of process
20, for instance, you can do this:

$ cd /proc/20/cwd; /bin/pwd
[root@centos-doxer cwd]# cd /proc/3157/cwd
[root@centos-doxer cwd]# pwd -P
/
[root@centos-doxer cwd]# pwdx 3157
3157: /
Under Linux 2.0 and earlier /proc/[pid]/exe is a pointer to the
binary which was executed, and appears as a symbolic link.
[root@centos-doxer 3157]# ls -l exe
lrwxrwxrwx 1 root root 0 May 24 08:11 exe -> /usr/sbin/sshd
[root@centos-doxer 3157]# readlink exe
/usr/sbin/sshd
/proc/[pid]/fd
This is a subdirectory containing one entry for each file which
the process has open, named by its file descriptor, and which is
a symbolic link to the actual file. Thus, 0 is standard input,
1 standard output, 2 standard error, etc.
/proc/self/fd/N is approximately the same as /dev/fd/N in some
Unix and Unix-like systems. Most Linux MAKEDEV scripts symboli-
cally link /dev/fd to /proc/self/fd, in fact.

Most systems provide symbolic links /dev/stdin, /dev/stdout, and
/dev/stderr, which respectively link to the files 0, 1, and 2 in
/proc/self/fd. Thus the example command above could be written
as:

$ foobar -i /dev/stdin -o /dev/stdout …
[root@centos-doxer 3157]# ls -l fd/
total 0
lrwx—— 1 root root 64 May 24 08:39 0 -> /dev/null
lrwx—— 1 root root 64 May 24 08:39 1 -> /dev/null
lrwx—— 1 root root 64 May 24 08:39 2 -> /dev/null
lrwx—— 1 root root 64 May 24 08:39 3 -> socket:[10835]
lrwx—— 1 root root 64 May 24 08:39 4 -> socket:[10866]
lr-x—— 1 root root 64 May 24 08:39 5 -> pipe:[10870]
l-wx—— 1 root root 64 May 24 08:39 6 -> pipe:[10870]
lrwx—— 1 root root 64 May 24 08:39 7 -> /dev/ptmx
lrwx—— 1 root root 64 May 24 08:11 8 -> /dev/ptmx
lrwx—— 1 root root 64 May 24 08:39 9 -> /dev/ptmx

[root@centos-doxer 3157]# ls -l /dev/stdin
lrwxrwxrwx 1 root root 15 May 24 08:07 /dev/stdin -> /proc/self/fd/0
[root@centos-doxer 3157]# ls -l /proc/self/fd/0
lrwx—— 1 root root 64 May 24 08:44 /proc/self/fd/0 -> /dev/pts/3
[root@centos-doxer 3157]#
[root@centos-doxer 3157]# ls -l /dev/stdout
lrwxrwxrwx 1 root root 15 May 24 08:07 /dev/stdout -> /proc/self/fd/1
[root@centos-doxer 3157]# ls -l /proc/self/fd/1
lrwx—— 1 root root 64 May 24 08:44 /proc/self/fd/1 -> /dev/pts/3
[root@centos-doxer 3157]#
[root@centos-doxer 3157]# ls -l /dev/stderr
lrwxrwxrwx 1 root root 15 May 24 08:07 /dev/stderr -> /proc/self/fd/2
[root@centos-doxer 3157]# ls -l /proc/self/fd/2
lrwx—— 1 root root 64 May 24 08:44 /proc/self/fd/2 -> /dev/pts/3

/proc/[pid]/fdinfo/ (since kernel 2.6.22)
This is a subdirectory containing one entry for each file which
the process has open, named by its file descriptor. The con-
tents of each file can be read to obtain information about the
corresponding file descriptor, for example:

$ cat /proc/12015/fdinfo/4
[root@centos-doxer 3157]# ls -l fdinfo/
total 0
-r——– 1 root root 0 May 24 08:45 0
-r——– 1 root root 0 May 24 08:45 1
-r——– 1 root root 0 May 24 08:45 2
-r——– 1 root root 0 May 24 08:45 3
-r——– 1 root root 0 May 24 08:45 4
-r——– 1 root root 0 May 24 08:45 5
-r——– 1 root root 0 May 24 08:45 6
-r——– 1 root root 0 May 24 08:45 7
-r——– 1 root root 0 May 24 08:45 8
-r——– 1 root root 0 May 24 08:45 9
/proc/[pid]/maps
A file containing the currently mapped memory regions and their
access permissions.

The format is:

address perms offset dev inode pathname
08048000-08056000 r-xp 00000000 03:0c 64593 /usr/sbin/gpm
08056000-08058000 rw-p 0000d000 03:0c 64593 /usr/sbin/gpm
08058000-0805b000 rwxp 00000000 00:00 0
40000000-40013000 r-xp 00000000 03:0c 4165 /lib/ld-2.2.4.so
40013000-40015000 rw-p 00012000 03:0c 4165 /lib/ld-2.2.4.so
4001f000-40135000 r-xp 00000000 03:0c 45494 /lib/libc-2.2.4.so
40135000-4013e000 rw-p 00115000 03:0c 45494 /lib/libc-2.2.4.so
4013e000-40142000 rw-p 00000000 00:00 0
bffff000-c0000000 rwxp 00000000 00:00 0

where “address” is the address space in the process that it
occupies, “perms” is a set of permissions:

r = read
w = write
x = execute
s = shared
p = private (copy on write)

“offset” is the offset into the file/whatever, “dev” is the
device (major:minor), and “inode” is the inode on that device.
0 indicates that no inode is associated with the memory region,
as the case would be with BSS (uninitialized data).
[root@centos-doxer 3157]# cat maps
2b98ef380000-2b98ef3e1000 r-xp 00000000 fd:00 5348968 /usr/sbin/sshd
2b98ef5e1000-2b98ef5e4000 rw-p 00061000 fd:00 5348968 /usr/sbin/sshd
2b98ef5e4000-2b98ef5ed000 rw-p 2b98ef5e4000 00:00 0
……

[root@centos-doxer 3157]# pmap -d 3157
3157: sshd: root@pts/3
Address Kbytes Mode Offset Device Mapping
00002b98ef380000 388 r-x– 0000000000000000 0fd:00000 sshd
00002b98ef5e1000 12 rw— 0000000000061000 0fd:00000 sshd
00002b98ef5e4000 36 rw— 00002b98ef5e4000 000:00000 [ anon ]
00002b98ef5ed000 112 r-x– 0000000000000000 0fd:00000 ld-2.5.so
……
mapped: 98380K writeable/private: 1180K shared: 2560K
[root@centos-doxer 3157]# ps aux|egrep ‘USER|3157′
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 3157 0.0 0.3 90192 3460 ? Ss 08:11 0:01 sshd: root@pts/3
root 3521 0.0 0.0 6056 552 pts/3 R+ 09:08 0:00 egrep USER|3157
[root@centos-doxer 3157]# cat stat | awk ‘{print $23 / 1024}’
90192

[root@centos-doxer 3157]# top -p 3157 -n 1
top – 08:58:15 up 51 min, 7 users, load average: 0.00, 0.00, 0.00
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.4%us, 0.4%sy, 0.1%ni, 92.2%id, 6.3%wa, 0.1%hi, 0.6%si, 0.0%st
Mem: 1026080k total, 728268k used, 297812k free, 22092k buffers
Swap: 2064376k total, 0k used, 2064376k free, 332616k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3157 root 15 0 90192 3460 2684 S 0.0 0.3 0:01.41 sshd

VIRT = Virtual Image (kb)
RES = Resident size (kb)
%MEM = Memory usage (RES)
SHR = Shared Mem size (kb)

/proc/[pid]/status
Provides much of the information in /proc/[pid]/stat and
/proc/[pid]/statm in a format that’s easier for humans to parse.
Here’s an example:

$ cat /proc/$$/status
Name: bash
State: S (sleeping)
Tgid: 3515
Pid: 3515
PPid: 3452
TracerPid: 0
Uid: 1000 1000 1000 1000
Gid: 100 100 100 100
FDSize: 256
Groups: 16 33 100
VmPeak: 9136 kB
VmSize: 7896 kB
VmLck: 0 kB
VmHWM: 7572 kB
VmRSS: 6316 kB
VmData: 5224 kB
VmStk: 88 kB
VmExe: 572 kB
VmLib: 1708 kB
VmPTE: 20 kB
Threads: 1
SigQ: 0/3067
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000010000
SigIgn: 0000000000384004
SigCgt: 000000004b813efb
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: ffffffffffffffff
Cpus_allowed: 00000001

stop)”, “Z (zombie)”, or “X (dead)”.

* Tgid: Thread group ID (i.e., Process ID).

* Pid: Thread ID (see gettid(2)).

* TracerPid: PID of process tracing this process (0 if not being
traced).

* Uid, Gid: Real, effective, saved set, and file system UIDs
(GIDs).

* FDSize: Number of file descriptor slots currently allocated.

* Groups: Supplementary group list.

* VmPeak: Peak virtual memory size.

* VmSize: Virtual memory size.

* VmLck: Locked memory size.

* VmHWM: Peak resident set size (“high water mark”).

* VmRSS: Resident set size.

* VmData, VmStk, VmExe: Size of data, stack, and text segments.

* VmLib: Shared library code size.

* VmPTE: Page table entries size (since Linux 2.6.10).

* Threads: Number of threads in process containing this thread.

* SigPnd, ShdPnd: Number of signals pending for thread and for
process as a whole (see pthreads(7) and signal(7)).

* SigBlk, SigIgn, SigCgt: Masks indicating signals being
blocked, ignored, and caught (see signal(7)).

* CapInh, CapPrm, CapEff: Masks of capabilities enabled in
inheritable, permitted, and effective sets (see capabili-
ties(7)).

* CapBnd: Capability Bounding set (since kernel 2.6.26, see
capabilities(7)).

* Cpus_allowed: Mask of CPUs on which this process may run
(since Linux 2.6.24, see cpuset(7)).

* Cpus_allowed_list: Same as previous, but in “list format”
(since Linux 2.6.26, see cpuset(7)).

* Mems_allowed: Mask of memory nodes allowed to this process
numerical thread ID ([tid]) of the thread (see gettid(2)).
Within each of these subdirectories, there is a set of files
with the same names and contents as under the /proc/[pid] direc-
tories. For attributes that are shared by all threads, the con-
tents for each of the files under the task/[tid] subdirectories
will be the same as in the corresponding file in the parent
/proc/[pid] directory (e.g., in a multithreaded process, all of
the task/[tid]/cwd files will have the same value as the
/proc/[pid]/cwd file in the parent directory, since all of the
threads in a process share a working directory). For attributes
that are distinct for each thread, the corresponding files under
task/[tid] may have different values (e.g., various fields in
each of the task/[tid]/status files may be different for each
thread).

In a multithreaded process, the contents of the /proc/[pid]/task
directory are not available if the main thread has already ter-
minated (typically by calling pthread_exit(3)).

[root@centos-doxer 3157]# cat status
Name: sshd
State: S (sleeping)
SleepAVG: 98%
Tgid: 3157
Pid: 3157
PPid: 2185
TracerPid: 0
Uid: 0 0 0 0
Gid: 0 0 0 0
FDSize: 64
Groups:
VmPeak: 90224 kB
VmSize: 90192 kB
VmLck: 0 kB
VmHWM: 3460 kB
VmRSS: 3460 kB
VmData: 728 kB
VmStk: 88 kB
VmExe: 388 kB
VmLib: 6228 kB
VmPTE: 188 kB
StaBrk: 2b98f6299000 kB
Brk: 2b98f62d4000 kB
StaStk: 7fff6bde15f0 kB
Threads: 1
SigQ: 1/8191
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000001000
SigCgt: 0000000180014006
CapInh: 0000000000000000
CapPrm: 00000000fffffeff
CapEff: 00000000fffffeff
Cpus_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
Mems_allowed: 00000000,00000001
/proc/[pid]/smaps (since Linux 2.6.14)
This file shows memory consumption for each of the process’s
mappings. For each of mappings there is a series of lines such
as the following:

08048000-080bc000 r-xp 00000000 03:02 13130 /bin/bash
Size: 464 kB

This file is only present if the CONFIG_MMU kernel configuration
option is enabled.
[root@centos-doxer 3157]# cat smaps
2b98ef380000-2b98ef3e1000 r-xp 00000000 fd:00 5348968 /usr/sbin/sshd
Size: 388 kB
Rss: 332 kB
Shared_Clean: 332 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
Swap: 0 kB
Pss: 78 kB
2b98ef5e1000-2b98ef5e4000 rw-p 00061000 fd:00 5348968 /usr/sbin/sshd
Size: 12 kB
Rss: 12 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 12 kB
Swap: 0 kB
Pss: 12 kB
……
/proc/[pid]/oom_adj (since Linux 2.6.11)
This file can be used to adjust the score used to select which
process should be killed in an out-of-memory (OOM) situation.
The kernel uses this value for a bit-shift operation of the
process’s oom_score value: valid values are in the range -16 to
+15, plus the special value -17, which disables OOM-killing
altogether for this process. A positive score increases the
likelihood of this process being killed by the OOM-killer; a
negative score decreases the likelihood. The default value for
this file is 0; a new process inherits its parent’s oom_adj set-
ting. A process must be privileged (CAP_SYS_RESOURCE) to update
this file.

/proc/[pid]/oom_score (since Linux 2.6.11)
This file displays the current score that the kernel gives to
this process for the purpose of selecting a process for the OOM-
killer. A higher score means that the process is more likely to
be selected by the OOM-killer. The basis for this score is the
amount of memory used by the process, with increases (+) or
decreases (-) for factors including:

* whether the process creates a lot of children using fork(2)
(+);

* whether the process has been running a long time, or has used
a lot of CPU time (-);

* whether the process has a low nice value (i.e., > 0) (+);

* whether the process is privileged (-); and

* whether the process is making direct hardware access (-).

The oom_score also reflects the bit-shift adjustment specified
by the oom_adj setting for the process.
/proc/[pid]/stat
Status information about the process. This is used by ps(1).
It is defined in /usr/src/linux/fs/proc/array.c.

The fields, in order, with their proper scanf(3) format speci-
fiers, are:

pid %d The process ID.

comm %s The filename of the executable, in parentheses.
This is visible whether or not the executable is
swapped out.

state %c One character from the string “RSDZTW” where R is
running, S is sleeping in an interruptible wait, D
is waiting in uninterruptible disk sleep, Z is zom-
bie, T is traced or stopped (on a signal), and W is
paging.

ppid %d The PID of the parent.

pgrp %d The process group ID of the process.

session %d The session ID of the process.

tty_nr %d The controlling terminal of the process. (The minor
device number is contained in the combination of
bits 31 to 20 and 7 to 0; the major device number is
in bits 15 t0 8.)

tpgid %d The ID of the foreground process group of the con-
trolling terminal of the process.

flags %u (%lu before Linux 2.6.22)
The kernel flags word of the process. For bit mean-
ings, see the PF_* defines in <linux/sched.h>.
Details depend on the kernel version.

minflt %lu The number of minor faults the process has made
which have not required loading a memory page from
disk.

cminflt %lu The number of minor faults that the process’s
waited-for children have made.

majflt %lu The number of major faults the process has made
which have required loading a memory page from disk.

cmajflt %lu The number of major faults that the process’s
waited-for children have made.

cutime %ld Amount of time that this process’s waited-for chil-
dren have been scheduled in user mode, measured in
clock ticks (divide by sysconf(_SC_CLK_TCK). (See
also times(2).) This includes guest time,
cguest_time (time spent running a virtual CPU, see
below).

cstime %ld Amount of time that this process’s waited-for chil-
dren have been scheduled in kernel mode, measured in
clock ticks (divide by sysconf(_SC_CLK_TCK).

priority %ld
(Explanation for Linux 2.6) For processes running a
real-time scheduling policy (policy below; see
sched_setscheduler(2)), this is the negated schedul-
ing priority, minus one; that is, a number in the
range -2 to -100, corresponding to real-time priori-
ties 1 to 99. For processes running under a non-
real-time scheduling policy, this is the raw nice
value (setpriority(2)) as represented in the kernel.
The kernel stores nice values as numbers in the
range 0 (high) to 39 (low), corresponding to the
user-visible nice range of -20 to 19.

Before Linux 2.6, this was a scaled value based on
the scheduler weighting given to this process.

nice %ld The nice value (see setpriority(2)), a value in the
range 19 (low priority) to -20 (high priority).

num_threads %ld
Number of threads in this process (since Linux 2.6).
Before kernel 2.6, this field was hard coded to 0 as
a placeholder for an earlier removed field.

itrealvalue %ld
The time in jiffies before the next SIGALRM is sent
to the process due to an interval timer. Since ker-
nel 2.6.17, this field is no longer maintained, and
is hard coded as 0.

starttime %llu (was %lu before Linux 2.6)
The time in jiffies the process started after system
boot.

vsize %lu Virtual memory size in bytes.

rss %ld Resident Set Size: number of pages the process has
in real memory. This is just the pages which count
towards text, data, or stack space. This does not
include pages which have not been demand-loaded in,
or which are swapped out.
kstkesp %lu The current value of ESP (stack pointer), as found
in the kernel stack page for the process.

kstkeip %lu The current EIP (instruction pointer).

signal %lu The bitmap of pending signals, displayed as a deci-
mal number. Obsolete, because it does not provide
information on real-time signals; use
/proc/[pid]/status instead.

blocked %lu The bitmap of blocked signals, displayed as a deci-
mal number. Obsolete, because it does not provide
information on real-time signals; use
/proc/[pid]/status instead.

sigignore %lu
The bitmap of ignored signals, displayed as a deci-
mal number. Obsolete, because it does not provide
information on real-time signals; use
/proc/[pid]/status instead.

sigcatch %lu
The bitmap of caught signals, displayed as a decimal
number. Obsolete, because it does not provide
information on real-time signals; use
/proc/[pid]/status instead.

wchan %lu This is the “channel” in which the process is wait-
ing. It is the address of a system call, and can be
looked up in a namelist if you need a textual name.
(If you have an up-to-date /etc/psdatabase, then try
ps -l to see the WCHAN field in action.)

nswap %lu Number of pages swapped (not maintained).

cnswap %lu Cumulative nswap for child processes (not main-
tained).

exit_signal %d (since Linux 2.1.22)
Signal to be sent to parent when we die.

processor %d (since Linux 2.2.8)
CPU number last executed on.

rt_priority %u (since Linux 2.5.19; was %lu before Linux 2.6.22)
Real-time scheduling priority, a number in the range
1 to 99 for processes scheduled under a real-time
policy, or 0, for non-real-time processes (see
sched_setscheduler(2)).

policy %u (since Linux 2.5.19; was %lu before Linux 2.6.22)
Scheduling policy (see sched_setscheduler(2)).
Decode using the SCHED_* constants in linux/sched.h.
clock ticks (divide by sysconf(_SC_CLK_TCK).
[root@centos-doxer 3157]# cat stat
3157 (sshd) S 2185 3157 3157 0 -1 4202752 1159 409 0 0 41 124 0 0 15 0 1 0 29275 92356608 870 18446744073709551615 47935848448000 47935848844780 140735003104752 18446744073709551615 47935894114547 0 0 4096 81926 0 0 0 17 0 0 0 22

/proc/[pid]/statm
Provides information about memory usage, measured in pages. The
columns are:

size total program size
(same as VmSize in /proc/[pid]/status)
resident resident set size
(same as VmRSS in /proc/[pid]/status)
share shared pages (from shared mappings)
text text (code)
lib library (unused in Linux 2.6)
data data + stack
dt dirty pages (unused in Linux 2.6)
[root@centos-doxer 3157]# cat statm
22548 870 674 97 0 204 0

Categories: Kernel, Linux, Systems Tags:

cpu hyperthreading vs dual core

May 14th, 2013 No comments

Note: This is from http://www.richweb.com/cpu_info

A hyperthreaded processor has the same number of function units as an older, non-hyperthreaded processor. It just has two execution contexts, so it can maybe achieve better function unit utilization by letting more than one program execute concurrently. On the other hand, if you’re running two programs which compete for the same function units, there is no advantage at all to having both running “concurrently.” When one is running, the other is necessarily waiting on the same function units.

A dual core processor literally has two times as many function units as a single-core processor, and can really run two programs concurrently, with no competition for function units.

A dual core processor is built so that both cores share the same level 2 cache. A dual processor (separate physical cpus) system differs in that each cpu will have its own level 2 cache. This may sound like an advantage, and in some situations it can be but in many cases new research and testing shows that the shared cache can be faster when the cpus are sharing the same or very similar tasks.

In general Hyperthreading is considered older technology and is no longer supported in newer cpus. Hyperthreading can provide a marginal (10%) for some server workloads like mysql, but dual core technology has essentially replaced hyperthreading in newer systems.

A dual core cpu running at 3.0Ghz should be faster then a dual cpu (separate core) system running at 3.0Ghz due to the ability to share the cache at higher bus speeds.

The examples below details how we determine what kind of cpu(s) are present.

The kernel data Linux exposes in /proc/cpuinfo will show each logical cpu with a unique processor number. A logical cpu can be a hyperthreading sibling, a shared core in a dual or quad core, or a separate physical cpu. We must look at the siblings, cpu cores and core id to tell the difference.

If the number of cores = the number of siblings for a given physical processor, then hyperthreading is OFF.

/bin/cat /proc/cpuinfo | /bin/egrep ‘processor|model name|cache size|core|sibling|physical’

 

Example 1: Single processor, 1 core, no Hyperthreading

processor	: 0
model name	: AMD Duron(tm) processor
cache size	: 64 KB

 

Example 2: Single processor, 1 core, Hyperthreading is enabled.

Notice how we have 2 siblings, but only 1 core. The physical cpu id is the same for both: 0.

processor	: 0
model name	: Intel(R) Pentium(R) 4 CPU 2.80GHz
cache size	: 1024 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 1
processor	: 1
model name	: Intel(R) Pentium(R) 4 CPU 2.80GHz
cache size	: 1024 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 1

 

Example 3. Single socket Quad Core

Notice how each processor has its own core id. The number of siblings matches the number of cores so there are no Hyperthreading siblings. Also notice the huge l2 cache – 6 MB. That makes sense though, when considering 4 cores share that l2 cache.

processor	: 0
model name	: Intel(R) Xeon(R) CPU           E5410  @ 2.33GHz
cache size	: 6144 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 4
processor	: 1
model name	: Intel(R) Xeon(R) CPU           E5410  @ 2.33GHz
cache size	: 6144 KB
physical id	: 0
siblings	: 4
core id		: 1
cpu cores	: 4
processor	: 2
model name	: Intel(R) Xeon(R) CPU           E5410  @ 2.33GHz
cache size	: 6144 KB
physical id	: 0
siblings	: 4
core id		: 2
cpu cores	: 4
processor	: 3
model name	: Intel(R) Xeon(R) CPU           E5410  @ 2.33GHz
cache size	: 6144 KB
physical id	: 0
siblings	: 4
core id		: 3
cpu cores	: 4

 

Example 3a. Single socket Dual Core

Again, each processor has its own core so this is a dual core system.

 

processor	: 0
model name	: Intel(R) Pentium(R) D CPU 3.00GHz
cache size	: 2048 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
processor	: 1
model name	: Intel(R) Pentium(R) D CPU 3.00GHz
cache size	: 2048 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2

 

Example 4. Dual Single core CPU, Hyperthreading ENABLED

This example shows that processer 0 and 2 share the same physical cpu and 1 and 3 share the same physical cpu. The number of siblings is twice the number of cores, which is another clue that this is a system with hyperthreading enabled.

 

processor	: 0
model name	: Intel(R) Xeon(TM) CPU 3.60GHz
cache size	: 1024 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 1
processor	: 1
model name	: Intel(R) Xeon(TM) CPU 3.60GHz
cache size	: 1024 KB
physical id	: 3
siblings	: 2
core id		: 0
cpu cores	: 1
processor	: 2
model name	: Intel(R) Xeon(TM) CPU 3.60GHz
cache size	: 1024 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 1
processor	: 3
model name	: Intel(R) Xeon(TM) CPU 3.60GHz
cache size	: 1024 KB
physical id	: 3
siblings	: 2
core id		: 0
cpu cores	: 1

 

Example 5. Dual CPU Dual Core No hyperthreading

Of the 5 examples this should be the most capable system processor-wise. There are a total of 4 cores; 2 cores in 2 separate socketed physical cpus. Each core shares the 4MB cache with its sibling core. The higher clock rate (3.0 Ghz vs 2.3Ghz) should offer slightly better performance than example 3.

 

processor	: 0
model name	: Intel(R) Xeon(R) CPU            5160  @ 3.00GHz
cache size	: 4096 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
processor	: 1
model name	: Intel(R) Xeon(R) CPU            5160  @ 3.00GHz
cache size	: 4096 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2
processor	: 2
model name	: Intel(R) Xeon(R) CPU            5160  @ 3.00GHz
cache size	: 4096 KB
physical id	: 3
siblings	: 2
core id		: 0
cpu cores	: 2
processor	: 3
model name	: Intel(R) Xeon(R) CPU            5160  @ 3.00GHz
cache size	: 4096 KB
physical id	: 3
siblings	: 2
core id		: 1
cpu cores	: 2

PS:
For explanation about flags in linux /proc/cpuinfo, you can refer to following:
http://blog.incase.de/index.php/cpu-feature-flags-and-their-meanings/
Categories: Kernel, Linux Tags:

resolved – differences between zfs ARC L2ARC ZIL

January 31st, 2013 No comments
  • ARC

zfs ARC(adaptive replacement cache) is a very fast cache located in the server’s memory.

For example, our ZFS server with 12GB of RAM has 11GB dedicated to ARC, which means our ZFS server will be able to cache 11GB of the most accessed data. Any read requests for data in the cache can be served directly from the ARC memory cache instead of hitting the much slower hard drives. This creates a noticeable performance boost for data that is accessed frequently.

  • L2ARC

As a general rule, you want to install as much RAM into the server as you can to make the ARC as big as possible. At some point, adding more memory is just cost prohibitive. That is where the L2ARC becomes important. The L2ARC is the second level adaptive replacement cache. The L2ARC is often called “cache drives” in the ZFS systems.

L2ARC is a new layer between Disk and the cache (ARC) in main memory for ZFS. It uses dedicated storage devices to hold cached data. The main role of this cache is to boost the performance of random read workloads. The intended L2ARC devices include 10K/15K RPM disks like short-stroked disks, solid state disks (SSD), and other media with substantially faster read latency than disk.

  • ZIL

ZIL(ZFS Intent Log) exists for performance improvement on synchronous writes. Synchronous write is very slow than asynchronous write, but it’s more stable. Essentially, the intent log of a file system is nothing more than an insurance against power failures, a to-do list if you will, that keeps track of the stuff that needs to be updated on disk, even if the power fails (or something else happens that prevents the system from updating its disks).

To get better performance, use separated disks(SSD) for ZIL, such as zpool add pool log c2d0.

Now I’m giving you an true example about zfs ZIL/L2ARC/ARC on SUN ZFS 7320 head:

test-zfs# zpool iostat -v exalogic
capacity operations bandwidth
pool alloc free read write read write
————————- —– —– —– —– —– —–
exalogic 6.78T 17.7T 53 1.56K 991K 25.1M
mirror 772G 1.96T 6 133 111K 2.07M
c0t5000CCA01A5FDCACd0 – - 3 36 57.6K 2.07M #these are the physical disks
c0t5000CCA01A6F5CF4d0 – - 2 35 57.7K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A6F5D00d0 – - 2 36 56.2K 2.07M
c0t5000CCA01A6F64F4d0 – - 2 35 57.3K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A76A7B8d0 – - 2 36 56.3K 2.07M
c0t5000CCA01A746CCCd0 – - 2 36 56.8K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A749A88d0 – - 2 35 56.7K 2.07M
c0t5000CCA01A759E90d0 – - 2 35 56.1K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A767FDCd0 – - 2 35 56.1K 2.07M
c0t5000CCA01A782A40d0 – - 2 35 57.1K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A782D10d0 – - 2 35 57.2K 2.07M
c0t5000CCA01A7465F8d0 – - 2 35 56.3K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A7597FCd0 – - 2 35 57.6K 2.07M
c0t5000CCA01A7828F4d0 – - 2 35 56.2K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A7829ACd0 – - 2 35 57.1K 2.07M
c0t5000CCA01A78278Cd0 – - 2 35 57.4K 2.07M
mirror 772G 1.96T 6 133 111K 2.07M
c0t5000CCA01A736000d0 – - 3 35 57.3K 2.07M
c0t5000CCA01A738000d0 – - 2 35 57.3K 2.07M
c0t5000A72030061B82d0 224M 67.8G 0 98 1 1.62M #ZIL(SSD write cache, ZFS Intent Log)
c0t5000A72030061C70d0 224M 67.8G 0 98 1 1.62M
c0t5000A72030062135d0 223M 67.8G 0 98 1 1.62M
c0t5000A72030062146d0 224M 67.8G 0 98 1 1.62M
cache – - – - – -
c2t2d0 334G 143G 15 6 217K 652K #L2ARC(SSD cache drives)
c2t3d0 332G 145G 15 6 215K 649K
c2t4d0 333G 144G 11 6 169K 651K
c2t5d0 333G 144G 13 6 192K 650K
c2t2d0 – - 0 0 0 0
c2t3d0 – - 0 0 0 0
c2t4d0 – - 0 0 0 0
c2t5d0 – - 0 0 0 0

And as for ARC:

test-zfs:> status memory show
Memory:
Cache 63.4G bytes #ARC
Unused 17.3G bytes
Mgmt 561M bytes
Other 491M bytes
Kernel 14.3G bytes

Categories: Kernel, NAS, SAN, Storage Tags: ,

resolved – bnx2i dev eth0 does not support iscsi

September 19th, 2012 No comments

There’s a weird incident occurred on a linux box. The linux box turned not responsible to ping or ssh, although from ifconfig and /proc/net/bonding/bond0 file, the system said it’s running ok. After some google work, I found that the issue may related to the NIC driver. I tried bring down/bring up NICs one by one, but got error:

Bringing up loopback interface bond0: bnx2i: dev eth0 does not support iscsi

bnx2i: iSCSI not supported, dev=eth0

bonding: no command found in slaves file for bond bond0. Use +ifname or -ifname

At last, I tried restart the whole network i.e. /etc/init.d/network restart. And that did the trick, the networking was then running ok and can ping/ssh to it without problem.

resolved – semget failed with status 28 failed oracle database starting up

August 2nd, 2012 No comments

Today we met a problem with semaphore and unable to start oracle instances. Here’s the error message:

ORA-27154: post/wait create failed
ORA-27300: OS system dependent operation:semget failed with status: 28
ORA-27301: OS failure message: No space left on device
ORA-27302: failure occurred at: sskgpcreates

So it turns out, the max number of arrays have been reached:
#check limits of all IPC
root@doxer# ipcs -al

—— Shared Memory Limits ——–
max number of segments = 4096
max seg size (kbytes) = 67108864
max total shared memory (kbytes) = 17179869184
min seg size (bytes) = 1

—— Semaphore Limits ——–
max number of arrays = 128
max semaphores per array = 250
max semaphores system wide = 1024000
max ops per semop call = 100
semaphore max value = 32767

—— Messages: Limits ——–
max queues system wide = 16
max size of message (bytes) = 65536
default max size of queue (bytes) = 65536

#check summary of semaphores
root@doxer# ipcs -su

—— Semaphore Status ——–
used arrays = 127
allocated semaphores = 16890

To resolve this, we need increase value of max number of semaphore arrays:

root@doxer# cat /proc/sys/kernel/sem
250 1024000 100 128
                 ^---needs to be increased

PS:

Here’s an example with toilets that describes differences between mutex and semaphore LOL http://koti.mbnet.fi/niclasw/MutexSemaphore.html

Categories: Kernel, Oracle DB Tags:

Resolved – bash /usr/bin/find Arg list too long

July 3rd, 2012 No comments

Have you ever met error like the following?

root@doxer# find /PRD/*/connectors/A01/QP*/*/logFiles/* -prune -name “*.log” -mtime +7 -type f |wc -l

bash: /usr/bin/find: Arg list too long

0

The cause of issue is kernel limitation for argument count which can be passed to find (as well as ls, and other utils). ARG_MAX defines

the maximum length of arguments for a new process. You can get the number of it using command:

root@doxer# getconf ARG_MAX
1048320

To quickly fix this, you can move your actions into the directory(replace * with subdir_NAME):

cd /PRD/subdir_NAME/connectors/A01/QP*/*/logFiles/;find . -prune -name “*.log” -mtime +7 -type f |wc -l

11382

PS:

  1. you can get all configuration values with getconf -a.
  2. For more solutions about the error “bash: /usr/bin/find: Arg list too long”, you can refer to http://www.in-ulm.de/~mascheck/various/argmax/
Categories: Kernel, Linux Tags:

ORA-00600 internal error caused by /tmp swap full

June 22nd, 2012 No comments

Today we encountered a problem when oracle failed to functioning. After some checking, this error was caused by /tmp running out of space. This also confirmed by OS logs:

Jun 20 17:43:59 tmpfs: [ID 518458 kern.warning] WARNING: /tmp: File system full, swap space limit exceeded

Oracle uses /tmp to compile PL/SQL code, so if there no space it unable to compile/execute. Which causing functions/procedures/packeges and trigers to timeout. The same also described in oracle note: ID 1389623.1

So in order to prevent further occurrences of this error, we should increase /tmp on the system to at least 4Gb.

There is an Oracle parameter to change the default location of these temporary files(_ncomp_shared_objects_dir), but it’s not a dynamic parameter. And also, while there is a way to resize a tmpfs filesystem online but it’s somehow risky. So the best idea is that, we firstly bring down Oracle DB on this host, then modify /etc/vfstab, and then reboot the whole system. This way will protect our data against the risk of corruption or lost etc, also it’ll have some outage time.
So finally, here’s the steps:
Amend the line in /etc/vfstab from:

swap – /tmp tmpfs – yes size=512m

To:

swap – /tmp tmpfs – yes size=4096m

Reboot machine and bring up oracle DB

solaris kernel bug – ack replied before sync/ack valid outbound packets dropped

April 9th, 2012 No comments

If you intermittently getting the following error “ldapserver.test.com:389; socket closed.” , and after some tcpdumping you may find the following:

From the network traffic analysing you may find the following incorrect package exchange chain exists:

testhost1 — > testhost2 (SYN)
testhost1 < — testhost2 (ACK) — on this point should be sent SYN ACK package
testhost1 — > testhost2 (RST) – respectively in case when it didn’t receive SYN/ACK – client initiate reset TCP connection

Actually this is a solaris kernel bug, more info you can refer to
The workaround is running this:
ndd -set /dev/ip ip_ire_arp_interval 999999999
After this, the packet drop to 1 per week per host.

More info about this kernel bug can be found here http://wesunsolve.net/bugid/id/6942436

Categories: Kernel, Unix Tags:

hostname is different between linux and solaris

February 21st, 2012 No comments

1. For linux, -a is a option for the command hostname:
-a, –alias
Display the alias name of the host (if used).
For example:
[root@linux ~]# hostname -a
linux localhost.localdomain localhost
[root@linux ~]# grep linux /etc/hosts
127.0.0.1 linux.doxer.org linux localhost.localdomain localhost

2.For solaris:

But for solaris, there’s no -a option, which means, if you run hostname -a on a solaris box, you’re actually setting the hostname to “-a”, which in turn will cause many problem especially ldap.

Categories: Kernel, Linux, Unix Tags:

Too many cron jobs and crond processes running

February 17th, 2012 No comments

I faced a problem that a ton of crond processes(cronjobs, or crontab) were running on the OS:

root@localhost# ps auxww|grep cron
vare 543 0.0 0.0 141148 5904 ? S 01:43 0:00 crond
root 4085 0.0 0.0 72944 976 ? Ss 2010 1:13 crond
vare 4522 0.0 0.0 141148 5904 ? S Feb16 0:00 crond
vare 5446 0.0 0.0 141148 5904 ? S 02:43 0:00 crond
vare 9202 0.0 0.0 141148 5904 ? S Feb16 0:00 crond
vare 10245 0.0 0.0 141148 5908 ? S 03:43 0:00 crond
vare 13989 0.0 0.0 141148 5904 ? S Feb16 0:00 crond
vare 15487 0.0 0.0 141148 5908 ? S 04:43 0:00 crond
vare 18796 0.0 0.0 141148 5904 ? S Feb16 0:00 crond
vare 20448 0.0 0.0 141148 5908 ? S 05:43 0:00 crond
root 23168 0.0 0.0 6024 596 pts/0 S+ 06:15 0:00 grep cron
vare 23474 0.0 0.0 141148 5904 ? S Feb16 0:00 crond
vare 27183 0.0 0.0 141148 5904 ? S Feb16 0:00 crond
vare 28358 0.0 0.0 141148 5904 ? S 00:43 0:00 crond
vare 32032 0.0 0.0 141148 5904 ? S Feb16 0:00 crond

…..(and more)

Now let’s see what cronjobs are running by user vare:
root@localhost# crontab -u vare -l
# run the VERA Deploy routine
43 * * * * cd /share/scripts > /dev/null 2>&1 ; sleep 5 ; /share/scripts/Application/VARE/Deploy > /dev/null 2>&1

After check the script /share/bbscripts/Application/VERA/Deploy, I can see that the script is changing directory to a NFS mount point<i.e. cd /share/scripts> and then do some checks<i.e. /share/scripts/Application/VARE/Deploy>. But as there’s problem during the process it’s changing to NFS mount point, so the script hung there and didn’t quit normally. As such, the number of crond was increasing.

Method to solve this specific problem(specific means you’ve to check your own script) is to first kill the hung processes of crond, then bounce autofs and then restart crond.

 

Categories: Kernel, Linux, Unix Tags:

using timex to check whether performance degradation caused by OS or VxVM

February 1st, 2012 No comments

To check for differences between operating system times to access disks and Volume Manager times to access disks, we can know whether to check for differences between operating system times to access disks and Volume Manager times to access disks. This is because they should both be about the same since both commands force a read of disk header information. If one of those is markedly greater then it indicates a problem in that area.

#echo | timex /usr/sbin/format #to avoid prompt for user input. Use time instead of timex for linux
real          13.03

user           0.10

sys            1.49
#timex vxdisk –o alldgs list
real           2.65

user           0.00

sys            0.00

Categories: Kernel, Linux, Unix Tags:

Linux hostname domainname dnsdomainname nisdomainname ypdomainname

December 20th, 2011 No comments

Here’s just an excerpt from online man page of “domainname”:

NAME
hostname – show or set the system’s host name
domainname – show or set the system’s NIS/YP domain name
dnsdomainname – show the system’s DNS domain name
nisdomainname – show or set system’s NIS/YP domain name
ypdomainname – show or set the system’s NIS/YP domain name
hostname will print the name of the system as returned by the gethost-
name(2) function.

domainname, nisdomainname, ypdomainname will print the name of the sys-
tem as returned by the getdomainname(2) function. This is also known as
the YP/NIS domain name of the system.

dnsdomainname will print the domain part of the FQDN (Fully Qualified
Domain Name). The complete FQDN of the system is returned with hostname
–fqdn.

Sometime you may find a weird thing that you can use ldap verification to log on a client, but you can not sudo to root. Now you should consider run domainname to check whether it’s set to (none). If it does, you should consider set the domainname just using domainname command.

Categories: Kernel, Linux, Unix Tags:

Extending tmpfs’ed /tmp on Solaris 10(and linux) without reboot

November 3rd, 2011 No comments

Thanks to Eugene.

If you need to extend /tmp that is using tmpfs on Solaris 10 global zone (works with zones too but needs adjustments) and don’t want to undertake a reboot, here’s a tried working solution.

PLEASE BE CAREFUL, ONE ERROR HERE WILL KILL THE LIVE KERNEL!

echo “$(echo $(echo ::fsinfo | mdb -k | grep /tmp | head -1 | awk ‘{print $1}’)::print vfs_t vfs_data \| ::print -ta struct tmount tm_anonmax | mdb -k | awk ‘{print $1}’)/Z 0×20000″ | mdb -kw

Note the 0×20000. This number means new size will be 1GB. It is calculated like this: as an example, 0×10000 in hex is 65535, or 64k. The size is set in pages, each page is 8k, so resulting allocation size is 64k * 8k = 512m. 0×20000 is 1GB, 0×40000 is 2GB etc.

If the server has zones, you will see more then one entry in ::fsinfo, and you need to feed exact struct address to mdb. This way you can change /tmp size for individual zones, but this can only be done from global zone.

Same approach can probably be applied to older Solaris releases but will definitely need adjustments. Oh, and in case you care, on Linux it’s as simple as “mount -o remount,size=1G /tmp” :)

 

Categories: Kernel, Unix Tags:

Error extras/ata_id/ata_id.c:42:23: fatal error: linux/bsg.h: No such file or directory when compile LFS udev-166

September 21st, 2011 1 comment

For “Linux From Scratch – Version 6.8″, on part “6.60. Udev-166″, after configuration on udev, we need compile the package using make, but later, I met the error message like this:

“extras/ata_id/ata_id.c:42:23: fatal error: linux/bsg.h: No such file or directory”

After checking line 42 of ata_id.c under extras/ata_id of udev-166′s source file, I can see that:

“#include <linux/bsg.h>”

As there’s no bsg.h under $LFS/usr/include/linux, so I was sure that this error was caused by C header file loss. Checking with:
root:/sources/udev-166# /lib/libc.so.6
I can see that Glibc was 2.13, and GCC was 4.5.2:


GNU C Library stable release version 2.13, by Roland McGrath et al.
Copyright (C) 2011 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 4.5.2.
Compiled on a Linux 2.6.22 system on 2011-09-04.
Available extensions:
crypt add-on version 2.1 by Michael Glad and others
GNU Libidn by Simon Josefsson
Native POSIX Threads Library by Ulrich Drepper et al
BIND-8.2.3-T5B
libc ABIs: UNIQUE IFUNC
For bug reporting instructions, please see:
<http://www.gnu.org/software/libc/bugs.html>.

After some searching work on google, I can conclude that this was caused by GCC version. I was not going to rebuilt gcc for this, so I tried get this header file and put it under $LFS/usr/include/linux/bsg.h. You can go to http://lxr.free-electrons.com/source/include/linux/bsg.h?v=2.6.25 to download the header file.

After copy & paster & chmod, I ran make again, and it succeeded.

Categories: Kernel, Linux Tags:

correctable ecc event detected by cpu0/1/3/4

July 20th, 2011 No comments

If you receive these kinds of alerts in Solaris, it means your server has Memory Dimm issue. Please check with hrdconf/madmin(HP-UX) or prtconf(SUN solaris) to see the error message.

For more information about ECC memory, you can refer to the following article: http://en.wikipedia.org/wiki/ECC_memory

Categories: Kernel, Unix Tags:

Resolved – ld.so.1: httpd: fatal: libaprutil-0.so.0: open failed: No such file or directory

July 9th, 2011 No comments

When I tried to start up IBM http server on my Solaris, I met this error message:
ld.so.1: httpd: fatal: libaprutil-0.so.0: open failed: No such file or directory
Killed
This error was caused by library libaprutil-0.so.0 can not be found in the current library path environment variable. We should add /apps/IBMHTTPD/Installs/IHS61-01/lib to LD_LIBRARY_PATH to make libaprutil-0.so.0 be seen by ld.
Here goes the resolution:
#export LD_LIBRARY_PATH=/apps/IBMHTTPD/Installs/IHS61-01/lib:$LD_LIBRARY_PATH #set environment variable
#/apps/IBMHTTPD/Installs/IHS61-01/bin/httpd -d /apps/IBMHTTPD/Installs/IHS61-01 -d /apps/IBMHTTPD/Installs/IHS61-01 -f /apps/IBMHTTPD/Installs/IHS61-01/conf/ihs.conf -k restart #restart IHS
#/usr/ucb/ps auxww|grep -i ibmhttpd #check the result

PS:

Here’s more about ld.so: (from book <Optimizing Linux® Performance: A Hands-On Guide to Linux® Performance Tools>)

When a dynamically linked application is executed, the Linux loader, ld.so, runs first.ld.so loads all the application’s libraries and connects symbols that the application uses with the functions the libraries provide. Because different libraries were originally linked at different and possibly overlapping places in memory, the linker needs to sort through all the symbols and make sure that each lives at a different place in memory. When a symbol is moved from one virtual address to another, this is called a relocation. It takes time for the loader to do this, and it is much better if it does not need to be done at all. The prelink application aims to do that by rearranging the system libraries of the entire systems so that they do not overlap. An application with a high number of relocations may not have been prelinked.

The Linux loader usually runs without any intervention from the user, and by just executing a dynamic program, it is run automatically. Although the execution of the loader is hidden from the user, it still takes time to run and can potentially slow down an application’s startup time. When you ask for loader statistics, the loader shows the amount of work it is doing and enables you to figure out whether it is a bottleneck.

The ld command is invisibly run for every Linux application that uses shared libraries. By setting the appropriate environment variables, we can ask it to dump information about its execution. The following invocation influences ld execution:
env LD_DEBUG=statistics,help LD_DEBUG_OUTPUT=filename <command>

panic cpu thread page_unlock is not locked issue when using centos xen to create solaris 10

May 25th, 2011 No comments

Don’t panic.

You can allocate more memory to solaris virtual machine(like 1024Mb) and try again.

In the Sun Forums thread, they say that 609 MB is the lowest you can go. You can give it a little more memory size if allowed.

Categories: Kernel, Linux Tags: