Archive

Author Archive

general networking tips

April 18th, 2014 No comments

 Q: What’s the difference between bandwidth and speed?

A: Bandwidth is a capacity; speed is a rate. Bandwidth tells you the maximum amount of data that your network can transmit. Speed tells you the rate at which the data can travel. The bandwidth for a CAT-5 cable is 10/100 Base-T. The speed of a CAT-5 cable changes depending on conditions.

Q: What is Base-T?
A: Base-T refers to the different standards for Ethernet transmission rates. The 10 Base-T standard transfers data at 10 megabits per second (Mbps). The 100 Base-T standard transfers data at 100 Mbps. The 1000 Base-T standard transfers data at a massive 1000 Mbps.

Q: What is a crossover cable used for?
A: Suppose you want to connect a laptop to a desktop computer. One way of doing this is to use a switch or a hub to connect the two devices, and another way of doing this would be to use a crossover cable, a cable that can send and receive data on both ends at the same time. A crossover cable is different from a straight-through cable in that a straight-through cable can only send or receive data on one end at a time.

Q: What’s the difference between megabits per second (Mbps) and megabytes per second (MBps)?
A: Megabits per second (Mbps) is a bandwidth rate used in the telecommunications and computer networking field. One megabit equals one million bursts of electrical current (aka binary pulses). Megabytes per second (MBps) is a data transfer rate used in computing. One megabyte equals 1, 048, 576 bytes, and one byte equals 8 binary digits (aka bits).
The order of the wires in an RJ-45 connector conforms to one of two standards. These standards are 568A and 568B.

 

586B

586B

586A

586A

 

 

 

PS:

Some contents of this article is from book <Head First Networking>.

SELinux security context and its elements – user, role, type identifiers

April 17th, 2014 No comments

All operating system access control is based on some type of access control attribute associated with objects and subjects. In SELinux, the access control attribute is called a security context. All objects (files, interprocess communication channels, sockets, network hosts, and so on) and subjects (processes) have a single security context associated with them. A security context has three elements: user, role, and type identifiers. The usual format for specifying or displaying a security context is as follows:

user:role:type

A valid security context must have one valid user, role, and type identifier, and that the identifiers are defined by the policy writer, and the string identifiers for each element are defined in the SELinux policy language.

Here’s the relationship between unix/linux users -> SELinux identifiers -> roles -> domain:

selinux-context

And here’s one example of SELinux transitions:

selinux-transitions

And here’s the code you can set to apache httpd server when SELinux runs in enforcing mode:

chcon -t httpd_sys_content_t /var/www/html
chcon -t httpd_sys_content_t /var/www/html -R
ls -Z /var/www/html

PS:
Some contents of this article is from book <SELinux by Example: Using Security Enhanced Linux>.

Categories: Linux, Security Tags:

resolved – show kitchen sink buttons when wordpress goes to fullscreen mode

April 11th, 2014 No comments

When you click the full-screen button of wordpress TinyMCE, wordpress will go to “Distraction-Free Writing mode”, which benefits as the name suggests. However, you’ll also find the toolbox of TinyMCE will only show a limited number of buttons and the second line of the toolbox(kitchen sink) will not show at all(I tried install plugin such as ultimate TinyMCE or advanced TinyMCE, but the issue remained):

full-screenPreviously, you can type ALT+SHIFT+G to go to another type of fullscreen mode, which has all buttons include kitchen sink ones. However, seems now the updated version of wordpress has disabled this feature.

To resolve this issue, we can insert the following code in functions.php of your theme:

function my_mce_fullscreen($buttons) {
$buttons[] = 'fullscreen';
return $buttons;
}
add_filter('mce_buttons', 'my_mce_fullscreen');

Later, the TinyMCE will have two full-screen button:

full-screen buttonsMake sure to click the SECOND full-screen button. When you do so, the editor will transform to the following appearance:

full-screen with kitchen sinkI assume this is what you’re trying for, right?

 

 

Categories: Life Tags:

add horizontal line button in wordpress

April 11th, 2014 No comments

There’re three methods for you to add a horizontal line button in wordpress:

Firstly, switch to “Text” mode, and enters <hr />.

Secondly, add the following in functions.php of your wordpress theme:

function enable_more_buttons($buttons) {
$buttons[] = ‘hr’;
return $buttons;
}
add_filter(“mce_buttons”, “enable_more_buttons”);

horizontal line

Thirdly, you can install plugin “Ultimate TinyMCE”, and in its setting, you can enable horizontal line button there in one click! This is my recommendation.

ultimate tinymce

Categories: Life Tags: ,

linux tips

April 10th, 2014 No comments
Linux Performance & Troubleshooting
For Linux Performance & Troubeshooting, please refer to another post - Linux tips – Performance and Troubleshooting
Linux system tips
ls -lu(access time, like cat file) -lt(modification time, like vi, ls -l defaults to use this) -lc(change time, chmod), stat ./aa.txt <UTC>
ctrl +z #bg and stopped
%1 & #bg and running
%1 #fg
pgrep -flu oracle  # processes owned by the user oracle
watch free -m #refresh every 2 seconds
pmap -x 30420 #memory mapping.
openssl s_client -connect localhost:636 -showcerts #verify ssl certificates, or 443
openssl x509 -in cacert.pem -noout -text
openssl x509 -in cacert.pem -noout -dates
openssl x509 -in cacert.pem -noout -purpose
openssl req -in robots.req.pem -text -verify -noout
blockdev –getbsz /dev/xvda1 #get blocksize of FS
dumpe2fs /dev/xvda1 |grep ‘Block size’
Strings
ovm svr ls|sort -rn -k 4 #sort by column 4
cat a1|sort|uniq -c |sort #SUS
ovm svr ls|uniq -f3 #skip the first three columns, this will list only 1 server per pool
for i in <all OVMs>;do (test.sh $i &);done #instead of using nohup &
ovm vm ls|egrep “`echo testhost{0\|,1\|,2\|,3\|,4}|tr -d ‘[:space:]‘`”
cat a|awk ‘{print $5}’|tr ‘\n’ ‘ ‘
getopt #getopts is builtin, more on http://wuliangxx.iteye.com/blog/750940
date -d ’1970-1-1 1276059000 sec utc’
date -d ’2010-09-11 23:20′ +%s
find . -name ‘*txt’|xargs tar cvvf a.tar
find . -maxdepth 1
for i in `find /usr/sbin/ -type f ! -perm -u+x`;do chmod +x $i;done #files that has no execute permisson for owner
find ./* -prune -print #-prune,do not cascade
find . -fprint file #put result to file
tar tvf a.tar  –wildcards “*ipp*” #globbing patterns
tar xvf bfiles.tar –wildcards –no-anchored ‘b*’
tar –show-defaults
tar cvf a.tar –totals *.txt #show speed
tar –append –file=collection.tar rock #add rock to collection.tar
tar –update -v -f collection.tar blues folk rock classical #only append new or updated ones, not replace
tar –delete –file=collection.tar blues #not on tapes
tar -c -f archive.tar –mode=’a+rw’
tar -C sourcedir -cf – . | tar -C targetdir -xf – #copy directories
tar -c -f jams.tar grape prune -C food cherry #-C,change dir, foot file cherry under foot directory
find . -size -400 -print > small-files
tar -c -v -z -T small-files -f little.tgz
tar -cf src.tar –exclude=’*.o’ src #multiple –exclude can be specified
expr 5 – 1
rpm2cpio ./ash-1.0.1-1.x86_64.rpm |cpio -ivd
eval $cmd
exec menu.viewcards #same to .
ls . | xargs -0 -i cp ./{} /etc #-i,use \n as separator, just like find -exec. -0 for space in filename. find -print0 use space to separate, not enter.(-i or -I {} for revoking filenames in the middle)
ls | xargs -t -i mv {} {}.old #mv source should exclude /,or unexpected errors may occur
mv –strip-trailing-slashes source destination
ls |xargs file /dev/fd/0 #replace -
ls -l -I “*out*” #not include out
find . -type d |xargs -i du -sh {} |awk ‘$1 ~ /G/’
find . -type f -name “*20120606″ -exec rm {} \; #do not need rm -rf. find . -type f -exec bash -c “ls -l ‘{}’” \;
ps -ef|grep init|sed -n ’1p’
cut -d ‘ ‘ -f1,3 /etc/mtab #first and third
seq 15 21 #print 15 to 21, or echo {15..21}
seq -s” ” 15 21 #use space as separator
Categories: Linux, tips Tags:

Linux tips – Performance and Troubleshooting

April 10th, 2014 No comments

System CPU

top
procinfo #yum install procinfo
gnome-system-monitor #can also see network flow rate
mpstat
sar

System Memory

top
free
slabtop
sar
/proc/meminfo #provides the most complete view of system memory usage
procinfo
gnome-system-monitor #can also see network flow rate

Process-specific CPU

time
strace #traces the system calls that a program makes while executing
ltrace #traces the calls(functions) that an application makes to libraries rather than to the kernel. Then use ldd to display which libraries are used, and use objdump to search each of those libraries for the given function.
ps
ld.so #ld

Process-specific Memory

ps
/proc/<pid> #you can refer to http://www.doxer.org/proc-filesystem-day-1/ for more info.

/proc/<PID>/status #provides information about the status of a given process PID
/proc/<PID>/maps #how the process’s virtual address space is used

ipcs #more info on http://www.doxer.org/resolved-semget-failed-with-status-28-failed-oracle-database-starting-up/ and http://www.doxer.org/resolvedload-manager-shared-memory-error-is-28-no-space-left-on-devicefor-apache-pmserver-etc-running-on-linux-solaris-unix/

Disk I/O

vmstat #provides totals rather than the rate of change during the sample
sar
lsof
time sh -c “dd if=/dev/zero of=System2.img bs=1M count=10240 && sync” #10G
time dd if=ddfile of=/dev/null bs=8k
dd if=/dev/zero of=vm1disk bs=1M seek=10240 count=0 #10G

Network

ethtool
ifconfig
ip
iptraf
gkrellm
netstat
gnome-system-monitor #can also see network flow rate
sar #network statistics
/etc/cron.d/sysstat #/var/log/sa/

General Ideas & options & outputs

Run Queue Statistics
In Linux, a process can be either runnable or blocked waiting for an event to complete.

A blocked process may be waiting for data from an I/O device or the results of a system call.

When these processes are runnable, but waiting to use the processor, they form a line called the run queue.
The load on a system is the total amount of running and runnable process.

Context Switches
To create the illusion that a given single processor runs multiple tasks simultaneously, the Linux kernel constantly switches between different processes.
The switch between different processes is called a context switch.
To guarantee that each process receives a fair share of processor time, the kernel periodically interrupts the running process and, if appropriate, the kernel scheduler decides to start another process rather than let the current process continue executing. It is possible that your system will context switch every time this periodic interrupt or timer occurs. (cat /proc/interrupts | grep timer, and do this again after e.g. 10s interval)

Interrupts
In addition, periodically, the processor receives an interrupt by hardware devices.
/proc/interrupts can be examined to show which interrupts are firing on which CPUs

CPU Utilization
At any given time, the CPU can be doing one of seven things:
Idle
Running user code #user time
System time #executing code in the Linux kernel on behalf of the application code
Executing user code that has been “nice”ed or set to run at a lower priority than normal processes
iowait #waiting for I/O (such as disk or network) to complete
irq #means it is in high-priority kernel code handling a hardware interrupt
softirq #executing kernel code that was also triggered by an interrupt, but it is running at a lower priority


Buffers and cache
Alternatively, if your system has much more physical memory than required by your applications, Linux will cache recently used files in physical memory so that subsequent accesses to that file do not require an access to the hard drive. This can greatly speed up applications that access the hard drive frequently, which, obviously, can prove especially useful for frequently launched applications. The first time the application is launched, it needs to be read from the disk; if the application remains in the cache, however, it needs to be read from the much quicker physical memory. This disk cache differs from the processor cache mentioned in the previous chapter. Other than oprofile, valgrind, and kcachegrind, most tools that report statistics about “cache” are actually referring to disk cache.

In addition to cache, Linux also uses extra memory as buffers. To further optimize applications, Linux sets aside memory to use for data that needs to be written to disk. These set-asides are called buffers. If an application has to write something to the disk, which would usually take a long time, Linux lets the application continue immediately but saves the file data into a memory buffer. At some point in the future, the buffer is flushed to disk, but the application can continue immediately.
Active Versus Inactive Memory
Active memory is currently being used by a process. Inactive memory is memory that is allocated but has not been used for a while. Nothing is essentially different between the two types of memory. When required, the Linux kernel takes a process’s least recently used memory pages and moves them from the active to the inactive list. When choosing which memory will be swapped to disk, the kernel chooses from the inactive memory list.
Kernel Usage of Memory (Slabs)
In addition to the memory that applications allocate, the Linux kernel consumes a certain amount for bookkeeping purposes. This bookkeeping includes, for example, keeping track of data arriving from network and disk I/O devices, as well as keeping track of which processes are running and which are sleeping. To manage this bookkeeping, the kernel has a series of caches that contains one or more slabs of memory. Each slab consists of a set of one or more objects. The amount of slab memory consumed by the kernel depends on which parts of the Linux kernel are being used, and can change as the type of load on the machine changes.

slabtop

slabtop shows in real-time how the kernel is allocating its various caches and how full they are. Internally, the kernel has a series of caches that are made up of one or more slabs. Each slab consists of a set of one or more objects. These objects can be active (or used) or inactive (unused). slabtop shows you the status of the different slabs. It shows you how full they are and how much memory they are using.


time

time measures three types of time. First, it measures the real or elapsed time, which is the amount of time between when the program started and finished execution. Next, it measures the user time, which is the amount of time that the CPU spent executing application code on behalf of the program. Finally, time measures system time, which is the amount of time the CPU spent executing system or kernel code on behalf of the application.


Disk I/O

When an application does a read or write, the Linux kernel may have a copy of the file stored into its cache or buffers and returns the requested information without ever accessing the disk. If the Linux kernel does not have a copy of the data stored in memory, however, it adds a request to the disk’s I/O queue. If the Linux kernel notices that multiple requests are asking for contiguous locations on the disk, it merges them into a single big request. This merging increases overall disk performance by eliminating the seek time for the second request. When the request has been placed in the disk queue, if the disk is not currently busy, it starts to service the I/O request. If the disk is busy, the request waits in the queue until the drive is available, and then it is serviced.

iostat

iostat provides a per-device and per-partition breakdown of how many blocks are written to and from a particular disk. (Blocks in iostat are usually sized at 512 bytes.)

lsof
lsof can prove helpful when narrowing down which applications are generating I/O


 top output

S(or STAT) – This is the current status of a process, where the process is either sleeping (S), running (R), zombied (killed but not yet dead) (Z), in an uninterruptable sleep (D), or being traced (T).

TIME – The total amount CPU time (user and system) that this process has used since it started executing.

top options

-b Run in batch mode. Typically, top shows only a single screenful of information, and processes that don’t fit on the screen never display. This option shows all the processes and can be very useful if you are saving top’s output to a file or piping the output to another command for processing.

I This toggles whether top will divide the CPU usage by the number of CPUs on the system. For example, if a process was consuming all of both CPUs on a two-CPU system, this toggles whether top displays a CPU usage of 100% or 200%.

1 (numeral 1) This toggles whether the CPU usage will be broken down to the individual usage or shown as a total.

mpstat options

-P { cpu | ALL } This option tells mpstat which CPUs to monitor. cpu is the number between 0 and the total CPUs minus 1.

The biggest benefit of mpstat is that it shows the time next to the statistics, so you can look for a correlation between CPU usage and time of day.

mpstat can be used to determine whether the CPUs are fully utilized and relatively balanced. By observing the number of interrupts each CPU is handling, it is possible to find an imbalance.

 sar options

-I {irq | SUM | ALL | XALL} This reports the rates that interrupts have been occurring in the system.
-P {cpu | ALL} This option specifies which CPU the statistics should be gathered from. If this isn’t specified, the system totals are reported.
-q This reports information about the run queues and load averages of the machine.
-u This reports information about CPU utilization of the system. (This is the default output.)
-w This reports the number of context switches that occurred in the system.
-o filename This specifies the name of the binary output file that will store the performance statistics.
-f filename This specifies the filename of the performance statistics.

-B – This reports information about the number of blocks that the kernel swapped to and from disk. In addition, for kernel versions after v2.5, it reports information about the number of page faults.
-W – This reports the number of pages of swap that are brought in and out of the system.
-r – This reports information about the memory being used in the system. It includes information about the total free memory, swap, cache, and buffers being used.
-R Report memory statistics

-d –  reports disk activities

-n DEV – Shows statistics about the number of packets and bytes sent and received by each device.
-n EDEV – Shows information about the transmit and receive errors for each device.
-n SOCK – Shows information about the total number of sockets (TCP, UDP, and RAW) in use.
-n ALL – Shows all the network statistics.

sar output

runq-sz This is the size of the run queue when the sample was taken.
plist-sz This is the number of processes present (running, sleeping, or waiting for I/O) when the sample was taken.
proc/s This is the number of new processes created per second. (This is the same as the forks statistic from vmstat.)

tps – Transfers per second. This is the number of reads and writes to the drive/partition per second.
rd_sec/s – Number of disk sectors read per second.
wr_sec/s – Number of disk sectors written per second.


vmstat options

-n print header info only once

-a This changes the default output of memory statistics to indicate the active/inactive amount of memory rather than information about buffer and cache usage.
-s (procps 3.2 or greater) This prints out the vm table. This is a grab bag of different statistics about the system since it has booted. It cannot be run in sample mode. It contains both memory and CPU statistics.

-d – This option displays individual disk statistics at a rate of one sample per interval. The statistics are the totals since system boot, rather than just those that occurred between this sample and the previous sample.
-p partition – This displays performance statistics about the given partition at a rate of one sample per interval. The statistics are the totals since system boot, rather than just those that occurred between this sample and the previous sample.

vmstat output
si – The rate of memory (in KB/s) that has been swapped in from disk during the last sample.
so – The rate of memory (in KB/s) that has been swapped out to disk during the last sample.
pages paged in – The amount of memory (in pages) read from the disk(s) into the system buffers. (On most IA32 systems, a page is 4KB.)
pages paged out – The amount of memory (in pages) written to the disk(s) from the system cache. (On most IA32 systems, a page is 4KB.)
pages swapped in – The amount of memory (in pages) read from swap into system memory.
pages swapped in/out – The amount of memory (in pages) written from system memory to the swap.

bo – This indicates the number of total blocks written to disk in the previous interval. (In vmstat, block size for a disk is typically 1,024 bytes.)
bi – This shows the number of blocks read from the disk in the previous interval. (In vmstat, block size for a disk is typically 1,024 bytes.)
wa – This indicates the amount of CPU time spent waiting for I/O to complete. The rate of disk blocks written per second.
reads: ms – The amount of time (in ms) spent reading from the disk.
writes: ms – The amount of time (in ms) spent writing to the disk.
IO: cur – The total number of I/O that are currently in progress. Note that there is a bug in recent versions of vmstat in which this is incorrectly divided by 1,000, which almost always yields a 0.
IO: s – This is the number of seconds spent waiting for I/O to complete.

iostat options
-d – This displays only information about disk I/O rather than the default display, which includes information about CPU usage as well.
-k – This shows statistics in kilobytes rather than blocks.
-x – This shows extended-performance I/O statistics.
device – If a device is specified, iostat shows only information about that device.

iostat output
tps – Transfers per second. This is the number of reads and writes to the drive/partition per second.
Blk_read/s – The rate of disk blocks read per second.
Blk_wrtn/s – The rate of disk blocks written per second.
Blk_read – The total number of blocks read during the interval.
Blk_wrtn – The total number of blocks written during the interval.
rrqm/s – The number of reads merged before they were issued to the disk.
wrqm/s – The number of writes merged before they were issued to the disk.
r/s – The number of reads issued to the disk per second.
w/s – The number of writes issued to the disk per second.
rsec/s – Disk sectors read per second.
wsec/s – Disk sectors written per second.
avgrq-sz – The average size (in sectors) of disk requests.
avgqu-sz – The average size of the disk request queue.
await – The average time (in ms) for a request to be completely serviced. This average includes the time that the request was waiting in the disk’s queue plus the amount of time it was serviced by the disk.
svctm – The average service time (in ms) for requests submitted to the disk. This indicates how long on average the disk took to complete a request. Unlike await, it does not include the amount of time spent waiting in the queue.

lsof options
+D directory – This causes lsof to recursively search all the files in the given directory and report on which processes are using them.
+d directory – This causes lsof to report on which processes are using the files in the given directory.

lsof output
FD – The file descriptor of the file, or tex for a executable, mem for a memory mapped file.
TYPE – The type of file. REG for a regular file.
DEVICE – Device number in major, minor number.
SIZE – The size of the file.
NODE – The inode of the file.


free options

-s delay – This option causes free to print out new memory statistics every delay seconds.


 strace options

strace [-p <pid>] -s 200 <program>#attach to a process. -s 200 to make the maximum string size to print (the default is 32) to 200. Note that filenames are not considered strings and are always printed in full.

-c – This causes strace to print out a summary of statistics rather than an individual list of all the system calls that are made.

ltrace options
-c – This option causes ltrace to print a summary of all the calls after the command has completed.
-S – ltrace traces system calls in addition to library calls, which is identical to the functionality strace provides.
-p pid – This traces the process with the given PID.


ps options
vsz The virtual set size is the amount of virtual memory that the application is using. Because Linux only allocated physical memory when an application tries to use it, this value may be much greater than the amount of physical memory the application is using.
rss The resident set size is the amount of physical memory the application is currently using.
pmep The percentage of the system memory that the process is consuming.
command This is the command name.

/proc/<PID>/status output
VmSize This is the process’s virtual set size, which is the amount of virtual memory that the application is using. Because Linux only allocates physical memory when an application tries to use it, this value may be much greater than the amount of physical memory the application is actually using. This is the same as the vsz parameter provided by ps.
VmLck This is the amount of memory that has been locked by this process. Locked memory cannot be swapped to disk.
VmRSS This is the resident set size or amount of physical memory the application is currently using. This is the same as the rss statistic provided by ps.

ipcs
Because shared memory is used by multiple processes, it cannot be attributed to any particular process. ipcs provides enough information about the state of the system-wide shared memory to determine which processes allocated the shared memory, which processes are using it, and how often they are using it. This information proves useful when trying to reduce shared memory usage.

ipcs options

lsof –u oracle | grep <shmid> #shmid is from output of ipcs -m. lists the processes under the oracle user attached to the shared memory segment

-t – This shows the time when the shared memory was created, when a process last attached to it, and when a process last detached from it.
-u – This provides a summary about how much shared memory is being used and whether it has been swapped or is in memory.
-l – This shows the system-wide limits for shared memory usage.
-p – This shows the PIDs of the processes that created and last used the shared memory segments.
-c – creator


ifconfig output #more on http://www.thegeekscope.com/linux-ifconfig-command-output-explained/

Errors – Frames with errors (possibly because of a bad network cable or duplex mismatch).
Dropped – Frames that were discarded (most likely because of low amounts of memory or buffers).
Overruns – Frames that may have been discarded by the network card because the kernel or network card was overwhelmed with frames. This should not normally happen.
Frame – These frames were dropped as a result of problems on the physical level. This could be the result of cyclic redundancy check (CRC) errors or other low-level problems.
Compressed – Some lower-level interfaces, such as Point-to-Point Protocol (PPP) or Serial Line Internet Protocol (SLIP) devices compress frames before they are sent over the network. This value indicates the number of these compressed frames. (Compressed packets are usually present during SLIP or PPP connections)

carrier – The number of packets discarded because of link media failure (such as a faulty cable)

ip options
-s [-s] link – If the extra -s is provided to ip, it provides a more detailed list of low-level Ethernet statistics.

iptraf options
-d interface – Detailed statistics for an interface including receive, transmit, and error rates
-s interface – Statistics about which IP ports are being used on an interface and how many bytes are flowing through them
-t <minutes> – Number of minutes that iptraf runs before exiting
-z interface – shows packet counts by size on the specified interface

netstat options
-p – Displays the PID/program name responsible for opening each of the displayed sockets
-c – Continually updates the display of information every second
–interfaces=<name> – Displays network statistics for the given interface
–statistics|-s – IP/UDP/ICMP/TCP statistics
–tcp|-t – Shows only information about TCP sockets
–udp|-u – Shows only information about UDP sockets.
–raw|-w – Shows only information about RAW sockets (IP and ICMP)
–listening|-l – Show only listening sockets. (These are omitted by default.)
–all|-a – Show both listening and non-listening (for TCP this means established connections) sockets. With the –interfaces option, show interfaces that are not marked
–numeric|-n – Show numerical addresses instead of trying to determine symbolic host, port or user names.
–extend|-e – Display additional information. Use this option twice for maximum detail.

netstat output

Active Internet connections (w/o servers)
Proto - The protocol (tcp, udp, raw) used by the socket.
Recv-Q - The count of bytes not copied by the user program connected to this socket.
Send-Q - The count of bytes not acknowledged by the remote host.
Local Address - Address and port number of the local end of the socket. Unless the --numeric (-n) option is specified, the socket address is resolved to its canonical host name (FQDN), and the port number is translated into the corresponding service name.
Foreign Address - Address and port number of the remote end of the socket. Analogous to "Local Address."
State - The state of the socket. Since there are no states in raw mode and usually no states used in UDP, this column may be left blank. Normally this can be one of several values: #more on http://www.doxer.org/tcp-flags-explanation-in-details-syn-ack-fin-rst-urg-psh-and-iptables-for-sync-flood/
    ESTABLISHED
        The socket has an established connection.
    SYN_SENT
        The socket is actively attempting to establish a connection.
    SYN_RECV
        A connection request has been received from the network.
    FIN_WAIT1
        The socket is closed, and the connection is shutting down.
    FIN_WAIT2
        Connection is closed, and the socket is waiting for a shutdown from the remote end.
    TIME_WAIT
        The socket is waiting after close to handle packets still in the network.
    CLOSED
        The socket is not being used.
    CLOSE_WAIT
        The remote end has shut down, waiting for the socket to close.
    LAST_ACK
        The remote end has shut down, and the socket is closed. Waiting for acknowledgement.
    LISTEN
        The socket is listening for incoming connections. Such sockets are not included in the output unless you specify the --listening (-l) or --all (-a) option.
    CLOSING
        Both sockets are shut down but we still don't have all our data sent.
    UNKNOWN
        The state of the socket is unknown.
User - The username or the user id (UID) of the owner of the socket.
PID/Program name - Slash-separated pair of the process id (PID) and process name of the process that owns the socket. --program causes this column to be included. You will also need superuser privileges to see this information on sockets you don't own. This identification information is not yet available for IPX sockets.

Example

[ezolt@scrffy ~/edid]$ vmstat 1 | tee /tmp/output
procs -----------memory---------- ---swap-- -----io----  --system-- ----cpu----
r  b   swpd   free   buff  cache   si   so    bi    bo    in    cs  us sy id wa
0  1 201060  35832  26532 324112    0    0     3     2     6     2  5  1  94  0
0  0 201060  35888  26532 324112    0    0    16     0  1138   358  0  0  99  0
0  0 201060  35888  26540 324104    0    0     0    88  1163   371  0  0 100  0

The number of context switches looks good compared to the number of interrupts. The scheduler is switching processes less than the number of timer interrupts that are firing. This is most likely because the system is nearly idle, and most of the time when the timer interrupt fires, the scheduler does not have any work to do, so it does not switch from the idle process.

[ezolt@scrffy manuscript]$ sar -w -c -q 1 2
Linux 2.6.8-1.521smp (scrffy)   10/20/2004

08:23:29 PM    proc/s
08:23:30 PM      0.00

08:23:29 PM   cswch/s
08:23:30 PM    594.00

08:23:29 PM   runq-sz  plist-sz   ldavg-1    ldavg-5  ldavg-15
08:23:30 PM         0       163      1.12       1.17      1.17

08:23:30 PM    proc/s
08:23:31 PM      0.00

08:23:30 PM   cswch/s
08:23:31 PM    812.87

08:23:30 PM   runq-sz  plist-sz   ldavg-1    ldavg-5  ldavg-15
08:23:31 PM         0       163      1.12       1.17      1.17

Average:       proc/s
Average:         0.00

Average:      cswch/s
Average:       703.98

Average:      runq-sz  plist-sz   ldavg-1    ldavg-5  ldavg-15
Average:            0       163      1.12       1.17      1.17

In this case, we ask sar to show us the total number of context switches and process creations that occur every second. We also ask sar for information about the load average. We can see in this example that this machine has 163 process that are in memory but not running. For the past minute, on average 1.12 processes have been ready to run.

bash-2.05b$ vmstat -a
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free  inact active   si   so    bi    bo   in    cs us sy id wa
 2  1 514004   5640 79816 1341208   33   31   204   247 1111  1548  8  5 73 14

The amount of inactive pages indicates how much of the memory could be swapped to disk and how much is currently being used. In this case, we can see that 1310MB of memory is active, and only 78MB is considered inactive. This machine has a large amount of memory, and much of it is being actively used.


bash-2.05b$ vmstat -s

      1552528  total memory
      1546692  used memory
      1410448  active memory
        11100  inactive memory
         5836  free memory
         2676  buffer memory
       645864  swap cache
      2097096  total swap
       526280  used swap
      1570816  free swap
     20293225 non-nice user cpu ticks
     18284715 nice user cpu ticks
     17687435 system cpu ticks
    357314699 idle cpu ticks
     67673539 IO-wait cpu ticks
       352225 IRQ cpu ticks
      4872449 softirq cpu ticks
    495248623 pages paged in
    600129070 pages paged out
     19877382 pages swapped in
     18874460 pages swapped out
   2702803833 interrupts
   3763550322 CPU context switches
   1094067854 boot time
     20158151 forks

It can be helpful to know the system totals when trying to figure out what percentage of the swap and memory is currently being used. Another interesting statistic is the pages paged in, which indicates the total number of pages that were read from the disk. This statistic includes the pages that are read starting an application and those that the application itself may be using.


[ezolt@wintermute tmp]$ ps -o etime,time,pcpu,cmd 10882
    ELAPSED     TIME %CPU CMD
      00:06 00:00:05 88.0 ./burn

This example shows a test application that is consuming 88 percent of the CPU and has been running for 6 seconds, but has only consumed 5 seconds of CPU time.


[ezolt@wintermute tmp]$ ps –o vsz,rss,tsiz,dsiz,majflt,minflt,cmd 10882
VSZ RSS TSIZ DSIZ MAJFLT MINFLT CMD
11124 10004 1 11122 66 2465 ./burn

The burn application has a very small text size (1KB), but a very large data size (11,122KB). Of the total virtual size (11,124KB), the process has a slightly smaller resident set size (10,004KB), which represents the total amount of physical memory that the process is actually using. In addition, most of the faults generated by burn were minor faults, so most of the memory faults were due to memory allocation rather than loading in a large amount of text or data from the program image on the disk.


[ezolt@wintermute tmp]$ cat /proc/4540/status
Name: burn
State: T (stopped)
Tgid: 4540
Pid: 4540
PPid: 1514
TracerPid: 0
Uid: 501 501 501 501
Gid: 501 501 501 501
FDSize: 256
Groups: 501 9 502
VmSize: 11124 kB
VmLck: 0 kB
VmRSS: 10004 kB
VmData: 9776 kB
VmStk: 8 kB
VmExe: 4 kB
VmLib: 1312 kB
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 0000000000000000
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000

The VmLck size of 0KB means that the process has not locked any pages into memory, making them unswappable. The VmRSS size of 10,004KB means that the application is currently using 10,004KB of physical memory, although it has either allocated or mapped the VmSize or 11,124KB. If the application begins to use the memory that it has allocated but is not currently using, the VmRSS size increases but leaves the VmSize unchanged.

[ezolt@wintermute test_app]$ cat /proc/4540/maps
08048000-08049000 r-xp 00000000 21:03 393730 /tmp/burn
08049000-0804a000 rw-p 00000000 21:03 393730 /tmp/burn
0804a000-089d3000 rwxp 00000000 00:00 0
40000000-40015000 r-xp 00000000 21:03 1147263 /lib/ld-2.3.2.so
40015000-40016000 rw-p 00015000 21:03 1147263 /lib/ld-2.3.2.so
4002e000-4002f000 rw-p 00000000 00:00 0
4002f000-40162000 r-xp 00000000 21:03 2031811 /lib/tls/libc-2.3.2.so
40162000-40166000 rw-p 00132000 21:03 2031811 /lib/tls/libc-2.3.2.so
40166000-40168000 rw-p 00000000 00:00 0
bfffe000-c0000000 rwxp fffff000 00:00 0

The burn application is using two libraries: ld and libc. The text section (denoted by the permission r-xp) of libc has a range of 0x4002f000 through 0×40162000 or a size of 0×133000 or 1,257,472 bytes.
The data section (denoted by permission rw-p) of libc has a range of 40162000 through 40166000 or a size of 0×4000 or 16,384 bytes. The text size of libc is bigger than ld’s text size of 0×15000 or 86,016 bytes. The data size of libc is also bigger than ld’s text size of 0×1000 or 4,096 bytes. libc is the big library that burn is linking in.


[ezolt@wintermute tmp]$ ipcs -u

------ Shared Memory Status --------
segments allocated 21
pages allocated 1585
pages resident 720
pages swapped 412
Swap performance: 0 attempts 0 successes

------ Semaphore Status --------
used arrays = 0
allocated semaphores = 0

------ Messages: Status --------
allocated queues = 0
used headers = 0
used space = 0 bytes

In this case, we can see that 21 different segments or pieces of shared memory have been allocated. All these segments consume a total of 1,585 pages of memory; 720 of these exist in physical memory and 412 have been swapped to disk.

[ezolt@wintermute tmp]$ ipcs

------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x00000000 0 root 777 49152 1
0x00000000 32769 root 777 16384 1
0x00000000 65538 ezolt 600 393216 2 dest

we ask ipcs for a general overview of all the shared memory segments in the system. This indicates who is using each memory segment. In this case, we see a list of all the shared segments. For one in particular, the one with a share memory ID of 65538, the user (ezolt) is the owner. It has a permission of 600 (a typical UNIX permission), which in this case, means that only ezolt can read and write to it. It has 393,216 bytes, and 2 processes are attached to it.

[ezolt@wintermute tmp]$ ipcs -p

------ Shared Memory Creator/Last-op --------
shmid owner cpid lpid
0 root 1224 11954
32769 root 1224 11954
65538 ezolt 1229 11954

Finally, we can figure out exactly which processes created the shared memory segments and which other processes are using them. For the segment with shmid 32769, we can see that the PID 1229 created it and 11954 was the last to use it.


[ezolt@wintermute procps-3.2.0]$ ./vmstat 1 3

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
1 1 0 197020 81804 29920 0 0 236 25 1017 67 1 1 93 4
1 1 0 172252 106252 29952 0 0 24448 0 1200 395 1 36 0 63
0 0 0 231068 50004 27924 0 0 19712 80 1179 345 1 34 15 49

During one of the samples, the system read 24,448 disk blocks. As mentioned previously, the block size for a disk is 1,024 bytes(or 4,096 bytes), so this means that the system is reading in data at about 23MB per second. We can also see that during this sample, the CPU was spending a significant portion of time waiting for I/O to complete. The CPU waits on I/O 63 percent of the time during the sample in which the disk was reading at ~23MB per second, and it waits on I/O 49 percent for the next sample, in which the disk was reading at ~19MB per second.

[ezolt@wintermute procps-3.2.0]$ ./vmstat -D
3 disks
5 partitions
53256 total reads
641233 merged reads
4787741 read sectors
343552 milli reading
14479 writes
17556 merged writes
257208 written sectors
7237771 milli writing
0 inprogress IO
342 milli spent IO

In this example, a large number of the reads issued to the system were merged before they were issued to the device. Although there were ~640,000 merged reads, only ~53,000 read commands were actually issued to the drives. The output also tells us that a total of 4,787,741 sectors have been read from the disk, and that since system boot, 343,552ms (or 344 seconds) were spent reading from the disk. The same statistics are available for write performance.

[ezolt@wintermute procps-3.2.0]$ ./vmstat -p hde3 1 3
hde3 reads read sectors writes requested writes
18999 191986 24701 197608
19059 192466 24795 198360
- 19161 193282 24795 198360

Shows that 60 (19,059 – 18,999) reads and 94 writes (24,795 – 24,795) have been issued to partition hde3. This view can prove particularly useful if you are trying to determine which partition of a disk is seeing the most usage.


 

[ezolt@localhost sysstat-5.0.2]$ ./iostat -x -dk 1 5 /dev/hda2
Linux 2.4.22-1.2188.nptl (localhost.localdomain) 05/01/2004
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s
avgrq-sz avgqu-sz await svctm %util
hda2 11.22 44.40 3.15 4.20 115.00 388.97 57.50 194.49
68.52 1.75 237.17 11.47 8.43

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s
avgrq-sz avgqu-sz await svctm %util
hda2 0.00 1548.00 0.00 100.00 0.00 13240.00 0.00 6620.00
132.40 55.13 538.60 10.00 100.00

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s
avgrq-sz avgqu-sz await svctm %util
hda2 0.00 1365.00 0.00 131.00 0.00 11672.00 0.00 5836.00
89.10 53.86 422.44 7.63 100.00

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s
avgrq-sz avgqu-sz await svctm %util
hda2 0.00 1483.00 0.00 84.00 0.00 12688.00 0.00 6344.00
151.0 39.69 399.52 11.90 100.00

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s
avgrq-sz avgqu-sz await svctm %util
hda2 0.00 2067.00 0.00 123.00 0.00 17664.00 0.00 8832.00
143.61 58.59 508.54 8.13 100.00

you can see that the average queue size is pretty high (~237 to 538) and, as a result, the amount of time that a request must wait (~422.44ms to 538.60ms) is much greater than the amount of time it takes to service the request (7.63ms to 11.90ms). These high average service times, along with the fact that the utilization is 100 percent, show that the disk is completely saturated.


[ezolt@wintermute sysstat-5.0.2]$ sar -n SOCK 1 2

Linux 2.4.22-1.2174.nptlsmp (wintermute.phil.org) 06/07/04
21:32:26 totsck tcpsck udpsck rawsck ip-frag
21:32:27 373 118 8 0 0
21:32:28 373 118 8 0 0
Average: 373 118 8 0 0

We can see the total number of open sockets and the TCP, RAW, and UDP sockets. sar also displays the number of fragmented IP packets.

PS:

perl tips

April 2nd, 2014 No comments
#!/usr/bin/perl
use strict;
use warnings;##arrays
my @animals = (“dog”, “pig”, “cat”);
print “The last element of array \$animals is : “.$animals[$#animals].”\n”;
if(@animals>2){
print “more than 2 animals found\n”;
}
else{
print “less than 2 animals found\n”
}
foreach(@animals){
print $_.”\n”;
}
##hashes
my %fruit_color=(“apple”, “red”, “banana”, “yellow”);
print “Color of banana is : “.$fruit_color{“banana”}.”\n”;

for $char (keys %fruit_color)
{
print(“$char => $fruit_color{$char}\n”);
}

##references
my $variables = {
scalar  =>  {
description => “single item”,
sigil => ‘$’,
},
array   =>  {
description => “ordered list of items”,
sigil => ‘@’,
},
hash    =>  {
description => “key/value pairs”,
sigil => ‘%’,
},
};
print “Scalars begin with a $variables->{‘scalar’}->{‘sigil’}\n”;

##Files and I/O
##regular expressions
open (my $passwd, “<”, “/etc/passwd2″) or die (“can  a not open”);
while (<$passwd>) {
print $_ if $_ =~ “test”;
}
close $passwd or die “$passwd: $!”;
my $next = “doing a first”;
$next =~ s/first/second/;
print $next.”\n”;

my $email = “testaccount\@doxer.org”;
if ($email =~ /([^@]+)@(.+)/) {
print “Username is : $1\n”;
print “Hostname is : $2\n”;
}

##subroutines
sub multiply{
my ($num1, $num2) = @_;
my $result = $num1 * $num2;
return $result;
}

my $result2 = multiply(3, 5);
print “3 * 5 = $result2\n”;

! system(‘date’) or die(“failed it”); #if a subroutine returns ok, it’ll return 0
PS:
Categories: Perl, Programming, tips Tags:

resolved – /lib/ld-linux.so.2: bad ELF interpreter: No such file or directory

April 1st, 2014 No comments

When I ran perl command today, I met problem below:

[root@test01 bin]# /usr/local/bin/perl5.8
-bash: /usr/local/bin/perl5.8: /lib/ld-linux.so.2: bad ELF interpreter: No such file or directory

Now let’s check which package /lib/ld-linux.so.2 belongs to on a good linux box:

[root@test02 ~]# rpm -qf /lib/ld-linux.so.2
glibc-2.5-118.el5_10.2

So here’s the resolution to the issue:

[root@test01 bin]# yum install -y glibc.x86_64 glibc.i686 glibc-devel.i686 glibc-devel.x86_64 glibc-headers.x86_64

Categories: Kernel, Linux, Systems Tags:

resolved – sudo: sorry, you must have a tty to run sudo

April 1st, 2014 2 comments

The error message below sometimes will occur when you run a sudo <command>:

sudo: sorry, you must have a tty to run sudo

To resolve this, you may comment out “Defaults requiretty” in /etc/sudoers(revoked by running visudo). Here is more info about this method: http://www.cyberciti.biz/faq/linux-unix-bsd-sudo-sorry-you-must-haveattytorun/

However, sometimes it’s not convenient or even not possible to modify /etc/sudoers, then you can consider the following:

echo -e “<password>\n”|sudo -S <sudo command>

For -S parameter of sudo, you may refer to sudo man page:

-S‘ The -S (stdin) option causes sudo to read the password from the standard input instead of the terminal device. The password must be followed by a newline character.

So here -S bypass tty(terminal device) to read the password from the standard input. And by this, we can now pipe password to sudo.

Categories: Linux, Programming, SHELL, Systems Tags: ,

Document databases and Graph databases

March 27th, 2014 No comments
  • Document databases

Document databases are not document management systems. More often than not, developers starting out with NoSQL confuse document databases with document and content management systems. The worddocument in document databases connotes loosely structured sets of key/value pairs in documents, typically JSON (JavaScript Object Notation), and not documents or spreadsheets (though these could be stored too).

Document databases treat a document as a whole and avoid splitting a document into its constituent name/value pairs. At a collection level, this allows for putting together a diverse set of documents into a single collection. Document databases allow indexing of documents on the basis of not only its primary identifier but also its properties. A few different open-source document databases are available today but the most prominent among the available options are MongoDB and CouchDB.

MongoDB

  • Official Online Resources — www.mongodb.org.
  • History — Created at 10gen.
  • Technologies and Language — Implemented in C++.
  • Access Methods — A JavaScript command-line interface. Drivers exist for a number of languages including C, C#, C++, Erlang. Haskell, Java, JavaScript, Perl, PHP, Python, Ruby, and Scala.
  • Query Language — SQL-like query language.
  • Open-Source License — GNU Affero GPL (http://gnu.org/licenses/agpl-3.0.html).
  • Who Uses It — FourSquare, Shutterfly, Intuit, Github, and more.

CouchDB

  • Official Online Resources — http://couchdb.apache.org and www.couchbase. Most of the authors are part of Couchbase, Inc.
  • History — Work started in 2005 and it was incubated into Apache in 2008.
  • Technologies and Language — Implemented in Erlang with some C and a JavaScript execution environment.
  • Access Methods — Upholds REST above every other mechanism. Use standard web tools and clients to access the database, the same way as you access web resources.
  • Open-Source License — Apache License version 2.
  • Who Uses It — Apple, BBC, Canonical, Cern, and more athttp://wiki.apache.org/couchdb/CouchDB_in_the_wild.

A lot of details on document databases are covered starting in the next chapter.

  • Graph Databases

So far I have listed most of the mainstream open-source NoSQL products. A few other products like Graph databases and XML data stores could also qualify as NoSQL databases. This book does not cover Graph and XML databases. However, I list the two Graph databases that may be of interest and something you may want to explore beyond this book: Neo4j and FlockDB:

Neo4J is an ACID-compliant graph database. It facilitates rapid traversal of graphs.

Neo4j

  • Official Online Resources — http://neo4j.org.
  • History — Created at Neo Technologies in 2003. (Yes, this database has been around before the term NoSQL was known popularly.)
  • Technologies and Language — Implemented in Java.
  • Access Methods — A command-line access to the store is provided. REST interface also available. Client libraries for Java, Python, Ruby, Clojure, Scala, and PHP exist.
  • Query Language — Supports SPARQL protocol and RDF Query Language.
  • Open-Source License — AGPL.
  • Who Uses It — Box.net.

FlockDB

  • Official Online Resources — https://github.com/twitter/flockdb
  • History — Created at Twitter and open sourced in 2010. Designed to store the adjacency lists for followers on Twitter.
  • Technologies and Language — Implemented in Scala.
  • Access Methods — A Thrift and Ruby client.
  • Open-Source License — Apache License version 2.
  • Who Uses It — Twitter.

A number of NoSQL products have been covered so far. Hopefully, it has warmed you up to learn more about these products and to get ready to understand how you can leverage and use them effectively in your stack.

PS:

This article is from book <Professional NoSQL>.

 

Categories: Clouding, Databases, IT Architecture Tags:

sorted ordered column-oriented stores VS key/value stores in NoSQL

March 27th, 2014 No comments
  • sorted ordered column-oriented stores

Google’s Bigtable espouses a model where data in stored in a column-oriented way. This contrasts with the row-oriented format in RDBMS. The column-oriented storage allows data to be stored effectively. It avoids consuming space when storing nulls by simply not storing a column when a value doesn’t exist for that column.

Each unit of data can be thought of as a set of key/value pairs, where the unit itself is identified with the help of a primary identifier, often referred to as the primary key. Bigtable and its clones tend to call this primary key the row-key. Also, as the title of this subsection suggests, units are stored in an ordered-sorted manner. The units of data are sorted and ordered on the basis of the row-key. To explain sorted ordered column-oriented stores, an example serves better than a lot of text, so let me present an example to you. Consider a simple table of values that keeps information about a set of people. Such a table could have columns like first_name, last_name, occupation, zip_code, and gender. A person’s information in this table could be as follows:

first_name: John
last_name: Doe
zip_code: 10001
gender: male

Another set of data in the same table could be as follows:

first_name: Jane
zip_code: 94303

The row-key of the first data point could be 1 and the second could be 2. Then data would be stored in a sorted ordered column-oriented store in a way that the data point with row-key 1 will be stored before a data point with row-key 2 and also that the two data points will be adjacent to each other.

Next, only the valid key/value pairs would be stored for each data point. So, a possible column-family for the example could be name with columns first_name and last_name being its members. Another column-family could be location with zip_code as its member. A third column-family could be profile. The gender column could be a member of the profile column-family. In column-oriented stores similar to Bigtable, data is stored on a column-family basis. Column-families are typically defined at configuration or startup time. Columns themselves need no a-priori definition or declaration. Also, columns are capable of storing any data types as far as the data can be persisted to an array of bytes.

So the underlying logical storage for this simple example consists of three storage buckets: name, location, and profile. Within each bucket, only key/value pairs with valid values are stored. Therefore, the name column-family bucket stores the following values:

For row-key: 1

first_name: John
last_name: Doe

For row-key: 2

first_name: Jane

The location column-family stores the following:

For row-key: 1

zip_code: 10001

For row-key: 2

zip_code: 94303

The profile column-family has values only for the data point with row-key 1 so it stores only the following:

For row-key: 1

gender: male

In real storage terms, the column-families are not physically isolated for a given row. All data pertaining to a row-key is stored together. The column-family acts as a key for the columns it contains and the row-key acts as the key for the whole data set.

Data in Bigtable and its clones is stored in a contiguous sequenced manner. As data grows to fill up one node, it is spilt into multiple nodes. The data is sorted and ordered not only on each node but also across nodes providing one large continuously sequenced set. The data is persisted in a fault-tolerant manner where three copies of each data set are maintained. Most Bigtable clones leverage a distributed filesystem to persist data to disk. Distributed filesystems allow data to be stored among a cluster of machines.

The sorted ordered structure makes data seek by row-key extremely efficient. Data access is less random and ad-hoc and lookup is as simple as finding the node in the sequence that holds the data. Data is inserted at the end of the list. Updates are in-place but often imply adding a newer version of data to the specific cell rather than in-place overwrites. This means a few versions of each cell are maintained at all times. The versioning property is usually configurable.

A bullet-point enumeration of some of the Bigtable open-source clones’ properties is listed next.

  • HBase

Official Online Resources — http://hbase.apache.org.
History — Created at Powerset (now part of Microsoft) in 2007. Donated to the Apache foundation before Powerset was acquired by Microsoft.
Technologies and Language — Implemented in Java.
Access Methods — A JRuby shell allows command-line access to the store. Thrift, Avro, REST, and protobuf clients exist. A few language bindings are also available. A Java API is available with the distribution.
Query Language — No native querying language. Hive (http://hive.apache.org) provides a SQL-like interface for HBase.
Open-Source License — Apache License version 2.
Who Uses It — Facebook, StumbleUpon, Hulu, Ning, Mahalo, Yahoo!, and others.

  • Hypertable

Official Online Resources — www.hypertable.org.
History — Created at Zvents in 2007. Now an independent open-source project.
Technologies and Language — Implemented in C++, uses Google RE2 regular expression library. RE2 provides a fast and efficient implementation. Hypertable promises performance boost over HBase, potentially serving to reduce time and cost when dealing with large amounts of data.
Access Methods — A command-line shell is available. In addition, a Thrift interface is supported. Language bindings have been created based on the Thrift interface. A creative developer has even created a JDBC-compliant interface for Hypertable.
Query Language — HQL (Hypertable Query Language) is a SQL-like abstraction for querying Hypertable data. Hypertable also has an adapter for Hive.
Open-Source License — GNU GPL version 2.
Who Uses It — Zvents, Baidu (China’s biggest search engine), Rediff (India’s biggest portal).

  • Cloudata

Official Online Resources — www.cloudata.org/.
History — Created by a Korean developer named YK Kwon (www.readwriteweb.com/hack/2011/02/open-source-bigtable-cloudata.php). Not much is publicly known about its origins.
Technologies and Language — Implemented in Java.
Access Methods — A command-line access is available. Thrift, REST, and Java API are available.
Query Language — CQL (Cloudata Query Language) defines a SQL-like query language.
Open-Source License — Apache License version 2.
Who Uses It — Not known.

Sorted ordered column-family stores form a very popular NoSQL option. However, NoSQL consists of a lot more variants of key/value stores and document databases. Next, I introduce the key/value stores.

  • key/value stores

A HashMap or an associative array is the simplest data structure that can hold a set of key/value pairs. Such data structures are extremely popular because they provide a very efficient, big O(1) average algorithm running time for accessing data. The key of a key/value pair is a unique value in the set and can be easily looked up to access the data.

Key/value pairs are of varied types: some keep the data in memory and some provide the capability to persist the data to disk. Key/value pairs can be distributed and held in a cluster of nodes.

A simple, yet powerful, key/value store is Oracle’s Berkeley DB. Berkeley DB is a pure storage engine where both key and value are an array of bytes. The core storage engine of Berkeley DB doesn’t attach meaning to the key or the value. It takes byte array pairs in and returns the same back to the calling client. Berkeley DB allows data to be cached in memory and flushed to disk as it grows. There is also a notion of indexing the keys for faster lookup and access. Berkeley DB has existed since the mid-1990s. It was created to replace AT&T’s NDBM as a part of migrating from BSD 4.3 to 4.4. In 1996, Sleepycat Software was formed to maintain and provide support for Berkeley DB.

Another type of key/value store in common use is a cache. A cache provides an in-memory snapshot of the most-used data in an application. The purpose of cache is to reduce disk I/O. Cache systems could be rudimentary map structures or robust systems with a cache expiration policy. Caching is a popular strategy employed at all levels of a computer software stack to boost performance. Operating systems, databases, middleware components, and applications use caching.

Robust open-source distributed cache systems like EHCache (http://ehcache.org/) are widely used in Java applications. EHCache could be considered as a NoSQL solution. Another caching system popularly used in web applications is Memcached (http://memcached.org/), which is an open-source, high-performance object caching system. Brad Fitzpatrick created Memcached for LiveJournal in 2003. Apart from being a caching system, Memcached also helps effective memory management by creating a large virtual pool and distributing memory among nodes as required. This prevents fragmented zones where one node could have excess but unused memory and another node could be starved for memory.

As the NoSQL movement has gathered momentum, a number of key/value pair data stores have emerged. Some of these newer stores build on the Memcached API, some use Berkeley DB as the underlying storage, and a few others provide alternative solutions built from scratch.

Many of these key/value pairs have APIs that allow get-and-set mechanisms to get and set values. A few, like Redis (http://redis.io/), provide richer abstractions and powerful APIs. Redis could be considered as a data structure server because it provides data structures like string (character sequences), lists, and sets, apart from maps. Also, Redis provides a very rich set of operations to access data from these different types of data structures.

This book covers a lot of details on key/value pairs. For now, I list a few important ones and list out important attributes of these stores. Again, the presentation resorts to a bullet-point-style enumeration of a few important characteristics.

  • Membase (Proposed to be merged into Couchbase, gaining features from CouchDB after the creation of Couchbase, Inc.)

Official Online Resources — www.membase.org/.
History — Project started in 2009 by NorthScale, Inc. (later renamed as Membase). Zygna and NHN have been contributors since the beginning. Membase builds on Memcached and supports Memcached’s text and binary protocol. Membase adds a lot of additional features on top of Memcached. It adds disk persistence, data replication, live cluster reconfiguration, and data rebalancing. A number of core Membase creators are also Memcached contributors.
Technologies and Language — Implemented in Erlang, C, and C++.
Access Methods — Memcached-compliant API with some extensions. Can be a drop-in replacement for Memcached.
Open-Source License — Apache License version 2.
Who Uses It — Zynga, NHN, and others.

  • Kyoto Cabinet

Official Online Resources — http://fallabs.com/kyotocabinet/.
History — Kyoto Cabinet is a successor of Tokyo Cabinet (http://fallabs.com/tokyocabinet/). The database is a simple data file containing records; each is a pair of a key and a value. Every key and value are serial bytes with variable length.
Technologies and Language — Implemented in C++.
Access Methods — Provides APIs for C, C++, Java, C#, Python, Ruby, Perl, Erlang, OCaml, and Lua. The protocol simplicity means there are many, many clients.
Open-Source License — GNU GPL and GNU LGPL.
Who Uses It — Mixi, Inc. sponsored much of its original work before the author left Mixi to join Google. Blog posts and mailing lists suggest that there are many users but no public list is available.

  • Redis

Official Online Resources — http://redis.io/.
History — Project started in 2009 by Salvatore Sanfilippo. Salvatore created it for his startup LLOOGG (http://lloogg.com/). Though still an independent project, Redis primary author is employed by VMware, who sponsor its development.
Technologies and Language — Implemented in C.
Access Methods — Rich set of methods and operations. Can access via Redis command-line interface and a set of well-maintained client libraries for languages like Java, Python, Ruby, C, C++, Lua, Haskell, AS3, and more.
Open-Source License — BSD.
Who Uses It — Craigslist.

The three key/value pairs listed here are nimble, fast implementations that provide storage for real-time data, temporary frequently used data, or even full-scale persistence.

The key/value pairs listed so far provide a strong consistency model for the data it stores. However, a few other key/value pairs emphasize availability over consistency in distributed deployments. Many of these are inspired by Amazon’s Dynamo, which is also a key/value pair. Amazon’s Dynamo promises exceptional availability and scalability, and forms the backbone for Amazon’s distributed fault tolerant and highly available system. Apache Cassandra, Basho Riak, and Voldemort are open-source implementations of the ideas proposed by Amazon Dynamo.

Amazon Dynamo brings a lot of key high-availability ideas to the forefront. The most important of the ideas is that of eventual consistency. Eventual consistency implies that there could be small intervals of inconsistency between replicated nodes as data gets updated among peer-to-peer nodes. Eventual consistency does not mean inconsistency. It just implies a weaker form of consistency than the typical ACID type consistency found in RDBMS.

For now I will list the Amazon Dynamo clones and introduce you to a few important characteristics of these data stores.

  • Cassandra

Official Online Resources — http://cassandra.apache.org/.
History — Developed at Facebook and open sourced in 2008, Apache Cassandra was donated to the Apache foundation.
Technologies and Language — Implemented in Java.
Access Methods — A command-line access to the store. Thrift interface and an internal Java API exist. Clients for multiple languages including Java, Python, Grails, PHP, .NET. and Ruby are available. Hadoop integration is also supported.
Query Language — A query language specification is in the making.
Open-Source License — Apache License version 2.
Who Uses It — Facebook, Digg, Reddit, Twitter, and others.

  • Voldemort

Official Online Resources — http://project-voldemort.com/.
History — Created by the data and analytics team at LinkedIn in 2008.
Technologies and Language — Implemented in Java. Provides for pluggable storage using either Berkeley DB or MySQL.
Access Methods — Integrates with Thrift, Avro, and protobuf (http://code.google.com/p/protobuf/) interfaces. Can be used in conjunction with Hadoop.
Open-Source License — Apache License version 2.
Who Uses It — LinkedIn.

  • Riak

Official Online Resources — http://wiki.basho.com/.
History — Created at Basho, a company formed in 2008.
Technologies and Language — Implemented in Erlang. Also, uses a bit of C and JavaScript.
Access Methods — Interfaces for JSON (over HTTP) and protobuf clients exist. Libraries for Erlang, Java, Ruby, Python, PHP, and JavaScript exist.
Open-Source License — Apache License version 2.
Who Uses It — Comcast and Mochi Media.

All three — Cassandra, Riak and Voldemort — provide open-source Amazon Dynamo capabilities. Cassandra and Riak demonstrate dual nature as far their behavior and properties go. Cassandra has properties of both Google Bigtable and Amazon Dynamo. Riak acts both as a key/value store and a document database.

PS:

This article is from book <Professional NoSQL>.

Categories: Clouding, Databases, IT Architecture Tags:

map/reduce framework definition and introduction

March 27th, 2014 No comments

MapReduce is a parallel programming model that allows distributed processing on large data sets on a cluster of computers. The MapReduce framework is patented (http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=/netahtml/PTO/srchnum.htm&r=1&f=G&l=50&s1=7,650,331.PN.&OS=PN/7,650,331&RS=PN/7,650,331) by Google, but the ideas are freely shared and adopted in a number of open-source implementations.

MapReduce derives its ideas and inspiration from concepts in the world of functional programming. Map and reduce are commonly used functions in the world of functional programming. In functional programming, a map function applies an operation or a function to each element in a list. For example, a multiply-by-two function on a list [1, 2, 3, 4] would generate another list as follows: [2, 4, 6, 8]. When such functions are applied, the original list is not altered. Functional programming believes in keeping data immutable and avoids sharing data among multiple processes or threads. This means the map function that was just illustrated, trivial as it may be, could be run via two or more multiple threads on the list and these threads would not step on each other, because the list itself is not altered.

Like the map function, functional programming has a concept of a reduce function. Actually, a reduce function in functional programming is more commonly known as a fold function. A reduce or a fold function is also sometimes called an accumulate, compress, or inject function. A reduce or fold function applies a function on all elements of a data structure, such as a list, and produces a single result or output. So applying a reduce function-like summation on the list generated out of the map function, that is, [2, 4, 6, 8], would generate an output equal to 20.

So map and reduce functions could be used in conjunction to process lists of data, where a function is first applied to each member of a list and then an aggregate function is applied to the transformed and generated list.

This same simple idea of map and reduce has been extended to work on large data sets. The idea is slightly modified to work on collections of tuples or key/value pairs. The map function applies a function on every key/value pair in the collection and generates a new collection. Then the reduce function works on the new generated collection and applies an aggregate function to compute a final output. This is better understood through an example, so let me present a trivial one to explain the flow. Say you have a collection of key/value pairs as follows:

[{ "94303": "Tom"}, {"94303": "Jane"}, {"94301": "Arun"}, {"94302": "Chen"}]

This is a collection of key/value pairs where the key is the zip code and the value is the name of a person who resides within that zip code. A simple map function on this collection could get the names of all those who reside in a particular zip code. The output of such a map function is as follows:

[{"94303":["Tom", "Jane"]}, {“94301″:["Arun"]}, {“94302″:["Chen"]}]

Now a reduce function could work on this output to simply count the number of people who belong to particular zip code. The final output then would be as follows:

[{"94303": 2}, {"94301": 1}, {"94302": 1}]

This example is extremely simple and a MapReduce mechanism seems too complex for such a manipulation, but I hope you get the core idea behind the concepts and the flow.

PS:

This article is from book <Professional NoSQL>.

AWS – relationship between Elastic Load Balancing, CloudWatch, and Auto Scale

March 20th, 2014 No comments

Introduction

The monitoring, auto scaling, and elastic load balancing features of the Amazon EC2 services give you easy on-demand access to capabilities that once required a complicated system architecture and a large hardware investment.
Any real-world web application must have the ability to scale. This can take the form of vertical scaling, where larger and higher capacity servers are rolled in to replace the existing ones, or horizontal scaling, where additional servers are placed side-by-side (architecturally speaking) with the existing resources. Vertical scaling is sometimes called a scale-up model, and horizontal scaling is sometimes called a scale-out model.

Vertical Scaling

At first, vertical scaling appears to be the easiest way to add capacity. You start out with a server of modest means and use it until it no longer meets your needs. You purchase a bigger one, move your code and data over to it, and abandon the old one. Performance is good until the newer, larger system reaches its capacity. You purchase again, repeating the process until your hardware supplier informs you that you’re running on the largest hardware that they have, and that you’ve no more room to grow. At this point you’ve effectively painted yourself into a corner.
Vertical scaling can be expensive. Each time you upgrade to a bigger system you also make a correspondingly larger investment. If you’re actually buying hardware, your first step-ups cost you thousands of dollars; your later ones cost you tens or even hundreds of thousands of dollars. At some point you may have to invest in a similarly expensive backup system, which will remain idle unless the unthinkable happens and you need to use it to continue operations.

Horizontal Scaling

Horizontal scaling is slightly more complex, but far more flexible and scalable in the long term. Instead of upgrading to a bigger server, you obtain another one (presumably of the same size, although there’s no requirement for this to be the case) and arrange to share the storage and processing load across two servers. When two servers no longer meet your needs, you add a third, a fourth, and so on. This scale-out model allows you to add resources incrementally and economically. As your fleet of servers grow, you can actually increase the reliability of your system by eliminating dependencies on any particular server.
Of course, sharing the storage and processing load across a fleet of servers is sometimes easier said than done. Loosely coupled systems tied together with SQS message queues like those we saw and built in the previous chapter can usually scale easily. Systems with a reliance on a traditional relational database or another centralized storage can be more difficult.

Monitoring, Scaling, and Load Balancing

We’ll need several services in order to build a horizontally scaled system that automatically scales to handle load.
First, we need to know how hard each server is working. We have to establish how much data is moving in and out across the network, how many disk reads and writes are taking place, and how much of the time the CPU (Central Processing Unit) is busy. This functionality is provided by Amazon CloudWatch. After CloudWatch has been enabled for an EC2 instance or an elastic load balancer, it captures and stores this information so that it can be used to control scaling decisions.
Second, we require a way to observe the system performance, using it to make decisions to add more EC2 instances (because the system is too busy) or to remove some running instances (because there’s too little work for them to do). This functionality is provided by the EC2 auto scaling feature. The auto scaling feature uses a rule-driven system to encode the logic needed to add and remove EC2 instances.
Third, we need a method for routing traffic to each of the running instances. This is handled by the EC2 elastic load balancing feature. Working in conjunction with auto scaling, elastic load balancing distributes traffic to EC2 instances located in one or more Availability Zones within an EC2 region. It also uses configurable health checks to detect failing instances and to route traffic away from them.
Figure 7-1 depicts how these features relate to each other.
An incoming HTTP load is balanced across a collection of EC2 instances. CloudWatch captures and stores system performance data from the instances. This data is used by auto scale to regulate the number of EC2 instances in the collection.
As you’ll soon see, you can use each of these features on their own or you can use them together. This modular model gives you a lot of flexibility and also allows you to learn about the features in an incremental fashion.

ELB_CloudWatch_AutoScalePS:

This article is from book <Host Your Web Site In The Cloud: Amazon Web Services Made Easy>.

 

Categories: Clouding Tags:

Resolved – print() on closed filehandle $fh at ./perl.pl line 6.

March 19th, 2014 No comments

You may find that print sometimes won’t work as expected in perl, for example:

[root@centos-doxer test]# cat perl.pl
#!/usr/bin/perl
use warnings;
open($fh,”test.txt”);
select $fh;
close $fh;
print “test”;

You may expect “test” to be printed, but actually you got error message:

print() on closed filehandle $fh at ./perl.pl line 6.

So how’s this happened? Please see my explanation:

[root@centos-doxer test]# cat perl.pl
#!/usr/bin/perl
use warnings;
open($fh,”test.txt”);
select $fh;
close $fh; #here you closed $fh filehandle, but you should now reset filehandle to STDOUT
print “test”;

Now here’s the updated script:

#!/usr/bin/perl
use warnings;
open($fh,”test.txt”);
select $fh;
close $fh;
select STDOUT;
print “test”;

This way, you’ll get “test” as expected!

 

Categories: Perl, Programming Tags:

set vnc not asking for OS account password

March 18th, 2014 No comments

As you may know, vncpasswd(belongs to package vnc-server) is used to set password for users when connecting to vnc using a vnc client(such as tightvnc). When you connect to vnc-server, it’ll ask for the password:

vnc-0After you connect to the host using VNC, you may also find that the remote server will ask again for OS password(this is set by passwd):

vnc-01For some cases, you may not want the second one. So here’s the way to cancel this behavior:

vnc-1vnc-2

 

 

Categories: Linux, Systems Tags: ,

stuck in PXE-E51: No DHCP or proxyDHCP offers were received, PXE-M0F: Exiting Intel Boot Agent, Network boot canceled by keystroke

March 17th, 2014 No comments

If you installed your OS and tried booting up it but stuck with the following messages:

stuck_pxe

Then one possibility is that, the configuration for your host’s storage array is not right. For instance, it should be JBOD but you had configured it to RAID6.

Please note that this is only one possibility for this error, you may search for PXE Error Codes you encoutered for more details.

PS:

  • Sometimes, DHCP snooping may prevent PXE functioning, you can read more http://en.wikipedia.org/wiki/DHCP_snooping.
  • STP(Spanning-Tree Protocol) makes each port wait up to 50 seconds before data is allowed to be sent on the port. This Delay in turn can cause problems with some applications/protocols (PXE, Bootworks, etc.). To alleviate the problem, Porfast was implemented on Cisco devices, the terminology might differ between different vendor devices. You can read more http://www.symantec.com/business/support/index?page=content&id=HOWTO6019
  • ARP caching http://www.networkers-online.com/blog/2009/02/arp-caching-and-timeout/
Categories: Hardware, Storage, Systems Tags:

Oracle BI Publisher reports – send mail when filesystems getting full

March 17th, 2014 No comments

Let’s assume you have one Oracle BI Publisher report for filesystem checking. And now you want to write script for checking that report page and send mail to system admins when filesystems are getting full. As the default output of Oracle BI Publisher report needs javascript to work, and as you may know javascript is evil that wget/curl can not get them, so after log on, the next step you need to do is to find the html version’s url of that report for you to use in your script(and the html page has all records when javascript one has only part of them):

BI_report_login

BI_export_html

 

Let’s assume that the html’s url is “http://www.example.com:9703/report.html”, and the display of it was like the following:

bi report

Then here goes the script that will check this page for hosts that has less than 10% available space and send mail to system admins:

#!/usr/bin/perl
use HTML::Strip;
system(“rm -f spacereport.html”);
system(“wget -q –no-proxy –no-check-certificate –post-data ‘id=admin&passwd=password’ ‘http://www.example.com:9703/report.html’ -O spacereport.html”);
open($fh,”spacereport.html”);

#or just @spacereport=<$fh>;
foreach(<$fh>){
push(@spacereport,$_);
}

#change array to hash
$index=0;
map {$pos{$index++}=$_} @spacereport;

#get location of <table> and </table>
#sort numerically ascending
for $char (sort {$a<=>$b} (keys %pos))
{
if($pos{$char} =~ /<table class=”c27″>/)
{
$table_start=$char;
}

if($pos{$char} =~ /<\/table>/)
{
$table_end=$char;
}

}

#get contents between <table> and </table>
for($i=$table_start;$i<=$table_end;$i++){
push(@table_array,$spacereport[$i]);
}
$table_htmlstr=join(“”,@table_array);

#get clear text between <table> and </table>
my $hs=HTML::Strip->new();
my $clean_text = $hs->parse($table_htmlstr);
$hs->eof;

@array_filtered=split(“\n”,$clean_text);

#remove empty array element
@array_filtered=grep { !/^\s+$/ } @array_filtered;
system(“rm -f space_mail_warning.txt”);
open($fh_mail_warning,”>”,”space_mail_warning.txt”);
select $fh_mail_warning;
for($j=4;$j<=$#array_filtered;$j=$j+4){
#put lines that has free space lower than 10% to space_mail_warning.txt
if($array_filtered[$j+2] <= 10){
print “Host: “.$array_filtered[$j].”\n”;
print “Part: “.$array_filtered[$j+1].”\n”;
print “Free(%): “.$array_filtered[$j+2].”\n”;
print “Free(GB): “.$array_filtered[$j+3].”\n”;
print “============\n\n”;
}
}
close $fh_mail_warning;

system(“rm -f space_mail_info.txt”);
open($fh_mail_info,”>”,”space_mail_info.txt”);
select $fh_mail_info;
for($j=4;$j<=$#array_filtered;$j=$j+4){
#put lines that has free space lower than 15% to space_mail_info.txt
if($array_filtered[$j+2] <= 15){
print “Host: “.$array_filtered[$j].”\n”;
print “Part: “.$array_filtered[$j+1].”\n”;
print “Free(%): “.$array_filtered[$j+2].”\n”;
print “Free(GB): “.$array_filtered[$j+3].”\n”;
print “============\n\n”;
}
}
close $fh_mail_info;

#send mail
#select STDOUT;
if(-s “space_mail_warning.txt”){
system(‘cat space_mail_warning.txt | /bin/mailx -s “Space Warning – please work with component owners to free space” [email protected]’);
} elsif(-s “space_mail_info.txt”){
system(‘cat space_mail_info.txt | /bin/mailx -s “Space Info – Space checking mail” [email protected]’);
}

Categories: Perl, Programming Tags:

wget and curl tips

March 14th, 2014 No comments

Imagine you want to download all files under http://www.example.com/2013/downloads, and not files under http://www.example.com/2013 except for directory ‘downloads’, then you can do this:

wget -r –level 100 -nd –no-proxy –no-parent –reject “index.htm*” –reject “*gif” ‘http://www.example.com/2013/downloads/’ #–level 100 is large enough, as I’ve seen no site has more than 100 levels of sub-directories so far.

wget -p -k –no-proxy –no-check-certificate –post-data ‘id=username&passwd=password’ <url> -O output.html

wget –no-proxy –no-check-certificate –save-cookies cookies.txt <url>

wget –no-proxy –no-check-certificate –load-cookies cookies.txt <url>

curl -k -u ‘username:password’ <url>

curl -k -L -d id=username -d passwd=password <url>

curl –data “loginform:id=username&loginform:passwd=password” -k -L <url>

 

Categories: Linux, Programming, SHELL Tags:

resolved – ssh Read from socket failed: Connection reset by peer and Write failed: Broken pipe

March 13th, 2014 No comments

If you met following errors when ssh to linux box:

Read from socket failed: Connection reset by peer

Write failed: Broken pipe

Then there’s one possibility that the linux box’s filesystem was corrupted. As in my case there’s output to stdout:

EXT3-fs error ext3_lookup: deleted inode referenced

To resolve this, you need make linux go to single user mode and fsck -y <filesystem>. You can get corrupted filesystem names when booting:

[/sbin/fsck.ext3 (1) -- /usr] fsck.ext3 -a /dev/xvda2
/usr contains a file system with errors, check forced.
/usr: Directory inode 378101, block 0, offset 0: directory corrupted

/usr: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
(i.e., without -a or -p options)

[/sbin/fsck.ext3 (1) -- /oem] fsck.ext3 -a /dev/xvda5
/oem: recovering journal
/oem: clean, 8253/1048576 files, 202701/1048233 blocks
[/sbin/fsck.ext3 (1) -- /u01] fsck.ext3 -a /dev/xvdb
u01: clean, 36575/14548992 files, 2122736/29081600 blocks
[FAILED]

So in this case, I did fsck -y /dev/xvda2 && fsck -y /dev/xvda5. Later reboot host, and then everything went well.

PS:

If two VMs are booted up in two hypervisors and these VMs shared the same filesystem(like NFS), then after fsck -y one FS and booted up the VM, the FS will corrupt soon as there’re other copies of itself is using that FS. So you need first make sure that only one copy of VM is running on hypervisors of the same server pool.

Categories: Kernel, Linux Tags:

tcpdump tips

March 13th, 2014 No comments

tcpdump [ -AdDefIKlLnNOpqRStuUvxX ] [ -B buffer_size ] [ -c count ]

[ -C file_size ] [ -G rotate_seconds ] [ -F file ]
[ -i interface ] [ -m module ] [ -M secret ]
[ -r file ] [ -s snaplen ] [ -T type ] [ -w file ]
[ -W filecount ]
[ -E spi@ipaddr algo:secret,... ]
[ -y datalinktype ] [ -z postrotate-command ] [ -Z user ] [ expression ]

#general format of a tcp protocol line

src > dst: flags data-seqno ack window urgent options
Src and dst are the source and destination IP addresses and ports.
Flags are some combination of S (SYN), F (FIN), P (PUSH), R (RST), W (ECN CWR) or E (ECN-Echo), or a single ‘.’(means no flags were set)
Data-seqno describes the portion of sequence space covered by the data in this packet.
Ack is sequence number of the next data expected the other direction on this connection.
Window is the number of bytes of receive buffer space available the other direction on this connection.
Urg indicates there is ‘urgent’ data in the packet.
Options are tcp options enclosed in angle brackets (e.g., <mss 1024>).

tcpdump -D #list of the network interfaces available
tcpdump -e #Print the link-level header on each dump line
tcpdump -S #Print absolute, rather than relative, TCP sequence numbers
tcpdump -s <snaplen> #Snarf snaplen bytes of data from each packet rather than the default of 65535 bytes
tcpdump -i eth0 -nn -XX vlan
tcpdump -i eth0 -nn -XX arp
tcpdump -i bond0 -nn -vvv udp dst port 53
tcpdump -i bond0 -nn -vvv host testhost
tcpdump -nn -vvv “dst host host1.example.com and (dst port 1521 or dst port 6200)”

Categories: Life Tags:

psftp through a proxy

March 5th, 2014 No comments

You may know that, we can set proxy in putty for ssh to remote host, as shown below:

putty_proxyAnd if you want to scp files from remote site to your local box, you can use putty’s psftp.exe. There’re many options for psftp.exe:

C:\Users\test>d:\PuTTY\psftp.exe -h
PuTTY Secure File Transfer (SFTP) client
Release 0.62
Usage: psftp [options] [user@]host
Options:
-V print version information and exit
-pgpfp print PGP key fingerprints and exit
-b file use specified batchfile
-bc output batchfile commands
-be don’t stop batchfile processing if errors
-v show verbose messages
-load sessname Load settings from saved session
-l user connect with specified username
-P port connect to specified port
-pw passw login with specified password
-1 -2 force use of particular SSH protocol version
-4 -6 force use of IPv4 or IPv6
-C enable compression
-i key private key file for authentication
-noagent disable use of Pageant
-agent enable use of Pageant
-batch disable all interactive prompts

Although there’s proxy setting option for putty.exe, there’s no proxy setting for psftp.exe! So what should you do if you want to copy files back to local box, and there’s firewall blocking you from doing this directly, and you must use a proxy?

As you may notice, there’s “-load sessname” option in psftp.exe:

-load sessname Load settings from saved session

This option means that, if you have session opened by putty.exe, then you can use psftp.exe -load <session name> to copy files from remote site. For example, suppose you opened one session named mysession in putty.exe in which you set proxy there, then you can use “psftp.exe -load mysession” to copy files from remote site(no need for username/password, as you must have entered that in putty.exe session):

C:\Users\test>d:\PuTTY\psftp.exe -load mysession
Using username “root”.
Remote working directory is /root
psftp> ls
Listing directory /root
drwx—— 3 ec2-user ec2-user 4096 Mar 4 09:27 .
drwxr-xr-x 3 root root 4096 Dec 10 23:47 ..
-rw——- 1 ec2-user ec2-user 388 Mar 5 05:07 .bash_history
-rw-r–r– 1 ec2-user ec2-user 18 Sep 4 18:23 .bash_logout
-rw-r–r– 1 ec2-user ec2-user 176 Sep 4 18:23 .bash_profile
-rw-r–r– 1 ec2-user ec2-user 124 Sep 4 18:23 .bashrc
drwx—— 2 ec2-user ec2-user 4096 Mar 4 09:21 .ssh
psftp> help
! run a local command
bye finish your SFTP session
cd change your remote working directory
chmod change file permissions and modes
close finish your SFTP session but do not quit PSFTP
del delete files on the remote server
dir list remote files
exit finish your SFTP session
get download a file from the server to your local machine
help give help
lcd change local working directory
lpwd print local working directory
ls list remote files
mget download multiple files at once
mkdir create directories on the remote server
mput upload multiple files at once
mv move or rename file(s) on the remote server
open connect to a host
put upload a file from your local machine to the server
pwd print your remote working directory
quit finish your SFTP session
reget continue downloading files
ren move or rename file(s) on the remote server
reput continue uploading files
rm delete files on the remote server
rmdir remove directories on the remote server
psftp>

Now you can get/put files as we used to now.

PS:

If you do not need proxy connecting to remote site, then you can use psftp.exe CLI to get remote files directly. For example:

d:\PuTTY\psftp.exe [email protected] -i d:\PuTTY\aws.ppk -b d:\PuTTY\script.scr -bc -be -v

And in d:\PuTTY\script.scr is script for put/get files:

cd /backup
lcd c:\
mget *.tar.gz
close

Categories: Linux, Systems Tags: ,

notes on Ten Steps to ITSM Success

February 25th, 2014 No comments
  • Step 1 – Setting the stage

1.   Draft a creditable Business Plan, complete with:

1.1   Clear executive sponsorship

1.2   Rudimentary financial analysis

1.3   Risk analysis

1.4   Organizational impact

1.5   Analysis of alternatives

1.6   Assumptions and constraints

1.7   Recommended implementation approach.

2.   Offer a proposed execution plan.

3.   Identify required resources.

4.   Execute a training and awareness campaign.

  • Step 2 - Inventory the current service offering

1. Gain agreement on current service offerings.
2. Develop cost types and categories.
3. Quantify the cost of each service.
4. Interview key stakeholders.
5. Validate findings with the business sponsor and stakeholders.

  • Step 3 - Validate the current service model

1. Identify and engage with key stakeholders across functional areas.
2. Develop a needs/services questionnaire jointly with customer representatives.
3. Decide on a “best means to an end” – i.e. conduct one-on-one interviews, or facilitate group workshops.
4. Agree and document business value-based, rank-ordered IT service requirements using a tool such as CTQ Tree.
5. Analyze results and develop Heat Maps and service maps.
6. Discuss results with the business, highlighting cost/trade-off areas.

  • Step 4 - Establish an itsm steering committee

1. Assemble an ITSM Steering Committee with cross-organizational representation.
2. Draft a charter outlining the Committee’s role, responsibilities, and scope of authority.
3. Educate the Committee members on their duties and areas of responsibility.
4. Formalize a standardized, repeatable communication strategy, and use it consistently.
5. Create a repository for housing and maintaining a historical record of Committee decisions and issues.
6. Ensure the ITSM Steering Committee is properly aligned with the organization’s enterprise governance model, including other groups with which it must interact.

  • Step 5 - Define the ideal target state

1. Articulate the company’s vision and mission statements. If they do not exist, engage your senior leadership and create them.
2. List the organization’s strategic goals.
3. If one does not already exist, create an organizational strategic plan that incorporates the ITSM Transformation effort.
4. Define specific, measurable, achievable, realistic and time-bound objectives that will achieve the articulated goals.
5. Plan the tasks necessary to achieve the objectives.
6. Create an IT Ecosystem detailing the interactions and relationships in the target state you wish to achieve.
7. Validate that your service management system:
— a. Supports delivery of target state services and agreed service levels
— b. Conforms to architectural policies, principles and guidelines

—c. Defines interfaces and integration points (people, processes, tools and information).

  • Step 6 - Create the IT strategic and tactical plans

1. Negotiate the order in which prioritized capabilities will be developed.
2. Achieve an optimal outcome for your ITSM Transformation by advising on trade-offs and emphasizing shared services, infrastructure and processes.
3. Issue a Notice of Decision when negotiations are complete.
4. Produce an IT Strategic Plan that addresses how IT will build, operate and sustain the capabilities required to deliver customer requirements.
5. Build a Goal Linking matrix.
6. Develop a Program Management Plan that defines the portfolio of projects required to execute the ITSM Steering Committee’s priorities.
7. Generate and publish the ITSM Transformation Roadmap to project staff and all relevant stakeholders, as well as to the Business Sponsor.
8. Construct, approve and publish tactical project plans.

  • Step 7 - Define organizational roles and responsibilities

1. Assess staff skills, functions, authority, accountability, roles, responsibilities and required level of supervision.
2. Schedule executive off-site session(s) with clearly established and enforced “rules of the game.”
3. Build out a top-level enterprise RACI that aligns to enterprise governance.
4. Validate and continue Organizational Change Management activities.

  • Step 8 - Standardized development approach

1. Decide upon and implement a standard process design framework.
2. Agree upon – and then publicize – enterprise standards.
3. Reach consensus on the broad activities each service and underlying process must execute.
4. Charter and staff integrated development teams.
5. Document tool automation requirements and procure licenses for the selected suite of tools.
6. Incorporate project management practices into your design plans.

  • Step 9 - Strategy and planning

1. Assess the planned capability development and prioritization.
2. Update previous assumptions and constraints.
3. Validate the scope of each planned capability.
4. Review and validate stipulated timelines.
5. Combine development activities, where applicable.
6. Create working project plan with milestones and agreed-upon deliverables.
7. Ensure proper allocation and utilization of planned resources.
8. Build a fullyloaded operational project plan.

  • Step 10 - Logical and physical design

1. Construct a business-specific logical design for business users, processing systems and data.
2. Define an enterprise data classification scheme and governance model aligned to security policies.
3. Create policies controlling how and under what circumstances data may be accessed (by people, systems and other data elements).
4. Convene an Administrative Review Session with stakeholders to validate the logical model.
5. Initiate physical design activities.
6. Draft initial transition readiness plan.

  • Step 11 - Build and test

1. Prepare the test facility (development and test environments).
2. Configure, integrate and test the selected tool suite.
3. Create a representative bed of test data.
4. Build the unit and integration test plans.
5. Create an initial draft of user and operator training plans.
6. Design the deployment plan.
7. Conduct the formal Acceptance Testing criteria.

  • Step 12 - Conduct service and process health assessment

1. Pause development activities to take the pulse of your ITSM Improvement Initiative.
2. Approach the Business Sponsor and request an independent Third-Party Service and Process Health Assessment.
3. Validate/refine the list of pertinent questions and oversee data collection.
4. Analyze Assessment results and remediate any identified gaps.

  • Step 13 - Analysis and deployment

1. Remediate discrepancies discovered during Service and Process health assessment.
2. Conduct simulation exercises(if feasible).
3. Execute approved training plans.
4. Prepare the environment for the upcoming change.
5. Schedule the deployment date.
6. Review and validate the back-out plan.
7. Deploy the approved Release Package.

  • Step 14 - Operation and sustainment

1. Ensure operational staff can sustain the new or modified environment.
2. Monitor activities to validate services are performing as desired.
3. Address and correct unforeseen capacity and availability issues.
4. Add or update service and operational documentation to the enterprise Knowledge Repository.

  • Step 15 - Balanced scorecard and continual improvement

1. Verify the target operational steady state has been achieved.
2. Develop and implement an IT Balanced Scorecard (IT BSC):
— Create measures that show that IT’s tangible and intangible to the business.
— Define a balanced set of metrics across the four key areas.
3. Position Continual Improvement as an overall Quality Management approach focused on the voice of the customer (VOC).
4. Create an enterprise Continual Improvement (CSI) Strategy:
— Define a standardized, data-driven, cross-organizational approach to continual improvement across the enterprise.
— Define key goals and objectives in alignment with enterprise strategy.
— Establish strong executive sponsorship and stakeholder buy-in.
— Anticipate argument against the CSI effort and develop countermeasures.
5. Create an enterprise Continual Improvement (CSI) Plan:
— Define the detailed Scope, Stakeholders, and Standards (three S’s).
— Develop an integrated CSI model combining “best of breed” frameworks.
— Design an organizational construct for a dedicated CSI Office/Team.
— Define and staff key CSI roles to execute the Plan.
— Define the Maturity Model, Process Model, Project Selection, Metrics, Auditing and Tools.
6. Use standardized templates to document, analyze, prioritize and manage enterprise improvement opportunities.

  • Step 16 - Putting it all together

PS:

This article are excerpts from book <Ten Steps to ITSM Success>.

Categories: IT Architecture Tags: ,

ITIL – Looking at an Example of a Service Design Project

February 18th, 2014 No comments

Dummy Co. is a commercial IT service provider. Dummy Co. has implemented ITIL and allocated many of the service management roles to members of the technical teams, as Figure 12-4 illustrates.

figure 12-4

An account manager has found a new customer. The account manager is Henry, and one of his roles is to act as the business relationship manager. The customer requires an invoicing service and doesn’t want to acquire and run the system itself. It wants to utilise it as part of a cloud (see the earlier section ‘Considering system architecture’ for an explanation of a cloud) from a software as a service (SaaS) provider. This means that the users will access the invoicing service from their Internet browsers over a private Internet connection, and use Dummy Co’s software application and hardware.

Henry has established the customer’s business requirements. These have been reviewed using the service portfolio management process (see Chapter 4). Dummy Co. has an existing application that can be used to deliver the service, and this will fulfil most of the utility requirements. However, the warranty requirements are quite demanding and exceed any service levels offered to existing customers. A business case has been established and approved via service portfolio management with a little help from the financial management and demand management processes (see Chapter 4). There is a considerable return on investment (ROI) to be gained from the venture. The design project is initiated. Charlie is the service delivery manager, and she has been allocated the role of design coordination manager, so she takes charge of the project.

Henry has invited Charlie, who also acts as the service level manager, to a meeting with the customer to identify the service level requirements. The service level requirements are as follows:

  • The service will be used at several locations worldwide, so it must be available 24/7, five days a week. Any single failure leading to loss of service at a site must be fixed within 30 minutes, and it will not break more than twice in any one week.
  • The maximum number of people using the service at any one time is likely to be around 3,000. Because the users are spread across various sites, demand will not fluctuate much throughout the day. The response time of the system must be less than two seconds.
  • As the service provider, you need to have appropriate disaster recovery facilities in place in the event that you have a disaster. As the customer, in the event that we have a disaster and want to relocate our staff, we require the ability for the service to be accessible at alternative locations in less than one hour.
  • The data protection act requires us to protect our client’s information. All invoice data must be protected and must be restorable in the event of a cyber-attack. Our data must be backed up and recoverable in the event that you suffer a security breach.

Charlie takes these requirements and documents them as the SLRs(Service Level Requirements). She follows a process flow like the one in Figure 12-2. Charlie reviews the new requirements and compares them with the targets in the existing OLA(Operational Level Agreement) and UC(Underpinning Contract) to see whether there are other services that are delivered to these service levels. If it turns out that these service levels have not been offered before, or there is doubt as to whether they can be achieved, then the technical designers must be consulted.

Well, Dummy Co.’s IT department has just one technical architect, and he is called Fred. Fred does what he always does when he has a new design project to get to grips with: he looks at the architecture of the infrastructure. He looks to see whether there is enough network equipment to cope with the additional load that the new service will need. He looks at the servers in the data centre to see whether there are enough of the right type. He looks at the data storage to see how much is in use. Pretty much what any technical architect would do.

However, Fred has many roles. For the projects that he undertakes, he performs the roles of availability manager, capacity manager, security manager and IT service continuity manager. So when he is reviewing the infrastructure, he takes the SLRs and looks at the requirements from the four warranty points of view. In fact, Fred takes the SLRs and converts them into detailed requirements for availability, capacity, continuity and security, and then considers the requirements in his musings. The work that Fred is doing is sometimes called a capability review. The results of this are that Fred produces an outline design and proposal of how the new requirements will be met. This of course comes at a cost, so Fred will obtain costs and quotes for any additional resources that are needed.

The main content of Fred’s proposal is as follows:

  • To fulfil the 24/7 availability requirement, additional server equipment will be required to provide the necessary resilience. The network currently operates at this level, and no additional resilience is required.
  • To meet the capacity and performance requirement, additional data storage will be added to the existing storage area network. This will also contribute towards the security requirement.
  • In addition, to meet the security requirement, additional data backups are required, and an additional off-site storage facility is needed.
  • The continuity requirement can be met with the existing recovery facility, but the recovery plans must be updated to ensure that the new serv- ice is recovered within the target time.

Fred sends the proposal back to Charlie. The next step is for Dummy Co. to decide whether it wants to spend the money required by Fred’s proposal in order to upgrade the infrastructure, agree to the customer’s requirements and win the business. This involves Charlie, Henry and Dummy Co. senior management. The decision is made to go ahead. Hurrah!

Henry and Charlie arrange to see the customer, and get into negotiation about the finer points of the SLA and, of course, the commercial stuff. Once the SLA is agreed, the detailed design work is started and Charlie liaises with Fred and others in the technical management team to help with any issues as they arise. When the design work is complete, Fred creates an SDP(Service Design Package). This marks the end of the design project. Of course the next step is to build, test and implement the solution – but that’s another story that comes under service transition, which I cover in Chapters 7 and 13.

Figure 12-5 provides an overview of the example in this section. It is not complete as I need a much bigger piece of paper than a page of this book to show it all, but hopefully it gives you an idea.

A swimlane flow diagram like the one in Figure 12-5 is a really good way to visualise how the processes will work in your organisation.

figure 12-5

PS: This article is from book <ITIL For Dummies, 2011 Edition>.

 

Categories: IT Architecture Tags: ,

checking MTU or Jumbo Frame settings with ping

February 14th, 2014 No comments

You may set your linux box’s MTU to jumbo frame sized 9000 bytes or larger, but if the switch your box connected to does not have jumbo frame enabled, then your linux box may met problems when sending & receiving packets.

So how can we get an idea of whether Jumbo Frame enabled on switch or linux box?

Of course you can log on switch and check, but we can also verify this from linux box that connects to switch.

On linux box, you can see the MTU settings of each interface using ifconfig:

[root@centos-doxer ~]# ifconfig eth0
eth0 Link encap:Ethernet HWaddr 08:00:27:3F:C5:08
UP BROADCAST RUNNING MULTICAST MTU:9000 Metric:1
RX packets:50502 errors:0 dropped:0 overruns:0 frame:0
TX packets:4579 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:9835512 (9.3 MiB) TX bytes:1787223 (1.7 MiB)
Base address:0xd010 Memory:f0000000-f0020000

As stated above, 9000 here doesn’t mean that Jumbo Frame enabled on your box to switch. As you can verify with below command:

[root@testbox ~]# ping -c 2 -M do -s 1472 testbox2
PING testbox2.example.com (192.168.29.184) 1472(1500) bytes of data. #so here 1500 bytes go through the network
1480 bytes from testbox2.example.com (192.168.29.184): icmp_seq=1 ttl=252 time=0.319 ms
1480 bytes from testbox2.example.com (192.168.29.184): icmp_seq=2 ttl=252 time=0.372 ms

— testbox2.example.com ping statistics —
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.319/0.345/0.372/0.032 ms
[root@testbox ~]#
[root@testbox ~]#
[root@testbox ~]# ping -c 2 -M do -s 1473 testbox2
PING testbox2.example.com (192.168.29.184) 1473(1501) bytes of data. #so here 1501 bytes can not go through. From here we can see that MTU for this box is 1500, although ifconfig says it’s 9000
From testbox.example.com (192.168.28.40) icmp_seq=1 Frag needed and DF set (mtu = 1500)
From testbox.example.com (192.168.28.40) icmp_seq=1 Frag needed and DF set (mtu = 1500)

— testbox2.example.com ping statistics —
0 packets transmitted, 0 received, +2 errors

Also, if your the switch is Cisco one, you can verify whether the switch port connecting server has enabled jumbo frame or not by sniffing CDP (Cisco discover protocol) packet. Here’s one example:

-bash-4.1# tcpdump -i eth0 -nn -v -c 1 ether[20:2] == 0×2000 #ether[20:2] == 0×2000 means capture only packets that have a 2 byte value of hex 2000 starting at byte 20
tcpdump: WARNING: eth0: no IPv4 address assigned
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
03:44:14.221022 CDPv2, ttl: 180s, checksum: 692 (unverified), length 287
Device-ID (0×01), length: 46 bytes: ‘ucf-c1z3-swi-5k01b.ucf.oracle.com(SSI16010QJH)’
Address (0×02), length: 13 bytes: IPv4 (1) 192.168.0.242
Port-ID (0×03), length: 16 bytes: ‘Ethernet111/1/12′
Capability (0×04), length: 4 bytes: (0×00000228): L2 Switch, IGMP snooping
Version String (0×05), length: 66 bytes:
Cisco Nexus Operating System (NX-OS) Software, Version 5.2(1)N1(4)
Platform (0×06), length: 11 bytes: ‘N5K-C5548UP’
Native VLAN ID (0x0a), length: 2 bytes: 123
AVVID trust bitmap (0×12), length: 1 byte: 0×00
AVVID untrusted ports CoS (0×13), length: 1 byte: 0×00
Duplex (0x0b), length: 1 byte: full
MTU (0×11), length: 4 bytes: 1500 bytes #so here MTU size was set to 1500 bytes
System Name (0×14), length: 18 bytes: ‘ucf-c1z3-swi-5k01b’
System Object ID (not decoded) (0×15), length: 14 bytes:
0×0000: 060c 2b06 0104 0109 0c03 0103 883c
Management Addresses (0×16), length: 13 bytes: IPv4 (1) 10.131.144.17
Physical Location (0×17), length: 13 bytes: 0×00/snmplocation
1 packets captured
1 packets received by filter
0 packets dropped by kernel
110 packets dropped by interface

PS:

  1. As for “-M do” parameter for ping, you may refer to man ping for more info. And as for DF(don’t fragment) and Path MTU Discovery mentioned in the manpage, you may read more on http://en.wikipedia.org/wiki/Path_MTU_discovery and http://en.wikipedia.org/wiki/IP_fragmentation
  2. Here’s more on tcpdump tips http://dazdaztech.wordpress.com/2013/05/17/using-tcpdump-to-see-cdp-or-lldp-packets/ and http://the-welters.com/professional/tcpdump.html
  3. Maximum packet size is the MTU plus the data-link header length. Packets are not always transmitted at the Maximum packet size. As we can see from output of iptraf -z eth0.
  4. Here’s more about MTU:

The link layer, which is typically Ethernet, sends information into the network as a series of frames. Even though the layers above may have pieces of information much larger than the frame size, the link layer breaks everything up into frames to send them over the network. This maximum size of data in a frame is known as the maximum transfer unit (MTU). You can use network configuration tools such as ip or ifconfig to set the MTU.

The size of the MTU has a direct impact on the efficiency of the network. Each frame in the link layer has a small header, so using a large MTU increases the ratio of user data to overhead (header). When using a large MTU, however, each frame of data has a higher chance of being corrupted or dropped. For clean physical links, a high MTU usually leads to better performance because it requires less overhead; for noisy links, however, a smaller MTU may actually enhance performance because less data has to be re-sent when a single frame is corrupted.

 

ITIL – From incidents to problems to changes

February 12th, 2014 No comments

The following figure shows the path from incident to problem to change. Keep this figure in mind as you read an example of this journey.

from incidents to problems to changes

from incidents to problems to changes

A user is innocently using her desktop system one day, and all of a sudden one of those dialog boxes pops up on the screen informing her she has a message. The message tells her that the spreadsheet software application has encountered error 3479 and can’t save the current spreadsheet to the user’s home folder on the network. It then gives a choice: ‘do you want to click on OK or Cancel?’ Now, this user lives in the ideal world of ITIL where everyone does things properly. So, what does the user do when faced with this error? She rings up the service desk of course.

The service desk analyst logs the call as an incident and goes through the incident management process. The analyst attempts initial diagnosis by searching the KEDB to look for a resolution, but doesn’t find one. Next the analyst escalates the incident to second-line support, not to investigate the problem but to search for a way to restore the service to the user. The incident is allocated to an engineer who manages to replicate the incident and advises the service desk analyst to tell the user to click cancel and save the file to a local folder instead of the network home folder. The service desk analyst contacts the user and the user saves the file locally. The user and service desk analyst agree that, because the service is now restored, the incident can be closed and no further action is required. The analyst tells the user to ring if it happens again. Are you happy with the story so far? It’s only happened once – IT systems are like that!

The next day the same error happens to other users and the service desk logs six incidents and links them together. In each case, the user has stored a file locally and gone away happy. However, would you like the IT department to do something more? It’s at this point that a problem record should be created. The service desk has the ability to do this, so a problem record is raised. The problem must now be allocated to an engineer or technical team to investigate and find the cause of the incidents. After some time, you hear a shout of ‘eureka!’ as the engineer identifies the root cause. The engineer creates a known error record documenting the cause and confirming that a suitable workaround is to save files locally. This information is passed to the service desk. This will hopefully make it easier for the service desk to deal with other incidents.

Now there is a decision to be made. The engineer who identified the root cause also identified a permanent solution. In this case, the cause was a change that had been made to the configuration of a number of servers to accommodate another system. This has had the effect of denying some users access to this server. The permanent solution is to make another change to the server. A decision must be made as to whether an RFC should be raised to implement the solution. Not all known errors will result in changes being made. Some will be too costly, some will be negated by new version updates. In this case, I’ll assume that the RFC is raised. It is now the role of the change management process (see Chapter 7) to manage the change through to conclusion.

When the change is implemented and reviewed, the change manager will close the change record. This will set off a game of knock down dominoes: the known error record and problem record will be closed, and if any incident records are open, these will be closed. The problem is resolved and no related incidents will happen again. And they all live happily ever after.

This is just one example, and I’m sure you can think of many more. It’s the principle that’s important: you must have a balanced approach to the sequence of activities.

PS:

This article is from book <ITIL For Dummies, 2011 Edition>.

Categories: IT Architecture Tags:

Oracle VM operations – poweron, poweroff, status, stat -r

January 27th, 2014 No comments

Here’s the script:

#!/usr/bin/perl
#1.OVM must be running before operations
#2.run ovm_vm_operation.pl status before running ovm_vm_operation.pl poweroff or poweron
use Net::SSH::Perl;
$host = $ARGV[0];
$operation = $ARGV[1];
$user = ‘root’;
$password = ‘password’;

if($host eq “help”) {
print “$0 OVM-name status|poweron|poweroff|stat-r\n”;
exit;
}

$ssh = Net::SSH::Perl->new($host);
$ssh->login($user,$password);

if($operation eq “status”) {
($stdout,$stderr,$exit) = $ssh->cmd(“ovm -uadmin -pwelcome1 vm ls|grep -v VM_test”);
open($host_fd,’>’,”/var/tmp/${host}.status”);
select $host_fd;
print $stdout;
close $host_fd;
} elsif($operation eq “poweroff”) {
open($poweroff_fd,’<’,”/var/tmp/${host}.status”);
foreach(<$poweroff_fd>){
if($_ =~ “Server_Pool|OVM|Powered”) {
next;
}
if($_ =~ /(.*?)\s+([0-9]{1,})\s+([0-9]{1,})\s+([0-9]{1,})\s+([a-zA-Z]{1,})\s+(.*)/){
$ssh->cmd(“ovm -uadmin -pwelcome1 vm poweroff -n $1 -s $6″);
sleep 12;
}
}
} elsif($operation eq “poweron”) {
open($poweron_fd,’<’,”/var/tmp/${host}.status”);
foreach(<$poweron_fd>){
if($_ =~ “Server_Pool|OVM|Running”) {
next;
}
if($_ =~ /(.*?)\s+([0-9]{1,})\s+([0-9]{1,})\s+([0-9]{1,})\s+([a-zA-Z]{1,})\s+Off(.*)/){
$ssh->cmd(“ovm -uadmin -pwelcome1 vm poweron -n $1 -s $6″);
#print “ovm -uadmin -pwelcome1 vm poweron -n $1 -s $6″;
sleep 20;
}
}
} elsif($operation eq “stat-r”) {
open($poweroff_fd,’<’,”/var/tmp/${host}.status”);
foreach(<$poweroff_fd>){
if($_ =~ /(.*?)\s+([0-9]{1,})\s+([0-9]{1,})\s+([0-9]{1,})\s+(Shutting\sDown|Initializing)\s+(.*)/){
#print “ovm -uadmin -pwelcome1 vm stat -r -n $1 -s $6″;
$ssh->cmd(“ovm -uadmin -pwelcome1 vm stat -r -n $1 -s $6″);
sleep 1;
}
}
}

You can use the following to make the script run in parallel:

for i in <all OVMs>;do (./ovm_vm_operation.pl $i status &);done

Categories: Clouding, IT Architecture, Oracle Cloud, Perl Tags:

avoid putty ssh connection sever or disconnect

January 17th, 2014 2 comments

After sometime, ssh will disconnect itself. If you want to avoid this, you can try run the following command:

while [ 1 ];do echo hi;sleep 60;done &

This will print message “hi” every 60 seconds on the standard output.

PS:

You can also set some parameters in /etc/ssh/sshd_config, you can refer to http://www.doxer.org/learn-linux/make-ssh-on-linux-not-to-disconnect-after-some-certain-time/

Categories: Linux, SHELL, Unix Tags:

“Include snapshots” made NFS shares from ZFS appliance shrinking

January 17th, 2014 No comments

Today I met one weird issue when checking one NFS share mounted from ZFS appliance. The NFS filesystem mounted on client was shrinking when I removed files as the space on that filesystem was getting low. But what made me confused was that the filesystem’s size would getting lower! Shouldn’t the free space getting larger and the size keep unchanged?

After some debugging, I found that this was caused by ZFS appliance shares’ “Include snapshots”. When I uncheck “Include snapshots”, the issue was gone!

zfs-appliance

Categories: Hardware, NAS, Storage Tags:

Service Lifecycle in ITIL and applying the service lifecycle to IT projects

January 15th, 2014 No comments
service-lifecycle

service-lifecycle

service-lifecycle-in-development-project

service-lifecycle-in-development-project

More:

Here’s a brief description of each activity of a typical project and its relation to the service lifecycle:

check.png Business case and project initiation: You use a business case to justify the cost and effort involved in providing the new service or changing an existing service. The business case triggers the project initiation. These activities happen at the service strategy stage.

check.png Requirements gathering and analysis: You identify and analyse the detailed requirements of the service or change. These activities happen in the service design stage.

check.png Design: You produce a design of the service that meets the requirements. This is usually a paper-based design at this point. These activities take place in the service design stage.

check.png Build: The physical bit where you acquire the solution, such as building the hardware, the servers and networks, or programming the software application. These activities happen in the service transition stage.

check.png Test: Testing the service is essential to ensure it meets the needs of the business, works in the way you expected, and can be supported. These activities also take place during the service transition stage.

check.png Implement or deploy: Launching the new or changed service into the live operational environment. This takes place during the service transition stage.

check.png Deliver and support: The service is now in the live or production environment and is being used by the users. The IT organisation must make sure the service is working and fix it quickly when it goes wrong. These activities take place during the service operation stage.

check.png Improve: After a service has been operated for some time, it’s often possible to optimise or improve the way it’s delivered. These activities are part of the CSI stage.

PS: This is from book <ITIL For Dummies, 2011 Edition>.

Categories: IT Architecture Tags:

roles in ITIL

January 14th, 2014 No comments

ITIL is best practice guidance and describes capabilities for managing IT services. The four main elements of these capabilities are: processes, functions, roles and the service lifecycle.

Implementing ITIL in your organisation means using the ITIL service management processes. Processes provide a documented way of doing things. When adopting the ITIL processes you must consider who is going to do what. This brings me to functions. Functions are organisational units, like a team or department, and are the source of the people who perform the process activities. ITIL offers some advice about the functions you may have in your organisation.

The combination of processes, functions and roles allows you to make best use of service management in your organisation.

Knowing who does what is essential to the success of ITIL. The ITIL books suggest lots of roles associated with each process, and I give a brief description of them in the relevant chapters. However, you benefit from knowing a few really important roles from the outset.

processes_functions_roles

The service owner

The service owner owns a service. The service owner is usually someone in the IT provider organisation, and the role provides a point of contact for a given service. The service owner doesn’t necessarily know everything about the service, but he does know a man (or woman) who does.

Here are some responsibilities of the service owner role:

check.png Participates in internal service review meetings

check.png Represents the service across the organisation

check.png Represents the service in change advisory board meetings

check.png Is responsible for continual improvement of the service and management of change in the service

check.png Understands the service and its components

The process owner

A process owner owns a process. This role is accountable for the process. For example, if the incident management process doesn’t achieve its aim of restoring the service to the user, the process owner gets shouted at (hopefully not literally). The process owner is accountable for the process and is responsible for identifying improvements to ensure that the process continues to be effective and efficient. Here are a few responsibilities of the role:

check.png Ensuring that the process is performed in accordance with the agreed and documented process

check.png Documenting and publicising the process

check.png Defining and reviewing the measurement of the process using metrics such as key performance indicators (KPIs)

remember.eps You must ensure that every service management process you adopt has a defined process owner.

The process manager

A process owner (see the previous section) is accountable for the process, but may not get involved in the day-to-day management of the process. This is a separate role often allocated to a different person: the process manager.

itildefinition.eps A process manager is responsible for operational management of a process. The process manager’s responsibilities include planning and coordination of all activities required to carry out, monitor and report on the process.

One process may have several process managers, for example it may have regional change managers or IT service continuity managers for each data centre.

remember.eps You must ensure that every service management process that you adopt has a defined process manager – though this may, of course, be the same person as the process owner.

The process practitioner

The process practitioner is the role that carries out one or many of the process activities. Basically, these people are the ones who do the work. However, it’s important that they have a clear list of responsibilities related to the process that they get involved in.

Note:

This is from book <ITIL For Dummies, 2011 Edition>.

Categories: IT Architecture Tags: ,