Archive

Archive for the ‘Network’ Category

resolved – net/core/dev.c:1894 skb_gso_segment+0x298/0x370()

April 19th, 2016 Comments off

Today on one of our servers, there were a lot of errors in /var/log/messages like below:

║Apr 14 21:50:25 test01 kernel: WARNING: at net/core/dev.c:1894
║skb_gso_segment+0x298/0x370()
║Apr 14 21:50:25 test01 kernel: Hardware name: SUN FIRE X4170 M3
║Apr 14 21:50:25 test01 kernel: : caps=(0x60014803, 0x0) len=255
║data_len=215 ip_summed=1
║Apr 14 21:50:25 test01 kernel: Modules linked in: dm_nfs nfs fscache
║auth_rpcgss nfs_acl xen_blkback xen_netback xen_gntdev xen_evtchn lockd
║ @ sunrpc 8021q garp bridge stp llc bonding be2iscsi iscsi_boot_sysfs ib_iser
║ @ rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic uio
║ @ ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp dm_round_robin libiscsi
║ @ dm_multipath scsi_transport_iscsi xenfs xen_privcmd dm_mirror video sbs sbshc
║acpi_memhotplug acpi_ipmi ipmi_msghandler parport_pc lp parport sr_mod cdrom
║ixgbe hwmon dca snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq
║snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore
║snd_page_alloc iTCO_wdt iTCO_vendor_support pcspkr ghes i2c_i801 hed i2c_core
║dm_region_hash dm_log dm_mod usb_storage ahci libahci sg shpchp megaraid_sas
║sd_mod crc_t10dif ext3 jbd mbcache
║Apr 14 21:50:25 test01 kernel: Pid: 0, comm: swapper Tainted: G W
║ 2.6.39-400.264.4.el5uek #1
║Apr 14 21:50:25 test01 kernel: Call Trace:
║Apr 14 21:50:25 test01 kernel: <IRQ> [<ffffffff8143dab8>] ?
║skb_gso_segment+0x298/0x370
║Apr 14 21:50:25 test01 kernel: [<ffffffff8106f300>]
║warn_slowpath_common+0x90/0xc0
║Apr 14 21:50:25 test01 kernel: [<ffffffff8106f42e>]
║warn_slowpath_fmt+0x6e/0x70
║Apr 14 21:50:25 test01 kernel: [<ffffffff810d73a7>] ?
║irq_to_desc+0x17/0x20
║Apr 14 21:50:25 test01 kernel: [<ffffffff812faf0c>] ?
║notify_remote_via_irq+0x2c/0x40
║Apr 14 21:50:25 test01 kernel: [<ffffffff8100a820>] ?
║xen_clocksource_read+0x20/0x30
║Apr 14 21:50:25 test01 kernel: [<ffffffff812faf4c>] ?
║xen_send_IPI_one+0x2c/0x40
║Apr 14 21:50:25 test01 kernel: [<ffffffff81011f10>] ?
║xen_smp_send_reschedule+0x10/0x20
║Apr 14 21:50:25 test01 kernel: [<ffffffff81056e0b>] ?
║ttwu_queue_remote+0x4b/0x60
║Apr 14 21:50:25 test01 kernel: [<ffffffff81509a7e>] ?
║_raw_spin_unlock_irqrestore+0x1e/0x30
║Apr 14 21:50:25 test01 kernel: [<ffffffff8143dab8>]
║skb_gso_segment+0x298/0x370
║Apr 14 21:50:25 test01 kernel: [<ffffffff8143dba6>]
║dev_gso_segment+0x16/0x50
║Apr 14 21:50:25 test01 kernel: [<ffffffff8143dfb5>]
║dev_hard_start_xmit+0x3d5/0x530
║Apr 14 21:50:25 test01 kernel: [<ffffffff8145a074>]
║sch_direct_xmit+0xc4/0x1d0
║Apr 14 21:50:25 test01 kernel: [<ffffffff8143e811>]
║dev_queue_xmit+0x161/0x410
║Apr 14 21:50:25 test01 kernel: [<ffffffff815099de>] ?
║_raw_spin_lock+0xe/0x20
║Apr 14 21:50:25 test01 kernel: [<ffffffffa045820c>]
║br_dev_queue_push_xmit+0x6c/0xa0 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffff81076e77>] ?
║local_bh_enable+0x27/0xa0
║Apr 14 21:50:25 test01 kernel: [<ffffffffa045e7ba>]
║br_nf_dev_queue_xmit+0x2a/0x90 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa045f668>]
║br_nf_post_routing+0x1f8/0x2e0 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffff81467428>]
║nf_iterate+0x78/0x90
║Apr 14 21:50:25 test01 kernel: [<ffffffff8146777c>]
║nf_hook_slow+0x7c/0x130
║Apr 14 21:50:25 test01 kernel: [<ffffffffa04581a0>] ?
║br_forward_finish+0x70/0x70 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa04581a0>] ?
║br_forward_finish+0x70/0x70 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa0458130>] ?
║br_flood_deliver+0x20/0x20 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa0458186>]
║br_forward_finish+0x56/0x70 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa045eba4>]
║br_nf_forward_finish+0xb4/0x180 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa045f36f>]
║br_nf_forward_ip+0x26f/0x370 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffff81467428>]
║nf_iterate+0x78/0x90
║Apr 14 21:50:25 test01 kernel: [<ffffffff8146777c>]
║nf_hook_slow+0x7c/0x130
║Apr 14 21:50:25 test01 kernel: [<ffffffffa0458130>] ?
║br_flood_deliver+0x20/0x20 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffff81467428>] ?
║nf_iterate+0x78/0x90
║Apr 14 21:50:25 test01 kernel: [<ffffffffa0458130>] ?
║br_flood_deliver+0x20/0x20 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa04582c8>]
║__br_forward+0x88/0xc0 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa0458356>]
║br_forward+0x56/0x60 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa04591fc>]
║br_handle_frame_finish+0x1ac/0x240 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa045ee1b>]
║br_nf_pre_routing_finish+0x1ab/0x350 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffff8115bfe9>] ?
║kmem_cache_alloc_trace+0xc9/0x1a0
║Apr 14 21:50:25 test01 kernel: [<ffffffffa045fc55>]
║br_nf_pre_routing+0x305/0x370 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffff8100122a>] ?
║xen_hypercall_xen_version+0xa/0x20
║Apr 14 21:50:25 test01 kernel: [<ffffffff81467428>]
║nf_iterate+0x78/0x90
║Apr 14 21:50:25 test01 kernel: [<ffffffff8146777c>]
║nf_hook_slow+0x7c/0x130

To fix this, we should disable LRO(large receive offload) first:

for i in eth0 eth1 eth2 eth3;do /sbin/ethtool -K $i lro off;done

And if the NICs are of Intel 10G, the we should disable GRO(generic receive offload) too:

for i in eth0 eth1 eth2 eth3;do /sbin/ethtool -K $i gro off;done

Here's the command to disable both of LRO/GRO:

for i in eth0 eth1 eth2 eth3;do /sbin/ethtool -K $i gro off;/sbin/ethtool -K $i lro off;done

 

change NIC configuration to make new VLAN tag take effect

April 2nd, 2015 Comments off

Sometimes you may want to add vlan tag to existing NIC, and after the addition, you'll need to change DNS names bound to the old tag with new IPs in the newly added vlan tag. After all these two steps done, you'll need to make changes on the hosts(take linux for example) to make these changes into effect.

In this example, I'm going to move the old v118_FE to the new VLAN v117_FE.

ifconfig v118_FE down
ifconfig bond0.118 down
cd /etc/sysconfig/network-scripts
mv ifcfg-bond0.118 ifcfg-bond0.117
vi ifcfg-bond0.117
    DEVICE=bond0.117
    BOOTPROTO=none
    USERCTL=no
    ONBOOT=yes
    BRIDGE=v117_FE
    VLAN=yes
mv ifcfg-v118_FE ifcfg-v117_FE
vi ifcfg-v117_FE
    DEVICE=v117_FE
    BOOTPROTO=none
    USERCTL=no
    ONBOOT=yes
    STP=off
    TYPE=Bridge
    IPADDR=10.119.236.13
    NETMASK=255.255.248.0
    NETWORK=10.119.232.0
    BROADCAST=10.119.239.255
ifup v117_FE
ifup bond0.117
reboot

resolved – VPN Service not available, The VPN agent service is not responding. Please restart this application after a while.

March 30th, 2015 13 comments

Today when I tried to connect to VPN through Cisco AnyConnect Secure Mobility Client, the following error dialog prompted:

 

VPN Service not available

VPN Service not available

And after I clicked "OK" button, the following dialog prompted:

The VPN agent service is not responding

The VPN agent service is not responding

So all of the two dialogs were complaining about "VPN service" not available/not responding. So I ran "services.msc" in windows run and found below:

vpn service

vpn service

When I checked, the service "Cisco AnyConnect Secure Mobility Agent" was stopped, and the "Startup type" was "Manual". So I changed "Startup type" to "Automatic", click "Start", then "OK" to save.

After this, Cisco AnyConnect Secure Mobility Client was running ok and I can connect through it to VPN.

TCP Window Scaling – values about TCP buffer size

February 4th, 2015 Comments off

TCP Window Scaling(TCP socket buffer size, TCP window size)

/proc/sys/net/ipv4/tcp_window_scaling #1 is to enable window scaling
/proc/sys/net/ipv4/tcp_rmem - memory reserved for TCP rcv buffers. minimum, initial and maximum buffer size
/proc/sys/net/ipv4/tcp_wmem - memory reserved for TCP send buffers
/proc/sys/net/core/rmem_max - maximum receive window
/proc/sys/net/core/wmem_max - maximum send window

The following values (which are the defaults for 2.6.17 with more than 1 GByte of memory) would be reasonable for all paths with a 4MB BDP or smaller:

echo 1 > /proc/sys/net/ipv4/tcp_moderate_rcvbuf #autotuning enabled. The receiver buffer size (and TCP window size) is dynamically updated (autotuned) for each connection. (Sender side autotuning has been present and unconditionally enabled for many years now).
echo 108544 > /proc/sys/net/core/wmem_max
echo 108544 > /proc/sys/net/core/rmem_max
echo "4096 87380 4194304" > /proc/sys/net/ipv4/tcp_rmem
echo "4096 16384 4194304" > /proc/sys/net/ipv4/tcp_wmem

Advanced TCP features

cat /proc/sys/net/ipv4/tcp_timestamps #more is here(allow more accurate RTT measurements for deriving the retransmission timeout estimator; protect against old segments from the previous incarnations of the TCP connection; allow detection of unnecessary retransmissions. But enabling it will also allow you to guess the uptime of a target system.)
cat /proc/sys/net/ipv4/tcp_sack

Here are some background knowledge:

  • The throughput of a communication is limited by two windows: the congestion window and the receive window(TCP congestion window is maintained by the sender, and TCP window size is maintained by the receiver). The former tries not to exceed the capacity of the network (congestion control) and the latter tries not to exceed the capacity of the receiver to process data (flow control). The receiver may be overwhelmed by data if for example it is very busy (such as a Web server). Each TCP segment contains the current value of the receive window. If for example a sender receives an ack which acknowledges byte 4000 and specifies a receive window of 10000 (bytes), the sender will not send packets after byte 14000, even if the congestion window allows it.
  • TCP uses what is called the "congestion window", or CWND, to determine how many packets can be sent at one time. The larger the congestion window size, the higher the throughput. The TCP "slow start" and "congestion avoidance" algorithms determine the size of the congestion window. The maximum congestion window is related to the amount of buffer space that the kernel allocates for each socket. For each socket, there is a default value for the buffer size, which can be changed by the program using a system library call just before opening the socket. There is also a kernel enforced maximum buffer size. The buffer size can be adjusted for both the send and receive ends of the socket.
  • To get maximal throughput it is critical to use optimal TCP send and receive socket buffer sizes for the link you are using. If the buffers are too small, the TCP congestion window will never fully open up. If the receiver buffers are too large, TCP flow control breaks and the sender can overrun the receiver, which will cause the TCP window to shut down. This is likely to happen if the sending host is faster than the receiving host. Overly large windows on the sending side is not usually a problem as long as you have excess memory; note that every TCP socket has the potential to request this amount of memory even for short connections, making it easy to exhaust system resources.
  • More about TCP Buffer Sizing is here.
  • More about /proc/sys/net/ipv4/* Variables is here.

resolved – high value of RX overruns in ifconfig

November 13th, 2014 Comments off

Today we tried ssh to one host, it stuck soon there. And we observed that RX overruns in ifconfig output was high:

[root@test /]# ifconfig bond0
bond0 Link encap:Ethernet HWaddr 00:10:E0:0D:AD:5E
inet6 addr: fe80::210:e0ff:fe0d:ad5e/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:140234052 errors:0 dropped:0 overruns:12665 frame:0
TX packets:47259596 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:34204561358 (31.8 GiB) TX bytes:21380246716 (19.9 GiB)

Receiver overruns usually occur when packets come in faster than the kernel can service the last interrupt. But in our case, we are seeing increasing inbound errors on interface Eth105/1/7 of device bcd-c1z1-swi-5k07a/b, did shut and no shut but no change. And after some more debugging, we found one bad SFP/cable(uplink port, e.g. WAN port, more info here). After replaced that, the server came back to normal.

example-swi-5k07a# sh int Eth105/1/7 | i err
4065 input error 0 short frame 0 overrun 0 underrun 0 ignored
0 output error 0 collision 0 deferred 0 late collision

example-swi-5k07a# sh int Eth105/1/7 | i err
4099 input error 0 short frame 0 overrun 0 underrun 0 ignored
0 output error 0 collision 0 deferred 0 late collision

example-swi-5k07a# sh int Eth105/1/7 counters errors

--------------------------------------------------------------------------------
Port Align-Err FCS-Err Xmit-Err Rcv-Err UnderSize OutDiscards
--------------------------------------------------------------------------------
Eth105/1/7 3740 483 0 4223 0 0

--------------------------------------------------------------------------------
Port Single-Col Multi-Col Late-Col Exces-Col Carri-Sen Runts
--------------------------------------------------------------------------------
Eth105/1/7 0 0 0 0 0 3740

--------------------------------------------------------------------------------
Port Giants SQETest-Err Deferred-Tx IntMacTx-Er IntMacRx-Er Symbol-Err
--------------------------------------------------------------------------------
Eth105/1/7 0 -- 0 0 0 0

example-swi-5k07a# sh int Eth105/1/7 counters errors

--------------------------------------------------------------------------------
Port Align-Err FCS-Err Xmit-Err Rcv-Err UnderSize OutDiscards
--------------------------------------------------------------------------------
Eth105/1/7 4386 551 0 4937 0 0

--------------------------------------------------------------------------------
Port Single-Col Multi-Col Late-Col Exces-Col Carri-Sen Runts
--------------------------------------------------------------------------------
Eth105/1/7 0 0 0 0 0 4386

--------------------------------------------------------------------------------
Port Giants SQETest-Err Deferred-Tx IntMacTx-Er IntMacRx-Er Symbol-Err
--------------------------------------------------------------------------------
Eth105/1/7 0 -- 0 0 0 0

PS:

During debugging, we also found on server side, the interface eth0 is half duplex and speed at 100Mb/s:

-bash-3.2# ethtool eth0
Settings for eth0:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Advertised auto-negotiation: Yes
Speed: 100Mb/s
Duplex: Half
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: pumbg
Wake-on: g
Current message level: 0x00000003 (3)
Link detected: yes

However, it should be full duplex and 1000Mb/s. So we also changed the speed, duplex to auto auto on switch and after that, the OS side is now showing the expected value:

-bash-3.2# ethtool eth0
Settings for eth0:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: pumbg
Wake-on: g
Current message level: 0x00000003 (3)
Link detected: yes

arping in linux for getting MAC address and update ARP caches by broadcast

August 27th, 2014 Comments off

Suppose we want to know MAC address of 10.182.120.210. then we can log on one linux host which is in the same subnet of 10.182.120.210, e.g. 10.182.120.188:

[root@centos-doxer ~]#arping -U -c 3 -I bond0 -s 10.182.120.188 10.182.120.210
ARPING 10.182.120.210 from 10.182.120.188 bond0
Unicast reply from 10.182.120.210 [00:21:CC:B7:1F:EB] 1.397ms
Unicast reply from 10.182.120.210 [00:21:CC:B7:1F:EB] 1.378ms
Sent 3 probes (1 broadcast(s))
Received 2 response(s)

So 00:21:CC:B7:1F:EB is the MAC address of 10.182.120.210. And from here we can see that IP address 10.182.120.210 is now used in local network.

Another use of arping is to update ARP cache. One scene is that, you assign a new machine with one being used IP address, then you will not able to log on the old machine with the IP address. Even after you shutdown the new machine, you may still not able to access the old machine. And here's the resolution:

Suppose we have configured the new machine NIC eth0 with IP address 192.168.0.2 which is already used by one old machine. Log on the new machine and run the following commands:

arping -A 192.168.0.2 -I eth0 192.168.0.2
arping -U -s 192.168.0.2 -I eth0 192.168.0.1 #this is sending ARP broadcast, and 192.168.0.1 is the gateway address.
/sbin/arping -I eth0 -c 3 -s 192.168.0.2 192.168.0.3 #update neighbours' ARP caches

PS:

  1. You can run 'arp -nae'(linux) or 'arp -a'(windows) to get arp table.
  2. Here is more about arp sproof prevention (in Chinese. statistic binding/arp firewall/small vlan/PPPoE/immune network).
  3. Here is about Proxy ARP(join broadcast LAN with serial link on router).

resolved – check backend OHS httpd servers for BIG ip F5 LTM VIP

May 23rd, 2014 Comments off

Assume you want to check the OHS or httpd servers one LTM VIP example.vip.com is routing traffic to. Then here's the steps:

  1. get the ip address of VIP example.vip.com;
  2. log on LTM's BUI. Local traffic-> virtual servers -> virtual server list, search ip
  3. click "edit" below column "resource"
  4. note down default pool
  5. search pool name in local traffic -> virtual servers -> pools -> pool list
  6. click the number below column members. Then you'll find the OHS servers and ports the VIP will route traffic to.

PS:

  • To check connections including one specific IP, run below
    • show /sys connection |grep -w <IP>

test telnet from VLAN on cisco router device

May 22nd, 2014 Comments off

If you want to test telnet/ping connection from one vlan to specific destination IP, here is the howto:

test-router# telnet 10.200.244.14 80 source vlan 125
Trying 10.200.244.14...
Connected to 10.200.244.14.
Escape character is '^]'.

ucf-c1z1-rtr-1# ping 10.180.220.71 source 10.180.200.2

PING 10.180.220.71 (10.180.220.71) from 10.240.200.2: 56 data bytes
Request 0 timed out
Request 1 timed out

Good luck.

tcpdump & wireshark tips

March 13th, 2014 Comments off

tcpdump [ -AdDefIKlLnNOpqRStuUvxX ] [ -B buffer_size ] [ -c count ]

[ -C file_size ] [ -G rotate_seconds ] [ -F file ]
[ -i interface ] [ -m module ] [ -M secret ]
[ -r file ] [ -s snaplen ] [ -T type ] [ -w file ]
[ -W filecount ]
[ -E spi@ipaddr algo:secret,... ]
[ -y datalinktype ] [ -z postrotate-command ] [ -Z user ] [ expression ]

#general format of a tcp protocol line

src > dst: flags data-seqno ack window urgent options
Src and dst are the source and destination IP addresses and ports.
Flags are some combination of S (SYN), F (FIN), P (PUSH), R (RST), W (ECN CWR) or E (ECN-Echo), or a single '.'(means no flags were set)
Data-seqno describes the portion of sequence space covered by the data in this packet.
Ack is sequence number of the next data expected the other direction on this connection.
Window is the number of bytes of receive buffer space available the other direction on this connection.
Urg indicates there is 'urgent' data in the packet.
Options are tcp options enclosed in angle brackets (e.g., <mss 1024>).

tcpdump -D #list of the network interfaces available
tcpdump -e #Print the link-level header on each dump line
tcpdump -S #Print absolute, rather than relative, TCP sequence numbers
tcpdump -s <snaplen> #Snarf snaplen bytes of data from each packet rather than the default of 65535 bytes
tcpdump -i eth0 -S -nn -XX vlan
tcpdump -i eth0 -S -nn -XX arp
tcpdump -i bond0 -S -nn -vvv udp dst port 53
tcpdump -i bond0 -S -nn -vvv host testhost
tcpdump -nn -S -vvv "dst host host1.example.com and (dst port 1521 or dst port 6200)"

tcpdump -vv -x -X -s 1500 -i eth0 'port 25' #traffic on SMTP. -xX to print data in addition to header in both hex/ASCII. use -s 192 to watch NFS traffic(NFS requests are very large and much of the detail won't be printed unless snaplen is increased).

tcpdump -nn -S udp dst port 111 #note that telnet is based on tcp protocol, NOT udp. So if you want to test UDP connection(udp is connection-less), then you must start up the app, then use tcpdump to test.

tcpdump -nn -S udp dst portrange 1-1023

Wireshark Capture Filters (in Capture -> Options)

Wireshark DisplayFilters (in toolbar)

Here is another example of TCP 3-way handshake & 4-way handshake & sync flood

EVENT DIAGRAM
Host A sends a TCP SYNchronize packet to Host BHost B receives A's SYNHost B sends a SYNchronize-ACKnowledgementHost A receives B's SYN-ACKHost A sends ACKnowledge

Host B receives ACK.
TCP socket connection is ESTABLISHED.

3-way-handshake
TCP Three Way Handshake
(SYN,SYN-ACK,ACK)

TCP-CLOSE_WAIT

 

The upper part shows the states on the end-point initiating the termination.

The lower part the states on the other end-point.

So the initiating end-point (i.e. the client) sends a termination request to the server and waits for an acknowledgement in state FIN-WAIT-1. The server sends an acknowledgement and goes in state CLOSE_WAIT. The client goes into FIN-WAIT-2 when the acknowledgement is received and waits for an active close. When the server actively sends its own termination request, it goes into LAST-ACK and waits for an acknowledgement from the client. When the client receives the termination request from the server, it sends an acknowledgement and goes into TIME_WAIT and after some time into CLOSED. The server goes into CLOSED state once it receives the acknowledgement from the client.

A socket can be in CLOSE_WAIT state indefinitely until the application closes it. Faulty scenarios would be like filedescriptor leak, server not being execute close() on socket leading to pile up of close_wait sockets. At java level, this manifests as "Too many open files" error. The value cannot be changed.

TIME_WAIT is just a time based wait on socket before closing down the connection permanently. Under most circumstances, sockets in TIME_WAIT is nothing to worry about. The value can be changed(tcp_time_wait_interval).

More info about time_wait & close_wait can be found here.

PS:

You can refer to this article for a detailed explanation of tcp three-way handshake establishing/terminating a connection. And for tcpdump one, you can check below:

[root@host2 ~]# telnet host1 14100
Trying 10.240.249.139...
Connected to host1.us.oracle.com (10.240.249.139).
Escape character is '^]'.
^]
telnet> quit
Connection closed.

[root@host1 ~]# tcpdump -vvv -S host host2
tcpdump: WARNING: eth0: no IPv4 address assigned
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes
03:16:39.188951 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto: TCP (6), length: 60) host1.us.oracle.com.14100 > host2.us.oracle.com.18890: S, cksum 0xa806 (correct), 3445765853:3445765853(0) ack 3946095098 win 5792 <mss 1460,sackOK,timestamp 854077220 860674218,nop,wscale 7> #2. host1 ack SYN package by host2, and add it by 1 as the number to identify this connection(3946095098). Then host1 send a SYN(3445765853).
03:16:41.233807 IP (tos 0x0, ttl 64, id 6650, offset 0, flags [DF], proto: TCP (6), length: 52) host1.us.oracle.com.14100 > host2.us.oracle.com.18890: F, cksum 0xdd48 (correct), 3445765854:3445765854(0) ack 3946095099 win 46 <nop,nop,timestamp 854079265 860676263> #5. host1 Ack F(3946095099), and then it send a F just as host2 did(3445765854 unchanged). 

[root@host2 ~]# tcpdump -vvv -S host host1
tcpdump: WARNING: eth0: no IPv4 address assigned
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes
03:16:39.188628 IP (tos 0x10, ttl 64, id 31059, offset 0, flags [DF], proto: TCP (6), length: 60) host2.us.oracle.com.18890 > host1.us.oracle.com.14100: S, cksum 0x265b (correct), 3946095097:3946095097(0) win 5792 <mss 1460,sackOK,timestamp 860674218 854045985,nop,wscale 7> #1. host2 send a SYN package to host1(3946095097)
03:16:39.188803 IP (tos 0x10, ttl 64, id 31060, offset 0, flags [DF], proto: TCP (6), length: 52) host2.us.oracle.com.18890 > host1.us.oracle.com.14100: ., cksum 0xed44 (correct), 3946095098:3946095098(0) ack 3445765854 win 46 <nop,nop,timestamp 860674218 854077220> #3. host2 ack the SYN sent by host1, and add 1 to identify this connection. The tcp connection is now established(3946095098 unchanged, ack 3445765854).
03:16:41.233397 IP (tos 0x10, ttl 64, id 31061, offset 0, flags [DF], proto: TCP (6), length: 52) host2.us.oracle.com.18890 > host1.us.oracle.com.14100: F, cksum 0xe546 (correct), 3946095098:3946095098(0) ack 3445765854 win 46 <nop,nop,timestamp 860676263 854077220> #4. host2 send a F(in) with a Ack, F will inform host1 that no more data needs sent(3946095098 unchanged), and ack is uded to identify the connection previously established(3445765854 unchanged)
03:16:41.233633 IP (tos 0x10, ttl 64, id 31062, offset 0, flags [DF], proto: TCP (6), length: 52) host2.us.oracle.com.18890 > host1.us.oracle.com.14100: ., cksum 0xdd48 (correct), 3946095099:3946095099(0) ack 3445765855 win 46 <nop,nop,timestamp 860676263 854079265> #6. host2 ack host1's F(3445765855), and the empty flag to identify the connection(3946095099 unchanged).

checking MTU or Jumbo Frame settings with ping

February 14th, 2014 Comments off

You may set your linux box's MTU to jumbo frame sized 9000 bytes or larger(can be between 1500 and 9000 for a 1GbE NIC, and 1500 and 64000 for a 10GbE NIC), but if the switch your box connected to does not have jumbo frame enabled, then your linux box may met problems when sending & receiving packets.

So how can we get an idea of whether Jumbo Frame enabled on switch or linux box?

Of course you can log on switch and check, but we can also verify this from linux box that connects to switch.

On linux box, you can see the MTU settings of each interface using ifconfig:

[root@centos-doxer ~]# ifconfig eth0
eth0 Link encap:Ethernet HWaddr 08:00:27:3F:C5:08
UP BROADCAST RUNNING MULTICAST MTU:9000 Metric:1
RX packets:50502 errors:0 dropped:0 overruns:0 frame:0
TX packets:4579 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:9835512 (9.3 MiB) TX bytes:1787223 (1.7 MiB)
Base address:0xd010 Memory:f0000000-f0020000

As stated above, 9000 here doesn't mean that Jumbo Frame enabled on your box to switch. As you can verify with below command:

[root@testbox ~]# ping -c 2 -M do -s 1472 testbox2
PING testbox2.example.com (192.168.29.184) 1472(1500) bytes of data. #so here 1500 bytes go through the network
1480 bytes from testbox2.example.com (192.168.29.184): icmp_seq=1 ttl=252 time=0.319 ms
1480 bytes from testbox2.example.com (192.168.29.184): icmp_seq=2 ttl=252 time=0.372 ms

--- testbox2.example.com ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.319/0.345/0.372/0.032 ms

[root@testbox ~]# ping -c 2 -M do -s 1473 testbox2
PING testbox2.example.com (192.168.29.184) 1473(1501) bytes of data. #so here 1501 bytes can not go through. From here we can see that MTU for this box is 1500, although ifconfig says it's 9000
From testbox.example.com (192.168.28.40) icmp_seq=1 Frag needed and DF set (mtu = 1500)
From testbox.example.com (192.168.28.40) icmp_seq=1 Frag needed and DF set (mtu = 1500)

--- testbox2.example.com ping statistics ---
0 packets transmitted, 0 received, +2 errors

Also, if your the switch is Cisco one, you can verify whether the switch port connecting server has enabled jumbo frame or not by sniffing CDP (Cisco discover protocol) packet. Here's one example:

-bash-4.1# tcpdump -i eth0 -nn -v -s 0 -c 1 ether[20:2] == 0x2000 #ether[20:2] == 0x2000 means capture only packets that have a 2 byte value of hex 2000 starting at byte 20
tcpdump: WARNING: eth0: no IPv4 address assigned
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
03:44:14.221022 CDPv2, ttl: 180s, checksum: 692 (unverified), length 287
Device-ID (0x01), length: 46 bytes: 'ucf-test-swi-5k01b.example.com(SSI16010QJH)'
Address (0x02), length: 13 bytes: IPv4 (1) 192.168.0.242
Port-ID (0x03), length: 16 bytes: 'Ethernet111/1/12'
Capability (0x04), length: 4 bytes: (0x00000228): L2 Switch, IGMP snooping
Version String (0x05), length: 66 bytes:
Cisco Nexus Operating System (NX-OS) Software, Version 5.2(1)N1(4)
Platform (0x06), length: 11 bytes: 'N5K-C5548UP'
Native VLAN ID (0x0a), length: 2 bytes: 123
AVVID trust bitmap (0x12), length: 1 byte: 0x00
AVVID untrusted ports CoS (0x13), length: 1 byte: 0x00
Duplex (0x0b), length: 1 byte: full
MTU (0x11), length: 4 bytes: 1500 bytes #so here MTU size was set to 1500 bytes
System Name (0x14), length: 18 bytes: 'ucf-c1z3-swi-5k01b'
System Object ID (not decoded) (0x15), length: 14 bytes:
0x0000: 060c 2b06 0104 0109 0c03 0103 883c
Management Addresses (0x16), length: 13 bytes: IPv4 (1) 10.131.144.17
Physical Location (0x17), length: 13 bytes: 0x00/snmplocation
1 packets captured
1 packets received by filter
0 packets dropped by kernel
110 packets dropped by interface

PS:

 1. As for "-M do" parameter for ping, you may refer to man ping for more info. And as for DF(don't fragment) and Path MTU Discovery mentioned in the manpage, you may read more on http://en.wikipedia.org/wiki/Path_MTU_discovery and http://en.wikipedia.org/wiki/IP_fragmentation

 2. Maximum packet size is the MTU plus the data-link header length. Packets are not always transmitted at the Maximum packet size. As we can see from output of iptraf -z eth0. To set MTU, you can use ip link set <device> mtu <size>. Here's more about "ip":

ip [ OPTIONS ] OBJECT { COMMAND | help }

OBJECT := { link | addr | addrlabel | route | rule | neigh | tunnel | maddr | mroute | monitor }

OPTIONS := { -V[ersion] | -s[tatistics] | -r[esolve] | -f[amily] { inet | inet6 | ipx | dnet | link } | -o[neline] }

-s, -stats, -statistics

output more information. If the option appears twice or more, the amount of information increases. As a rule, the information is statistics or some time values.

 3. If you find no related data returned:

[root@test ~]# tcpdump -i eth0 -nn -vvv -c 1 ether[20:2] == 0x2000

tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes
06:17:41.347122 IP (tos 0x0, ttl 64, id 8455, offset 0, flags [+], proto: UDP (17), length: 1500) 10.240.19.231.513 > 10.240.23.255.513: UDP, length 7020
1 packets captured
1 packets received by filter
0 packets dropped by kernel

Then you can increase the size of data from each packet. Setting snaplen to 0 sets it to the default of 65535, for backwards compatibility with recent older versions of tcpdump:

[root@test ~]# tcpdump -i eth0 -nn -vvv -s 0 -c 1 ether[20:2] == 0x2000

Or if CDP totally cannot help you get the info, you can try LLDP(Link Layer Discovery Protocol) which replaced CDP(or you can run lldpctl after installing package lldpd, install method here):

[root@test ~]# tcpdump -i eth3 -s 1500 -XX -c 1 'ether proto 0x88cc'

[root@test ~]# tcpdump -i eth3 -v -s 1500 -c 1 '(ether[12:2]=0x88cc or ether[20:2]=0x2000)'

# lldpctl
-------------------------------------------------------------------------------
LLDP neighbors:
-------------------------------------------------------------------------------
Interface: eth4, via: LLDP, RID: 2, Time: 0 day, 00:21:15
Chassis:
ChassisID: local test-cz01-o13-swi-2
Port:
PortID: ifalias Ex0/12
-------------------------------------------------------------------------------
Interface: eth5, via: LLDP, RID: 2, Time: 0 day, 07:17:52
Chassis:
ChassisID: local test-cz01-o13-swi-2
Port:
PortID: ifalias Ex0/11
-------------------------------------------------------------------------------
Interface: eth0, via: LLDP, RID: 1, Time: 0 day, 07:18:15
Chassis:
ChassisID: mac 70:ca:9b:d2:f2:3f
SysName: test-cz01-g13-swi-1.ucf.oracle.com
SysDescr: Cisco IOS Software, Catalyst 4500 L3 Switch Software (cat4500-IPBASEK9-M), Version 15.0(2)SG6, RELEASE SOFTWARE (fc1)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2012 by Cisco Systems, Inc.
Compiled Wed 31-Oct-12 13:38 by prod_r
MgmtIP: 192.168.80.75
Capability: Bridge, on
Capability: Router, on
Port:
PortID: ifname Gi1/19
PortDescr: GigabitEthernet1/19
PMD autoneg: supported: yes, enabled: yes
Adv: 10Base-T, HD: yes, FD: yes
Adv: 100Base-TX, HD: yes, FD: yes
Adv: 1000Base-T, HD: yes, FD: yes
MAU oper type: 1000BaseTFD - Four-pair Category 5 UTP, full duplex mode
VLAN: 125, pvid: yes
-------------------------------------------------------------------------------

PS:

 1. Here's more on tcpdump tips http://dazdaztech.wordpress.com/2013/05/17/using-tcpdump-to-see-cdp-or-lldp-packets/ and http://the-welters.com/professional/tcpdump.html

 2. For EtherType, you can refer to wiki page here https://en.wikipedia.org/wiki/EtherType

 4. Here's more about MTU:

The link layer, which is typically Ethernet, sends information into the network as a series of frames. Even though the layers above may have pieces of information much larger than the frame size, the link layer breaks everything up into frames(which in payload encloses IP packet such as TCP/UDP/ICMP) to send them over the network. This maximum size of data in a frame is known as the maximum transfer unit (MTU). You can use network configuration tools such as ip or ifconfig to set the MTU.

The size of the MTU has a direct impact on the efficiency of the network. Each frame in the link layer has a small header, so using a large MTU increases the ratio of user data to overhead (header). When using a large MTU, however, each frame of data has a higher chance of being corrupted or dropped. For clean physical links, a high MTU usually leads to better performance because it requires less overhead; for noisy links, however, a smaller MTU may actually enhance performance because less data has to be re-sent when a single frame is corrupted.

Here's one image of layers of network frames:

layers-of-network-frames

 

Add static routes in linux which will survive reboot and network bouncing

December 24th, 2013 Comments off

We can see that in linux, the file /etc/sysconfig/static-routes is revoked by /etc/init.d/network:

[root@test-linux ~]# grep static-routes /etc/init.d/network
# Add non interface-specific static-routes.
if [ -f /etc/sysconfig/static-routes ]; then
grep "^any" /etc/sysconfig/static-routes | while read ignore args ; do

So we can add rules in /etc/sysconfig/static-routes to let network routes survive reboot and network bouncing. The format of /etc/sysconfig/static-routes is like:

any net 10.247.17.0 netmask 255.255.255.192 gw 10.247.10.1
any net 10.247.11.128 netmask 255.255.255.192 gw 10.247.10.1

To make route in effect immediately, you can use route add:

route add -net 192.168.62.0 netmask 255.255.255.0 gw 192.168.1.1

But remember that to change the default gateway, we need modify /etc/sysconfig/network(modify GATEWAY=).

After the modification, bounce the network using service network restart to make the changes in effect.

PS: 

  • You need make sure network id follows -net, or you'll see error "route: netmask doesn't match route address".
  • To reload all static routes in /etc/sysconfig/static-routes, you can do the following:
      # Add non interface-specific static-routes.
        if [ -f /etc/sysconfig/static-routes ]; then
           grep "^any" /etc/sysconfig/static-routes | while read ignore args ; do
              /sbin/route add -$args
           done
        fi

resolved – mount clntudp_create: RPC: Program not registered

December 2nd, 2013 Comments off

When I did a showmount -e localhost, error occured:

[root@centos-doxer ~]# showmount -e localhost
mount clntudp_create: RPC: Program not registered

So I checked what RPC program number of showmount was using:

[root@centos-doxer ~]# grep showmount /etc/rpc
mountd 100005 mount showmount

As this indicated, we should startup mountd daemon to make showmount -e localhost work. And mountd is part of nfs, so I started up nfs:

[root@centos-doxer ~]# /etc/init.d/nfs start
Starting NFS services: [ OK ]
Starting NFS quotas: [ OK ]
Starting NFS daemon: [ OK ]
Starting NFS mountd: [ OK ]

Now as mountd was running, showmount -e localhost should work.

 

VLAN in windows hyper-v

November 26th, 2013 Comments off

Briefly, a virtual LAN (VLAN) can be regarded as a broadcast domain. It operates on the OSI
network layer 2. The exact protocol definition is known as 802.1Q. Each network packet belong-
ing to a VLAN has an identifier. This is just a number between 0 and 4095, with both 0 and 4095
reserved for other uses. Let’s assume a VLAN with an identifier of 10. A NIC configured with
the VLAN ID of 10 will pick up network packets with the same ID and will ignore all other IDs.
The point of VLANs is that switches and routers enabled for 802.1Q can present VLANs to dif-
ferent switch ports in the network. In other words, where a normal IP subnet is limited to a set
of ports on a physical switch, a subnet defined in a VLAN can be present on any switch port—if
so configured, of course.

Getting back to the VLAN functionality in Hyper-V: both virtual switches and virtual NICs
can detect and use VLAN IDs. Both can accept and reject network packets based on VLAN ID,
which means that the VM does not have to do it itself. The use of VLAN enables Hyper-V to
participate in more advanced network designs. One limitation in the current implementation is
that a virtual switch can have just one VLAN ID, although that should not matter too much in
practice. The default setting is to accept all VLAN IDs.

F5 big-ip LTM iRULE to redirect http requests to https

October 25th, 2013 Comments off

Here's the irule script:

when HTTP_REQUEST {
HTTP::redirect "https://[HTTP::host][HTTP::uri]"
}

PS:

1.You can read more about F5 LTM docs here http://support.f5.com/kb/en-us/products/big-ip_ltm.html <select a version of big ip software from the left side first>

2.Here's one diagram shows a logical configuration example of the F5 solution for Oracle Database, Applications, Middleware, Servers and Storage:

oracle_f5

PS:

Here is about "Persistence profile types".

tcp flags explanation in details – SYN ACK FIN RST URG PSH and iptables for sync flood

October 11th, 2013 Comments off

This is from wikipedia:

To establish a connection, TCP uses a three-way handshake. Before a client attempts to connect with a server, the server must first bind to and listen at a port to open it up for connections: this is called a passive open. Once the passive open is established, a client may initiate an active open. To establish a connection, the three-way (or 3-step) handshake occurs:

  1. SYN: The active open is performed by the client sending a SYN to the server. The client sets the segment's sequence number to a random value A.
  2. SYN-ACK: In response, the server replies with a SYN-ACK. The acknowledgment number is set to one more than the received sequence number i.e. A+1, and the sequence number that the server chooses for the packet is another random number, B.
  3. ACK: Finally, the client sends an ACK back to the server. The sequence number is set to the received acknowledgement value i.e. A+1, and the acknowledgement number is set to one more than the received sequence number i.e. B+1.

At this point, both the client and server have received an acknowledgment of the connection. The steps 1, 2 establish the connection parameter (sequence number) for one direction and it is acknowledged. The steps 2, 3 establish the connection parameter (sequence number) for the other direction and it is acknowledged. With these, a full-duplex communication is established.

You can read pdf document here http://www.nordu.net/development/2nd-cnnw/tcp-analysis-based-on-flags.pdf

H3C's implementations of sync flood solution http://www.h3c.com/portal/Products___Solutions/Technology/Security_and_VPN/Technology_White_Paper/200812/624110_57_0.htm

Using iptables to resolve sync flood issue http://pierre.linux.edu/2010/04/how-to-secure-your-webserver-against-syn-flooding-and-dos-attack/ and http://www.cyberciti.biz/tips/howto-limit-linux-syn-attacks.html

You may also consider using tcpkill to kill half open sessions(using ss -s/netstat -s<SYN_RECV>/tcptrack to see connection summary)

Output from netstat -atun:

The reason for waiting is that packets may arrive out of order or be retransmitted after the connection has been closed. CLOSE_WAIT indicates that the other side of the connection has closed the connection. TIME_WAIT indicates that this side has closed the connection. The connection is being kept around so that any delayed packets can be matched to the connection and handled appropriately.

more on http://kb.iu.edu/data/ajmi.html about FIN_wait (one error: 2MSL<Maximum Segment Lifetime>=120s, not 2ms)

All about tcp socket states: http://www.krenel.org/tcp-time_wait-and-ephemeral-ports-bad-friends/

And here's more about tcp connection(internet socket) states: https://en.wikipedia.org/wiki/Transmission_Control_Protocol#Protocol_operation

device bond0/bond1 does not seem to be present delaying

August 1st, 2013 Comments off

Today I encountered one problem when trying to start up network:

"device bond0 does not seem to be present delaying"

In my env, bond0 was bridged to v119_BR. And eth0 and eth1 that composed bond0 were not in any vlan in the switch side. Here's some more info:

-bash-3.2# ifconfig v119_BR

v119_BR Link encap:Ethernet HWaddr 00:21:28:EE:C5:37
inet addr:10.172.149.171 Bcast:10.172.151.255 Mask:255.255.255.0
inet6 addr: 2606:b400:2010:4051:221:28ff:feee:c537/64 Scope:Global
inet6 addr: fe80::221:28ff:feee:c537/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:3830 errors:0 dropped:0 overruns:0 frame:0
TX packets:334 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:292812 (285.9 KiB) TX bytes:65740 (64.1 KiB)

-bash-3.2# cat ifcfg-v119_BR
DEVICE=v119_BR
BOOTPROTO=none
USERCTL=no
ONBOOT=yes
STP=off
TYPE=Bridge
IPADDR=10.172.149.171
NETMASK=255.255.255.0
NETWORK=10.172.144.0
BROADCAST=10.172.151.255

And I've bond0(eth0 and eth1) bridged to v119_BR:

-bash-3.2# cat ifcfg-bond0
DEVICE=bond0
BOOTPROTO=none
USERCTL=no
ONBOOT=yes
BRIDGE=v119_BR
BONDING_OPTS="primary=eth0 miimon=100 mode=1"
VLAN=no

PS:

As no multiple VLANs are defined on switch, so no alias interface needed for the bridge v119_BR. If multiple VLANs are defined on eth0 & eth1, then we have to define bond0.117(117 is vlan ID), which should be like the following:
[root@testhost ~]# cat /etc/sysconfig/network-scripts/ifcfg-bond0.117 #and then ifcfg-bond0.118 and more if these vlans are defined on switch
DEVICE=bond0.117
BOOTPROTO=none
USERCTL=no
ONBOOT=yes
BRIDGE=v119_BR
VLAN=yes

[root@testhost ~]# cat ifcfg-bond0.122
DEVICE=bond0.122
BOOTPROTO=none
USERCTL=no
ONBOOT=yes
BRIDGE=v122_MGT
VLAN=yes

[root@testhost network-scripts]# cat ifcfg-bond0 #there's no BRIDGE here, as multiple Bridges will be specified in alias config
DEVICE=bond0
BOOTPROTO=none
USERCTL=no
ONBOOT=yes
BONDING_OPTS="primary=eth0 miimon=100 mode=1"

So what about the problem:

"device bond0/bond0 does not seem to be present delaying"

After some searching, I found that it's caused by bond module not loaded. To resolve this, I tried:

-bash-3.2# uname -r
2.6.18-128.2.1.4.37.el5xen

-bash-3.2# find . -name "*bond*"
./kernel/drivers/net/tulip/winbond-840.ko
./kernel/drivers/net/bonding
./kernel/drivers/net/bonding/bonding.ko

insmod /lib/modules/2.6.18-128.2.1.4.37.el5xen/kernel/drivers/net/bonding/bonding.ko

Add the following lines in /etc/modprobe.conf

alias bond1 bonding
alias bond0 bonding

Then reboot the host. After these steps, the issue was resolved.

iptables rules after creating NAT to eth0 in virt-manager

July 18th, 2013 Comments off

During creating a new virtual network in virt-manager GUI, I specified the following parameters:

Network: 192.168.72.0/24

Start: 192.168.72.128

End: 192.168.72.254

Forwarding to physical network(Destination: Physical device eth0, Mode NAT)

After this, the iptables will automatically created:

[root@centos-doxer ~]# iptables-save #dump the contents of an IP Table. Use /etc/init.d/iptables save to save iptables rules to /etc/sysconfig/iptables
*filter
:INPUT ACCEPT [19172:2049589]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [16236:5446648]
-A INPUT -i virbr1 -p udp -m udp --dport 53 -j ACCEPT
-A INPUT -i virbr1 -p tcp -m tcp --dport 53 -j ACCEPT
-A INPUT -i virbr1 -p udp -m udp --dport 67 -j ACCEPT
-A INPUT -i virbr1 -p tcp -m tcp --dport 67 -j ACCEPT
-A FORWARD -d 192.168.72.0/255.255.255.0 -i eth0 -o virbr1 -m state --state RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -s 192.168.72.0/255.255.255.0 -i virbr1 -o eth0 -j ACCEPT
-A FORWARD -i virbr1 -o virbr1 -j ACCEPT
-A FORWARD -o virbr1 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -i virbr1 -j REJECT --reject-with icmp-port-unreachable
COMMIT

*nat
:PREROUTING ACCEPT [16665:4067972]
:POSTROUTING ACCEPT [197:13739]
:OUTPUT ACCEPT [197:13739]
-A POSTROUTING -s 192.168.72.0/255.255.255.0 -d ! 192.168.72.0/255.255.255.0 -o eth0 -p tcp -j MASQUERADE --to-ports 1024-65535
-A POSTROUTING -s 192.168.72.0/255.255.255.0 -d ! 192.168.72.0/255.255.255.0 -o eth0 -p udp -j MASQUERADE --to-ports 1024-65535
-A POSTROUTING -s 192.168.72.0/255.255.255.0 -d ! 192.168.72.0/255.255.255.0 -o eth0 -j MASQUERADE
COMMIT

And here's the bridge info:

[root@centos-doxer ~]# brctl show
bridge name bridge id STP enabled interfaces
virbr0 8000.0800273fc508 no eth0
virbr1 8000.000000000000 yes #one vm will have one interface here, format will be like virbr1-nic

[root@centos-doxer ~]# ifconfig virbr1
virbr1 Link encap:Ethernet HWaddr 00:00:00:00:00:00
inet addr:192.168.72.1 Bcast:192.168.72.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:12 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 b) TX bytes:3656 (3.5 KiB)

NAT forwarding for ssh and vncviewer

July 11th, 2013 Comments off

*nat
:PREROUTING ACCEPT [1:88]
:POSTROUTING ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
-A PREROUTING -p tcp --dport 5911 -d 192.168.40.10 -j DNAT --to-destination 172.16.0.101:5904 #now ssh <ip of eth0, 192.168.40.10> -p 5911 is going to visit 172.16.0.101:5904
-A PREROUTING -p tcp --dport 222 -d 192.168.40.10 -j DNAT --to-destination 172.16.0.101:22 #now ssh <ip of eth0, 192.168.40.10> -p 222 is going to visit 172.16.0.101:22
-A POSTROUTING -o eth0 -j MASQUERADE #if eth0 is private ip, you can also do a NAT with one public ip.
COMMIT

*filter
:INPUT ACCEPT [247:16364]
:FORWARD ACCEPT [163:13692]
:OUTPUT ACCEPT [228:18664]
COMMIT

PPTP vpn configuration on linux

July 10th, 2013 Comments off
  • Configure pptp on remote machine:

vi /etc/sysctl.conf
net.ipv4.ip_forward=1

sysctl -p
yum install pptpd

vi /etc/pptpd.conf
option /etc/ppp/options.pptpd
ppp /usr/sbin/pppd
localip 192.168.8.1 #configure it first as an alias eth0:1 for example
remoteip 192.168.8.11-14
netmask 255.255.248.0
lcp-echo-interval 180
lcp-echo-failure 10

vi /etc/ppp/options.pptpd
name pptpd
refuse-pap
refuse-chap
refuse-mschap
require-mschap-v2
require-mppe-128
proxyarp
lock
nobsdcomp
novj
novjccomp
nologfd
idle 2592000
ms-dns 8.8.8.8 #you can also use OpenDNS's DNS such as 208.67.222.222/208.67.220.220
ms-dns 8.8.4.4
mtu 1416
mru 1416
noaccomp

vi /etc/ppp/chap-secrets
"username" pptpd "password" *

vi /etc/sysconfig/iptables
*nat
:PREROUTING ACCEPT [1:88]
:POSTROUTING ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
-A POSTROUTING -o eth0 -j MASQUERADE #eth0 is where public ip resides
#-A POSTROUTING -s 192.168.1.0/24 -j SNAT -to 192.168.8.1
COMMIT

*filter
:INPUT ACCEPT [247:16364]
-A INPUT -p tcp -m state --state NEW -m tcp --dport 1723 -j ACCEPT
-A INPUT -p gre -j ACCEPT #PPTP is a form of PPP between two hosts via GRE using encryption (MPPE) and compression (MPPC)
-A OUTPUT -p gre -j ACCEPT
:FORWARD ACCEPT [163:13692]
:OUTPUT ACCEPT [228:18664]
COMMIT

/etc/init.d/pptpd restart

  • Now add a new VPN connection in Windows.

PS:

  • Here's four types of vpn tunneling protocols:
  1. L2TP(Data Link Layer, Layer 2 Tunneling Protocol) over IPSec(IPSec is used to secure L2TP packets by providing confidentiality, authentication and integrity): xl2tpd. IPsec in tunneling mode does not create virtual physical interfaces at the end of the tunnel, since the tunnel is handled directly by the TCP/IP stack. L2TP can be used to provide these interfaces, this technique is called L2TP/IPsec. However, IPSec has a weakness that prevents it from being used all the time—IPSec can’t travel through a Network Address Translation (NAT) server(here is for l2tp/ipsec going through NAT). If you need to go through a NAT server, you can use PPTP or SSTP. Here is more info about IPSec transport/tunnel mode, and here is more about NAT/PAT.
  2. IPSec(Network Layer): openswan/strongswan
  3. PPTP(Session Layer): PoPTop(alias for pptpd)
  4. SSL based vpn(Session Layer): openvpn/openconnect/SSTP(windows)
  5. IKEv2(Internet Key Exchange Version 2): support VPN reconnect
  • After creating the vpn connection(assume you give it a name doxer-vpn), you'll find the following when you run 'route print':

C:\Users\liandy>route print
===========================================================================
Interface List
70...........................doxer-vpn

......

192.168.8.0    255.255.255.0      192.168.8.1     192.168.8.11     21

  • You may want to enlarge netmask to another one, such as 255.255.248.0. Here's one bat script to automatically connect and change route table:

rasdial doxer-vpn username password

for /f "delims=." %%i in ('route print ^| find /i "doxer-vpn"') do set if=%%i

route add 192.168.8.0 mask 255.255.248.0 0.0.0.0 if %if%

pause

  • And you'll also find that you can not visit the sites blocked by the remote site gateway's firewall. To change this, right click the vpn connection and navigate through:

Networking -> IPV4 -> Properties -> Advanced, uncheck "use default gateway on remote network".

After this, you should now be able to connect to sites blocked by remote firewall.

  • For Tun(layer 3) and Tap(layer 2), you can refer to here

NAT binding one priviate ip and one public ip together using linux as router

July 9th, 2013 Comments off

If you're using Linux as router, and you want to bind one public ip and one private ip together, you can do the following:

  • On linux router, define the public ip and iptables rules:

[root@router ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth0:1
DEVICE="eth0:1"
BOOTPROTO="static"
ONBOOT="yes"
TYPE="Ethernet"
IPADDR=10.172.100.171
NETMASK=255.255.248.0

[root@router~]# cat /etc/sysconfig/iptables

*nat
:PREROUTING ACCEPT [1:88]
:POSTROUTING ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]

-A POSTROUTING -o eth0 -j MASQUERADE #eth0 is public ip, which acts as a gateway(not router) to internal network. For outbound internal package, eth0 will MASQUERADE their source address
-A PREROUTING -d 10.172.100.171 -j DNAT --to-destination 192.168.6.1 #or you can do port forwarding like "-A PREROUTING -p tcp --dport 5911 -d 192.168.40.10 -j DNAT --to-destination 172.16.0.101:5901"

-A POSTROUTING -s 192.168.6.1 -j SNAT --to-source 10.172.100.171

COMMIT

*filter
:INPUT ACCEPT [247:16364]
:FORWARD ACCEPT [163:13692]
:OUTPUT ACCEPT [228:18664]
COMMIT

PS:

Here're some other types of redirect(REDIRECT is used to redirect the traffic between services on the local machine):

/sbin/iptables -t nat -I PREROUTING -p tcp --dport 2222 -j REDIRECT --to-ports 22

/sbin/iptables -t nat -I OUTPUT -p tcp -d 127.0.0.1 --dport 2222 -j REDIRECT --to-ports 22

/sbin/iptables -t nat -I OUTPUT -p tcp -d 10.247.17.2 --dport 2222 -j REDIRECT --to-ports 22

*nat
:PREROUTING ACCEPT [2:98]
:POSTROUTING ACCEPT [8:840]
:OUTPUT ACCEPT [8:840]
-A PREROUTING -p tcp -m tcp --dport 389 -j REDIRECT --to-ports 4060
-A OUTPUT -s 10.247.18.143 -p tcp -m tcp --dport 389 -j REDIRECT --to-ports 4060
-A OUTPUT -s 127.0.0.1 -p tcp -m tcp --dport 389 -j REDIRECT --to-ports 4060

configure linux as a router firewall through iptables NAT

June 25th, 2013 Comments off
  • On the linux box that will act as router:

1.Turn on ip_forward:

vi /etc/sysctl.conf

net.ipv4.ip_forward = 1

sysctl -p

2.Edit /etc/sysconfig/iptables:

*nat
:PREROUTING ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
-A POSTROUTING -o eth1 -j MASQUERADE #eth1 is the NIC connecting to outside network

#-A POSTROUTING -s 192.168.8.0/255.255.248.0 -o eth0 -j MASQUERADE  #allow 192.168.8.0/21 to do NAT
COMMIT

*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
COMMIT

3.Reload iptables:

[root@Router ~]# service iptables restart
Flushing firewall rules: [ OK ]
Setting chains to policy ACCEPT: filter nat [ OK ]
Unloading iptables modules: [ OK ]
Applying iptables firewall rules: [ OK ]
Loading additional iptables modules: ip_conntrack_netbios_n[ OK ]

[root@Router ~]# iptables -t nat -nL
Chain PREROUTING (policy ACCEPT)
target prot opt source destination

Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0

Chain OUTPUT (policy ACCEPT)
target prot opt source destination
[root@Router ~]# iptables -t filter -nL
Chain INPUT (policy ACCEPT)
target prot opt source destination

Chain FORWARD (policy ACCEPT)
target prot opt source destination

Chain OUTPUT (policy ACCEPT)
target prot opt source destination

On the linux box that will act as client:

1.Set default gateway to the ip address of linux router:

vi /etc/sysconfig/network

...

GATEWAY=192.168.6.1 #this is ip address of the linux router

...

2.restart network

Test

On router, the default gateway is:

0.0.0.0         10.124.184.1    0.0.0.0         UG        0 0          0 eth1

And on the linux client, we'll now be able to connect to outside network too:

[root@client ~]# ping 10.244.29.184
PING 10.244.29.184 (10.244.29.184) 56(84) bytes of data.
64 bytes from 10.244.29.184: icmp_seq=1 ttl=254 time=0.236 ms

PS:

  1. You can also make linux as firewall using NAT/iptables, more on this article: http://xinn.org/iptables-nat.html
  2. About the numbers in brackets, you can refer to the following: https://www.linuxquestions.org/questions/linux-networking-3/those-%5B-damn-brackets-%5D-in-iptables-must-be-there-for-a-reason-619556/
  3. You should turn on promiscuous mode before applying the configs in this article. If you're using Vsphere Esxi, this is the step:

promiscuous

mtr – combine traceroute and ping

May 28th, 2013 Comments off

mtr combines the functionality of the traceroute(use UDP packets)/tracert(use ping, sometimes better than traceroute) and ping programs in a single network diagnostic tool.

As mtr starts, it investigates the network connection between the host mtr runs on and HOSTNAME. by sending packets with purposly low TTLs. It continues to send packets with low TTL, noting the response time of the intervening routers. This allows mtr to print the response percentage and response times of the internet route to HOSTNAME. A sudden increase in packetloss or response time is often an indication of a bad (or simply overloaded) link.

 

mtr in linux

mtr in linux

snmptrapd traphandle configuration example

January 16th, 2013 1 comment

This article is going to show the basic configuration of snmptrapd and it's traphandle command.

Assumptions:
snmptrapd is running on a linux host named "test-centos";
The host sending snmptrap messages in this example is named "test-zfs-host"

Now first we're going to set snmptrapd up on the snmptrap server side:

###Server side
[root@test-centos snmp]# cat /etc/snmp/snmptrapd.conf
#traphandle default /bin/mail -s "snmpdtrapd messages" <put your mail address here>
traphandle default /root/lognotify
authCommunity log,execute,net public

[root@test-centos snmp]# service snmptrapd restart

[root@test-centos snmp]# cat /root/lognotify
#!/bin/bash
read host
read ip
vars=

while read oid val
do
if [ "$vars" = "" ]
then
vars="$oid = $val"
else
vars="$vars, $oid = $val"
fi
done
echo trap: $host $ip $vars >/var/tmp/snmptrap.out

And to test whether snmptrapd is working as expected:

###On client side
snmptrap -v2c -c public test-centos:162 "" SNMPv2-MIB::sysDescr SNMPv2-MIB::sysDescr.0 s "test-zfs-host test-zfs-host.ip this is a test snmptrap string"

And after running this command, you can have a check of /var/tmp/snmptrap.out on the snmptrapd server side(test-centos):

[root@test-centos ~]# cat /var/tmp/snmptrap.out

PS:
If you're using sun zfs head, you can set snmptrap destinations in zfs BUI(configuration -> SNMP), here's the snapshot(click to see the larger image):

SUN zfs storage 7320 monitoring using net-snmp and mrtg

December 25th, 2012 Comments off

This article is going to talk about zfs storage 7320 monitoring using net-snmp and mrtg. Although the monitored system is sun zfs storage 7320, you'll find the main idea of this article can be applied to many different system monitoring, including but not limited to cpu usage/network/bandwidth/disk/temperature of cisco switches, other linux systems and even windows systems.

As net-snmp extending agent functionality is not supported on sun zfs storage 7320 which is solaris 11 express system, so I'm going to monitor sun zfs storage through using of one linux snmp client by writing monitoring scripts on that linux client rather than on zfs itself.

Now, here goes the steps:

Part 1 - set up snmp client and mrtg on a linux host

yum -y install gcc-* gd-* libpng-* zlib-* httpd
yum -y install net-snmp* net-snmp-libs lm_sensors lm_sensors-devel
yum -y install mrtg
cp /etc/snmp/snmpd.conf{,.bak}
echo "rocommunity public" > /etc/snmp/snmpd.conf
mkdir -p /etc/mrtg
chkconfig --level 2345 snmpd on
service snmpd start

Ensure snmpd is listening:

netstat -tunlp |grep snmp
tcp 0 0 127.0.0.1:199 0.0.0.0:* LISTEN 26427/snmpd
udp 0 0 0.0.0.0:161 0.0.0.0:* 26427/snmpd

And let's have a test to make sure snmp client is working as expected:

[root@test-centos mrtg]# snmpwalk -v2c -c public localhost interface
IF-MIB::ifNumber.0 = INTEGER: 4
IF-MIB::ifIndex.1 = INTEGER: 1
IF-MIB::ifIndex.2 = INTEGER: 2
IF-MIB::ifIndex.3 = INTEGER: 3
IF-MIB::ifIndex.4 = INTEGER: 4
IF-MIB::ifDescr.1 = STRING: lo
IF-MIB::ifDescr.2 = STRING: eth0
IF-MIB::ifDescr.3 = STRING: eth1
......
......

Now it's time to configure mrtg on linux:

[root@test-centos mrtg]# cat /etc/httpd/conf.d/mrtg.conf
Alias /mrtg /var/www/mrtg
<Location /mrtg>
Order deny,allow
Allow from all
</Location>

Now, do a httpd restart:

[root@test-centos ~]# apachectl restart

Part 2 - configure snmp on SUN zfs storage 7320 web UI

Log on SUN zfs storage 7320 web UI, navigate through "Configuration" -> "SNMP", and configure as the following:

After this, you'll need enable/restart SNMP service on zfs.
Now do a snmpwalk from snmp linux client to SUN zfs storage:

[root@test-centos ~]# snmpwalk -v2c -c public test-zfs-host interface
IF-MIB::ifNumber.0 = INTEGER: 6
IF-MIB::ifIndex.1 = INTEGER: 1
IF-MIB::ifIndex.4 = INTEGER: 4
IF-MIB::ifIndex.5 = INTEGER: 5
IF-MIB::ifIndex.6 = INTEGER: 6
IF-MIB::ifIndex.7 = INTEGER: 7
IF-MIB::ifIndex.8 = INTEGER: 8
IF-MIB::ifDescr.1 = STRING: lo0
......
......

  • Part 3 - extending snmp mibs/oids on snmp client(the linux host)

As stated at the beginning of this article, net-snmp extending agent functionality is not supported on sun zfs storage 7320. So I'm going to monitor sun zfs storage through using of one linux snmp client by writing monitoring scripts on that linux client rather than on zfs itself.

To avoid ssh asking for password when connecting from the linux host, you need add the linux host's pubkey to sun zfs storage:
On snmp Linux client:

[root@test-centos ~]# ssh-keygen -t rsa #if you already have pubkey, skip this step
[root@test-centos ~]# cat /root/.ssh/id_rsa.pub |awk '{print $2}'
AAAAB3NzaC1yc2EAAAABIwAAAQEAxYd97A/V5RwdkfzbkmYBqF189pTLOlbYt0dZzO395dfU0Sp/Ykrk+sOJO0bJZEtytuTcCz/bVutWB7vLzeQPxIToRUQnZX7ZoMsjyaFk3LhtAgFhYIycOw2FQL8Qvb5yMBASB2/KthsqaiNqOP/2Vy5e0aCFFIV5DlKQTp/3eceSMq8kTx+e801lZow++yT70rp3p+5WtriN/NKYI0B3cpSQY/36D/TcOF9v5IaqQokp/mLRoc1MLOhN0sy0ipCdT+0bbkZ4Lh8bEeQO48UGKEOnYrYto33tay4mZk8HPWFK4w/TQGxBLthiuPQ4oZzG3gVpQUS4GRwI9zZoGtgELQ==

On Sun zfs storage:
Click "Configuration", then add the snmp client's public key(ensure RSA is selected):

Do a ssh connection from snmp linux client to the sun zfs storage host, it should now not asking for password.(ensure to do this, as there's possibility that sun zfs's key is not on the linux client as you may never have connected from that linux client to the sun zfs storage system)

Now we're to the most important part. Assume we want to monitoring space usage on the SUN zfs storage system, to do this, you'll need do the following on the snmp linux client:

[root@test-centos mrtg]# cat /etc/snmp/snmpd.conf
rocommunity public
extend .1.3.6.1.4.1.2021.31 zfs-test-zfs-host-total /bin/bash /var/tmp/mrtg/zfs-test-zfs-host-total.sh
extend .1.3.6.1.4.1.2021.32 zfs-test-zfs-host-used /bin/bash /var/tmp/mrtg/zfs-test-zfs-host-used.sh

[root@test-centos ~]# cat /var/tmp/mrtg/zfs-test-zfs-host-used.sh
#!/bin/bash
_used=`ssh test-zfs-host "status ls"|grep Used|awk '{print gensub('/T/',"","g",$2)}'`
echo $_used;

[root@test-centos ~]# cat /var/tmp/mrtg/zfs-test-zfs-host-total.sh
_used=`ssh test-zfs-host "status ls"|grep Used|awk '{print gensub('/T/',"","g",$2)}'` #I trimed 'T', you may need modify this to meet your environment
_avail=`ssh test-zfs-host "status ls"|grep Avail|awk '{print gensub('/T/',"","g",$2)}'` #I trimed 'T', you may need modify this to meet your environment
_all=`echo $_used + $_avail|bc`
echo $_all;

[root@test-centos ~]# chmod +x /var/tmp/mrtg/zfs-test-zfs-host-used.sh
[root@test-centos ~]# chmod +x /var/tmp/mrtg/zfs-test-zfs-host-total.sh

Now, let's do a snmp restart on snmp linux client and then test the newly added OIDs:

[root@test-centos ~]# service snmpd restart

[root@test-centos ~]# snmpwalk -v2c -c public localhost .1.3.6.1.4.1.2021.32
UCD-SNMP-MIB::ucdavis.32.1.0 = INTEGER: 1
UCD-SNMP-MIB::ucdavis.32.2.1.2.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100 = STRING: "/bin/bash"
UCD-SNMP-MIB::ucdavis.32.2.1.3.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100 = STRING: "/var/tmp/mrtg/zfs-test-zfs-host-used.sh"
UCD-SNMP-MIB::ucdavis.32.2.1.4.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100 = ""
UCD-SNMP-MIB::ucdavis.32.2.1.5.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100 = INTEGER: 5
UCD-SNMP-MIB::ucdavis.32.2.1.6.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100 = INTEGER: 1
UCD-SNMP-MIB::ucdavis.32.2.1.7.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100 = INTEGER: 1
UCD-SNMP-MIB::ucdavis.32.2.1.20.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100 = INTEGER: 4
UCD-SNMP-MIB::ucdavis.32.2.1.21.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100 = INTEGER: 1
UCD-SNMP-MIB::ucdavis.32.3.1.1.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100 = STRING: "9.31"
UCD-SNMP-MIB::ucdavis.32.3.1.2.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100 = STRING: "9.31"
UCD-SNMP-MIB::ucdavis.32.3.1.3.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100 = INTEGER: 1
UCD-SNMP-MIB::ucdavis.32.3.1.4.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100 = INTEGER: 0
UCD-SNMP-MIB::ucdavis.32.4.1.2.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100.1 = STRING: "9.31"
[root@test-centos ~]# snmpwalk -v2c -c public localhost .1.3.6.1.4.1.2021.31
UCD-SNMP-MIB::ucdavis.31.1.0 = INTEGER: 1
UCD-SNMP-MIB::ucdavis.31.2.1.2.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108 = STRING: "/bin/bash"
UCD-SNMP-MIB::ucdavis.31.2.1.3.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108 = STRING: "/var/tmp/mrtg/zfs-test-zfs-host-total.sh"
UCD-SNMP-MIB::ucdavis.31.2.1.4.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108 = ""
UCD-SNMP-MIB::ucdavis.31.2.1.5.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108 = INTEGER: 5
UCD-SNMP-MIB::ucdavis.31.2.1.6.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108 = INTEGER: 1
UCD-SNMP-MIB::ucdavis.31.2.1.7.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108 = INTEGER: 1
UCD-SNMP-MIB::ucdavis.31.2.1.20.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108 = INTEGER: 4
UCD-SNMP-MIB::ucdavis.31.2.1.21.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108 = INTEGER: 1
UCD-SNMP-MIB::ucdavis.31.3.1.1.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108 = STRING: "16.32"
UCD-SNMP-MIB::ucdavis.31.3.1.2.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108 = STRING: "16.32"
UCD-SNMP-MIB::ucdavis.31.3.1.3.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108 = INTEGER: 1
UCD-SNMP-MIB::ucdavis.31.3.1.4.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108 = INTEGER: 0
UCD-SNMP-MIB::ucdavis.31.4.1.2.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108.1 = STRING: "16.32"

From the output, we can see that OIDs UCD-SNMP-MIB::ucdavis.32.3.1.1.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100 (used space) and UCD-SNMP-MIB::ucdavis.31.3.1.1.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108 (total space) are the two OIDs we want.

  • Part 4 - do the mrtg drawing

As we got the OIDs we want, it's now easier for us to do the mrtg drawing. On the snmp linux host, do the following steps:

[root@test-centos ~]# cat /etc/mrtg/test-zfs-host.cfg
#LoadMIBs: /usr/share/snmp/mibs/UCD-SNMP-MIB.txt,/usr/share/snmp/mibs/TCP-MIB.txt
workdir: /var/www/mrtg/
Title[zfs_space_test-zfs-host]: Percentage used space on zfs
PageTop[zfs_space_test-zfs-host]: <h1>Percentage used space on zfs</h1>
Target[zfs_space_test-zfs-host]: .1.3.6.1.4.1.2021.32.3.1.1.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100&.1.3.6.1.4.1.2021.31.3.1.1.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108:public@localhost
Options[zfs_space_test-zfs-host]: growright,gauge,transparent,nopercent
Unscaled[zfs_space_test-zfs-host]: ymwd
MaxBytes[zfs_space_test-zfs-host]: 100
YLegend[zfs_space_test-zfs-host]: UsedSpace %
ShortLegend[zfs_space_test-zfs-host]: T
LegendI[zfs_space_test-zfs-host]: Used
LegendO[zfs_space_test-zfs-host]: Total
Legend1[zfs_space_test-zfs-host]: Percentage used space on zfs
Legend2[zfs_space_test-zfs-host]: Percentage all space on zfs

PS:
You need replace "UCD-SNMP-MIB::ucdavis" with .1.3.6.1.4.1.2021 or you'll get error messages like the following:

[root@test-centos ~]# env LANG=C TZ=Asia/Shanghai /usr/bin/mrtg /etc/mrtg/test-zfs-host.cfg
Argument "v4only" isn't numeric in int at /usr/bin/../lib64/mrtg2/SNMP_Session.pm line 183.
backoff (v4only) must be a number >= 1.0 at /usr/bin/../lib64/mrtg2/SNMP_util.pm line 465

Let's continue:

[root@test-centos ~]# env LANG=C TZ=Asia/Shanghai /usr/bin/mrtg /etc/mrtg/test-zfs-host.cfg
24-12-2012 01:46:41, Rateup WARNING: /usr/bin/rateup could not read the primary log file for zfs_space_test-zfs-host
24-12-2012 01:46:41, Rateup WARNING: /usr/bin/rateup The backup log file for zfs_space_test-zfs-host was invalid as well
24-12-2012 01:46:41, Rateup WARNING: /usr/bin/rateup Can't remove zfs_space_test-zfs-host.old updating log file
24-12-2012 01:46:41, Rateup WARNING: /usr/bin/rateup Can't rename zfs_space_test-zfs-host.log to zfs_space_test-zfs-host.old updating log file

[root@test-centos ~]# env LANG=C TZ=Asia/Shanghai /usr/bin/mrtg /etc/mrtg/test-zfs-host.cfg
24-12-2012 01:46:46, Rateup WARNING: /usr/bin/rateup Can't remove zfs_space_test-zfs-host.old updating log file

[root@test-centos ~]# env LANG=C TZ=Asia/Shanghai /usr/bin/mrtg /etc/mrtg/test-zfs-host.cfg

[root@test-centos ~]# indexmaker --output=/var/www/mrtg/index.html /etc/mrtg/test-zfs-host.cfg

Now add cronjob:

0-59/5 * * * * env LANG=C TZ=Asia/Shanghai /usr/bin/mrtg /etc/mrtg/test-zfs-host.cfg

Visit http://<your linux box's ip address>/mrtg/ to get the GUI result(you'll wait 5 minutes for the initial result, be patient!)

  • Part 5 - troubleshooting

  • If you met error messages like the following:

[root@test-centos ~]# env LANG=C TZ=Asia/Shanghai /usr/bin/mrtg /etc/mrtg/test-zfs-host.cfg
Argument "v4only" isn't numeric in int at /usr/bin/../lib64/mrtg2/SNMP_Session.pm line 183.
backoff (v4only) must be a number >= 1.0 at /usr/bin/../lib64/mrtg2/SNMP_util.pm line 465

That's because you're using OIDs/MIBs alias rather than number. Change alias to number, i.e. change UCD-SNMP-MIB::ucdavis to .1.3.6.1.4.1.2021, and then re-check.

  • If you met error messages like the following:

[root@test-centos ~]# env LANG=C TZ=Asia/Shanghai /usr/bin/mrtg /etc/mrtg/test-zfs-host.cfg
cannot encode Object ID 34.4.1.2.28.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.110.105.99.45.98.97.110.100.119.105.100.116.104.3: first subid too big in Object ID 34.4.1.2.28.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.110.105.99.45.98.97.110.100.119.105.100.116.104.3 at /usr/bin/mrtg line 2035
Tuesday, 25 December 2012 at 0:34: ERROR: Target[zfs-test-zfs-host-io-average-latency-igb2][_IN_] ' $target->[3]{$mode} ' did not eval into defined data
Tuesday, 25 December 2012 at 0:34: ERROR: Target[zfs-test-zfs-host-io-average-latency-igb2][_OUT_] ' $target->[3]{$mode} ' did not eval into defined data

Then you should carefully check the scripts you write for the OIDs, and run several times of snmpwalk to ensure the values are correct(is your script's output variable, this may cause problems)

  • If you met error messages like the following:

[root@test-centos mrtg]# env LANG=C TZ=Asia/Shanghai /usr/bin/mrtg /etc/mrtg/test-zfs-host.cfg
SNMP Error:
Received SNMP response with error code
error status: noSuchName
index 2 (OID: 1.3.6.1.4.1.2021.44)
SNMPv1_Session (remote host: "localhost" [127.0.0.1].161)
community: "public"
request ID: 1786815468
PDU bufsize: 8000 bytes
timeout: 2s
retries: 5
backoff: 1)
at /usr/bin/../lib64/mrtg2/SNMP_util.pm line 490
SNMPGET Problem for .1.3.6.1.4.1.2021.44 .1.3.6.1.4.1.2021.44 sysUptime sysName on public@localhost::::::v4only
at /usr/bin/mrtg line 2035
SNMP Error:
Received SNMP response with error code
error status: noSuchName
index 2 (OID: 1.3.6.1.4.1.2021.45)
SNMPv1_Session (remote host: "localhost" [127.0.0.1].161)
community: "public"
request ID: 1786815469
PDU bufsize: 8000 bytes
timeout: 2s
retries: 5
backoff: 1)
at /usr/bin/../lib64/mrtg2/SNMP_util.pm line 490
SNMPGET Problem for .1.3.6.1.4.1.2021.45 .1.3.6.1.4.1.2021.45 sysUptime sysName on public@localhost::::::v4only
at /usr/bin/mrtg line 2035
Monday, 24 December 2012 at 14:41: ERROR: Target[zfs_space_test-zfs-host][_IN_] '( $target->[0]{$mode} ) * 10000 / ( $target->[1]{$mode} )' (warn): Use of uninitialized value in division (/) at (eval 16) line 1.
Monday, 24 December 2012 at 14:41: ERROR: Target[zfs_space_test-zfs-host][_OUT_] '( $target->[0]{$mode} ) * 10000 / ( $target->[1]{$mode} )' (warn): Use of uninitialized value in division (/) at (eval 17) line 1.

Then one possible culprit is that new net-snmp stop supporting "exec", use "extend" instead and re-try.

Note that you can also use "snmpd -f -Le" to check error messages related to snmpd.

PS:
Here's more links from where you can read more about net-snmp/mrtg:

  1. net-snmp FAQ http://www.net-snmp.org/FAQ.html
  2. mrtg configuration opstions http://oss.oetiker.ch/mrtg/doc/mrtg-reference.en.html
  3. SUN zfs storage SNMP - http://docs.oracle.com/cd/E22471_01/html/820-4167/configuration__services__snmp.html
  4. net-snmp extending agent functionality - http://linux.die.net/man/5/snmpd.conf (search for 'extending agent functionality' on this page)
  5. To get only numbers of snmpwalk, add -On. More info on http://linux.die.net/man/1/snmpcmd
  6. Now here's the image of the baby(click on it to see the larger one:

403 CoachingSessionExceeded – McAfee the culprit

December 18th, 2012 Comments off

Today I tried access one website within company's network which was very important to me. But the site loaded incompletely and so was not usable.

As always, company network is somehow restricted, for example, only output port 80 and 443 are enabled in my company's network. As many companies do, McAfee software is used to restrict internal employees surfing the internet. So I firstly thought McAfee was the culprit.

As I'm using chrome browser, so I press F12 to open the developer tools. On "Network" tab(you may need ctrl+F5 to forcely refresh the website), I found the following error status:

Then I opened that url source in a new tab, and the following page occured:

So everything sorted! I clicked the button "click here if you have read ....", and then went back to the site, refresh, and later everything was ok!

basic knowledge for netmask hexadecimal decimal binary netmask cidr calculator

May 3rd, 2012 Comments off

Firstly, let's get familiar with hexadecimal/decimal/binary in netmask linux/windows netmask like FF.FF.FF.FE or 255.255.255.254 or 11111111.11111111.11111111.11111110 which of them are identical.

F(hexadecimal) equals 15(decimal) and 1111(binary), E(hexadecimal) equals 14(decimal) and 1110(binary). Converts every F to 1111 and E to 1110, so FF.FF.FF.FE will turn out to be 11111111.11111111.11111111.11111110. As 11111111(binary) equals 255(decimal) and 11111110(binary) equals 254(decimal) so 11111111.11111111.11111111.11111110 will be 255.255.255.254. As there's only 1 bit for host and there's 31 bits for network, so CIDR for FF.FF.FF.FE will be xxx.xxx.xxx.xxx/31.

Then, let's talk about relationship between ip address/netmask/network address/broadcast address/max hosts in one subnet

Given ip address and netmask address, we can calculate this ip's network address, broadcast address, max hosts in this network with the same netmask in this specified subnet. For example, if ip is 192.168.1.28, netmask is 255.255.255.240, then 256-240=16(means there'll be at most 16 hosts), as 192.168.1.28 belongs to ip range of 192.168.1.16 ~ 192.168.1.32, so it means that 192.168.1.28 has network address 192.168.1.16 and broadcast address 192.168.1.31.(network address must be the first address in the available subnet address and must in whole number multiples which is 16, 256-240=16)
NB:
  • We can confirm example above with the help of ipcalc(which is installed by default under RHEL / CentOS / Fedora Linux using initscripts package):
doxer@doxer ~ $ ipcalc -c 192.168.1.28/255.255.255.240 #or you can use ipcalc -pnbm 192.168.1.28 255.255.255.240
Address: 192.168.1.28 11000000.10101000.00000001.0001 1100
Netmask: 255.255.255.240 = 28 11111111.11111111.11111111.1111 0000
Wildcard: 0.0.0.15 00000000.00000000.00000000.0000 1111
=>
Network: 192.168.1.16/28 11000000.10101000.00000001.0001 0000
HostMin: 192.168.1.17 11000000.10101000.00000001.0001 0001
HostMax: 192.168.1.30 11000000.10101000.00000001.0001 1110
Broadcast: 192.168.1.31 11000000.10101000.00000001.0001 1111
Hosts/Net: 14 Class C, Private Internet

  • here is some common SubNet Mask <-> Hex SubNet Mask <-> CIDR <-> Bit Mask <-> Quantity in Range (remember 9 numbers, 255/254/252/248/240/224/192/128<128 hosts>/0)

 

Quantity in Range SubNet Mask Hex SubNet Mask CIDR Bit Mask  
1 255.255.255.255 FF.FF.FF.FF xxx.xxx.xxx.xxx/32 11111111111111111111111111111111 D Class
2 255.255.255.254 FF.FF.FF.FE xxx.xxx.xxx.xxx/31 11111111111111111111111111111110  
4 255.255.255.252 FF.FF.FF.FC xxx.xxx.xxx.xxx/30 11111111111111111111111111111100  
8 255.255.255.248 FF.FF.FF.F8 xxx.xxx.xxx.xxx/29 11111111111111111111111111111000  
16 255.255.255.240 FF.FF.FF.F0 xxx.xxx.xxx.xxx/28 11111111111111111111111111110000  
32 255.255.255.224 FF.FF.FF.E0 xxx.xxx.xxx.xxx/27 11111111111111111111111111100000  
64 255.255.255.192 FF.FF.FF.C0 xxx.xxx.xxx.xxx/26 11111111111111111111111111000000  
128 255.255.255.128 FF.FF.FF.80 xxx.xxx.xxx.xxx/25 11111111111111111111111110000000  
256 255.255.255.0 FF.FF.FF.00 xxx.xxx.xxx.xxx/24 11111111111111111111111100000000 C Class
512 255.255.254.0 FF.FF.FE.00 xxx.xxx.xxx.xxx/23 11111111111111111111111000000000  
1,024 255.255.252.0 FF.FF.FC.00 xxx.xxx.xxx.xxx/22 11111111111111111111110000000000  
2,048 255.255.248.0 FF.FF.F8.00 xxx.xxx.xxx.xxx/21 11111111111111111111100000000000  
4,096 255.255.240.0 FF.FF.F0.00 xxx.xxx.xxx.xxx/20 11111111111111111111000000000000  
8,192 255.255.224.0 FF.FF.E0.00 xxx.xxx.xxx.xxx/19 11111111111111111110000000000000  
16,384 255.255.192.0 FF.FF.C0.00 xxx.xxx.xxx.xxx/18 11111111111111111100000000000000  
32,768 255.255.128.0 FF.FF.80.00 xxx.xxx.xxx.xxx/17 11111111111111111000000000000000  
65,536 255.255.0.0 FF.FF.00.00 xxx.xxx.xxx.xxx/16 11111111111111110000000000000000 B Class
131,072 255.254.0.0 FF.FE.00.00 xxx.xxx.xxx.xxx/15 11111111111111100000000000000000  
262,144 255.252.0.0 FF.FC.00.00 xxx.xxx.xxx.xxx/14 11111111111111000000000000000000  
524,288 255.248.0.0 FF.F8.00.00 xxx.xxx.xxx.xxx/13 11111111111110000000000000000000  
1,048,576 255.240.0.0 FF.F0.00.00 xxx.xxx.xxx.xxx/12 11111111111100000000000000000000  
2,097,152 255.224.0.0 FF.E0.00.00 xxx.xxx.xxx.xxx/11 11111111111000000000000000000000  
4,194,304 255.192.0.0 FF.C0.00.00 xxx.xxx.xxx.xxx/10 11111111110000000000000000000000  
8,388,608 255.128.0.0 FF.80.00.00 xxx.xxx.xxx.xxx/9 11111111100000000000000000000000  
16,777,216 255.0.0.0 FF.00.00.00 xxx.xxx.xxx.xxx/8 11111111000000000000000000000000 A Class
33,554,432 254.0.0.0 FE.00.00.00 xxx.xxx.xxx.xxx/7 11111110000000000000000000000000  
67,108,864 252.0.0.0 FC.00.00.00 xxx.xxx.xxx.xxx/6 11111100000000000000000000000000  
134,217,728 248.0.0.0 F8.00.00.00 xxx.xxx.xxx.xxx/5 11111000000000000000000000000000  
268,435,456 240.0.0.0 F0.00.00.00 xxx.xxx.xxx.xxx/4 11110000000000000000000000000000  
536,870,912 224.0.0.0 E0.00.00.00 xxx.xxx.xxx.xxx/3 11100000000000000000000000000000  
1,073,741,824 192.0.0.0 C0.00.00.00 xxx.xxx.xxx.xxx/2 11000000000000000000000000000000  
2,147,483,648 128.0.0.0 80.00.00.00 xxx.xxx.xxx.xxx/1 10000000000000000000000000000000