tcpdump & wireshark tips

tcpdump [ -AdDefIKlLnNOpqRStuUvxX ] [ -B buffer_size ] [ -c count ]

[ -C file_size ] [ -G rotate_seconds ] [ -F file ]
[ -i interface ] [ -m module ] [ -M secret ]
[ -r file ] [ -s snaplen ] [ -T type ] [ -w file ]
[ -W filecount ]
[ -E spi@ipaddr algo:secret,... ]
[ -y datalinktype ] [ -z postrotate-command ] [ -Z user ] [ expression ]

#general format of a tcp protocol line

src > dst: flags data-seqno ack window urgent options
Src and dst are the source and destination IP addresses and ports.
Flags are some combination of S (SYN), F (FIN), P (PUSH), R (RST), W (ECN CWR) or E (ECN-Echo), or a single '.'(means no flags were set)
Data-seqno describes the portion of sequence space covered by the data in this packet.
Ack is sequence number of the next data expected the other direction on this connection.
Window is the number of bytes of receive buffer space available the other direction on this connection.
Urg indicates there is 'urgent' data in the packet.
Options are tcp options enclosed in angle brackets (e.g., <mss 1024>).

tcpdump -D #list of the network interfaces available
tcpdump -e #Print the link-level header on each dump line
tcpdump -S #Print absolute, rather than relative, TCP sequence numbers
tcpdump -s <snaplen> #Snarf snaplen bytes of data from each packet rather than the default of 65535 bytes
tcpdump -i eth0 -S -nn -XX vlan
tcpdump -i eth0 -S -nn -XX arp
tcpdump -i bond0 -S -nn -vvv udp dst port 53
tcpdump -i bond0 -S -nn -vvv host testhost
tcpdump -nn -S -vvv "dst host host1.example.com and (dst port 1521 or dst port 6200)"

tcpdump -vv -x -X -s 1500 -i eth0 'port 25' #traffic on SMTP. -xX to print data in addition to header in both hex/ASCII. use -s 192 to watch NFS traffic(NFS requests are very large and much of the detail won't be printed unless snaplen is increased).

tcpdump -nn -S udp dst port 111 #note that telnet is based on tcp protocol, NOT udp. So if you want to test UDP connection(udp is connection-less), then you must start up the app, then use tcpdump to test.

tcpdump -nn -S udp dst portrange 1-1023

Wireshark Capture Filters (in Capture -> Options)

Wireshark DisplayFilters (in toolbar)

Here is another example of TCP 3-way handshake & 4-way handshake & sync flood

EVENT DIAGRAM
Host A sends a TCP SYNchronize packet to Host BHost B receives A's SYNHost B sends a SYNchronize-ACKnowledgementHost A receives B's SYN-ACKHost A sends ACKnowledge

Host B receives ACK.
TCP socket connection is ESTABLISHED.

3-way-handshake
TCP Three Way Handshake
(SYN,SYN-ACK,ACK)

TCP-CLOSE_WAIT

 

The upper part shows the states on the end-point initiating the termination.

The lower part the states on the other end-point.

So the initiating end-point (i.e. the client) sends a termination request to the server and waits for an acknowledgement in state FIN-WAIT-1. The server sends an acknowledgement and goes in state CLOSE_WAIT. The client goes into FIN-WAIT-2 when the acknowledgement is received and waits for an active close. When the server actively sends its own termination request, it goes into LAST-ACK and waits for an acknowledgement from the client. When the client receives the termination request from the server, it sends an acknowledgement and goes into TIME_WAIT and after some time into CLOSED. The server goes into CLOSED state once it receives the acknowledgement from the client.

A socket can be in CLOSE_WAIT state indefinitely until the application closes it. Faulty scenarios would be like filedescriptor leak, server not being execute close() on socket leading to pile up of close_wait sockets. At java level, this manifests as "Too many open files" error. The value cannot be changed.

TIME_WAIT is just a time based wait on socket before closing down the connection permanently. Under most circumstances, sockets in TIME_WAIT is nothing to worry about. The value can be changed(tcp_time_wait_interval).

More info about time_wait & close_wait can be found here.

PS:

You can refer to this article for a detailed explanation of tcp three-way handshake establishing/terminating a connection. And for tcpdump one, you can check below:

[root@host2 ~]# telnet host1 14100
Trying 10.240.249.139...
Connected to host1.us.oracle.com (10.240.249.139).
Escape character is '^]'.
^]
telnet> quit
Connection closed.

[root@host1 ~]# tcpdump -vvv -S host host2
tcpdump: WARNING: eth0: no IPv4 address assigned
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes
03:16:39.188951 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto: TCP (6), length: 60) host1.us.oracle.com.14100 > host2.us.oracle.com.18890: S, cksum 0xa806 (correct), 3445765853:3445765853(0) ack 3946095098 win 5792 <mss 1460,sackOK,timestamp 854077220 860674218,nop,wscale 7> #2. host1 ack SYN package by host2, and add it by 1 as the number to identify this connection(3946095098). Then host1 send a SYN(3445765853).
03:16:41.233807 IP (tos 0x0, ttl 64, id 6650, offset 0, flags [DF], proto: TCP (6), length: 52) host1.us.oracle.com.14100 > host2.us.oracle.com.18890: F, cksum 0xdd48 (correct), 3445765854:3445765854(0) ack 3946095099 win 46 <nop,nop,timestamp 854079265 860676263> #5. host1 Ack F(3946095099), and then it send a F just as host2 did(3445765854 unchanged). 

[root@host2 ~]# tcpdump -vvv -S host host1
tcpdump: WARNING: eth0: no IPv4 address assigned
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes
03:16:39.188628 IP (tos 0x10, ttl 64, id 31059, offset 0, flags [DF], proto: TCP (6), length: 60) host2.us.oracle.com.18890 > host1.us.oracle.com.14100: S, cksum 0x265b (correct), 3946095097:3946095097(0) win 5792 <mss 1460,sackOK,timestamp 860674218 854045985,nop,wscale 7> #1. host2 send a SYN package to host1(3946095097)
03:16:39.188803 IP (tos 0x10, ttl 64, id 31060, offset 0, flags [DF], proto: TCP (6), length: 52) host2.us.oracle.com.18890 > host1.us.oracle.com.14100: ., cksum 0xed44 (correct), 3946095098:3946095098(0) ack 3445765854 win 46 <nop,nop,timestamp 860674218 854077220> #3. host2 ack the SYN sent by host1, and add 1 to identify this connection. The tcp connection is now established(3946095098 unchanged, ack 3445765854).
03:16:41.233397 IP (tos 0x10, ttl 64, id 31061, offset 0, flags [DF], proto: TCP (6), length: 52) host2.us.oracle.com.18890 > host1.us.oracle.com.14100: F, cksum 0xe546 (correct), 3946095098:3946095098(0) ack 3445765854 win 46 <nop,nop,timestamp 860676263 854077220> #4. host2 send a F(in) with a Ack, F will inform host1 that no more data needs sent(3946095098 unchanged), and ack is uded to identify the connection previously established(3445765854 unchanged)
03:16:41.233633 IP (tos 0x10, ttl 64, id 31062, offset 0, flags [DF], proto: TCP (6), length: 52) host2.us.oracle.com.18890 > host1.us.oracle.com.14100: ., cksum 0xdd48 (correct), 3946095099:3946095099(0) ack 3445765855 win 46 <nop,nop,timestamp 860676263 854079265> #6. host2 ack host1's F(3445765855), and the empty flag to identify the connection(3946095099 unchanged).

checking MTU or Jumbo Frame settings with ping

You may set your linux box's MTU to jumbo frame sized 9000 bytes or larger(can be between 1500 and 9000 for a 1GbE NIC, and 1500 and 64000 for a 10GbE NIC), but if the switch your box connected to does not have jumbo frame enabled, then your linux box may met problems when sending & receiving packets.

So how can we get an idea of whether Jumbo Frame enabled on switch or linux box?

Of course you can log on switch and check, but we can also verify this from linux box that connects to switch.

On linux box, you can see the MTU settings of each interface using ifconfig:

[root@centos-doxer ~]# ifconfig eth0
eth0 Link encap:Ethernet HWaddr 08:00:27:3F:C5:08
UP BROADCAST RUNNING MULTICAST MTU:9000 Metric:1
RX packets:50502 errors:0 dropped:0 overruns:0 frame:0
TX packets:4579 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:9835512 (9.3 MiB) TX bytes:1787223 (1.7 MiB)
Base address:0xd010 Memory:f0000000-f0020000

As stated above, 9000 here doesn't mean that Jumbo Frame enabled on your box to switch. As you can verify with below command:

[root@testbox ~]# ping -c 2 -M do -s 1472 testbox2
PING testbox2.example.com (192.168.29.184) 1472(1500) bytes of data. #so here 1500 bytes go through the network
1480 bytes from testbox2.example.com (192.168.29.184): icmp_seq=1 ttl=252 time=0.319 ms
1480 bytes from testbox2.example.com (192.168.29.184): icmp_seq=2 ttl=252 time=0.372 ms

--- testbox2.example.com ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.319/0.345/0.372/0.032 ms

[root@testbox ~]# ping -c 2 -M do -s 1473 testbox2
PING testbox2.example.com (192.168.29.184) 1473(1501) bytes of data. #so here 1501 bytes can not go through. From here we can see that MTU for this box is 1500, although ifconfig says it's 9000
From testbox.example.com (192.168.28.40) icmp_seq=1 Frag needed and DF set (mtu = 1500)
From testbox.example.com (192.168.28.40) icmp_seq=1 Frag needed and DF set (mtu = 1500)

--- testbox2.example.com ping statistics ---
0 packets transmitted, 0 received, +2 errors

Also, if your the switch is Cisco one, you can verify whether the switch port connecting server has enabled jumbo frame or not by sniffing CDP (Cisco discover protocol) packet. Here's one example:

-bash-4.1# tcpdump -i eth0 -nn -v -s 0 -c 1 ether[20:2] == 0x2000 #ether[20:2] == 0x2000 means capture only packets that have a 2 byte value of hex 2000 starting at byte 20
tcpdump: WARNING: eth0: no IPv4 address assigned
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
03:44:14.221022 CDPv2, ttl: 180s, checksum: 692 (unverified), length 287
Device-ID (0x01), length: 46 bytes: 'ucf-test-swi-5k01b.example.com(SSI16010QJH)'
Address (0x02), length: 13 bytes: IPv4 (1) 192.168.0.242
Port-ID (0x03), length: 16 bytes: 'Ethernet111/1/12'
Capability (0x04), length: 4 bytes: (0x00000228): L2 Switch, IGMP snooping
Version String (0x05), length: 66 bytes:
Cisco Nexus Operating System (NX-OS) Software, Version 5.2(1)N1(4)
Platform (0x06), length: 11 bytes: 'N5K-C5548UP'
Native VLAN ID (0x0a), length: 2 bytes: 123
AVVID trust bitmap (0x12), length: 1 byte: 0x00
AVVID untrusted ports CoS (0x13), length: 1 byte: 0x00
Duplex (0x0b), length: 1 byte: full
MTU (0x11), length: 4 bytes: 1500 bytes #so here MTU size was set to 1500 bytes
System Name (0x14), length: 18 bytes: 'ucf-c1z3-swi-5k01b'
System Object ID (not decoded) (0x15), length: 14 bytes:
0x0000: 060c 2b06 0104 0109 0c03 0103 883c
Management Addresses (0x16), length: 13 bytes: IPv4 (1) 10.131.144.17
Physical Location (0x17), length: 13 bytes: 0x00/snmplocation
1 packets captured
1 packets received by filter
0 packets dropped by kernel
110 packets dropped by interface

PS:

 1. As for "-M do" parameter for ping, you may refer to man ping for more info. And as for DF(don't fragment) and Path MTU Discovery mentioned in the manpage, you may read more on http://en.wikipedia.org/wiki/Path_MTU_discovery and http://en.wikipedia.org/wiki/IP_fragmentation

 2. Maximum packet size is the MTU plus the data-link header length. Packets are not always transmitted at the Maximum packet size. As we can see from output of iptraf -z eth0. To set MTU, you can use ip link set <device> mtu <size>. Here's more about "ip":

ip [ OPTIONS ] OBJECT { COMMAND | help }

OBJECT := { link | addr | addrlabel | route | rule | neigh | tunnel | maddr | mroute | monitor }

OPTIONS := { -V[ersion] | -s[tatistics] | -r[esolve] | -f[amily] { inet | inet6 | ipx | dnet | link } | -o[neline] }

-s, -stats, -statistics

output more information. If the option appears twice or more, the amount of information increases. As a rule, the information is statistics or some time values.

 3. If you find no related data returned:

[root@test ~]# tcpdump -i eth0 -nn -vvv -c 1 ether[20:2] == 0x2000

tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes
06:17:41.347122 IP (tos 0x0, ttl 64, id 8455, offset 0, flags [+], proto: UDP (17), length: 1500) 10.240.19.231.513 > 10.240.23.255.513: UDP, length 7020
1 packets captured
1 packets received by filter
0 packets dropped by kernel

Then you can increase the size of data from each packet. Setting snaplen to 0 sets it to the default of 65535, for backwards compatibility with recent older versions of tcpdump:

[root@test ~]# tcpdump -i eth0 -nn -vvv -s 0 -c 1 ether[20:2] == 0x2000

Or if CDP totally cannot help you get the info, you can try LLDP(Link Layer Discovery Protocol) which replaced CDP(or you can run lldpctl after installing package lldpd, install method here):

[root@test ~]# tcpdump -i eth3 -s 1500 -XX -c 1 'ether proto 0x88cc'

[root@test ~]# tcpdump -i eth3 -v -s 1500 -c 1 '(ether[12:2]=0x88cc or ether[20:2]=0x2000)'

# lldpctl
-------------------------------------------------------------------------------
LLDP neighbors:
-------------------------------------------------------------------------------
Interface: eth4, via: LLDP, RID: 2, Time: 0 day, 00:21:15
Chassis:
ChassisID: local test-cz01-o13-swi-2
Port:
PortID: ifalias Ex0/12
-------------------------------------------------------------------------------
Interface: eth5, via: LLDP, RID: 2, Time: 0 day, 07:17:52
Chassis:
ChassisID: local test-cz01-o13-swi-2
Port:
PortID: ifalias Ex0/11
-------------------------------------------------------------------------------
Interface: eth0, via: LLDP, RID: 1, Time: 0 day, 07:18:15
Chassis:
ChassisID: mac 70:ca:9b:d2:f2:3f
SysName: test-cz01-g13-swi-1.ucf.oracle.com
SysDescr: Cisco IOS Software, Catalyst 4500 L3 Switch Software (cat4500-IPBASEK9-M), Version 15.0(2)SG6, RELEASE SOFTWARE (fc1)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2012 by Cisco Systems, Inc.
Compiled Wed 31-Oct-12 13:38 by prod_r
MgmtIP: 192.168.80.75
Capability: Bridge, on
Capability: Router, on
Port:
PortID: ifname Gi1/19
PortDescr: GigabitEthernet1/19
PMD autoneg: supported: yes, enabled: yes
Adv: 10Base-T, HD: yes, FD: yes
Adv: 100Base-TX, HD: yes, FD: yes
Adv: 1000Base-T, HD: yes, FD: yes
MAU oper type: 1000BaseTFD - Four-pair Category 5 UTP, full duplex mode
VLAN: 125, pvid: yes
-------------------------------------------------------------------------------

PS:

 1. Here's more on tcpdump tips http://dazdaztech.wordpress.com/2013/05/17/using-tcpdump-to-see-cdp-or-lldp-packets/ and http://the-welters.com/professional/tcpdump.html

 2. For EtherType, you can refer to wiki page here https://en.wikipedia.org/wiki/EtherType

 4. Here's more about MTU:

The link layer, which is typically Ethernet, sends information into the network as a series of frames. Even though the layers above may have pieces of information much larger than the frame size, the link layer breaks everything up into frames(which in payload encloses IP packet such as TCP/UDP/ICMP) to send them over the network. This maximum size of data in a frame is known as the maximum transfer unit (MTU). You can use network configuration tools such as ip or ifconfig to set the MTU.

The size of the MTU has a direct impact on the efficiency of the network. Each frame in the link layer has a small header, so using a large MTU increases the ratio of user data to overhead (header). When using a large MTU, however, each frame of data has a higher chance of being corrupted or dropped. For clean physical links, a high MTU usually leads to better performance because it requires less overhead; for noisy links, however, a smaller MTU may actually enhance performance because less data has to be re-sent when a single frame is corrupted.

Here's one image of layers of network frames:

layers-of-network-frames

 

Add static routes in linux which will survive reboot and network bouncing

We can see that in linux, the file /etc/sysconfig/static-routes is revoked by /etc/init.d/network:

[root@test-linux ~]# grep static-routes /etc/init.d/network
# Add non interface-specific static-routes.
if [ -f /etc/sysconfig/static-routes ]; then
grep "^any" /etc/sysconfig/static-routes | while read ignore args ; do

So we can add rules in /etc/sysconfig/static-routes to let network routes survive reboot and network bouncing. The format of /etc/sysconfig/static-routes is like:

any net 10.247.17.0 netmask 255.255.255.192 gw 10.247.10.1
any net 10.247.11.128 netmask 255.255.255.192 gw 10.247.10.1

To make route in effect immediately, you can use route add:

route add -net 192.168.62.0 netmask 255.255.255.0 gw 192.168.1.1

But remember that to change the default gateway, we need modify /etc/sysconfig/network(modify GATEWAY=).

After the modification, bounce the network using service network restart to make the changes in effect.

PS: 

  • You need make sure network id follows -net, or you'll see error "route: netmask doesn't match route address".
  • To reload all static routes in /etc/sysconfig/static-routes, you can do the following:
      # Add non interface-specific static-routes.
        if [ -f /etc/sysconfig/static-routes ]; then
           grep "^any" /etc/sysconfig/static-routes | while read ignore args ; do
              /sbin/route add -$args
           done
        fi

resolved - mount clntudp_create: RPC: Program not registered

When I did a showmount -e localhost, error occured:

[root@centos-doxer ~]# showmount -e localhost
mount clntudp_create: RPC: Program not registered

So I checked what RPC program number of showmount was using:

[root@centos-doxer ~]# grep showmount /etc/rpc
mountd 100005 mount showmount

As this indicated, we should startup mountd daemon to make showmount -e localhost work. And mountd is part of nfs, so I started up nfs:

[root@centos-doxer ~]# /etc/init.d/nfs start
Starting NFS services: [ OK ]
Starting NFS quotas: [ OK ]
Starting NFS daemon: [ OK ]
Starting NFS mountd: [ OK ]

Now as mountd was running, showmount -e localhost should work.

 

VLAN in windows hyper-v

Briefly, a virtual LAN (VLAN) can be regarded as a broadcast domain. It operates on the OSI
network layer 2. The exact protocol definition is known as 802.1Q. Each network packet belong-
ing to a VLAN has an identifier. This is just a number between 0 and 4095, with both 0 and 4095
reserved for other uses. Let’s assume a VLAN with an identifier of 10. A NIC configured with
the VLAN ID of 10 will pick up network packets with the same ID and will ignore all other IDs.
The point of VLANs is that switches and routers enabled for 802.1Q can present VLANs to dif-
ferent switch ports in the network. In other words, where a normal IP subnet is limited to a set
of ports on a physical switch, a subnet defined in a VLAN can be present on any switch port—if
so configured, of course.

Getting back to the VLAN functionality in Hyper-V: both virtual switches and virtual NICs
can detect and use VLAN IDs. Both can accept and reject network packets based on VLAN ID,
which means that the VM does not have to do it itself. The use of VLAN enables Hyper-V to
participate in more advanced network designs. One limitation in the current implementation is
that a virtual switch can have just one VLAN ID, although that should not matter too much in
practice. The default setting is to accept all VLAN IDs.

F5 big-ip LTM iRULE to redirect http requests to https

Here's the irule script:

when HTTP_REQUEST {
HTTP::redirect "https://[HTTP::host][HTTP::uri]"
}

PS:

1.You can read more about F5 LTM docs here http://support.f5.com/kb/en-us/products/big-ip_ltm.html <select a version of big ip software from the left side first>

2.Here's one diagram shows a logical configuration example of the F5 solution for Oracle Database, Applications, Middleware, Servers and Storage:

oracle_f5

PS:

Here is about "Persistence profile types".

tcp flags explanation in details - SYN ACK FIN RST URG PSH and iptables for sync flood

This is from wikipedia:

To establish a connection, TCP uses a three-way handshake. Before a client attempts to connect with a server, the server must first bind to and listen at a port to open it up for connections: this is called a passive open. Once the passive open is established, a client may initiate an active open. To establish a connection, the three-way (or 3-step) handshake occurs:

  1. SYN: The active open is performed by the client sending a SYN to the server. The client sets the segment's sequence number to a random value A.
  2. SYN-ACK: In response, the server replies with a SYN-ACK. The acknowledgment number is set to one more than the received sequence number i.e. A+1, and the sequence number that the server chooses for the packet is another random number, B.
  3. ACK: Finally, the client sends an ACK back to the server. The sequence number is set to the received acknowledgement value i.e. A+1, and the acknowledgement number is set to one more than the received sequence number i.e. B+1.

At this point, both the client and server have received an acknowledgment of the connection. The steps 1, 2 establish the connection parameter (sequence number) for one direction and it is acknowledged. The steps 2, 3 establish the connection parameter (sequence number) for the other direction and it is acknowledged. With these, a full-duplex communication is established.

You can read pdf document here http://www.nordu.net/development/2nd-cnnw/tcp-analysis-based-on-flags.pdf

H3C's implementations of sync flood solution http://www.h3c.com/portal/Products___Solutions/Technology/Security_and_VPN/Technology_White_Paper/200812/624110_57_0.htm

Using iptables to resolve sync flood issue http://pierre.linux.edu/2010/04/how-to-secure-your-webserver-against-syn-flooding-and-dos-attack/ and http://www.cyberciti.biz/tips/howto-limit-linux-syn-attacks.html

You may also consider using tcpkill to kill half open sessions(using ss -s/netstat -s<SYN_RECV>/tcptrack to see connection summary)

Output from netstat -atun:

The reason for waiting is that packets may arrive out of order or be retransmitted after the connection has been closed. CLOSE_WAIT indicates that the other side of the connection has closed the connection. TIME_WAIT indicates that this side has closed the connection. The connection is being kept around so that any delayed packets can be matched to the connection and handled appropriately.

more on http://kb.iu.edu/data/ajmi.html about FIN_wait (one error: 2MSL<Maximum Segment Lifetime>=120s, not 2ms)

All about tcp socket states: http://www.krenel.org/tcp-time_wait-and-ephemeral-ports-bad-friends/

And here's more about tcp connection(internet socket) states: https://en.wikipedia.org/wiki/Transmission_Control_Protocol#Protocol_operation

device bond0/bond1 does not seem to be present delaying

Today I encountered one problem when trying to start up network:

"device bond0 does not seem to be present delaying"

In my env, bond0 was bridged to v119_BR. And eth0 and eth1 that composed bond0 were not in any vlan in the switch side. Here's some more info:

-bash-3.2# ifconfig v119_BR

v119_BR Link encap:Ethernet HWaddr 00:21:28:EE:C5:37
inet addr:10.172.149.171 Bcast:10.172.151.255 Mask:255.255.255.0
inet6 addr: 2606:b400:2010:4051:221:28ff:feee:c537/64 Scope:Global
inet6 addr: fe80::221:28ff:feee:c537/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:3830 errors:0 dropped:0 overruns:0 frame:0
TX packets:334 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:292812 (285.9 KiB) TX bytes:65740 (64.1 KiB)

-bash-3.2# cat ifcfg-v119_BR
DEVICE=v119_BR
BOOTPROTO=none
USERCTL=no
ONBOOT=yes
STP=off
TYPE=Bridge
IPADDR=10.172.149.171
NETMASK=255.255.255.0
NETWORK=10.172.144.0
BROADCAST=10.172.151.255

And I've bond0(eth0 and eth1) bridged to v119_BR:

-bash-3.2# cat ifcfg-bond0
DEVICE=bond0
BOOTPROTO=none
USERCTL=no
ONBOOT=yes
BRIDGE=v119_BR
BONDING_OPTS="primary=eth0 miimon=100 mode=1"
VLAN=no

PS:

As no multiple VLANs are defined on switch, so no alias interface needed for the bridge v119_BR. If multiple VLANs are defined on eth0 & eth1, then we have to define bond0.117(117 is vlan ID), which should be like the following:
[root@testhost ~]# cat /etc/sysconfig/network-scripts/ifcfg-bond0.117 #and then ifcfg-bond0.118 and more if these vlans are defined on switch
DEVICE=bond0.117
BOOTPROTO=none
USERCTL=no
ONBOOT=yes
BRIDGE=v119_BR
VLAN=yes

[root@testhost ~]# cat ifcfg-bond0.122
DEVICE=bond0.122
BOOTPROTO=none
USERCTL=no
ONBOOT=yes
BRIDGE=v122_MGT
VLAN=yes

[root@testhost network-scripts]# cat ifcfg-bond0 #there's no BRIDGE here, as multiple Bridges will be specified in alias config
DEVICE=bond0
BOOTPROTO=none
USERCTL=no
ONBOOT=yes
BONDING_OPTS="primary=eth0 miimon=100 mode=1"

So what about the problem:

"device bond0/bond0 does not seem to be present delaying"

After some searching, I found that it's caused by bond module not loaded. To resolve this, I tried:

-bash-3.2# uname -r
2.6.18-128.2.1.4.37.el5xen

-bash-3.2# find . -name "*bond*"
./kernel/drivers/net/tulip/winbond-840.ko
./kernel/drivers/net/bonding
./kernel/drivers/net/bonding/bonding.ko

insmod /lib/modules/2.6.18-128.2.1.4.37.el5xen/kernel/drivers/net/bonding/bonding.ko

Add the following lines in /etc/modprobe.conf

alias bond1 bonding
alias bond0 bonding

Then reboot the host. After these steps, the issue was resolved.

iptables rules after creating NAT to eth0 in virt-manager

During creating a new virtual network in virt-manager GUI, I specified the following parameters:

Network: 192.168.72.0/24

Start: 192.168.72.128

End: 192.168.72.254

Forwarding to physical network(Destination: Physical device eth0, Mode NAT)

After this, the iptables will automatically created:

[root@centos-doxer ~]# iptables-save #dump the contents of an IP Table. Use /etc/init.d/iptables save to save iptables rules to /etc/sysconfig/iptables
*filter
:INPUT ACCEPT [19172:2049589]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [16236:5446648]
-A INPUT -i virbr1 -p udp -m udp --dport 53 -j ACCEPT
-A INPUT -i virbr1 -p tcp -m tcp --dport 53 -j ACCEPT
-A INPUT -i virbr1 -p udp -m udp --dport 67 -j ACCEPT
-A INPUT -i virbr1 -p tcp -m tcp --dport 67 -j ACCEPT
-A FORWARD -d 192.168.72.0/255.255.255.0 -i eth0 -o virbr1 -m state --state RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -s 192.168.72.0/255.255.255.0 -i virbr1 -o eth0 -j ACCEPT
-A FORWARD -i virbr1 -o virbr1 -j ACCEPT
-A FORWARD -o virbr1 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -i virbr1 -j REJECT --reject-with icmp-port-unreachable
COMMIT

*nat
:PREROUTING ACCEPT [16665:4067972]
:POSTROUTING ACCEPT [197:13739]
:OUTPUT ACCEPT [197:13739]
-A POSTROUTING -s 192.168.72.0/255.255.255.0 -d ! 192.168.72.0/255.255.255.0 -o eth0 -p tcp -j MASQUERADE --to-ports 1024-65535
-A POSTROUTING -s 192.168.72.0/255.255.255.0 -d ! 192.168.72.0/255.255.255.0 -o eth0 -p udp -j MASQUERADE --to-ports 1024-65535
-A POSTROUTING -s 192.168.72.0/255.255.255.0 -d ! 192.168.72.0/255.255.255.0 -o eth0 -j MASQUERADE
COMMIT

And here's the bridge info:

[root@centos-doxer ~]# brctl show
bridge name bridge id STP enabled interfaces
virbr0 8000.0800273fc508 no eth0
virbr1 8000.000000000000 yes #one vm will have one interface here, format will be like virbr1-nic

[root@centos-doxer ~]# ifconfig virbr1
virbr1 Link encap:Ethernet HWaddr 00:00:00:00:00:00
inet addr:192.168.72.1 Bcast:192.168.72.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:12 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 b) TX bytes:3656 (3.5 KiB)

NAT forwarding for ssh and vncviewer

*nat
:PREROUTING ACCEPT [1:88]
:POSTROUTING ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
-A PREROUTING -p tcp --dport 5911 -d 192.168.40.10 -j DNAT --to-destination 172.16.0.101:5904 #now ssh <ip of eth0, 192.168.40.10> -p 5911 is going to visit 172.16.0.101:5904
-A PREROUTING -p tcp --dport 222 -d 192.168.40.10 -j DNAT --to-destination 172.16.0.101:22 #now ssh <ip of eth0, 192.168.40.10> -p 222 is going to visit 172.16.0.101:22
-A POSTROUTING -o eth0 -j MASQUERADE #if eth0 is private ip, you can also do a NAT with one public ip.
COMMIT

*filter
:INPUT ACCEPT [247:16364]
:FORWARD ACCEPT [163:13692]
:OUTPUT ACCEPT [228:18664]
COMMIT