momo zone

调核人的blog

Monthly Archives: 七月 2014

linux 多网卡多ip同lan(vlan)下的问题

拓扑
========================+
+-------------------+   |         +-------------------+
|192.168.13.99 (wlan0)  |---------|                   |
+-------------------+   |         |                   |      +---------------------+
                        |         |       SW/router   |      |                     |
                        |         |      192.168.13.1 |------|   192.168.13.103    |
+-------------------+   |         |                   |      |                     |
|192.168.13.104(eth2)   |---------|                   |      +---------------------+
+-------------------+   |         +-------------------+
========================+

default gw 192.168.13.1 via wlan0

/proc/sys/net/ipv4/conf/all/rp_filter = 0

/proc/sys/net/ipv4/conf/all/arp_accept = 0

/proc/sys/net/ipv4/conf/all/arp_announce = 0

/proc/sys/net/ipv4/conf/all/arp_filter = 0

/proc/sys/net/ipv4/conf/all/arp_ignore = 0

/proc/sys/net/ipv4/conf/all/arp_notify = 0

 

/proc/sys/net/ipv4/conf/default/rp_filter = 0

/proc/sys/net/ipv4/conf/default/arp_accept = 0

/proc/sys/net/ipv4/conf/default/arp_announce = 0

/proc/sys/net/ipv4/conf/default/arp_filter = 0

/proc/sys/net/ipv4/conf/default/arp_ignore = 0

/proc/sys/net/ipv4/conf/default/arp_notify = 0

 

1. SW/ROUTE询问192.168.13.104的ARP,eth0和wlan0均应答,且mac地址不同。应答的原因是ip地址是属于主机的,而不是属于网卡。SW/ROUTE以最后应答的mac地址为准。

2.同样SW/ROUTE询问192.168.13.99的ARP,eth0和wlan0均应答,且mac地址不同。SW/ROUTE以最后应答的mac地址为准。

综合1,2两点这里存在很严重的问题:看起来lan中存在arp冲突,但实际却没有,只是因为ip地址属于主机,操作系统对任意网口都作了应答。另一方面是SW中缓存的arp表项中ip/mac与实际不符,会造成数据流不正确,比如目的为192.168.13.99(eth2)的包都到wlan0上去了,这样另一条链路成了摆设。

解决该问题的方法是给这两个接口设置arp_ignore:

echo 1 > /proc/sys/net/ipv4/conf/wlan0/arp_ignore

echo 1 > /proc/sys/net/ipv4/conf/eth2/arp_ignore

0 – (默认值): 回应任何网络接口上对任何本地IP地址的arp查询请求 

1 – 只回答目标IP地址是来访网络接口本地地址的ARP查询请求 

2 -只回答目标IP地址是来访网络接口本地地址的ARP查询请求,且来访IP必须在该网络接口的子网段内 

3 – 不回该网络界面的arp请求,而只对设置的唯一和连接地址做出回应 

4-7 – 保留未使用

8 -不回应所有(本地地址)的arp查询

然后还会发现一个问题,如果192.168.13.103 去ping 192.168.13.99 会发现请求到达了eth2,但应答是从192.168.13.104(wlan0)出去的。这是因为默认网关是配在wlan0上的。

此外还有另外一个参数arp_filter,它和arp_ignore有什么区别?

arp_filter - BOOLEAN
    1 - Allows you to have multiple network interfaces on the same
    subnet, and have the ARPs for each interface be answered
    based on whether or not the kernel would route a packet from
    the ARP'd IP out that interface (therefore you must use source
    based routing for this to work). In other words it allows control
    of which cards (usually 1) will respond to an arp request.

    0 - (default) The kernel can respond to arp requests with addresses
    from other interfaces. This may seem wrong but it usually makes
    sense, because it increases the chance of successful communication.
    IP addresses are owned by the complete host on Linux, not by
    particular interfaces. Only for more complex setups like load-
    balancing, does this behaviour cause problems.

    arp_filter for the interface will be enabled if at least one of
    conf/{all,interface}/arp_filter is set to TRUE,
    it will be disabled otherwise

如果设置为1,那么内核也不会让每个网口都把自己的mac地址给arp应答。这点和arp_ignore一样,但不同的是内核会选择一个最佳网口来应答该arp请求,选择的标准是基于路由的,对于现在的情况是因为默认路由配在wlan0上,所以由wlan0应答arp,而且arp应答的mac地址也是wlan0。所以这样虽然不会造成arp冲突,但eth2的链路同样被浪费,对于我们的目标调整该参数无意义。

还有一个问题,如果rp_filter=1,arp_ignore=1 那么wlan0或eth2其中一个不会作arp应答。没错,原因确实是反向路由验证的问题。也就是说arp流程也会走路由验证。local的路由表如下:

192.168.13.0 0.0.0.0 255.255.255.0 U 0 0 0 wlan0
192.168.13.0 0.0.0.0 255.255.255.0 U 0 0 0 eth2

这种情况下eth2不会应答,原因是收到192.168.13.103发来的arp 192.168.13.99的请求后会验证反向路由,内核把源目的颠倒一下发现wlan0的路由表项在前面,但实际来自eth2,所以这个请求包被过滤掉了。

static int arp_process(struct sk_buff *skb)
{
	struct net_device *dev = skb->dev;
	struct in_device *in_dev = __in_dev_get_rcu(dev);
	struct arphdr *arp;
	unsigned char *arp_ptr;
	struct rtable *rt;
	unsigned char *sha;
	__be32 sip, tip;
	u16 dev_type = dev->type;
	int addr_type;
	struct neighbour *n;
	struct net *net = dev_net(dev);

	/* arp_rcv below verifies the ARP header and verifies the device
	 * is ARP'able.
	 */

	if (in_dev == NULL)
		goto out;

	arp = arp_hdr(skb);

	switch (dev_type) {
	default:
		if (arp->ar_pro != htons(ETH_P_IP) ||
		    htons(dev_type) != arp->ar_hrd)
			goto out;
		break;
	case ARPHRD_ETHER:
	case ARPHRD_FDDI:
	case ARPHRD_IEEE802:
		/*
		 * ETHERNET, and Fibre Channel (which are IEEE 802
		 * devices, according to RFC 2625) devices will accept ARP
		 * hardware types of either 1 (Ethernet) or 6 (IEEE 802.2).
		 * This is the case also of FDDI, where the RFC 1390 says that
		 * FDDI devices should accept ARP hardware of (1) Ethernet,
		 * however, to be more robust, we'll accept both 1 (Ethernet)
		 * or 6 (IEEE 802.2)
		 */
		if ((arp->ar_hrd != htons(ARPHRD_ETHER) &&
		     arp->ar_hrd != htons(ARPHRD_IEEE802)) ||
		    arp->ar_pro != htons(ETH_P_IP))                    //验证报文
			goto out;
		break;
	case ARPHRD_AX25:
		if (arp->ar_pro != htons(AX25_P_IP) ||
		    arp->ar_hrd != htons(ARPHRD_AX25))
			goto out;
		break;
	case ARPHRD_NETROM:
		if (arp->ar_pro != htons(AX25_P_IP) ||
		    arp->ar_hrd != htons(ARPHRD_NETROM))
			goto out;
		break;
	}

	/* Understand only these message types */

	if (arp->ar_op != htons(ARPOP_REPLY) &&
	    arp->ar_op != htons(ARPOP_REQUEST))
		goto out;

/*
 *	Extract fields
 */
	arp_ptr = (unsigned char *)(arp + 1);
	sha	= arp_ptr;
	arp_ptr += dev->addr_len;
	memcpy(&sip, arp_ptr, 4);                //源ip
	arp_ptr += 4;
	switch (dev_type) {
#if IS_ENABLED(CONFIG_FIREWIRE_NET)
	case ARPHRD_IEEE1394:
		break;
#endif
	default:
		arp_ptr += dev->addr_len;
	}
	memcpy(&tip, arp_ptr, 4);
/*
 *	Check for bad requests for 127.x.x.x and requests for multicast
 *	addresses.  If this is one such, delete it.
 */
	if (ipv4_is_multicast(tip) ||
	    (!IN_DEV_ROUTE_LOCALNET(in_dev) && ipv4_is_loopback(tip)))
		goto out;

/*
 *     Special case: We must set Frame Relay source Q.922 address
 */
	if (dev_type == ARPHRD_DLCI)
		sha = dev->broadcast;

/*
 *  Process entry.  The idea here is we want to send a reply if it is a
 *  request for us or if it is a request for someone else that we hold
 *  a proxy for.  We want to add an entry to our cache if it is a reply
 *  to us or if it is a request for our address.
 *  (The assumption for this last is that if someone is requesting our
 *  address, they are probably intending to talk to us, so it saves time
 *  if we cache their address.  Their address is also probably not in
 *  our cache, since ours is not in their cache.)
 *
 *  Putting this another way, we only care about replies if they are to
 *  us, in which case we add them to the cache.  For requests, we care
 *  about those for us and those for our proxies.  We reply to both,
 *  and in the case of requests for us we add the requester to the arp
 *  cache.
 */

	/* Special case: IPv4 duplicate address detection packet (RFC2131) */   //ip地址冲突检测
	if (sip == 0) {
		if (arp->ar_op == htons(ARPOP_REQUEST) &&
		    inet_addr_type(net, tip) == RTN_LOCAL &&
		    !arp_ignore(in_dev, sip, tip))
			arp_send(ARPOP_REPLY, ETH_P_ARP, sip, dev, tip, sha,
				 dev->dev_addr, sha);
		goto out;
	}

        //arp请求处理流程
	if (arp->ar_op == htons(ARPOP_REQUEST) &&
	    ip_route_input_noref(skb, tip, sip, 0, dev) == 0) {  //其中会调用fib_validate_source进行反向路由验证

		rt = skb_rtable(skb);
		addr_type = rt->rt_type;

                //如果不转发,肯定是这个分支
		if (addr_type == RTN_LOCAL) {
			int dont_send;

			dont_send = arp_ignore(in_dev, sip, tip);
			if (!dont_send && IN_DEV_ARPFILTER(in_dev))
				dont_send = arp_filter(sip, tip, dev);
			if (!dont_send) {
				n = neigh_event_ns(&arp_tbl, sha, &sip, dev);
				if (n) {
					arp_send(ARPOP_REPLY, ETH_P_ARP, sip,
						 dev, tip, sha, dev->dev_addr,
						 sha);
					neigh_release(n);
				}
			}
			goto out;
		} else if (IN_DEV_FORWARD(in_dev)) {
			if (addr_type == RTN_UNICAST  &&
			    (arp_fwd_proxy(in_dev, dev, rt) ||
			     arp_fwd_pvlan(in_dev, dev, rt, sip, tip) ||
			     (rt->dst.dev != dev &&
			      pneigh_lookup(&arp_tbl, net, &tip, dev, 0)))) {
				n = neigh_event_ns(&arp_tbl, sha, &sip, dev);
				if (n)
					neigh_release(n);

				if (NEIGH_CB(skb)->flags & LOCALLY_ENQUEUED ||
				    skb->pkt_type == PACKET_HOST ||
				    in_dev->arp_parms->proxy_delay == 0) {
					arp_send(ARPOP_REPLY, ETH_P_ARP, sip,
						 dev, tip, sha, dev->dev_addr,
						 sha);
				} else {
					pneigh_enqueue(&arp_tbl,
						       in_dev->arp_parms, skb);
					return 0;
				}
				goto out;
			}
		}
	}

	/* Update our ARP tables */

	n = __neigh_lookup(&arp_tbl, &sip, dev, 0);

	if (IN_DEV_ARP_ACCEPT(in_dev)) {
		/* Unsolicited ARP is not accepted by default.
		   It is possible, that this option should be enabled for some
		   devices (strip is candidate)
		 */
		if (n == NULL &&
		    (arp->ar_op == htons(ARPOP_REPLY) ||
		     (arp->ar_op == htons(ARPOP_REQUEST) && tip == sip)) &&
		    inet_addr_type(net, sip) == RTN_UNICAST)
			n = __neigh_lookup(&arp_tbl, &sip, dev, 1);
	}

	if (n) {
		int state = NUD_REACHABLE;
		int override;

		/* If several different ARP replies follows back-to-back,
		   use the FIRST one. It is possible, if several proxy
		   agents are active. Taking the first reply prevents
		   arp trashing and chooses the fastest router.
		 */
		override = time_after(jiffies, n->updated + n->parms->locktime);

		/* Broadcast replies and request packets
		   do not assert neighbour reachability.
		 */
		if (arp->ar_op != htons(ARPOP_REPLY) ||
		    skb->pkt_type != PACKET_HOST)
			state = NUD_STALE;
		neigh_update(n, sha, state,
			     override ? NEIGH_UPDATE_F_OVERRIDE : 0);
		neigh_release(n);
	}

out:
	consume_skb(skb);
	return 0;
}
Advertisements

2014青海之行

 

 

 

 

 

 

shot by cannon g12

 

IMG_1031

 

IMG_1033IMG_1043IMG_1087IMG_1089IMG_1133IMG_1155IMG_1162IMG_1164IMG_1194IMG_1196IMG_1199IMG_1249IMG_1082IMG_1073IMG_1070IMG_1045IMG_1044

多宿主机中从指定的接口发包

目前知道两种方法:
一种是使用bind函数绑定到指定ip上,这个众所周知:

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

int main()
{
	struct sockaddr_ll sll;
	int fd;
	struct ifreq ifr;
	char *dev;
	
	fd = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
	dev = "eth0";
	
	strncpy((char *)ifr.ifr_name, dev, sizeof(ifr.ifr_name));
	assert(ioctl(fd, SIOCGIFINDEX, &ifr)==0);

	memset(&sll, 0, sizeof(sll));
	sll.sll_family = AF_PACKET;
	sll.sll_protocol = htons(ETH_P_ALL);
	sll.sll_ifindex = ifr.ifr_ifindex;

	assert(bind(fd, (struct sockaddr *)&sll, sizeof(sll)==0));
}

另一种方法是使用socket option SO_BINDTODEVICE:

#include 
#include 
struct ifreq ifr;
struct socket fd;

strncpy(ifr.ifr_name, "eth0", IFNAMSIZ);
if(setsockopt(sock, SOL_SOCKET, SO_BINDTODEVICE, (char *)&ifr, sizeof(irq))<0){
      perror("SO_BINDTODEVICE failed");
}
不过这个方法有两个问题,
SO_BINDTODEVICE不是poxis所定义的,所以不具有可移植性
使用该方法不能代替bind函数,也就是说它仅绑定发送设备,这样回来的包会被rp_filter过滤掉

对于perl而言可以使用下面的方法:
setsockopt($sock, SOL_SOCKET, 25, pack("Z*", "eth0"))
如果使用的io::socket的话,new的时候不要指定ip,然后:
$sock->sockopt(25,pack("Z*", "eth0"));