tcpudmp对 direction的支持。
内核代码有一个关键数据结构:skb的pkt_type字段。
在收发路径这个域被赋值为PACKET_OUTGOING或者其他。
这个值被传递到往用户空间,libpcap根据它判断报文的方向是否是期望的。
pkt_type的可能取值
1 | 24 #define PACKET_HOST 0 /* To us */ |
内核代码有一个关键数据结构:skb的pkt_type字段。
在收发路径这个域被赋值为PACKET_OUTGOING或者其他。
这个值被传递到往用户空间,libpcap根据它判断报文的方向是否是期望的。
1 | 24 #define PACKET_HOST 0 /* To us */ |
vhost net 的目的是为了避免在host kerne上做一次qemu的调度,提升性能。
xmit: 让vm的数据报在 host的内核就把报文发送出去。
rcv:
vhost_poll
是vhost里最关键的一个数据结构。1 | 27 /* Pol> > file (eventfd or socket) */ |
1 | ==> ipvlan_start_xmit |
1 | ==> ipvlan_start_xmit |
1 | ==> ipvlan_start_xmit |
All the packet are get by the rx_handler
, ipvlan_handle_frame
.
lookup the dest ipvlan port(net_device)
by the dst IPv4/6 address, and send to it.
1 | ==> ipvlan_handle_frame |
1 | ==> ipvlan_handle_frame |
1 | ipvlan_start_xmit |
1 | 308 static const struct net_device_ops ipvlan_netdev_ops = { |
1 | 495 int ipvlan_queue_xmit(struct sk_buff *skb, struct net_device *dev) |
1 | 457 static int ipvlan_xmit_mode_l2(struct sk_buff *skb, struct net_device *dev) |
For ixgbe nic, we want to assign a tx hardware qeueue to each cpu,
and the tx softirq should use the corresponding hardware queue.
each packet will select a softqueue in dev_queue_xmit
,
we rewrite ixgbe driver ndo_select_queue
(ixgbe_select_queue
),
which will return current cpu index(based 0) when packet select queue.
thus for each cpu use its own tx queue.
but, we found some packet had unmatched queue index when send
on specific cpu.
for example, a packet’s queue index is 5 but is sent by cpu3,
thus, cpu3 will operate tx hw queue5, which should only be done by cpu5.
1 | 7913 static const struct net_device_ops ixgbe_netdev_ops = { |
record queue_index +1
, 0 is used as NOT record.
####call trace
1 | > ixgbe_poll |
####On redhat5, Why tcpdump could not work on bonding work.
OS: redhat 5.
There are two 82599 interfaces eth0 and eth1.
These two interfaces are used as slave of bond0,
eth1 is backup of eth0.
We ping the default gateway on test machine.
ping work OK, and tcpdump on bond0 show the icmp request and icmp require packets.
while on eth0 only icmp request, and eth1 has no any packet.
以handle_level_irq
为例说明.
1 | ===> handle_level_irq |
1 | 127 static inline int __must_check |
reg value–>irq(int) —> struct irq_desc
1 | ==> 中断时的有一个寄存器会保存中断源的vector值. |
注:这里的handle_irq不是真正的中断处理函数,而是几大类中断控制器处理函数.
如82599, msi等.
具体分析见:irq study1
1 | ==> handle_level_irq |
这里的action->handler
才是我们使用request_irq
注册的中断处理函数.
具体分析见:
具体分析见:irq study2