pernet ops
1 | /** |
1 | int register_pernet_subsys(struct pernet_operations *ops) |
####Fox example
inet6_init in pernet.
1 | static struct pernet_operations inet6_net_ops = { |
1 | /** |
1 | int register_pernet_subsys(struct pernet_operations *ops) |
####Fox example
inet6_init in pernet.
1 | static struct pernet_operations inet6_net_ops = { |
promisc is one bit of struct net_device’s flag, which is used to indicate if a device is in promisc status.
1 | 30 /* Standard interface flags (netdevice->flags). */ |
There are two kinds of operataion, could cause a NIC enter/leave promisc status.
ip command
run mutli on
command, just need one off
to recover.
1 | ip link set dev eth0 promisc on |
tcpdump command
When tcpdump starts, it let dev to promisc,
and just before exit, tcpdump let dev left promisc.
All these is done by call kernel api dev_set_promiscuity.
在处理TCP-SYN首包时候, tcp_conn_request
函数里, 会有三个不同条件的长度检查。
inet_csk_reqsk_queue_is_full
半链接的个数超过sk_max_ack_backlog, 则丢包。sk_acceptq_is_full
: accept 队列长度超过sk_max_ack_backlog,则丢包。sysctl_tcp_syncookies
禁用(值为0)时, sysctl_max_syn_backlog
与inet_csk_reqsk_queue_len
: 队列长度如果超过sysctl_max_syn_backlog的3/4则丢包其中,
sysctl_max_syn_backlog
: 初始化时,最小 128。如果ehash_entries/128比128大,取最大值。ehash_entries是 tcp 的 hash 桶个数。sysctl_tcp_syncookies
: 初始值为 1虽然不再维护半链接队列了, 但是每次创建req socket后,这个统计值都是在增加的。
因此如果半链接个数
超过了最大值sk_max_ack_backlog
,则启用cookie(sysctl_tcp_syncookies为1或2),如果不支持cookie,则丢弃。
1 | 278 static inline int inet_csk_reqsk_queue_len(const struct sock *sk) |
当前内核默认启用syncookie机制(sysctl_tcp_syncookies为1),队列溢出会触发synccookie 机制。
只有关闭了tcpcookie(0)后,才会在队列溢出时候丢弃syn报文。
有时候我们需要给一个网口配置的有个 IP 地址,这时候我们有两种配置方法:
两种方法最终在内核实现是一样的,存储位置也一样,并且可以相互读写配置结果。
比如将9.9.9.199/24
配置到 lo 口上,并起个别名lo:9
1 | [root@VM-0-12-centos ~]# ifconfig lo:9 9.9.9.199/24 |
ifconfig命令的配置结果,也可以通用ip link
命令来查看
1 | [root@VM-0-12-centos ~]# ip a show dev lo |
网口的别名在ip命令里被当做label
输出,放在scope字段后面
内核如何维护网卡设备的RUNNING 状态
主要几个部分:
netif_carrier_on
和netif_carrier_off
。这个函数会netdev->state
上的__LINK_STATE_NOCARRIER
标志位。ndo_tx_timeout
, 做一些应急补救,比如对网卡队列复位等操作。这里的看门狗跟网卡驱动里的看门狗还不是同一个看门狗。具体差别待研究。linkwatch_do_dev(struct net_device *dev)
函数进行处理。 该函数更新netdev->operate
标志位。同时调用通用的dev_activate
或者dev_deactivate
对网卡做网卡队列进行处理。 我们这里重点关注跟网卡状态位有管的部分,忽略跟网卡队列的处理。rfc2863_policy
和default_operstate
后面我们重点介绍。netif_carrier_on
和netif_carrier_off
: 内核里的两个通用的处理函数,功能基本对称
总结:
dev->state
下的__LINK_STATE_NOCARRIER
是 carrier是否OK 的唯一判断标准。
__struct_group
浏览IPv6代码时候,看到这样一个新玩法,__struct_group
1 | 118 struct ipv6hdr { |
###用法
再看一下用法:
922 /* copy IPv6 saddr & daddr to flow_keys, possibly using 64bit load/store
923 * Equivalent to : flow->v6addrs.src = iph->saddr;
924 * flow->v6addrs.dst = iph->daddr;
925 */
926 static inline void iph_to_flow_copy_v6addrs(struct flow_keys *flow,
927 const struct ipv6hdr *iph)
928 {
...
932 memcpy(&flow->addrs.v6addrs, &iph->addrs, sizeof(flow->addrs.v6addrs));
...
1
2
3
###
11 /**
12 * __struct_group() - Create a mirrored named and anonyomous struct
13 *
14 * @TAG: The tag name for the named sub-struct (usually empty)
15 * @NAME: The identifier name of the mirrored sub-struct
16 * @ATTRS: Any struct attributes (usually empty)
17 * @MEMBERS: The member declarations for the mirrored structs
18 *
19 * Used to create an anonymous union of two structs with identical layout
20 * and size: one anonymous and one named. The former's members can be used
21 * normally without sub-struct naming, and the latter can be used to
22 * reason about the start, end, and size of the group of struct members.
23 * The named struct can also be explicitly tagged for layer reuse, as well
24 * as both having struct attributes appended.
25 */
26 #define __struct_group(TAG, NAME, ATTRS, MEMBERS...) \
27 union { \
28 struct { MEMBERS } ATTRS; \
29 struct TAG { MEMBERS } ATTRS NAME; \
30 } ATTRS
```