how to create a inet socket

Data Structure

1
2
3
4
5
1019 static const struct net_proto_family inet_family_ops = {
1020 .family = PF_INET,
1021 .create = inet_create,
1022 .owner = THIS_MODULE,
1023 };

call trace

1
2
3
4
5
6
> inet_create
> > search the right inetsw array element with type/protocol
> > sock->ops = answer->ops;
> > k = sk_alloc(net, PF_INET, GFP_KERNEL, answer_prot);
> > sock_init_data(sock, sk);
> > init the sk

Read More

system call socket

Summary

System call socket will do two things:

  1. create a struct socket *sock.
    Mainly done by __sock_create, which alloc a struct socket *sock,
    then init it with the creating method of net_families[family].
  2. map the sock to a file descriptor by sock_map_fd.
    TODO…..

call trace:

1
2
3
4
5
6
7
8
9
10
11
12
> socket
> > sock_create
> > > __sock_create
> > > > sock = sock_alloc();
> > > > sock->type = type;
> > > > rcu_read_lock();
> > > > pf = rcu_dereference(net_families[family]);
> > > > try_module_get(pf->owner))
> > > > rcu_read_unlock();
> > > > pf->create(net, sock, protocol, kern);
> > > > module_put(pf->owner)
> > socket_map_fd

Read More

socket net_proto_family

summary

Each family has a corresponding array element of struct net_proto_family,
which will be called in system call socket.

Data Structure

1
2
3
4
5
6
181 struct net_proto_family {
182 int family;
183 int (*create)(struct net *net, struct socket *sock,
184 int protocol, int kern);
185 struct module *owner;
186 };

The create is important, which is first and basic function during
system call socket.

1
2
164 static DEFINE_SPINLOCK(net_family_lock);
165 static const struct net_proto_family __rcu *net_families[NPROTO] __read_mostly;

Read More

IPv4 route fib tree rebalance

call trace

以插入一条新的路由为例。

1
2
3
4
5
6
> fib_insert_node
> > trie_rebalance
> > > while loop
> > > > resize
> > > > tnode_put_child_reorg
> > > > tnode_free_flush

trie_rebalance

1
2
3
4
5
6
7
8
9
for_each_node(from current node tn to  fib_trie root)
call resize()
tnode_put_child_reorg

从当前节点开始一直到根节点,以当前节点作为一个子树,
反复调用resize, 并通过tnode_put_child_reorg
更新当前节点的父节点的统计信息。

注: resize可能会更改子树的根节点!

Read More

where softirq is invoked

summary

softirq 真正干活的函数是__do_softirq
linuxv3.11内核里能够执行__do_softirq,有如下调用,
这里指真正执行softirq的地方,不是触发(设置)softirq标志 !!!

  1. 每个硬中断退出的时。
  2. 当bh使能的时。
  3. 发送回环报文时。

Read More

IPv4 route fib insert node

steps

step 1. 循环fib_trie, 直到当前节点为空或者是叶子节点时,停止。
1.a
比较当前节点的key跟待插入的key的前pos位,
如果不相等,循环结束。如果相等执行1.b
(作为改进,加上父节点已经比较了前posx位, 那么只需比较
(posx,pos)这区间的位即可。 注意posx有可能与pos相等。
1.b
记录已经比较的位数(tn->pos + tn->bits)
取当前节点的一个孩子节点,
tkey_extract_bits(key, tn->pos, tn->bits)
继续执行。

step 2. 循环结束,当前节点n有可能为空,也有可能是个叶子节点,或者中间节点(1.a).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
if (n == NULL)
if (fib_trie == NULL)
shit! 树的第一个节点。
else
当前节点是父节点的一个空孩子节点。
以为树不是空的,所以父节点肯定存在。
见 case2
else n is a leaf
if (tkey_equals(key, n->key)) //'key'值相同
case 1:
路由前缀相同(可能掩码不一样)
else
case 3b.
else
//assert n is a internal node
并且,当前节点的key的pos位跟待插入的key的前pos位不相等。
见case3a

Read More