IPv4 route fib insert node

##steps

step 1. 循环fib_trie, 直到当前节点为空或者是叶子节点时,停止。
1.a
比较当前节点的key跟待插入的key的前pos位,
如果不相等,循环结束。如果相等执行1.b
(作为改进,加上父节点已经比较了前posx位, 那么只需比较
(posx,pos)这区间的位即可。 注意posx有可能与pos相等。
1.b
记录已经比较的位数(tn->pos + tn->bits)
取当前节点的一个孩子节点,
tkey_extract_bits(key, tn->pos, tn->bits)
继续执行。

step 2. 循环结束,当前节点n有可能为空,也有可能是个叶子节点,或者中间节点(1.a).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
if (n == NULL)
if (fib_trie == NULL)
shit! 树的第一个节点。
else
当前节点是父节点的一个空孩子节点。
以为树不是空的,所以父节点肯定存在。
见 case2
else n is a leaf
if (tkey_equals(key, n->key)) //'key'值相同
case 1:
路由前缀相同(可能掩码不一样)
else
case 3b.
else
//assert n is a internal node
并且,当前节点的key的pos位跟待插入的key的前pos位不相等。
见case3a

Read More

IPv4-route-fib-create-info

summary

根据用户参数,创建并返回fib_create_info节点。

  1. 如果用户参数有问题,则返回空。
    参数检查过程中,会创建一个新的struct fib_info 节点,如果参数检查失败,
    该节点会被free_fib_info释放(通过rcu模式)
  2. 如果存在一个相同配置参数的节点,则返回已有的struct fib_info 节点。
    其实已经创建了一个新的struct fib_info 节点,该节点会被free_fib_info释放(通过rcu模式)同1。
  3. 否则,创建一个struct fib_info 节点,并初始化,
    节点里相关的ref会被increase,同时将该节点链接到 struct fib_info
    相关的3个hash链(fib_info_hash, fib_info_laddrhash, fib_info_devhash)上.
  4. 最后返回新建的节点。

Read More

IPv4 route fib find node

###summary
遍历路由树,查找是否存在key这条路由。
如果有则返回对应的节点 struct leaf *
否则返回NULL。

中间节点 和 叶子节点

路由树(fib lc tree)是个多叉树,树的每个节点是struct rt_trie_node
每个节点的分叉个数(X)是可变的, 由 struct rt_trie_node里的bits决定。
X = 2 的bits次幂次方。

叶子节点的首个子类型也是struct rt_trie_node,所以可以看成是其派生。

树的遍历

  1. 如果当前节点为空, 结束循环。

  2. 如果当前节点不是T_TNODE,结束循环。
    即:当前节点为 叶子节点T_LEAF.

  3. 比较当前节点tn和参数key未比较过的前缀是否匹配。
    tkey_sub_equals(tn->key, x, y, key))

    未比较过的前缀[x-y)这个区间的位,包括x,不包括y
    x = 父节点的pos + 父节点的bits (父节点为空时,x取0)
    y = 当前节点的pos + 父节点的pos
    这里,具体取哪些位石油父节点和当前节点刚提决定的。

  4. 如果相等,则结束循环,判断当前节点是否真正匹配。

  5. 如果不相等,则跳到第N个一个孩子节点。
    N = tkey_extract_bits(key, tn->pos, tn->bits)
    N是取从参数key的第tn->pos位开始, 共tn->bits位转化出来的值。
    注意,此处是取key的某一位或者几位的值,而具体怎么取值是有当前节点的(pos,bits)决定的。

完整判断

当前节点n是叶子节点,并且key完全相同
n != NULL && IS_LEAF(n) && tkey_equals(key, n->key))

Read More

pfn_to_page and page_to_pfn

###summary
page_to_pfn and pfn_to_page are often used in kernel.

for PP81, kernel uses CONFIG_DISCONTIGMEM + NUMA.
Every node has node_mem_map and node_start_pfnto help the page/pfn covertion.
node_start_pfn is the first page’s pfn of this node.
node_mem_map stores all the struct page of this node.

so thete is a map between them:

`node_start_pfn + i` <===> `node_mem_map[i]`

The key is how to get node id by pfn or page.
the corresponding function is pfn_to_nid and page_to_nid

page_to_nid is simpile.
the node id is store in struct page ->flags
while pfn_to_nid(for pp81), it convert to pageaddress and then turn to pa_to_nid.

感觉这个实现有点罗嗦!
直接在include/asm-generic/memory_model.h 把pa_to_nid 定义一个 arch_pfn_to_nid宏。
省得pfn转为pa以后,又转为pfn。

Read More

How does xfrm km work

summary

There is a list xfrm_km_list in kernel.
Each node of the list is struct xfrm_mgr,
which has several methods to notify usersapce by netlink message.
Different methods has corresponding method,
and it broadcast the netlink message with different xfrm groups.

struct xfrm_mgr has many methods, for example:
1. notify: notify the sa change, ex: add, delete, expire ..
2. acquire: notify when sp is match, while no SA is got.
3. compile_policy:
4. new_mapping:
5: notify_policy: notify sp change. add, delete, expire.
6. report:
7. mirgrate:

Read More

how to use git bisect

usage:

In version management, we often meet this case:

1
2
3
When a line/file is deleted for git repo?
or
who did it ?

In this case, we need git bisect.

  1. git bisect start
  2. git bisect good commit_id1
  3. git bisect bad commit_id2
  4. git bisect bad/good

repeat to step 4, until finish.

more auto method for step4 is run with git bisect run command.

by the way git blame is used to check the added line/file.

Read More

Where is IPv6 route cache

Q: Does kernel IPv6 route support route cache ?

yes. but it is very different with IPv4.
It should be ‘clone’ strictly speaking.

For simple let us only focus on ip6_pol_route, and variable struct rt6_info *nrt.

  1. When a route is matched, a nrt will be created by rt6_alloc_cow or rt6_alloc_clone.

  2. the nrt will be set with flag RTF_CACHE;

  3. ip6_ins_rt insert the new nrt to fib tree.

  4. kernel starts fib6 gc for nrt.

so there will be a new cache route, and after a period it will disappear.

Read More