draft: isolcpus

Isolcpus

kthread has wrong affinity when use isolcpus in bootline

when boot kernel with isolcpus in grub command lines, only init thread has expected affinity, which exclude the isolated cpus.

while the kthreads affinity still includes isolated cpus.

Read More

PROMISC in net device->flag

summary

promisc is one bit of struct net_device’s flag, which is used to indicate if a device is in promisc status.

1
2
3
4
5
6
7
8
9
10
11
12
30 /* Standard interface flags (netdevice->flags). */
31 #define IFF_UP 0x1 /* interface is up */
32 #define IFF_BROADCAST 0x2 /* broadcast address valid */
33 #define IFF_DEBUG 0x4 /* turn on debugging */
34 #define IFF_LOOPBACK 0x8 /* is a loopback net */
35 #define IFF_POINTOPOINT 0x10 /* interface is has p-p link */
36 #define IFF_NOTRAILERS 0x20 /* avoid use of trailers */
37 #define IFF_RUNNING 0x40 /* interface RFC2863 OPER_UP */
38 #define IFF_NOARP 0x80 /* no ARP protocol */
39 #define IFF_PROMISC 0x100 /* receive all packets */
40 #define IFF_ALLMULTI 0x200 /* receive all multicast packets*/
...

There are two kinds of operataion, could cause a NIC enter/leave promisc status.

  1. ip command
    run mutli on command, just need one off to recover.

    1
    2
      	ip link set dev eth0 promisc on
    ip link set dev eth0 promisc off
  2. tcpdump command
    When tcpdump starts, it let dev to promisc,
    and just before exit, tcpdump let dev left promisc.
    All these is done by call kernel api dev_set_promiscuity.

Read More

Netlink in kernel(continue)

##netlink介绍
netlink是一种用于内核和用户空间进行数据交互的socket。 关于netlink的具体介绍,google给出更好的解释。
netlink是socket的一种,其的family号是PF_NETLINK, netlink包括很多种proto,并且用户可以根据自己的需要进行扩展。
每个netlinksocket都有一个pid,该pid在所属proto下是唯一的。 在netlink消息
传递是,pid常被用来标识目的地socket。

Every netlink socket is indicated by (net, proto, pid).

  1. net: the net namespace.
  2. proto: netlink proto.
  3. pid:

Read More

netlink grab

nl_table的操作:
读操作: netlink_lock_tablenetlink_unlock_table
写操作netlink_table_grabnetlink_table_ungrab

原子变量nl_table_users 用来保存对nl_table的读者引用计数。

只要没有进行的写者,即使有n个写者在等待,m个读者也可以同时读。
当没有任何读者时,写着才可以获得权限。
一旦一个写着获得权限,所有的读者和写者都得等待。

Read More

netlink bulk dump

需求

有时候我们需要从内核输出大量的消息。 例如,dump interface, xfrm sa,sp(几千甚至几万条)等 这些信息显然无法放到一个skb里。

这是我们需要借助netlinkcallback机制。 原理:

DATA STRUCTURE

1
2
3
4
5
6
7
8
9
10
11
12
13
14
 93 struct netlink_callback {
94 struct sk_buff *skb;
95 const struct nlmsghdr *nlh;
96 int (*dump)(struct sk_buff * skb,
97 struct netlink_callback *cb);
98 int (*done)(struct netlink_callback *cb);
99 void *data;
100 /* the module that dump function belong to */
101 struct module *module;
102 u16 family;
103 u16 min_dump_alloc;
104 unsigned int prev_seq, seq;
105 long args[6];
106 };
1
2
3
4
5
6
7
117 struct netlink_dump_control {
118 int (*dump)(struct sk_buff *skb, struct netlink_callback *);
119 int (*done)(struct netlink_callback *);
120 void *data;
121 struct module *module;
122 u16 min_dump_alloc;
123 };
两个支持dump的重要的函数API`dumpdone`
1
2
dump: 每次输出时调用,接着上次的数据输出。如果全部输出完成返回0. 
done: 全部输出完成后被调用。
dump 过程
1
2
3
4
1. 当dump的消息非常多时候,首先创建struct netlink_callback, 并创建这个cb挂到netlink socket(nlk)上。
此处的nlk是提出dump请求的那个socket。
2. 调用dump函数输出第一次结果, 并将结果放到放到nlk的接受队列里,激发dataready。
3. 上层应用程序调用rcvmsg就会返回。并得到第一次的输出结果, 在rcvmsg的系统调用再次netlink_dump。

闭环反馈过程:

1
2
3
4
这样应用程序每次通过系统调用rcv, 在将数据从内核中收上来的
这个系统调用rcv也激发的一次netlink_dump
新的数据被 追加到了socket的接受队列里。重复以上过程,直到所有的数据dump完成,cb->dump(skb, cb)返回0
内核调用 cb->done(cb),并将cb从netlink socket上删除并释放对应的内存。
1
2
nlk->cb = NULL;
netlink_consume_callback(cb);
1
2
3
4
5
6
7
8
9
10
11
12
2359 stati int netlink_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
2360 int flags)
2361 {
...
2431 if (nlk->cb_running &&
2432 atomic_read(&sk->sk_rmem_alloc) <= sk->sk_rcvbuf / 2) {
2433 ret = netlink_dump(sk);
2434 if (ret) {
2435 sk->sk_err = -ret;
2436 sk->sk_error_report(sk);
2437 }
2438 }

Netlink in kernel

#netlink socket framework.

##netlink socket proto
netlink socket is a kind of socket. There are many proto of netlink. There may have many groups under each proto. See example in following.

Every netlink socket is indicated by <net, proto, pid>.

  1. net: the net namespace.
  2. proto: netlink proto.
  3. pid:

Read More

Install html2markdown in Ubuntu(12.10)

###Edit source.list
ensure the universe components is enabled.
martin@PC:~/git/blog/_posts$ cat /etc/apt/sources.list
deb http://ubuntu.cn99.com/ubuntu/ quantal main
deb-src http://ubuntu.cn99.com/ubuntu/ quantal main

deb http://ubuntu.cn99.com/ubuntu/ quantal universe
deb-src http://ubuntu.cn99.com/ubuntu/ quantal universe

###update apt
martin@PC:~/git/blog/_posts$ sudo apt-get update

###install
martin@PC:~/git/blog/_posts$ sudo apt-get install python-html2text^C

html2markdown ready

martin@PC:~/git/blog/_posts$ html2markdown –version
html2markdown 3.200.3