tcpdump work with bonding interface

test case

####On redhat5, Why tcpdump could not work on bonding work.
OS: redhat 5.
There are two 82599 interfaces eth0 and eth1.
These two interfaces are used as slave of bond0,
eth1 is backup of eth0.

We ping the default gateway on test machine.
ping work OK, and tcpdump on bond0 show the icmp request and icmp require packets.
while on eth0 only icmp request, and eth1 has no any packet.

Read More

register irq handler

call trace

handle_level_irq为例说明.

1
2
3
4
===> handle_level_irq
==> ==> handle_irq_event
==> ==> ==> handle_irq_event_percpu
==> ==> ==> ==>action->handler

where handler is registered

1
2
3
4
5
6
127 static inline int __must_check
128 request_irq(unsigned int irq, irq_handler_t handler, unsigned long flags,
129 const char *name, void *dev)
130 {
131 return request_threaded_irq(irq, handler, NULL, flags, name, dev);
132 }

Read More

irq framework

中断处理过程:

硬件中断到中断控制器

reg value–>irq(int) —> struct irq_desc

1
2
3
4
5
6
==> 中断时的有一个寄存器会保存中断源的vector值.
==> ==> `arch/x86/kernel/entry_64.S`调用函数`do_IRQ`.
==> ==> ==> `do_IRQ`依据`vector_irq`和vector值, 找到对应的中断号,并调用`handle_irq`.
==> ==> ==> ==> `handle_irq`通过函数irq_to_descdesc,可将中断号转化为`struct irq_desc`.
==> ==> ==> ==> generic_handle_irq_desc(irq, desc);
==> ==> ==> ==> ==> `generic_handle_irq_desc`调用 desc->handle_irq(irq, desc);

注:这里的handle_irq不是真正的中断处理函数,而是几大类中断控制器处理函数.
如82599, msi等.
具体分析见:irq study1

中断控制器到具体的中断处理函数

1
2
3
4
5
==> handle_level_irq
==> ==> irqreturn_t handle_irq_event(struct irq_desc *desc)
==> ==> ==> struct irqaction *action = desc->action
==> ==> ==> ret = handle_irq_event_percpu(desc, action);
==> ==> ==> ==> action->handler(irq, action->dev_id);

这里的action->handler才是我们使用request_irq注册的中断处理函数.
具体分析见:
具体分析见:irq study2

irq vector

中断处理过程:

reg value–>irq(int) —> struct irq_desc

1
2
3
4
5
6
==> 中断时的有一个寄存器会保存中断源的vector值.
==> ==> `arch/x86/kernel/entry_64.S`调用函数`do_IRQ`.
==> ==> ==> `do_IRQ`依据`vector_irq`和vector值, 找到对应的中断号,并调用`handle_irq`.
==> ==> ==> ==> `handle_irq`通过函数irq_to_descdesc,可将中断号转化为`struct irq_desc`.
==> ==> ==> ==> generic_handle_irq_desc(irq, desc);
==> ==> ==> ==> ==> `generic_handle_irq_desc`调用 desc->handle_irq(irq, desc);

注:这里的handle_irq不是真正的中断处理函数,而是几大类中断控制器处理函数.
如82599, msi等.

`do_IRQ(struct pt_regs *regs)

File: arch/x86/kernel/irq.c

arch/x86/kernel/entry_64.S
will call do_IRQ

Read More

Delayed work: dst_gc_work

summary

A delayed work will first start a timer,
and when timeout, the delayed work will be put a worker_pool‘s
worklist or a pool_workqueue‘s delayed_works

how to use delayed work

data structure

1
2
3
4
5
6
7
8
113 struct delayed_work {
114 struct work_struct work;
115 struct timer_list timer;
116
117 /* target workqueue and CPU ->timer uses to queue ->work */
118 struct workqueue_struct *wq;
119 int cpu;
120 };

Read More

worker and worker_thread

###Summary
The struct worker is the really scheudle unit in workqueue.
Each struct worker has a corresponding thread(task) by worker->task.
A struct worker is linked to struct worker_pool->idle_list when work is idle.
and moved to struct worker_pool->busy_hash.

worker_thread

  1. move worker from pool->idle_list and clear worker ‘s WORKER_IDLE flag.
  2. check the pool and manage the workers(create/destory)
  3. Iterate all the `struct work_struct *work` in the `struct worker_pool->worklist`,
    
    and run them in sequence with process_one_work(worker, work);.
  4. move worker into idle list again.
  5. schedule();

Read More

dst garbage

dst garbage summary

garbage collection is a common method used in kernel.
When a object(struct,memeory) become invalid, we need
free them, but the object maybe reference by others.

such as a dst_entry is not invalid, and it is still
referenced(used) by others.

then __dst_free will be called for this case.
It will first set dst to dirty(dead),
and then put it into dst_garbage.list by dst->next.

Then a workqueue task will check the dst‘s reference,
and free(destory) it when no reference on it.

Two key struct struct dst_garbage and dst_gc_work

Read More

dst ops

Call trace

forward a packet.

1
2
3
4
5
6
> ip_rcv_finish
> > ip_route_input_noref
> > > ip_route_input_slow
> > > > fib_lookup
> > > > ip_mkroute_input
> > dst_input(skb)
1
2
3
4
> > > > ip_mkroute_input
> > > > > __mkroute_input
> > > > > > rth = rt_dst_alloc(...)
> > > > > > skb_dst_set(skb, &rth->dst);

Read More