irq vector

中断处理过程:

reg value–>irq(int) —> struct irq_desc

1
2
3
4
5
6
==> 中断时的有一个寄存器会保存中断源的vector值.
==> ==> `arch/x86/kernel/entry_64.S`调用函数`do_IRQ`.
==> ==> ==> `do_IRQ`依据`vector_irq`和vector值, 找到对应的中断号,并调用`handle_irq`.
==> ==> ==> ==> `handle_irq`通过函数irq_to_descdesc,可将中断号转化为`struct irq_desc`.
==> ==> ==> ==> generic_handle_irq_desc(irq, desc);
==> ==> ==> ==> ==> `generic_handle_irq_desc`调用 desc->handle_irq(irq, desc);

注:这里的handle_irq不是真正的中断处理函数,而是几大类中断控制器处理函数.
如82599, msi等.

`do_IRQ(struct pt_regs *regs)

File: arch/x86/kernel/irq.c

arch/x86/kernel/entry_64.S
will call do_IRQ

Read More

Delayed work: dst_gc_work

summary

A delayed work will first start a timer,
and when timeout, the delayed work will be put a worker_pool‘s
worklist or a pool_workqueue‘s delayed_works

how to use delayed work

data structure

1
2
3
4
5
6
7
8
113 struct delayed_work {
114 struct work_struct work;
115 struct timer_list timer;
116
117 /* target workqueue and CPU ->timer uses to queue ->work */
118 struct workqueue_struct *wq;
119 int cpu;
120 };

Read More

worker and worker_thread

###Summary
The struct worker is the really scheudle unit in workqueue.
Each struct worker has a corresponding thread(task) by worker->task.
A struct worker is linked to struct worker_pool->idle_list when work is idle.
and moved to struct worker_pool->busy_hash.

worker_thread

  1. move worker from pool->idle_list and clear worker ‘s WORKER_IDLE flag.
  2. check the pool and manage the workers(create/destory)
  3. Iterate all the `struct work_struct *work` in the `struct worker_pool->worklist`,
    
    and run them in sequence with process_one_work(worker, work);.
  4. move worker into idle list again.
  5. schedule();

Read More

dst garbage

dst garbage summary

garbage collection is a common method used in kernel.
When a object(struct,memeory) become invalid, we need
free them, but the object maybe reference by others.

such as a dst_entry is not invalid, and it is still
referenced(used) by others.

then __dst_free will be called for this case.
It will first set dst to dirty(dead),
and then put it into dst_garbage.list by dst->next.

Then a workqueue task will check the dst‘s reference,
and free(destory) it when no reference on it.

Two key struct struct dst_garbage and dst_gc_work

Read More

dst ops

Call trace

forward a packet.

1
2
3
4
5
6
> ip_rcv_finish
> > ip_route_input_noref
> > > ip_route_input_slow
> > > > fib_lookup
> > > > ip_mkroute_input
> > dst_input(skb)
1
2
3
4
> > > > ip_mkroute_input
> > > > > __mkroute_input
> > > > > > rth = rt_dst_alloc(...)
> > > > > > skb_dst_set(skb, &rth->dst);

Read More

how to xmit a packet with Qdisc

summary

We think it as a ideal and simple case:

Call Trace:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
> dev_queue_xmit
> > __dev_queue_xmit(skb, NULL);
> > > rcu_read_lock_bh();
> > > txq = netdev_pick_tx(dev, skb, accel_priv);
> > > q = rcu_dereference_bh(txq->qdisc);
> > > rc = __dev_xmit_skb(skb, q, dev, txq);
> > > > skb_dst_force(skb);
> > > > q->enqueue(skb, q);
> > > > qdisc_run_begin(q)
> > > > __qdisc_run(q);
> > > > > while (qdisc_restart(q))
> > > > > > __netif_schedule
> > > > > qdisc_run_end(q)
> > > rcu_read_unlock_bh();
> > > return rc;

Read More

Qdisc running flag

Summary

In struct Qdisc, there are two similar fileds.
running flag is stored in __state of struct Qdisc, NOT state.
Every time, when we send a packet from qdisc, the running flag is
set by qdisc_run_begin, and after that, it is removed by qdisc_run_end.

1
2
3
84         unsigned long           state;
...
87 unsigned int __state;

todo

why need busylock?

Read More

how to create dev qdisc

Summary

Part 1: Register multi queue net device.

In this part, only the framework is prepared for qdisc,
and the noop_qdisc is set as default.

prepare netdev_queues.

for example: intel igb hardware has 8 hardware tx queue,
and nic driver create 8 corresponding struct netdev_queue
in the _tx of struct net_device.

prepare mq_qdisc

The mq_qdisc is attached to the corresponding device.
In mq_qdisc private field, a default qdisc will be
create for each NIC’s hardware queue.
This is done in mq_init.
The default qdisc is pfifo_fast_ops.

attach mq_qdisc to netdev_queue.

In mq_attach, these qdiscs are attatched to corresponding
struct netdev_queue.

Part 2: Active a net device with right qdiscs

Here only trace with the case mq_qdisc.
When dev is up, dev_open is called, which will call dev_activate.

Read More

qdisc study part1: qdisc_base

###Qdisc_ops is the core of a Qdisc.
All kinds of the Qdisc_ops are linked in a list by qdisc_base.
The key item of different Qdisc_ops is id[IFNAMSIZ].

Note: the list is a Singly-linked list, not a common list of kernel.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
158 struct Qdisc_ops {
159 struct Qdisc_ops *next;
160 const struct Qdisc_class_ops *cl_ops;
161 char id[IFNAMSIZ];
162 int priv_size;
163
164 int (*enqueue)(struct sk_buff *, struct Qdisc *);
165 struct sk_buff * (*dequeue)(struct Qdisc *);
166 struct sk_buff * (*peek)(struct Qdisc *);
167 unsigned int (*drop)(struct Qdisc *);
168
169 int (*init)(struct Qdisc *, struct nlattr *arg);
170 void (*reset)(struct Qdisc *);
171 void (*destroy)(struct Qdisc *);
172 int (*change)(struct Qdisc *, struct nlattr *arg);
173 void (*attach)(struct Qdisc *);
174
175 int (*dump)(struct Qdisc *, struct sk_buff *);
176 int (*dump_stats)(struct Qdisc *, struct gnet_dump *);
177
178 struct module *owner;
179 };

qdisc_base

1
2
3
134 /* The list of all installed queueing disciplines. */
135
136 static struct Qdisc_ops *qdisc_base;

Read More