For ixgbe nic, we want to assign a tx hardware qeueue to each cpu, and the tx softirq should use the corresponding hardware queue.
each packet will select a softqueue in dev_queue_xmit, we rewrite ixgbe driver ndo_select_queue(ixgbe_select_queue), which will return current cpu index(based 0) when packet select queue. thus for each cpu use its own tx queue.
but, we found some packet had unmatched queue index when send on specific cpu.
for example, a packet’s queue index is 5 but is sent by cpu3, thus, cpu3 will operate tx hw queue5, which should only be done by cpu5.
Analysis
When watchdog is start, it first freeze all subqueues, and the do the check. At the end, it resume the subqueues, and reschedule them.
Because the watchdog is handled in a timer, so the reschedule the queue will be done on a different cpu, which is different the packets’s queue index.
for example: packet rung select queue on CPU1, while CPU2 run the watchdog, this packet will be store in the queue1, but not sent. when cpu2 finish the watchdog, queue1 is rescheduled. NOTE here the queue1 start run on cpu2 not cpu1. which is not expected and safe. it will cause the tx ring buffer hang.
2985 static inline void netif_tx_lock(struct net_device *dev) 2986 { 2987 unsigned int i; 2988 int cpu; 2989 2990 spin_lock(&dev->tx_global_lock); 2991 cpu = smp_processor_id(); 2992 for (i = 0; i < dev->num_tx_queues; i++) { 2993 struct netdev_queue *txq = netdev_get_tx_queue(dev, i); 2994 2995 /* We are the only thread of execution doing a 2996 * freeze, but we have to grab the _xmit_lock in 2997 * order to synchronize with threads which are in 2998 * the ->hard_start_xmit() handler and already 2999 * checked the frozen bit. 3000 */ 3001 __netif_tx_lock(txq, cpu); 3002 set_bit(__QUEUE_STATE_FROZEN, &txq->state); 3003 __netif_tx_unlock(txq); 3004 } 3005 }
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
3013 static inline void netif_tx_unlock(struct net_device *dev) 3014 { 3015 unsigned int i; 3016 3017 for (i = 0; i < dev->num_tx_queues; i++) { 3018 struct netdev_queue *txq = netdev_get_tx_queue(dev, i); 3019 3020 /* No need to grab the _xmit_lock here. If the 3021 * queue is not stopped for another reason, we 3022 * force a schedule. 3023 */ 3024 clear_bit(__QUEUE_STATE_FROZEN, &txq->state); 3025 netif_schedule_queue(txq); 3026 } 3027 spin_unlock(&dev->tx_global_lock); 3028 }