1257 isolcpus= [KNL,SMP] Isolate CPUs from the general scheduler. 1258 Format: 1259 <cpu number>,...,<cpu number> 1260 or 1261 <cpu number>-<cpu number> 1262 (must be a positive range in ascending order) 1263 or a mixture 1264 <cpu number>,...,<cpu number>-<cpu number> 1265 1266 This option can be used to specify one or more CPUs 1267 to isolate from the general SMP balancing and scheduling 1268 algorithms. You can move a process onto or off an 1269 "isolated" CPU via the CPU affinity syscalls or cpuset. 1270 <cpu number> begins at 0 and the maximum value is 1271 "number of CPUs in system - 1". 1272 1273 This option is the preferred way to isolate CPUs. The 1274 alternative -- manually setting the CPU mask of all 1275 tasks in the system -- can cause problems and 1276 suboptimal load balancer performance.
Analysis:
this is a common problem, kernel 3.10 is used to analysis it.
why kthread(pid) has wrong affinity. when kernel boot, pid0 create process init(pid1) and kthread process(pid2). pid1 and pid2 is created by rest_init in init/main.c
All the kernel threads created by kthread_create/kthread_run, will be the children of the kthread process(pid2).
In kernel source, kthread(pid2) is identitied by ‘kthread_task’, which is run with funciton ‘kthreadd’.
mainly of function ‘kthread’ is a infinite loop, check if there is a new kernel thread need to be created, if yes, create them.
before start the infinite loop, kernel chagne the cpumask/affinity of pid2, see line 453 file kernel/kthread.c.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
364static noinline void __init_refok rest_init(void) 365 { 366int pid; 367 368 rcu_scheduler_starting(); 369/* 370 * We need to spawn init first so that it obtains pid 1, however 371 * the init task will end up wanting to create kthreads, which, if 372 * we schedule it before we create kthreadd, will OOPS. 373 */ 374 kernel_thread(kernel_init, NULL, CLONE_FS | CLONE_SIGHAND); <== pid 1 375 numa_default_policy(); 376 pid = kernel_thread(kthreadd, NULL, CLONE_FS | CLONE_FILES); <== pid 2 377 rcu_read_lock(); 378 kthreadd_task = find_task_by_pid_ns(pid, &init_pid_ns); 379 rcu_read_unlock(); 380 complete(&kthreadd_done);
In function ‘kthreadd’, file kernel/kthread.c
1 2 3 4 5 6 7 8 9 10 11 12 13
446intkthreadd(void *unused) 447 { 448structtask_struct *tsk = current; 449 450/* Setup a clean context for our children to inherit. */ 451 set_task_comm(tsk, "kthreadd"); 452 ignore_signals(tsk); 453 set_cpus_allowed_ptr(tsk, cpu_all_mask); <======== 454 set_mems_allowed(node_states[N_MEMORY]); 455 456 current->flags |= PF_NOFREEZE; 457 458for (;;) {
why kernel thread created by ‘kthread_create’ still has wrong affinity, even kthread(pid2) has been set with right affinity. kthead_create is just a marco to wrapper kthread_create_on_node. In kthread_create_on_node, new kernel thread affinity/cpumask will be set as cpu_all_mask(all cpus), just after it is created. See line 285