These patches implement a user-interface for setting task CPU affinity.

  The kernel already respects CPU affinity, which is stored in the cpus_allowed
  member of the task_struct.

  I have two implementations: a proc-based interface and a syscall-based
  interface, both for 2.4 and 2.5.  The syscall based implementation was merged
  as of 2.5.8-pre3.

syscall-based user interface for setting CPU affinity

  This patch implements the following syscalls:

  int sched_setaffinity(pid_t pid, unsigned int len,
                        unsigned long *new_mask_ptr)
  int sched_getaffinity(pid_t pid, unsigned int len,
                       unsigned long *user_mask_ptr)

  which set and get affinity, respectively.  They use the set_cpus_allowed
  method in Ingo Molnar's new O(1) scheduler to simplify the work they
  need to do.

  The set syscall implements security: user must possess CAP_SYS_NICE or be
  the same uid as the task in question.  Anyone can call the get method.

  The `len' parameter allows for future changes in the size of the cpus_allowed
  bitmask.  Note this used to be an `unsigned int *' where you passed a pointer
  to your size and got in return a pointer to the system size.  Linus did not
  like this solution and wanted just the length itself passed in.  Now the
  function returns the number of bytes of the mask on success.  Don't ask how
  you are supposed to learn them.  Some of the patches here have this old
  behavior.

  The affinity-test.c and affinity-run.c provide example code using the new
  syscalls.  The former is a simple test, the later can be used for changing
  the affinity of already running tasks.

  This patch was merged into 2.5.8-pre3.

User-space use of the syscalls

  The `schedutils' package of scheduling utilities contains a program to
  manipulate task CPU affinity using the syscalls.  There are also example
  programs here.

  At the time of this writing, glibc has not yet been updated with support for
  the new syscalls.  There is a glibc patch in this directory to add support.
  If you do not feel like recompiling your C library, there is also an
  affinity.h header to do the work.

proc-based user interface for setting CPU affinity

  With this patch, reading and writing /proc/<pid>/affinity will get and set
  the affinity.

  The read mask will be ANDed with cpu_online_map, so that only valid bits are
  returned.  The written data must have _some_ valid bits in it.  I.e.,
  ffffffff is valid on my 2-way system but 01000000 probably is not.  When a
  new mask is set, a reschedule is forced to put the task on a legal CPU.

  Security is implemented: the writer must possess CAP_SYS_NICE or be the same
  uid as the task in question.  Anyone can read the data.

  Note I had to implement a proc_write function for the procfs (pid) code.
  This is generic and can be used by other, writable, entries.

Both of these patches borrow from each other and Ingo Molnar's CPU affinity
syscall work.

Robert Love
10 April 2002