1. 05 Oct, 2014 1 commit
  2. 17 Sep, 2014 1 commit
  3. 05 Sep, 2014 2 commits
  4. 07 Aug, 2014 1 commit
    • H. Peter Anvin's avatar
      x86-64, espfix: Don't leak bits 31:16 of %esp returning to 16-bit stack · 7ecbe6e0
      H. Peter Anvin authored
      commit 3891a04a upstream.
      
      The IRET instruction, when returning to a 16-bit segment, only
      restores the bottom 16 bits of the user space stack pointer.  This
      causes some 16-bit software to break, but it also leaks kernel state
      to user space.  We have a software workaround for that ("espfix") for
      the 32-bit kernel, but it relies on a nonzero stack segment base which
      is not available in 64-bit mode.
      
      In checkin:
      
          b3b42ac2
      
       x86-64, modify_ldt: Ban 16-bit segments on 64-bit kernels
      
      we "solved" this by forbidding 16-bit segments on 64-bit kernels, with
      the logic that 16-bit support is crippled on 64-bit kernels anyway (no
      V86 support), but it turns out that people are doing stuff like
      running old Win16 binaries under Wine and expect it to work.
      
      This works around this by creating percpu "ministacks", each of which
      is mapped 2^16 times 64K apart.  When we detect that the return SS is
      on the LDT, we copy the IRET frame to the ministack and use the
      relevant alias to return to userspace.  The ministacks are mapped
      readonly, so if IRET faults we promote #GP to #DF which is an IST
      vector and thus has its own stack; we then do the fixup in the #DF
      handler.
      
      (Making #GP an IST exception would make the msr_safe functions unsafe
      in NMI/MC context, and quite possibly have other effects.)
      
      Special thanks to:
      
      - Andy Lutomirski, for the suggestion of using very small stack slots
        and copy (as opposed to map) the IRET frame there, and for the
        suggestion to mark them readonly and let the fault promote to #DF.
      - Konrad Wilk for paravirt fixup and testing.
      - Borislav Petkov for testing help and useful comments.
      Reported-by: default avatarBrian Gerst <brgerst@gmail.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com
      
      
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Andrew Lutomriski <amluto@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Dirk Hohndel <dirk@hohndel.org>
      Cc: Arjan van de Ven <arjan.van.de.ven@intel.com>
      Cc: comex <comexk@gmail.com>
      Cc: Alexander van Heukelum <heukelum@fastmail.fm>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: <stable@vger.kernel.org> # consider after upstream merge
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7ecbe6e0
  5. 17 Jul, 2014 1 commit
  6. 09 Jul, 2014 1 commit
    • David Rientjes's avatar
      mm, pcp: allow restoring percpu_pagelist_fraction default · 1f8e5d43
      David Rientjes authored
      commit 7cd2b0a3
      
       upstream.
      
      Oleg reports a division by zero error on zero-length write() to the
      percpu_pagelist_fraction sysctl:
      
          divide error: 0000 [#1] SMP DEBUG_PAGEALLOC
          CPU: 1 PID: 9142 Comm: badarea_io Not tainted 3.15.0-rc2-vm-nfs+ #19
          Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
          task: ffff8800d5aeb6e0 ti: ffff8800d87a2000 task.ti: ffff8800d87a2000
          RIP: 0010: percpu_pagelist_fraction_sysctl_handler+0x84/0x120
          RSP: 0018:ffff8800d87a3e78  EFLAGS: 00010246
          RAX: 0000000000000f89 RBX: ffff88011f7fd000 RCX: 0000000000000000
          RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000010
          RBP: ffff8800d87a3e98 R08: ffffffff81d002c8 R09: ffff8800d87a3f50
          R10: 000000000000000b R11: 0000000000000246 R12: 0000000000000060
          R13: ffffffff81c3c3e0 R14: ffffffff81cfddf8 R15: ffff8801193b0800
          FS:  00007f614f1e9740(0000) GS:ffff88011f440000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
          CR2: 00007f614f1fa000 CR3: 00000000d9291000 CR4: 00000000000006e0
          Call Trace:
            proc_sys_call_handler+0xb3/0xc0
            proc_sys_write+0x14/0x20
            vfs_write+0xba/0x1e0
            SyS_write+0x46/0xb0
            tracesys+0xe1/0xe6
      
      However, if the percpu_pagelist_fraction sysctl is set by the user, it
      is also impossible to restore it to the kernel default since the user
      cannot write 0 to the sysctl.
      
      This patch allows the user to write 0 to restore the default behavior.
      It still requires a fraction equal to or larger than 8, however, as
      stated by the documentation for sanity.  If a value in the range [1, 7]
      is written, the sysctl will return EINVAL.
      
      This successfully solves the divide by zero issue at the same time.
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Reported-by: default avatarOleg Drokin <green@linuxhacker.ru>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f8e5d43
  7. 07 Jul, 2014 1 commit
  8. 01 Jul, 2014 1 commit
    • Naoya Horiguchi's avatar
      mm/memory-failure.c: support use of a dedicated thread to handle SIGBUS(BUS_MCEERR_AO) · 98143390
      Naoya Horiguchi authored
      commit 3ba08129
      
       upstream.
      
      Currently memory error handler handles action optional errors in the
      deferred manner by default.  And if a recovery aware application wants
      to handle it immediately, it can do it by setting PF_MCE_EARLY flag.
      However, such signal can be sent only to the main thread, so it's
      problematic if the application wants to have a dedicated thread to
      handler such signals.
      
      So this patch adds dedicated thread support to memory error handler.  We
      have PF_MCE_EARLY flags for each thread separately, so with this patch
      AO signal is sent to the thread with PF_MCE_EARLY flag set, not the main
      thread.  If you want to implement a dedicated thread, you call prctl()
      to set PF_MCE_EARLY on the thread.
      
      Memory error handler collects processes to be killed, so this patch lets
      it check PF_MCE_EARLY flag on each thread in the collecting routines.
      
      No behavioral change for all non-early kill cases.
      
      Tony said:
      
      : The old behavior was crazy - someone with a multithreaded process might
      : well expect that if they call prctl(PF_MCE_EARLY) in just one thread, then
      : that thread would see the SIGBUS with si_code = BUS_MCEERR_A0 - even if
      : that thread wasn't the main thread for the process.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Reviewed-by: default avatarTony Luck <tony.luck@intel.com>
      Cc: Kamil Iskra <iskra@mcs.anl.gov>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Chen Gong <gong.chen@linux.jf.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      98143390
  9. 26 Jun, 2014 1 commit
    • Mimi Zohar's avatar
      ima: audit log files opened with O_DIRECT flag · ca7919db
      Mimi Zohar authored
      commit f9b2a735
      
       upstream.
      
      Files are measured or appraised based on the IMA policy.  When a
      file, in policy, is opened with the O_DIRECT flag, a deadlock
      occurs.
      
      The first attempt at resolving this lockdep temporarily removed the
      O_DIRECT flag and restored it, after calculating the hash.  The
      second attempt introduced the O_DIRECT_HAVELOCK flag. Based on this
      flag, do_blockdev_direct_IO() would skip taking the i_mutex a second
      time.  The third attempt, by Dmitry Kasatkin, resolves the i_mutex
      locking issue, by re-introducing the IMA mutex, but uncovered
      another problem.  Reading a file with O_DIRECT flag set, writes
      directly to userspace pages.  A second patch allocates a user-space
      like memory.  This works for all IMA hooks, except ima_file_free(),
      which is called on __fput() to recalculate the file hash.
      
      Until this last issue is addressed, do not 'collect' the
      measurement for measuring, appraising, or auditing files opened
      with the O_DIRECT flag set.  Based on policy, permit or deny file
      access.  This patch defines a new IMA policy rule option named
      'permit_directio'.  Policy rules could be defined, based on LSM
      or other criteria, to permit specific applications to open files
      with the O_DIRECT flag set.
      
      Changelog v1:
      - permit or deny file access based IMA policy rules
      Signed-off-by: default avatarMimi Zohar <zohar@linux.vnet.ibm.com>
      Acked-by: default avatarDmitry Kasatkin <d.kasatkin@samsung.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ca7919db
  10. 11 Jun, 2014 3 commits
  11. 07 Jun, 2014 4 commits
  12. 06 May, 2014 3 commits
  13. 24 Mar, 2014 1 commit
  14. 20 Mar, 2014 1 commit
  15. 12 Mar, 2014 1 commit
  16. 06 Mar, 2014 2 commits
  17. 05 Mar, 2014 1 commit
    • Mike Snitzer's avatar
      dm thin: ensure user takes action to validate data and metadata consistency · 07f2b6e0
      Mike Snitzer authored
      
      If a thin metadata operation fails the current transaction will abort,
      whereby causing potential for IO layers up the stack (e.g. filesystems)
      to have data loss.  As such, set THIN_METADATA_NEEDS_CHECK_FLAG in the
      thin metadata's superblock which:
      1) requires the user verify the thin metadata is consistent (e.g. use
         thin_check, etc)
      2) suggests the user verify the thin data is consistent (e.g. use fsck)
      
      The only way to clear the superblock's THIN_METADATA_NEEDS_CHECK_FLAG is
      to run thin_repair.
      
      On metadata operation failure: abort current metadata transaction, set
      pool in read-only mode, and now set the needs_check flag.
      
      As part of this change, constraints are introduced or relaxed:
      * don't allow a pool to transition to write mode if needs_check is set
      * don't allow data or metadata space to be resized if needs_check is set
      * if a thin pool's metadata space is exhausted: the kernel will now
        force the user to take the pool offline for repair before the kernel
        will allow the metadata space to be extended.
      
      Also, update Documentation to include information about when the thin
      provisioning target commits metadata, how it handles metadata failures
      and running out of space.
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      07f2b6e0
  18. 03 Mar, 2014 1 commit
    • Oliver Hartkopp's avatar
      can: remove CAN FD compatibility for CAN 2.0 sockets · 821047c4
      Oliver Hartkopp authored
      In commit e2d265d3
      
       (canfd: add support for CAN FD in CAN_RAW sockets)
      CAN FD frames with a payload length up to 8 byte are passed to legacy
      sockets where the CAN FD support was not enabled by the application.
      
      After some discussions with developers at a fair this well meant feature
      leads to confusion as no clean switch for CAN / CAN FD is provided to the
      application programmer. Additionally a compatibility like this for legacy
      CAN_RAW sockets requires some compatibility handling for the sending, e.g.
      make CAN2.0 frames a CAN FD frame with BRS at transmission time (?!?).
      
      This will become a mess when people start to develop applications with
      real CAN FD hardware. This patch reverts the bad compatibility code
      together with the documentation describing the removed feature.
      Acked-by: default avatarStephane Grosjean <s.grosjean@peak-system.com>
      Signed-off-by: default avatarOliver Hartkopp <socketcan@hartkopp.net>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      821047c4
  19. 25 Feb, 2014 1 commit
  20. 24 Feb, 2014 1 commit
  21. 22 Feb, 2014 1 commit
  22. 19 Feb, 2014 1 commit
  23. 18 Feb, 2014 1 commit
  24. 15 Feb, 2014 2 commits
  25. 14 Feb, 2014 2 commits
  26. 13 Feb, 2014 4 commits