BoundHook – Exception
Based Kernel-Controlled UserMode Hooking
Based Kernel-Controlled UserMode Hooking
Author: Kasif Dekel, Security Researcher at CyberArk
Prologue
In
this article, we’ll present a new hooking technique that we have found during
our research work.
this article, we’ll present a new hooking technique that we have found during
our research work.
Hooking
techniques give you control over the way an operating system or a piece of
software behaves. Some of the software that utilizes hooks include: application
security solutions, system utilities, tools for programming (e.g. interception,
debugging, extending software, etc.), malicious software (e.g. rootkits) and
many others.
techniques give you control over the way an operating system or a piece of
software behaves. Some of the software that utilizes hooks include: application
security solutions, system utilities, tools for programming (e.g. interception,
debugging, extending software, etc.), malicious software (e.g. rootkits) and
many others.
Please
note, this is neither an elevation nor an exploitation technique. This
technique can be used in a post-exploitation scenario in which the attacker has
control over the asset. Since malicious kernel code (rootkits) often seeks to
establish persistence in unfriendly territory, stealth technology plays a
fundamental role.
note, this is neither an elevation nor an exploitation technique. This
technique can be used in a post-exploitation scenario in which the attacker has
control over the asset. Since malicious kernel code (rootkits) often seeks to
establish persistence in unfriendly territory, stealth technology plays a
fundamental role.
Technical Description
The
idea behind this BoundHook technique is to cause an exception in a very
specific location in a user-mode context and catch the exception to gain
control over the thread execution.
idea behind this BoundHook technique is to cause an exception in a very
specific location in a user-mode context and catch the exception to gain
control over the thread execution.
To
do this, we can use the BOUND instruction, which is part of Intel MPX (Memory
Protection Extensions). This instruction is designed to (along with the compiler,
runtime libraries and OS support) increase software security by checking
pointer references whose normal compile-time intentions are maliciously
exploited at runtime due to memory corruption vulnerabilities.
do this, we can use the BOUND instruction, which is part of Intel MPX (Memory
Protection Extensions). This instruction is designed to (along with the compiler,
runtime libraries and OS support) increase software security by checking
pointer references whose normal compile-time intentions are maliciously
exploited at runtime due to memory corruption vulnerabilities.
In
a nutshell, the BOUND instruction checks an array index against bounds and raises
software interrupt 5 if the test fails (32-bit: nt!KiTrap05, 64-bit: nt!KiBoundFault).
a nutshell, the BOUND instruction checks an array index against bounds and raises
software interrupt 5 if the test fails (32-bit: nt!KiTrap05, 64-bit: nt!KiBoundFault).
Why
not just do a comparison, you ask? Because Intel designed this new instruction
to generate a fault that will enable the OS to examine the bound check failure.
not just do a comparison, you ask? Because Intel designed this new instruction
to generate a fault that will enable the OS to examine the bound check failure.
The
instruction’s syntax is as follows –
instruction’s syntax is as follows –
BOUND r16, m16&16 – Checks if r16 (array index) is within bounds
specified by m16&16
specified by m16&16
BOUND r32, m32&32 – Checks if r32 (array index) is within bounds
specified by m32&32
specified by m32&32
When
a bound fault occurs, the trap handler calls nt!KiHandleBound and then executes registered
bounds-exception callback routines.
a bound fault occurs, the trap handler calls nt!KiHandleBound and then executes registered
bounds-exception callback routines.
A
kernel-mode driver or a shellcode payload running in kernel-mode can register a
callback routine for bound faults using nt!KeRegisterBoundCallback. This function is
not “exported” by the WDK headers, and a pointer to the function has to be
obtained dynamically.
kernel-mode driver or a shellcode payload running in kernel-mode can register a
callback routine for bound faults using nt!KeRegisterBoundCallback. This function is
not “exported” by the WDK headers, and a pointer to the function has to be
obtained dynamically.
The
callback routine has no parameters and should return a BOUND_CALLBACK_STATUS, which is
basically:
callback routine has no parameters and should return a BOUND_CALLBACK_STATUS, which is
basically:
After
completion of the bound fault registration, the kernel-mode code should get a
pointer to the user-mode DLL (or any other PE) base address and calculate the
address of the function that it’s about to hook.
completion of the bound fault registration, the kernel-mode code should get a
pointer to the user-mode DLL (or any other PE) base address and calculate the
address of the function that it’s about to hook.
Obtaining
a function address is a simple task and can be accomplished in various ways,
for example by parsing the PE header. Please note, parsing an image that is
loaded into a specific process should be done in the process’s context or using
the appropriate APIs.
a function address is a simple task and can be accomplished in various ways,
for example by parsing the PE header. Please note, parsing an image that is
loaded into a specific process should be done in the process’s context or using
the appropriate APIs.
Once
our code is done calculating the function address, it would be nice to simply
start writing to that address. However, because this code resides in
read/execute only memory, we are unable to do this.
our code is done calculating the function address, it would be nice to simply
start writing to that address. However, because this code resides in
read/execute only memory, we are unable to do this.
Windows
memory protection relies on the following factors:
memory protection relies on the following factors:
·
The R/W flag in PDEs and PTEs (read only = 0, read/write = 1).
The R/W flag in PDEs and PTEs (read only = 0, read/write = 1).
·
The U/S flag in PDEs and PTEs (supervisor mode = 0, user mode
= 1).
The U/S flag in PDEs and PTEs (supervisor mode = 0, user mode
= 1).
·
The WP flag in the CR0 register (17th bit).
The WP flag in the CR0 register (17th bit).
Now,
we have a few options. We can either write to that address in a way that would
trigger the COW (copy-on-write) protection or, to achieve maximum stealth, we
can write directly to the function address in one of two ways. We can either manipulate
the CR0 register using __readcr0()
and __writecr0(),
or we can allocate our own memory descriptor list (MDL) to describe the memory
pages and adjust permissions on the MDL using a bitwise OR and the MDL_MAPPED_TO_SYSTEM_VA.
The MDL approach will be much more “stealthy”, since it’s completely
invisible to the current PatchGuard implementation.
we have a few options. We can either write to that address in a way that would
trigger the COW (copy-on-write) protection or, to achieve maximum stealth, we
can write directly to the function address in one of two ways. We can either manipulate
the CR0 register using __readcr0()
and __writecr0(),
or we can allocate our own memory descriptor list (MDL) to describe the memory
pages and adjust permissions on the MDL using a bitwise OR and the MDL_MAPPED_TO_SYSTEM_VA.
The MDL approach will be much more “stealthy”, since it’s completely
invisible to the current PatchGuard implementation.
First,
here’s how we can use the CR0 approach. The CR0 register description, taken from
the Intel 64 and IA-32 Architectures Software Developer’s Manual reads:
here’s how we can use the CR0 approach. The CR0 register description, taken from
the Intel 64 and IA-32 Architectures Software Developer’s Manual reads:
“WP
Write Protect (bit 16 of CR0) — When set, inhibits supervisor-level procedures
from writing into readonly pages; when clear, allows supervisor-level
procedures to write into read-only pages (regardless of the U/S bit setting;
see Section 4.1.3 and Section 4.6).”
Write Protect (bit 16 of CR0) — When set, inhibits supervisor-level procedures
from writing into readonly pages; when clear, allows supervisor-level
procedures to write into read-only pages (regardless of the U/S bit setting;
see Section 4.1.3 and Section 4.6).”
Here
is an example of cr0 register manipulation:
is an example of cr0 register manipulation:
Writing
directly to the DLL’s COW page will allow us to hook every process on the system
that is using this DLL since it will affect the cow-origin page.
directly to the DLL’s COW page will allow us to hook every process on the system
that is using this DLL since it will affect the cow-origin page.
Triggering
a bound fault is easy. For example, this code will trigger a fault:
a bound fault is easy. For example, this code will trigger a fault:
Thus,
our kernel-mode code that performs the hooking should write a similar assembly
code to the place where it wants to get control over the execution of the
thread.
our kernel-mode code that performs the hooking should write a similar assembly
code to the place where it wants to get control over the execution of the
thread.
For
example, if we want to hook KERNELBASE!CreateFileW, we can inject these opcodes to the
function’s prologue:
example, if we want to hook KERNELBASE!CreateFileW, we can inject these opcodes to the
function’s prologue:
UCHAR
opcodes[5]= {0x36, 0x66, 0x62, 0x0C, 0x24};
opcodes[5]= {0x36, 0x66, 0x62, 0x0C, 0x24};
This
is basically: BOUND
CX, DWORD PTR SS : [ESP]. In this specific case, we assume that CX will
be zero (when used in real code this should be tested for every function) and
the top of stack will be greater than zero (as this is a proof of concept and
not a released tool).
is basically: BOUND
CX, DWORD PTR SS : [ESP]. In this specific case, we assume that CX will
be zero (when used in real code this should be tested for every function) and
the top of stack will be greater than zero (as this is a proof of concept and
not a released tool).
Now,
after writing this to the KERNELBASE!CreateFileW prologue, when a user-mode thread calls this
function our kernel-mode callback function will take control of the thread.
after writing this to the KERNELBASE!CreateFileW prologue, when a user-mode thread calls this
function our kernel-mode callback function will take control of the thread.
Doing
this, gives us a lot of advantages, for example –
this, gives us a lot of advantages, for example –
·
The hooked page will still be COW, thus anti-malware solutions
and researchers doing manual analysis won’t be able to notice that the page has
been modified.
The hooked page will still be COW, thus anti-malware solutions
and researchers doing manual analysis won’t be able to notice that the page has
been modified.
·
Most AVs are unaware of this method and probably aren’t addressing
it (especially since the page is still COW).
Most AVs are unaware of this method and probably aren’t addressing
it (especially since the page is still COW).
·
A user-mode debugger will not be able to catch this hook. A
regular inline hook method makes the hooked routine jump to another user-mode
code, but BoundHook’s method traps the execution flow by the kernel bound
faults handler.
A user-mode debugger will not be able to catch this hook. A
regular inline hook method makes the hooked routine jump to another user-mode
code, but BoundHook’s method traps the execution flow by the kernel bound
faults handler.
·
This method is invisible to most PatchGuard (PG) protection
mechanisms. The CR0 is protected by PG, but since it is modified for a very
short period of time, the chance of being caught by PG is minimal. The MDL
approach to bypass the COW mechanism is not detectable by PG.
This method is invisible to most PatchGuard (PG) protection
mechanisms. The CR0 is protected by PG, but since it is modified for a very
short period of time, the chance of being caught by PG is minimal. The MDL
approach to bypass the COW mechanism is not detectable by PG.
Proof-of-concept,
a call stack of a hooked thread:
a call stack of a hooked thread:
We
know that BoundHook does not meet Microsoft’s bar to be considered a
vulnerability, as machine administrator rights are already compromised.
Microsoft’s response on receiving responsible notification of a similar issue
from CyberArk (GhostHook) was as follows:
know that BoundHook does not meet Microsoft’s bar to be considered a
vulnerability, as machine administrator rights are already compromised.
Microsoft’s response on receiving responsible notification of a similar issue
from CyberArk (GhostHook) was as follows:
“We
have completed our investigation of this issue and have found that it is not a
vulnerability but a technique to avoid detection once the machine is already
compromised. Because it’s a post-exploitation technique it doesn’t meet the bar
for servicing in a security update but we will consider fixing it in a future
version of Windows.”
have completed our investigation of this issue and have found that it is not a
vulnerability but a technique to avoid detection once the machine is already
compromised. Because it’s a post-exploitation technique it doesn’t meet the bar
for servicing in a security update but we will consider fixing it in a future
version of Windows.”
In
conclusion, this method will bring new capabilities to both software security
vendors and malware writers.
conclusion, this method will bring new capabilities to both software security
vendors and malware writers.
For the LATEST tech updates,
FOLLOW us on our Twitter
LIKE us on our FaceBook
SUBSCRIBE to us on our YouTube Channel!