Saturday, July 11, 2020

Hyper-V #0x2 - Hypercalls part 2

Laziness and mistakes

Hi again, and sorry for the significant delay after the last post! The weather in Estonia was for weeks kind of weird (very sunny and warm) - this, combined with my laziness, resulted in a lot smaller productivity. Now when the weather is back to its regular (rainy and chilly), I hope to move on with the writing and cover hypercalls from the Hyper-V side.

But before I get to it, I must mend a massive oversight that I made last time. I highly recommended Alex Ionescu blog posts, but I forgot to bring out one other researcher whose blog posts I also highly praise for anyone getting into Hyper-V. This person is Saar Amar (@AmarSaar). His blog post "First Steps in Hyper-V Research" (https://msrc-blog.microsoft.com/2018/12/10/first-steps-in-hyper-v-research/) is a definite must-read for anyone starting, and there are many other writings from him at the MSRC blog. Sorry for forgetting you!

Before beginning notes

  • I will be covering all opcodes etc. from Intel CPU point of view. The logic with AMD should be quite the same (hopefully).
  • I will only use x64 architecture - who uses 32bit for virtualization anyway.

Hypercalls from Hyper-V

Logically, the Hyper-V side of the hypercalls flow is quite essential - Hyper-V is the thing that actually does all the work. So it would make sense to go through basic logic how handling from the hypervisor side goes:

0. VM exit is triggered by the vmcall opcode (covered in previous posts)
  1. Hyper-V VM exit handler is triggered & CPU context is stored.
  2. Hyper-V handles the hypercall that was required.
  3. Hyper-V restores the CPU context and resumes the VM.

I will now try to go through all these steps and show how/where these steps are done in the Hyper-V.

1. Hyper-V VM exit handler is triggered & CPU context is stored

Hyper-V VM exit handler is the function to where code execution is directed right after the vmcall opcode (or any other reason why VM was exited - this is not only "vmcall handler"). It can be easily found in two different ways:
  1. Look for code that is storing all the registry values one by one to somewhere.

  2. The handler function location is written into VMCS (Virtual-Machine Control Structure) at field 0x6C16. In the current picture, rcx will contain a pointer to the handler function.

You might wonder where the stack pointer will point in such a situation (if you run "k" in windbg inside handling function, then it displays the empty stack trace). This is also a value taken from VMCS but from field 0x6c14.

But like said before, the handler function for VM exit is universal, so how does Hyper-V understand the reason for exit (it could also be memory exception, MSR read/write, CPUID operation, etc...). This can once again be found from VMCS and at field 0x4402. For vmcall exit, the value in that field is 0x12 (18). Based on this, it's easy to find the location where the value is read and look at the branching based on this (my current version path is shown):

READING VM EXIT REASON


[SOME LESS IMPORTANT CODE]





AND HERE IS THE HYPERCALL HANDLING CODE


2. Hyper-V handles the hypercall that was required

This was already covered in previous posts, but I will quickly go through this again. The hypercall number in the hypercall code parameter/structure will be used as an index to the array inside the Hyper-V at hv+0xC00000. These array elements are 0x18 (24) bytes large, and the first 8 bytes are the pointer to the handling function. The rest of the structure is, of course, not public and I'm not sure about its overall setup, so I will not make any guesses before I'm more certain (haven't been that important yet).

This being said, if you want to quickly know what hypercalls have a handler and which have no actual functionality behind them, just determine handler to the hypercall 0 and compare others to it. Hypercall 0 is an invalid hypercall, so all the other hypercall structures that point to the same handler will also not be in use.


3. Hyper-V restores CPU context and resumes the VM

After finishing the hypercall handling, the execution should return to the VM. For that, 2 things must happen. First, the CPU context has to be restored, and then VM execution must be continued with vmresume opcode. In the current version of the Hyper-V, there are only 4 instances of vmresume opcode in hypervisor executable, so it's not a tricky thing to find out which one is used in the hypercall handling flow. After finding out which one, it's clear to see, that the Hyper-V is restoring the original CPU context before returning:




Reading VMCS fields

Since a lot of essential values are kept in VMCS fields, the question comes - how to read these values. Windbg itself does not seem to allow it, and while I'm quite sure that MS internal hvexts.dll extension gives this capability, it does not help anyone outside. Based on that, we have to make our own tool for it. Luckily it's not hard. I have used the exact same method as in kernel to inject code to the Hyper-V.  I take the end of .text segment where no more code exists on the last executable page:


As you can see, after the code there are more than 0x500 bytes of free memory that is still executable and can also be easily filled by the debugger. Just to verify it also on the live system run "u hv+0x347A82; db hv+0x347A82" on Windbg and the result confirms that memory is available but contains nothing important:


After that, it's easy to generate some code to read the value from VMCS:
push rax
push rcx
mov rax, 0x6C16
vmread rcx, rax
int3
pop rcx
pop rax
int3

This code will store the used registries, read 0x6C16 field from the VMCS, break to the debugger (allowing to learn the value), restore the registries, and break again (to allow the return to the original instruction point).

So the process is:
  1. Write the code to some executable location. For example in version where I'm testing it:
    eb hv+0x347B00 0x50 0x51 0x48 0xC7 0xC0 0x16 0x6C 0x00 0x00 0x0F 0x78 0xC1 0xCC 0x59 0x58 0xCC
  2. When you want to read 0x6C16 value, you point rip to the injected code (also storing the original rip)
  3. Execute until first int3
  4. Read the rcx (contains the VMCS field value)
  5. Execute until the second int3
  6. Restore original rip

Steps from 2-6 can all be combined into:
r $t1=rip; r rip=hv+0x347B00; gh; r rcx; gh; r rip = $t1

What now

Hopefully, this post made things a bit more clear about the hypercall handling from the Hyper-V own side and how it's possible to read VMCS values using Windbg. It's also clear that there are a lot more details about it all, so I will reference a couple of additional resources that can help reverse or generally understand how hypervisors should work in such places. I will also try to add some new tools to this series GitHub repo, to make VMCS reading more comfortable. 

Resources:
  • https://rayanfam.com/topics/hypervisor-from-scratch-part-5/
    Sina & Shahriar hypervisor series is an excellent read overall but regarding VM exit handling and other such things, the part-5 is the most valuable.
  • "intel® 64 and IA-32 Architectures Software Developer’s Manual"
    Well...... to understand as much as possible, you have to dig through this a bit. The 3C part is probably the most reasonable place for hypervisor related topics.