Table of Contents

Overview
#

This is a first blog of new series Memory Forensics Mastery. In this we try to understand why we need to learn the skill of memory forensics when there are already multiple domains in forensic like Disk, Network, Mobile. What are concepts involved in memory management, how memory is managed in windows, virtual & physical address, paging, hibernation and lot more.

[!NOTE] : If you spot any mistakes or areas for improvement in my blog, feel free to reach out to me via email or on any platform. I welcome constructive feedback and am always eager to enhance the quality of my content.

Why Memory Forensics required ?
#

I know you are thinking why to start learning memory forensics, and is learning this worth it.

Let’s understand this using a scenario we have an incident which you are investigating and let’s assume it is a malware attack but there is no evidence of malware on disk and network artifacts and packets found containing information about downloading a malware from c2 server that is also done via encrypted channel and malware deleted event logs also. Now how you will find connections established, files dropped, persistence achieved by malware or actual malicious binary responsible.

Memory Forensics can help us to find answers, when system uses any file or opens handles to files, establish network connections, and start process, there are lot of structures created in memory, pointers are updated etc., which can help investigator to find the evidence that are deleted from disk. You can find that obfuscated malware also de obfuscate in memory

There are some techniques like Sleep Obfuscation or Sleep Mask Kit, using these techniques if malware is not executing tasks it again obfuscate itself in memory and if you acquire memory when malware is in sleep state, then you will have hard time to find malware or deobfuscate it

When malware delete files, event logs, exit process they are not removed from memory immediately, they are overwrite when system require that memory to assign it to another resource.

History of Memory
#

There are multiple great timeline available on multiple site how actually it started with using vacuum tubes in digital computers during 1940’s and the introduction of DRAM (Dynamic RAM) in 1968 by Robert Dennard was a game changer, allowing more efficient, compact, and reliable memory storage using capacitors. By the 1990s, DDR (Double Data Rate) RAM revolutionized speed, doubling data transfers per clock cycle.

Timelines :

computerhope Memory TimeLine

cs.odu.edu Memory TimeLine

TimeToast Memory TimeLine

Concept of Paging
#

There are process which require huge memory like 2GB or 4GB and it is hard to allocate 4GB RAM to single process, therefore paging is used which makes an illusion of virtual memory for each process and MMU ( Memory Management Unit) map virtual memory to physical and allow to run multiple process with small amount of physical memory.

Virtual memory is also divided into fixed size block called pages (4KB/4096 B)
Physical memory (RAM) is divided into similarly sized blocks called frames.
A page from virtual memory is mapped to a frame in physical memory.

We will cover virtual memory and it’s translation in later section of blog.

How to find if paging is enabled or not, we can check control registers.

According to the Intel 64 and IA-32 Architectures Software Developer’s Manual, the CR0 register is responsible for paging being enabled.

CR0.PG refers to the 31st bit of the CR0 register. If this bit is set to 1, paging is enabled. If it is set to 0, paging is disabled.

The below image is from a default installation of Windows 10 x64, showing the 31st bit of the CR0 bit is set to 1.

Source: Connor McGarr Paging Blog

If you want to learn in depth of paging check Connor McGarr Turning the Pages Blog

Virtual Address and Physical Address
#

Let’s discuss after lot of technology advancement what is the maximum limit of memory a system can have in x64 system.

Theoretically a 64-bit (x64) architecture can support up to 16 exabytes (about 16 million terabytes) of addressable memory, since 64-bit addressing allows 2⁶⁴ memory addresses. But practically it depends on hardware not all motherboard and OS support maximum 16 exabytes.

We can check Microsoft document, it shows that windows 11 Enterprise support maximum 6TB. So even we have 16 exabytes theoretically the memory still have some limitations and if we talk more practically developer, engineers & corporate endpoints have max 64 GBs of RAM and you will observe 8 GB or 16 GB is most used.

Memory Limits for Windows by Microsoft

Generally I found that gamers system have huge amount of RAM, but after reading multiple reddit, quora & forums discussion I can say that they are also having 32 GB as maximum.

32GB of RAM may sound like a lot, but having a surplus of available capacity can make a huge difference in performance and proves why so many gaming enthusiasts spend more money to add 32GB, 48GB, or even 64GB to their systems. The latest game releases are already starting to recommend a minimum 16GB of RAM. So, if you use your PC for more than just gaming or want to future-proof for upcoming releases, 32GB could be the right option for you.

Source : Kingston

Ok so you may be thinking why I am discussing max memory limit on systems, gamer & developers PC, corporate endpoints. The reason that even with 16 or 8 GB RAM our system run smoothly but actually system using/require more than that.

Understand the above statement, if we have five process and two of require 2 GB and all other require 3GB and system total RAM size is 8 GB then either we can run two 2GB process and one 3 GB or two 3 GB and one 2 GB but not all process, therefore to handle this problem concept of virtual memory is used.

Virtual Memory assigned to each process in x32 and x64 system have some limitation :

On 32 bit system each process have 2 GB user address space and 2 GB kernel address space. [ Kernel space is shared between all process ]

On 64 bit system each x64 process have 128 TB user address space and 128 TB kernel address space & x32 process can have maximum 4GB of User Address Space if largeaddressaware flag is enabled when binary is compiled otherwise it will have 2 GB User Address Space. [ Kernel space is shared between all process ]

Virtual & Physical Address [ source: Microsoft ]

Virtual Address Spaces Microsoft

Stackoverflow discussion on Memory Size

Let’s use VMMap and check x64 bit explorer.exe and x32 bit powershell process to find how actually virtual memory is allocated to process in windows.

The below image shows that explorer highest address is 0X00007FFFEF560000 and it’s size is 108K, converting 0X00007FFFEF560000 to decimal value is 140737208778752 and 13500 for 108 K, now add both and convert it to TB.

\[ 140737208778752 \text{ (bytes)} + 13500 \text{ (bytes)} = \frac{140737208792252}{1024^4} \text{TB} \approx 128 \text{TB} \]

The below image shows that powershell x32 highest address is 0X7FFE7000 and it’s size is 4K, converting 0X7FFE7000 to decimal value is 125726720 and 4096 for 4K, now add both and convert it to GB.

\[ 2147381248 \text{ (bytes)} + 4096 \text{ (bytes)} = \frac{2147385344}{1024^3} \text{GB} \approx 2 \text{GB} \]

Virtual Address To Physical Address
#

The actual data for any process, file, registry, kernel structures are stored in physical memory [ RAM ], even we have 128TB or 2 GB virtual space per process in x64 and x32 bit system respectively, we have to somehow map that to the actual data location or physical memory address, otherwise Virtual Memory is useless.

The conversion of Virtual to Physical I will address with acronym vtop in the blog.

Basic Understanding of vtop
#

To convert/map virtual offset to physical offset the basic process is that from virtual few bits are used to find the offset in physical address and other are used to find physical page location in RAM using Page Table.

Check below image it shows clear idea how MMU convert Virtual Offset to Physical Offset.

Virtual offset is divided into two chunks :
- Virtual Page Number ( VPN )
- Page Offset
Page table have entry for each Virtual Page Number and then this table is used to check corresponding Physical Page Number and access data from physical memory and if page is invalid or not present that means it is swapped to disk and a page fault occur and OS then loads that to physical memory and then update page table with new Physical address location and then process access it.

But in reality this is more complex than just a Virtual Offset -> page table/map -> Physical Address.

vtop in x32bit
#

In x32 bit 32 bit divided into three blocks, 12 bit and two 10 bit blocks.

First 10 bit belongs to PDI ( Page Directory Index): It helps us to find which page table contains mapping between virtual address and physical address
Second 10 bit belongs to PTI ( Page Table Index): Provide corresponding Physical address
Last 12 bits -> Offset to find where data is stored in frame.

VIRTUAL ADDRESS (32-bits)
  ┌────────────────────────────────────────────────────┐
  │ 10-bit PDI      | 10-bit PTI       | 12-bit Offset │
  └────────────────────────────────────────────────────┘
      ↓                    ↓                 |
+------------------+    +------------------+ |  +-------------------+
| Page Directory   |--->| Page Table       |-|-->| Physical Page     |
|                  |    |                  | |   |                   |
| Points to Page   |    | Points to Phys.  | |---|-->Offset selects  |
| Table Base Addr  |    | Page Base Addr   |     |   exact location  |
+------------------+    +------------------+     +-------------------+

Why Page Directory Required ?
#

This is valid question why we require any index for page tables and then check those page tables to find physical address. The better approach is to maintain a page table that will correlate each virtual address to physical address.

The problem is that page tables are also stored in RAM and storing a whole mapping of 4GB we will require each entry for each offset and it will take huge amount of RAM. This approach is not memory efficient therefore multiple page tables are used.

\[ \text{Entries per page table} = \frac{4294967296 \text{ bytes [ 4GB whole virtual address in one page table ]}}{4096 \text{ bytes [ 4 KB Page ]}} = 1048576 \text{ entries} \]

\[ \text{Total RAM Required} = 1048576[\text{Page Table Entries}] \times 4\text{B} [\text{Size of Each Entries}] ~= 4 \text{MB} \]

Here memory consumption somewhat similar but remember each process have their own page table and storing single page table with all entries for each process of 4MB will not be memory efficient and Page tables will take huge amount os space even if they are not used but in multiple page table if pages table are not required they can be swapped to disk and when required they loaded into RAM, you will understand more how multi page table is useful in x64 bit vtop.

Page and frame [ in RAM ] are of same size as then only we can map them using the offset [ 12 bits]

We will take example of 4KB page size, it means our each page will be of size 4KB ( 4096 Bytes ) and in x32 bit system address size is of 4 bytes [ 32 bits/8 bits = 4 B] and we already know that pages and frames are of same size therefore page table is also stored in 4KB frame therefore total entries a single page table can have is 1024.

\[ \text{Entries per page table} = \frac{4096 \text{ bytes [ 4KB Frame ]}}{4 \text{ bytes [ x32 address ]}} = 1024 \text{ entries} \]

But our whole virtual address space is 4GB, therefore we require multiple page table to map complete 4GB Virtual address space to RAM.

But now how system will know which page table contains mapping for which virtual address therefore Page Directory is used.

Calculation to cover 4GB Virtual Address

Page Directory stored in 4KB frame then it can have max. 1024 entries [ each entry = page tables locations ] and each table have 1024 entries and each page covers 4 KB.

Total Virtual Address = 1024 * 1024 [Page Table Entries] * 4KB [Size Of Each Page] ~= 4GB

\[ \text{Total Virtual Address} = 1024[\text{Page Directory Entries}] \times 1024[\text{PT Entries}] \times 4\text{KB} [\text{Size of Each Page}] = 4 \text{GB} \] Another example for explanation if system divide memory into 2MB then calculations looks like

\[ \text{Entries per page table} = \frac{2097152 \text{ bytes [ 2MB Frame ]}}{4 \text{ bytes [ x32 address ]}} = 524288 \text{ entries} \]

Total Size covered by each page table = 524288 * 2MB ~= 1 TB

This means one page table is sufficient but there are can be multiple page table if system have multiple paging.

vtop in x64 bit
#

In x64 bit system we are not using all 64bit we already verify that using vmmap, The virtual address is 48 bits in a canonical form, The rest of the bits are either zero or one depend on 47 bit.

48bit Virtual Address blocks

First 9 bits -> PML4 Index and it points to PDPT.
Next 9 bits -> PDPT Index and it points to PDT which helps to identify the page table for translation.
Next 9 bits -> PDT Index and it points to PT which stores the correlation between virtual and physical address.
Next 9 bits -> PT Index and it points to address in physical memory.
Last 12 bits -> Offset to find where data is stored in frame.

If you understood concept of Page Directory in x32 bit then you already started understanding why we are having multiple page table hierarchy if not, no problem I am here.

A single-level page table mapping the entire 48-bit address space (256 TB) would require an large number of entries (each entry mapping to a 4KB page):

\[ \frac{256 \text{TB}}{4 \text{KB}} = 2^{36}\text{entries} \approx 68.7 \text{billion entries}. \]

Each entry in a page table is 8 bytes (64 bits). So, the total size of a single-level page table would be:

\[ 68.7 \text{billion entries} \times 8 \text{bytes} = 549 \text{GB}. \]

Clearly, this is impractical to fit into memory, Therefore multi page tables are used as they can be swapped to disk when not required.

Since bit also increased from 32 to 48 in x64 and each table can only store 512 entries mapping needs more tables to cover complete 256TB.

\[ \text{Entries per page table} = \frac{4096 \text{ bytes [ 4KB Frame ]}}{8 \text{ bytes [ x64 address ]}} = 512 \text{ entries} \]

Therefore we need PML4 -> PDPT -> PDT -> Page Table -> Physical Page.

\[ \text{Total Virtual Address} = 512 [\text{PML4 Entries}] \times 512 [\text{PDPT Entries}] \times 512 [\text{PDT Entries}] \times 512 [\text{Page Table Entries}] \times 4 \text{KB} [\text{Size of Each Page}] = 256\text{TB} \]

In the below we can observe that on x64 bit system memory acquired by page table is 202864 KB (~200 MB).

Some hardware supports 57 bit in x64 system [ i.e, 48 - 57 = 9 bit ] These extra 9 bits are used to have another page table which is known as PML5 [which stores 512 entries and point to index in PML4 ]

Translation Lookaside Buffer (TLB)
#

TLB is a cache memory that helps in vtop as we discussed for both x32 and x64 system has to check multiple page tables to find final physical address, but if we start storing some information of recent virtual address to physical address translation in a place that is much faster to access and direct and i.e, Translation Lookaside Buffer.

TLB store recent translation in form of tags and if process want to access same physical address then i.e, TLB hit and it can provide Physical address directly without traversing multiple page table and if TLB don’t have that information i.e, TLB miss and now system follow the same process and get Physical address and then update the TLB.

TLBs can have 64 to 512 entries only because it is cache memory and increasing the size of TLB will increase the cost.

Physical Address Extension ( PAE )
#

In x32 bit system physical memory is limited to max 4GB (2³²). But if PAE is enabled it we can use upto 64 GB or 256 GB RAM.

But question is how this can be possible, as we discussed in earlier section of blog, x32 system uses two pages tables (Page Directory & Page Table) now it increases to three levels (Page Directory Pointer Table, Page Directory & Page Table).

First 2 bits -> Page Directory Pointer Table index and points to Page Directory
Next 9 bits -> It belongs to PDI and points to PTI index and helps to find which page table contains mapping
Next 9 bits -> It belongs to PTI and points to frame number in RAM.
Next 12 bits -> Offset to find where data is stored in frame.

Earlier each page table entries have 32 bit and once PAE is enabled it increased to 64 bit. Therefore entries in PDT and PTI reduced to 512 and PDPT have 4 entries (2²).

\[ \text{Total Virtual Address} = 4 [\text{PDPT Entries}] \times 512 [\text{PDT Entries}] \times 512 [\text{PT Entries}] \times 4 \text{KB} [\text{Size of Each Page}] = 4\text{GB} \]

It means we are covering complete 4 GB but since each entries are 64 bit then system can map 64 bit address of Physical Memory.

To check if PAE is enabled we can the 5^th bit in CR4.

VIRTUAL ADDRESS (32-bits)
  ┌─────────────────────────────────────────────────────────────────────────────┐
  │ 2-bit PDPT  |     10-bit PDI    |      10-bit PTI           | 12-bit Offset │
  └─────────────────────────────────────────────────────────────────────────────┘
        ↓             		↓                    ↓                    |
+--------------------+   +------------------+    +------------------+ |   +-------------------+
| Page Directory     |-->| Page Directory   |--->| Page Table       |-|-->| Physical Page     |
| Pointer Table      |   |                  |    |                  | |   |                   |
|                    |   |                  |    |                  | |   |                   |
| Points to Page     |   | Points to Page   |    | Points to Phys.  | |---|-->Offset selects  |
| Directory Base Addr|   | Table Base Addr  |    | Page Base Addr   |     |   exact location  |
+--------------------+   +------------------+    +------------------+     +-------------------+

Fragmentation in Memory
#

Source: Energy-Efficient Dynamic Memory Allocators Paper

Let’s understand fragmentation using an example. We have 8GB RAM and if system don’t have concept of paging and non contiguous memory allocation then system will allocate memory anywhere it find the space.

For Example current memory look like something this

|---- 2GB in use ----||---- 1GB Free----||---- 4GB in use ----||---- 1GB Free----|

A process that need 2 GB of contiguous space, system have 2GB os space but process can’t use it as it is disturbed and now our 4GB space is Free and a new process require 3 GB space then whole 4GB block will be allocated to 3GB process. In both cases we are wasting memory in first we have memory but can’t use in second we are using memory but more than requirement.

Segmentation of Memory
#

In paging we discussed that memory is divided into chunks of fixed size but in segmentation memory is not divided into fixed size. One best example is that when our system loads program into memory it have segments like text(store code), Data Segment(initialized global/static variables).

Segmentation Table To Physical Address
#

When CPU is executing instructions it needs to get it from physical address, To reach RAM/physical address segmentation table is used.

Segmentation table have two information, limit (size of segmentation) & base address (where it is mapped in RAM) and each segment is referred by number so that it is easy for CPU to access to check.

In above image it can be observed that CPU have two information to check s (segment number) & d (instruction offset), now segmentation table is used find the correct segment and then d (instruction offset) check if the size is greater then limit or not if no then check the base address and add offset to base address and reach physical address.

Kernel Patch Protection ( KPP )
#

The is security features rolled out in x64 bit system by Microsoft in 2005. Earlier AV, EDR vendors and developers can easily modify kernel memory and critical structure in kernel memory like _EPROCESS and it sometimes crashes system. To control this KPP is introduced and to limit modification in kernel memory but it is not reducing risk to zero, You can use a driver and load it into system and you get the Kernel Memory access and start modifying it, if any critically structure is modified, it causes BSOD (Blue Screen of Death).

There are websites like LOLDrivers that store drivers information those are vulnerable or exploited by malicious actors.

Paged Pool & Non-Paged Pool Memory
#

Memory pool is a chunk of memory allocated into the virtual memory and used by kernel structures & drivers.

There are two type of pools Paged & Non-Paged:

Paged Pool
#

This memory region is allocated in virtual memory but it is also allowed to page out or swap it to disk.

Non-Paged Pool
#

This memory region is allocated in virtual memory but it is not allowed to page out or swap it to disk and it will mapped to physical address whole time.

Stack & Heaps
#

+-------------------------+
| Kernel Space            | (Restricted to kernel operations)
+-------------------------+
| User Space              | 
|                         |
|  Stack (grows downward) | 
|                         |
|  Free Space             |
|                         |
|  Heap (grows upward)    |
|                         |
|  BSS Segment            | (uninitialized data)
|  Data Segment           | (initialized global/static variables)
|  Text Segment           | (program code)
+-------------------------+

Stack
#

It is a memory allocated by system automatically when a new function is invoked and used to store the function return pointer, variables and it follows LIFO [ Last In First Out ] approach. This is not a permanent memory space once the function is return or function execution complete the memory is deallocated or free and can be used by another resources.

Heap
#

It is a memory allocated by programmer,user & developer when they want to store data or need memory space for their program or software, it follows FIFO [ First In First Out ] approach. This is a permanent memory space programmer control the memory deallocattion.

Both stack and heap are allocated in user address space.

Concept of hibernation
#

This is a state of operating system, when we choose system to hibernate, all of data available in RAM are stored in a file named hiberfil.sys in windows and when system is back from hibernation, system will load all data back into memory using this file.

Conclusion
#

I hope now you have better understanding of how actually memory works, how CPU manages virtual and physical memory and various concepts involved in Memory Management and is it worth it learning memory forensics skill. In next part of this series, I will cover how we can acquire & investigate the memory, finding artifacts and also how to use popular memory forensics tools like volatility, rekall and how to develop your own plugins for tool.