Linux Memory Cheat Sheet, containing useful Tools and Concepts about Linux Kernel memory management.

Linux Virtual Memory Map: Virtual Addresses ranges

0000000000000000 - 00007fffffffffff (=47 bits) user space, different per mm
hole caused by [48:63] sign extension
ffff800000000000 - ffff80ffffffffff (=40 bits) guard hole
ffff880000000000 - ffffc7ffffffffff (=64 TB) direct mapping of all phys. memory
ffffc80000000000 - ffffc8ffffffffff (=40 bits) hole
ffffc90000000000 - ffffe8ffffffffff (=45 bits) vmalloc/ioremap space
ffffe90000000000 - ffffe9ffffffffff (=40 bits) hole
ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB)
... unused hole ...
ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
... unused hole ...
ffffffff80000000 - ffffffffa0000000 (=512 MB) kernel text mapping, from phys 0
ffffffffa0000000 - ffffffffff5fffff (=1525 MB) module mapping space
ffffffffff600000 - ffffffffffdfffff (=8 MB) vsyscalls
ffffffffffe00000 - ffffffffffffffff (=2 MB) unused hole

It is interesting to see that the addresses in the range ffff880000000000-ffffc7ffffffffff directly map the physical ram for up to 64TB of data, in fact in the (volatility)volshell of a RAM dump belonging to an emulator with 1.5Gb of RAM, the last read that is allowed in such virtual address range is

db(0xffff880000000000+0x5FFD5Fff,0x1)

The direct mapping area is direct with respect to RAM itself, it doesn’t maps directly to address ranges shown in /proc/iomem for example. If I do have two sections of RAM physical addresses shown in /proc/iomem, one starting at 0x1000 and ending at 0x 1fff, the other starting at 0x3000 and ending at 0x3fff; when I read byte 0xffff880000000000 from direct mapping zone I will read physical address 0x1000; when I read 0xffff880000002000, the physical address 0x3000 will be accessed.

The kernel text mapping range ffffffff80000000-ffffffffa0000000 points to the same physical addresses of the first 512MB of the direct mapping range

The kernel most used data structures such as task_struct structures of running processes will be allocated into the direct mapping area.

Is it possible to translate pointer values into physical addresses by hand?

/proc/iomem

bp@home:~/Public/thesis/volatility$ sudo cat /proc/iomem | grep RAM
00001000-0009d3ff : System RAM
00100000-78327fff : System RAM
7ac0e000-7ac0efff : System RAM
100000000-47dffffff : System RAM

/boot/System.map and /proc/kallsyms Files

Kernel Linear Mapping and Kernel Virtual Memory Accesses

Thus, given that paging is enabled, the kernel uses a work around to keep the translation easy, that is called Linear Mapping: all kernel virtual addresses are mapped 1:1 to physical addresses, and the translation simply involves a subtraction of an offset. Furthermore the Linux Kernel memory region is not swappable, this means that it will always be in RAM.

Linux Kernel phys_to_virt() and virt_to_phys() macros (arch/x86/include/asm/io.h)are used for Kernel Addresses translation. The names are evocative enough: the first translates physical to virtual addresses and the second vice versa.

static inline void *phys_to_virt(phys_addr_t address)
{
return __va(address);
}
static inline phys_addr_t virt_to_phys(volatile void *address)
{
return __pa(address);
}
// Following the Rabbit:
#define __va(x) ((void *)((unsigned long)(x)+PAGE_OFFSET))
#define __pa(x) __phys_addr((unsigned long)(x))
#define __phys_addr(x) __phys_addr_nodebug(x)
static inline unsigned long __phys_addr_nodebug(unsigned long x)
{
unsigned long y = x - __START_KERNEL_map;

/* use the carry flag to determine if x was < __START_KERNEL_map */
x = y + ((x > y) ? phys_base : (__START_KERNEL_map - PAGE_OFFSET));

return x;
}

I don’t really know why in the virtual to physical kernel memory translation there is more logic than a simple arithmetic operation, and why x > y isn’t arithmetically resolved in __START_KERNEL_map>0 ( if anyone knows or can point out useful pointers please comment below!).

Tools Cheat Sheet

volatility -f memory.dmp --profile=<dump-profile> imagecopy -O memory.raw

Commands within VolShell:

# Convert Virtual Address to Physical Address
# Kernel Virtual Addresses are simply shifted
# by the __START_KERNEL_map( = 0xffffffff80000000 ) offset
addrspace().vtop(0xffffffff81c114c0)
# Dump Structure from RAM at a given Virtual Address
dt("task_struct",0xffffffff81c114c0)
# Get Structure from RAM at a given Virtual Address
obj.Object("task_struct",vm=addrspace(),offset=0xffffffff81c114c0)
# Utility function, converts page value to page virt addr
def pageval_to_address(val):
val = bin(val)[2:]

# we first extend page value to canonical 48bit address
while(len(val)<48):
val="0"+val
# page table uses only 48bits, sign-extend to 64
sign = int(val[0])
for i in range(16):
val = str(sign) + val
return int("0b"+val,2)
# Get all available pages addresses (page table uses 48 bit
# addresses, we have to sign-extend the last bit)
# pages will contain tuples such as:
# (<virtual-addr>,<physical-addr>,<size>)
pages=[]
for addr,size in addrspace().get_available_pages():
pages.append(\
(pageval_to_address(addr),\
addrspace().vtop(pageval_to_address(addr)),\
size))
# volatility knows that page value != page virtual address
addrspace().vtop(page[0][0]) == addrspace().vtop(pageval_to_address(page[0][0])
# If the 48 bit is set then we are in kernel memory
isKernelMemory = lambda addr: addr>0x800000000000
# Iterate over task_struct-s and obtain their bytes
init_task_addr = self.addr_space.profile.get_symbol("init_task")
init_task = obj.Object("task_struct", vm = self.addr_space, offset = init_task_addr)
for task in init_task.tasks:
data = self.addr_space.read(task.obj_offset,task.size())

Print Kernel Call Stack for a symbol (p.e. vfs_read) of a given pid with bcc-tools

sudo trace-bpfcc vfs_read -p 29578 -K

Software Engineer, passionate about Cybersecurity.