Does mmap() update the page table after every page fault?

Based on my research on mmap(), I understand that mmap uses demand paging to copy in data to the kernel page cache only when the virtual memory address is touched, through page fault.

If we are reading files that are bigger than the page cache, then some stale page in the page cache will have to be swapped out reclaimed. So my question is, will the page table be updated to map the corresponding virtual memory address to the address of the old stale page in the cache (now containing new data)? How does this happen? Is this part of the mmap() system call?

Asked By: prajasek

||

will the page table be updated to map the corresponding virtual memory address to the address of the old stale page in the cache (now containing new data)? How does this happen?

When mmap() is called, it creates a mapping in the process’s virtual address space to the file specified. This mapping merely sets up the ability for these pages to be loaded when they are actually accessed, it doesn’t load anything into memory yet. When you then access the pages, a page fault is generated, the page table entries are updated to map the virtual addresses to the physical addresses of the newly loaded pages, and you can then access the file. This happens in filemap_fault.

This is also how it works if you access a mapped page which has been evicted: the kernel handles the page fault, puts the file content back into the pages, and from the application’s perspective, nothing happened.

There’s nothing special about mmap() here per se — this is how demand paging works inside the Linux kernel in general, as used for almost everything — even regular program memory and file cache entries.

[…] map the corresponding virtual memory address […]

Note that, when reading in with mmap(), the kernel typically will use readahead in order to load more content than just the single page you’ve generated a page fault on, unless there is an indication that this would be unhelpful, like MADV_RANDOM (indicated by user), or MMAP_LOTSAMISS (kernel heuristic).

Answered By: Chris Down
Categories: Answers Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.