| • Science | • People | • Locations | • Timeline |
In a virtual memory system, the address space does not match physical memory. In a computer architecture where the word size is 32 bits, each process can "see" up to 232=4G of memory virtually. This means a program can use some memory address in this 4G address space, but this memory address has no physical meaning and only represents an address in this virtual address space.
Of course, as of 2004 many computers do not actually have 4G of memory physically. However, almost all programs do not use all 4G of memory in this address space, but only parts of it at a time. This is the key idea behind virtual memory.
In virtual memory systems, the 4G address space is split up into small chunks (commonly 4K in size), called pages. The physical memory is also split up into chunks (commonly 4K in size), called frames.* The virtual memory system essentially keeps track of what memory is being used, and maps a virtual address to one of these frames.
Many architectures have direct hardware support for virtual memory, providing what is known as a translation lookaside buffer (TLB), which is filled with page-frame mappings initially, and instead of having the virtual memory system entirely in software, when the hardware looks up a memory address and does the page-frame translation, which gains us a performance increase.
However, the TLB can only hold a fixed number of page-frame mappings. It is the job of the virtual memory system to extend this into software, and to hold extra page-frame mappings. The virtual memory system does so by means of a page table.
The page table is at the heart of the virtual memory system, and is the second port-of-call when an address translation fails from the TLB, called a TLB miss.
Hardware architectures can offer the chance for an interrupt handler to be installed so the TLB miss can be handled. The handler can look up the address mapping in the page table, and can see whether a mapping exists in the page table. If one exists, it is written back to the TLB, as the hardware accesses memory through the TLB in a virtual memory system, and the faulting instruction is restarted, with the consequence that the hardware will look in the TLB again, find the mapping, and the translation will succeed.
However, the page table lookup may not be successful for two reasons:
In the first case, the memory access is invalid and an error may be triggered (other operating systems may behave differently). In the second case, the page is elsewhere, such as on a disk. To handle this case, the page needs to be taken from disk and put into physical memory. When physical memory is not full, this is quite simple, one simply needs to write the page into physical memory, modify the entry in the page table to say that it is present in physical memory (see the next section), write the mapping into the TLB and restart the instruction.
However, when physical memory is full, and there are no free frames available, pages in physical memory may need to be swapped with the page that needs to be written to physical memory. The page table needs to be updated to mark that the pages that were previously in physical memory are no longer so, and to mark that the page that was on disk is no longer so also (and to of course write the mapping into the TLB and restart the instruction). This process of swapping pages between physical memory and disk is known sometimes as, obviously, swapping (though the term is sometimes used to describe swapping entire processes). This process however is extremely slow in comparison to memory access via the TLB or even the page table, which lies in physical memory.
The simplest page table systems often maintain a frame table and a page table.
The frame table, which in the most basic system, holds information about which frames are mapped. In more advanced systems, the frame table can also hold information to which address space a page belongs, or statistics information, or other background information.
The page table holds the mapping between a virtual address of a page and the address of a physical frame. There is also auxilliary information about the page such as a present bit, a dirty or modified bit, address space or process ID information, amongst others.
Secondary storage, such as a hard disk, can be used to augment physical memory. Pages can be swapped in and out of physical memory and the disk. The present bit can indicate what pages are currently present in physical memory or are in disk, and can indicate how to treat these different pages, ie., whether to load a page from disk and swap another page in physical memory out, etc.
The dirty bit allows us a performance optimization. Say we have a page on disk that we swap in to physical memory. We can either write to this page, or we can just read from it. If we just read from it, and we need to replace this page with another, we don't need to write this page back to disk since the page hasn't changed (if we want to reload the page, we can just do so from disk again). However, if we did write to the page, we would raise the dirty flag, and this would mean we would need to write the page back so if we reload the page, we get the correct information back.
Address space or process ID information is necessary so the virtual memory management system knows what pages to associate to what process. Since the virtual memory map is the same for each process, between two processes, two identical virtual addresses could be used for different purposes, so the addresses must be somehow distinguished by identifying it with the process in the page table. This can be done by using a unique address space identifier, or by using process IDs.