7th ed. chapter 08

(Latest Revision: Sun Sep 19, 2005 )

Chapter Eight -- Memory Management -- Chapter Notes

Section 8.1 -- Background
- Section 8.1.1 -- Address Binding
  - Addresses in the source program are generally symbolic -- e.g. count
  - Typically the compiler binds symbolic addresses to relocatable, relative addresses, given as offsets from the base address of the program or the containing module.
  - The relative addresses may be converted to absolute addresses by the linkage editor or loader.
  - If we want to be able to move programs from one location in memory to another then it will not be workable to load programs containing absolute addresses. In a modern operating system processes are usually relocatable.
- Section 8.1.2 -- Logical- versus Physical-Address Space
  - Under execution-time binding (the dominant paradigm) while a process is executing, the memory management unit (MMU) hardware is responsible for any required mapping from logical address to physical address.
  - The user program deals with the logical addresses exclusively. The (MMU) hardware translates a logical address only when a memory access is performed.
  - In a simple example situation, the MMU might translate logical addresses in the range 0..max to the range R..R+max, where R is the value stored in the relocation register.
- Section 8.1.3 -- Dynamic Loading
  - Under dynamic loading, a routine is not loaded until it is called.
  - Each routine has a disk image represented in a relocatable load format
  - Routines that are never called are never loaded. This may result in considerable savings in memory usage.
- Section 8.1.4 -- Dynamic Linking and Shared Libraries
  - The in-memory program text originally contains a stub for each reference the program has to a library routine. The stub is a piece of code that tells where in memory or on disk to locate the library routine.
  - When the stub is first executed it is replaced with the address of the routine. (If need be, the routine is loaded first.)
  - All processes share the same copy of each library routine.
- Section 8.1.5 -- Overlays
  - Can we run a process that is larger than the physical memory?
  - One "primitive" way to do this is through the use of overlays.
  - For example you could write the program so it loads part of itself, runs for a while, then loads some more of itself on top of a part of itself that is no longer needed, and then runs some more.
  - It is difficult but not impossible to write such a program. It used to be done fairly commonly. It is still done on a limited basis.
  - The idea of virtual memory is to automate this task so that the programmer doesn't have to do anything special with programs that are bigger than physical memory.
Section 8.2 -- Swapping
- Some old multiprogramming systems did swapping to the disk each time there was a context switch. This made some sense back when memories were so small that very few processes could be loaded in memory simultaneously.
- (When you read about this don't let it get you confused about the difference between swapping and context switching. They are completely different things.)
- It isn't practical to do swapping this often in a modern system -- the system would be too unresponsive.
- It is common for an OS to swap out one or more processes when the system has begun to run out of physical memory.
- Windows 3.1 used swapping. When you clicked on a window the associated process would be swapped in, if it was not already in memory.
Section 8.3 -- Contiguous Memory Allocation
- In a contiguous memory allocation set-up each process resides in some contiguous address range in memory (e.g. in the L addresses from base address B to B+L-1). The OS would typically reside in low memory, along with the interrupt vector.
- Section 8.3.1 -- Memory Protection
  - A scheme similar to the base-limit registers idea discussed in chapter two will suffice to keep track of and enforce memory allocations.
  - By changing the values of the base and limit, the OS can keep track of processes as it relocates or resizes them.
- Section 8.3.2 -- Memory Allocation
  - Fixed-size partitioning is a very simple memory allocation methodology.
  - Variable-sized partitioning is more flexible than fixed-size partitioning.
    - The OS maintains a list of available "holes" in memory.
    - A process is placed in a hole big enough for it, and the remainder not used is returned to the list as a smaller hole.
    - Holes are returned to the list when the process terminates and adjacent holes are merged into one.
    - The job of allocating the memory under these conditions is known as the dynamic storage allocation problem.
    - The strategy of searching for a hole may affect performance. First fit, best fit, and worst fit are possibilities.
    - If we order the list of holes by size, we can decrease the time required to find a suitable hole for a process, but keeping the list in order requires extra time.
    - First fit is generally faster than best fit. Both first fit and best fit are better than worst fit in terms of storage utilization and speed.
    - All three algorithms suffer from the effects of external fragmentation. For example, 1/3 of the memory may be wasted (unusable).
- Section 8.3.3 -- Fragmentation
  - Fragmentation can be external or internal.
  - Internal fragmentation is memory that is allocated but not used.
  - If processes are relocatable then the OS can move them around to compact external fragmentation into usable holes. PROBLEM WITH THIS: it can take a long time and if you try to do it piecemeal it becomes a complex job that is very difficult to do correctly.
  - The idea of paging and segmentation is to do an "end run" around these problems by allowing the memory allocation of a process to consist of non-contiguous chunks.
Section 8.4 -- Paging
- When swapping is done in conjunction with variable-size partitioning, there is typically a dynamic storage allocation problem to solve on the swap space device in addition to the problem in main memory. Backing stores are very slow compared to main memories so compaction is not a realistic option. This makes it all the more attractive to use paging or segmentation instead of variable-size partitioning.
- Section 8.4.1 -- Basic Method
  - The hardware has a given page size such as 4Kbytes (in other words, 4096 bytes). We divide primary memory and backing store into page-sized contiguous chunks (called frames). For example page #0 runs from byte #0 through byte #4095; and page #1 runs from byte #4096 through byte (4096+4095)=8191
  - Processes are loaded into a number of frames. The frames don't have to be contiguous with each other. For example the frame used for the first 4096 bytes of the program could be frame #17, which has base address 17*(4096)=69632 and runs up through byte 69632+4095=73727. The second 4096 bytes of the program could be in frame #3, which runs from byte 3*4096=12288 to byte 12288+4095=16383.
  - The logical address space is a contiguous extent ranging in address from 0 to some upper limit.
  - As a program runs the hardware does all the routine translation of logical addresses to physical addresses by using a page table. The operating system does not perform this routine address translation -- that would be extremely slow!
  - The entries of the page table are set up when each page is first loaded. When a process attempts a memory access, hardware uses the first part of the logical address as an index into the page table. The hardware finds the base address of a frame in the page table and combines it with the offset in the logical address. This result is the physical address translation of the logical address.
  - There is no external fragmentation with paging but each allocation can create almost a full page of internal fragmentation.
  - A small page size reduces internal fragmentation. A large page size keeps the page table smaller and reduces the total amount of I/O overhead for copying pages to and from the backing store.
  - Memory protection with paging is pretty straightforward. The OS creates the page table. It works as a set of base-limit registers.
  - The OS has to keep track of all the allocations of the physical frames.
  - The OS keeps track of the page tables for each process.
  - Suppose a process gives an address as a parameter when communicating with the OS. For example the address could be the base address of an array that the process wants to use as an I/O buffer. The process gives the OS a logical address. (The process only knows about logical addresses.) The operating system needs to know the physical address. The OS will use the page table of the process to translate.
- Section 8.4.2 -- Hardware Support
  - The copy of the page table used by the hardware might be a set of dedicated CPU registers.
  - In a modern general-purpose system the CPU contains a page-table base register (PTBR) pointing to a large page table resident in the main memory.
  - Such a 'modern' system also uses an address cache (TLB) so that the MMU does not usually have to take the time to access the page table when performing an address translation.
  - ASID technology allows the TLB to contain address translation information for several different processes.
  - ASID technology also cuts down on the necessity to do time-consuming cache flushes during a context switch.
  - Effective memory access time is a function of the hit ratio, memory access time, and TLB search time.
- Section 8.4.3 -- Protection
  - Some systems make the page table only as long as is necessary for the size of the process. Such a system would typically have a page-table length register (PTLR). A process attempting to access an address "past the end of the table" would generate a trap to the OS.
  - In any case, the valid bit in "extra" page table entries can be cleared by the OS so that the process will trap if it tries to use one of those entries.
  - Unfortunately a process generally can access the internal fragment in its last page.
- Section 8.4.4 -- Structure of the Page Table
  - Section 8.4.4.1 --- Hierarchical Paging
    - Page tables may be quite large. In that case we may divide the page table into pages.
    - In one scheme, the logical address is partitioned as (p1|p2|d). P1 is used as an index into an outer page table. This leads us to one page of the page table. P2 and d are then used in the "normal way" to complete the address translation.
    - For still larger page tables SPARC's support three-level paging and Motorola 68030 supports four-level paging.
    - Generally it is not considered appropriate to map a 64-bit paged address space with "traditional" hierarchical page tables. It requires a "ridiculous" number of levels of page tables -- e.g. seven levels.
  - Section 8.4.4.2 -- Hashed Page Tables
    - Hashed page tables are an alternative to hierarchical page tables. A hash function is applied to the virtual address. Collisions are resolved with external chaining. Each entry on a chain contains a virtual address, frame number, and pointer for the next item on the chain.
    - Clustered page tables are a variant in which each entry in the page table refers to several pages.
  - Section 8.4.4.3 -- Inverted Page Table
    - The UltraSPARC and PowerPC use an inverted page table. This table has one entry for each frame. The entry contains the virtual address for the frame and info on the process that is using the frame.
    - There may be some total space savings with this set-up, but hardware and OS cannot directly index into the table using the page number, so it takes a long time to do forward address translation.
    - The idea of the hashed page table is used in conjunction with the inverted page table to speed the search for the correct table entry.
    - Of course if there is a cache hit in the TLB, the page table is not consulted and effective memory access time is nearly equal to memory access time. If the page table is consulted, then address translation requires one memory access for each probed link in the hash-overflow chain.
- Section 8.4.5 -- Shared Pages
  - The paging paradigm easily supports shared memory (at least when "traditional" hierarchical page tables are used.)
  - If two processes have the same frame number in both their page tables then they are able to share that frame.
  - The OS can use this idea to allow many processes to share the same read-only program text.
  - Writeble memory may be shared as a means of interprocess communication.
  - Inverted page tables are set-up to have just one virtual page number for each frame, so this makes it difficult to implement shared memory.
Section 8.5 -- Segmentation
- Section 8.5.1 -- Basic Method
  - Programmers tend to think of their programs as a collection of named functions, modules, and data structures -- not arranged in any particular order.
  - Maybe it is not so "natural" to think of the program as occupying a linear array of word cells starting at address 0 and running to some upper limit.
  - The segementation memory management scheme makes it a little easier for the programmer to view the program as that unordered collection.
  - Instead of pages we have variable length named segments of memory. Logical addresses consist of a segment name followed by an offset within the segment. (Well, really we don't use a segment name -- we use a segment number :-)
- Section 8.5.2 -- Hardware
  - A segment table is indexed by segment number (name). Each entry of the table contains the base address and limit (length) of a segment.
  - To translate an address we compare the offset part with the limit in the segment table entry. If the offset is not too large, the physical address is the sum of the segment base plus the offset. Otherwise we have to trap a violation of a segment limit.
- Section 8.5.3 -- Protection and Sharing
  - Often it does not work well to mark pages as read-only or execute-only -- because a page may contain different kinds of things.
  - On the other hand a segement is likely just to contain one kind of thing, and so it usually works out better if we mark segments as read-only, read/write, execute-only, and so on.
  - Since we can define segments to contain precisely what we want to share, segmentation offers some advantages over paging with regard to supporting sharing.
  - However if shared code contains references to itself then all processes sharing the code will need to refer to the segment using the same segment number -- the one used in the code itself.
  - On the other hand there is no problem sharing read-only code that contain no pointers, or that uses only relative addresssing.
- Section 8.5.4 -- Fragmentation
  - Since segments are variable in size, allocating segments is a dynamic storage allocation problem subject to external fragmentation.
  - It is not hard to relocate segments so it is pretty easy to do compaction of segmented memory.
  - Nevertheless it remains impractical to do compaction of the segment images on swap space.
  - The smaller the segments the less severe will be the problems with external fragmentation.
Section 8.6 -- Segmentation with Paging
- If we use segments, but page the segments, we can get the advantages of segmentation while eliminating external fragmentation. However, if there are many segments the internal fragmentation problem can be non-trivial.
- Address translation is more complicated in this setting.
Section 8.7 -- Summary