[Operating System] Virtual Memory Management
Introduction
Concepts
- Virtual memory
- Partition user program into blocks and load them into memory on a block basis
- Noncontiguous allocation
- Paging/segmentation
- Goals of virtual memory management
- Virtual memory system performance improvements
- Cost model
- Various optimization schemes
- Virtual memory system performance improvements
Virtual memory management strategies
- Allocation strategies
- Fetch strategies
- Placement strategies
- Replacement strategies
- Cleaning strategies
- Load control strategies
Cost Model
Cost model for virtual memory system
- Page fault frequency
- Page fault rate
- Should be designed to reduce operation cost, which is usually defined by the number of page faults that is generated during process execution
- Reduction of context switching overhead or kernel intervention overhead
- System performance improvements
Page reference string
- Sequence (string) of page numbers that are referenced during process execution
- Page reference string : $\omega$
- $\omega$ = $r_1r_2 \cdots r_i \cdots r_T$
- $r_i$ = page number, $r_i$ ∈ {0, 1, 2, ···, N-1}
- N : number of pages of the process (0 ~ N-1)
- T : total number of memory accesses of the process
- Page fault rate : $F(\omega)$
Haardware/Software Components
- Hardware components
- Address translation deviceUsed for efficient address mapping Examples
- Dedicated page-table registers
- Cache memories
- TLB (associative memories)
Bit vectors
- Record the information of whether the reference or update is made to the page in each page frame of memory
- Reference bits (use bits)
- Update bits (modify bits, write bits, dirty bits)
Hardware components
- Reference bit vector
- Whether the contents of each page frame in memory is referenced recently or not
- Operation
- Reference bit is set to 1 when it is referenced by a process
- Periodically reset all reference bits
- Reference bit = 1, at time t:
- The page frame is referenced by the running process after the last reset point
- Reference bit vector
- Update bit vector
- Whether the page frame is modified after it is loaded into memory
- No periodical reset
- Page frame with update bit 1
- The contents of the page in memory and in disk are not same
- Write-back (to disk) is necessary for the page
Software components
- Virtual memory management modules (for system performance)
- Allocation strategies (HOW MUCH)
- Fetch strategies (WHEN)
- Placement strategies (WHERE)
- Replacement strategies (WHICH/WHO)
- Cleaning strategies (WHEN)
- Load control strategies (HOW MANY)
Allocation strategies
- How much space to allocate to each process
- Fixed allocation
- Fixed amount of memory is allocated to each process during entire execution of the process
- Variable allocation
- Variable amount of memory
- Fixed allocation
- Considerations
- It is necessary to estimate the space needs of the processes
- Too much allocation
- Memory waste
- Too small allocation
- Increase of page fault rate, performance degradation
Fetch strategies
- When to bring a page into memory
- Demand fetch
- Fetch the page that is referenced by a process
- Only the pages referenced are loaded into memory
- Page fault overhead
- Anticipatory fetch (pre-fetch)
- Prediction on page reference in the future
- Pre-fetch the pages that has high probability of reference in the near future
- Prediction overhead, hit ratio (?)
- Demand fetch
- Demand fetch scheme is used in most systems
- Pre-fetch: prediction overhead, resource waste when it does not hit
- Demand fetch: reasonable performance in normal situations
Placement strategies
- Where to place the incoming page
- Segmentation system
- First-fit
- Best-fit
- Worst-fit
- Next-fit
- Not necessary in paging systems
Replacement strategies
- Which page to displace for incoming page
- Fixed allocation based replacement strategies
- MIN(OPT, B0) algorithm
- Random algorithm
- FIFO(First In First Out) algorithm
- LRU(Least Recently Used) algorithm
- Additional reference-bits algorithm
- LFU(Least Frequently Used) algorithm
- NUR(Not Used Recently) algorithm
- Clock algorithm
- Enhanced clock algorithm
- Variable allocation based replacement strategies
- VMIN(Variable MIN) algorithm
- WS(Working Set) algorithm
- PFF(Page Fault Frequency) algorithm
Cleaning strategies
- When to write-back the updated page
- Demand cleaning
- Write-back when the updated page is to be replaced
- Write-back only once before replacement
- Anticipatory cleaning (pre-cleaning)
- Prediction-based write-back
- Reduces write-back time at the time of replacement
- Resource waste due to repeated write-backs for a page
- Demand cleaning
Load control strategies
- Control the multiprogramming degree of a system
- Related to allocation strategy
- Should maintain multiprogramming degree at the appropriate range
- At the plateau (Ref: next slide)
- Underloaded: System resource waste, performance degradation
- Overloaded: System resource contention, performance degradation
- Thrashing (excessive paging activity)
- Plateau ranges are located at different positions for different systems
Schemes that have critical impacts on system performance
- Allocation strategy
- Replacement strategy
Locality
Definition
- Process intensively access some portion of the program/data it executes/references
- Programs tend to use data and instructions with addresses near or equal to those they have used recently
- Loop structure in program
- Use of data structures such as array and structure
Locality
- Classification of locality
- Temporal locality
- Recently referenced items are likely to be referenced again in the near future
- Spatial locality
- Items with nearby addresses tend to be referenced close together in time
- Temporal locality
Locality example
- Data references
- Reference array elements in succession / Spatial locality
- Reference variable sum each iteration / Temporal locality
- Instruction references
- Reference instructions in sequence / Spatial locality
- Cycle through the loop repeatedly / Temporal locality
Replacement Strategies
Local replacement
- Each process select a victim (replacement frame) from only its own set of allocated frames
- Fixed allocation based replacement
- MIN(OPT, B0) algorithm
- Random algorithm
- FIFO(First In First Out) algorithm
- LRU(Least Recently Used) algorithm
- Additional reference-bits algorithm
- LFU(Least Frequently Used) algorithm
- NUR(Not Used Recently) algorithm
- Clock algorithm
- Enhanced clock algorithm
Global replacement
- Allows a process to select a victim (replaceme nt frame) from the set of all frames, even if that frame is currently allocated to another process
- One process can take a frame from another
- Variable allocation based replacement
- VMIN(Variable MIN) algorithm
- WS(Working Set) algorithm
- PFF(Page Fault Frequency) algorithm
- More common method
- Generally results in greater system throughput
Replacement Strategies (FA)
MIN algorithm (OPT, B0 algorithm)
- Proposed by Belady in 1966
- Minimizes page fault frequency (proved)
- Scheme
- Replace the page that will not be used for the longest period of time
- Tie-breaking rule
- Page with greatest (or smallest) page number
- Unrealizable
- Can be used only when the process’s reference string is known apriori
- Usage
- Performance measurement tool for replacement schemes
- 4 page frames allocated, initially empty
Random algorithm
- Randomly select the page to be displaced
- Low overhead
- No policy
FIFO algorithm
- Choose the page to be replaced based on when the page is previously loaded into memory
- Scheme
- Replace the oldest page
- Requirements
- Timestamping (memory load time for each page) is necessary
- Characteristics
- May replace frequently used pages
- FIFO anomaly (Belady’s anomaly)
- In FIFO algorithm, page fault frequency may increase even if more memory frames are allocated
- 4 page frames allocated, initially empty
- Anomaly example
LRU(Least Recently Used) algorithm
- Choose the page to be replaced based on the reference time
- Scheme
- Replace the page that has not been used for the longest period of time
- Requirements
- Timestamping (page reference time) is necessary
- Characteristics
- Based on program locality
- Approximates to the performance of MIN algorithm
- Used in most practical systems
- 4 page frames allocated, initially empty
LFU(Least Frequently Used) algorithm
- Counting-based page replacement
- Choose the page to be replaced based on the page reference frequency
- Scheme
- Replace the page with the smallest reference counts
- Tie-breaking rule
- LRU
- Requirements
- Page reference count should be accumulated
- Characteristics
- Less overhead than LRU, while exploiting program locality
- Drawbacks
- May replace recently loaded pages even if they have high probability of reference
- Page reference count accumulation overhead
- 4 page frames allocated, initially empty
NUR(Not Used Recently) algorithm
- LRU approximation scheme
- Lower overhead than LRU algorithm
- Uses bit vectors
- Reference bit vector
- Update bit vector
- Scheme
- Check reference/update bit and choose a victim page
- Order of replacement (reference bit: r, update bit: m)
- ① Replace the page with (r, m) = (0, 0)
- ② Replace the page with (r, m) = (0, 1)
- ③ Replace the page with (r, m) = (1, 0)
- ④ Replace the page with (r, m) = (1, 1)
Additional reference-bits algorithm
- LRU approximation
- History register for each page
- Recording the reference bits at regular intervals
- Timer interrupts at regular intervals
- OS shifts the reference bit for each page into the high-order bit of its history register, shifting the other bits right by one bit and discarding the low-order bit
- Replacement based on the value of history register
- Interpret the value of the history register as unsigned integers
- Choose the page with smallest value as a victim
- Example
- Value of the 8-bit history registers
- 00000000
- No reference during 8 time periods
- 11111111
- Referenced at least once in each period
- Page a with 01100100 and page b with 00110111
- Page a has been used more recently than page b
- 00000000
- Value of the 8-bit history registers
Clock algorithm
- Used for IBM VM/370 OS
- Uses reference bit
- No periodical reset for reference bits
- Choose the page to be replaced using pointer that circulates the list of pages (frames) in memory
- Scheme
- Choose the page to be replaced with the pointer moving clockwise
- Mechanism
- ① Check the reference bit r of the page that the pointer points
- ② If r == 0, select the page as a victim
- ③ If r == 1, reset the reference bit to 0 and advances the pointer goto setp-①
- Characteristics
- Earlier memory load time
- → higher probability of displacement
- Similar to FIFO algorithm
- Page replacement based on the reference bit
- Similar to LRU (or NUR) algorithm
- Earlier memory load time
Enhanced clock algorithm
- Similar to clock algorithm
- Considers update bit as well as reference bit
Scheme
- Choose the page to be replaced with the pointer moving clockwise
- Mechanism
- ① Check (r, m) of the page that the pointer points
- ② (0, 0): select the page as a victim, advances the pointer
- ③ (0, 1): set (r, m) to (0, 0), put the page on the cleaning list, goto setp-⑥
- ④ (1, 0): set (r, m) to (0, 0), goto setp-⑥
- ⑤ (1, 1): set (r, m) to (0, 1), goto setp-⑥
- ⑥ Advances the pointer, goto setp-①
- ① Check (r, m) of the page that the pointer points
Other algorithms
- MRU(Most Recently Used) algorithm
- $\leftrightarrow$ LRU algorithm
- MFU(Most Frequently Used) algorithm
- $\leftrightarrow$ LFU algorithm
- Thinks that the page with smallest reference counts was probably just brought in
Consideration on page replacement algorithms
- Whether the scheme is based on reasonable criteria
- Overhead
Implementation of page replacement schemes
- FIFO algorithm
- Time-stamping
- Queue
- LRU algorithm
- Time-stamping
- Stack (LRU stack)
…
Replacement Strategies (VA)
Working set memory management
- Proposed by Denning in 1968
- Working set
- Set of pages that the process intensively references at a time
- Set of pages in the most recent time interval of length $\Delta$
- Working set of a process changes as time goes
- Based on locality
- Let the working set to resided in memory every time
- Reduces page fault rate (reduces thrshing)
- Improve system performance
- Should consider other measures as well as “number of page faults”
- Characteristics
- A page can be displaced without incoming page
- No page may be displaced for incoming page
- Disadvantages
- Overhead
- Residence set, set of pages that reside in memory, should be updated for every page reference
- Residence set update independent of page faults
- Overhead
PFF(Page Fault Frequency) scheme
VMIN algorithms
Other Considerations
- Page size
- Program restructuring
- TLB reach
Page size
- General range of page size
- 2^7 (128) Bytes ~ 2^22 (4M) Bytes
- Pros and cons
- Smaller page size
- Smaller internal fragmentation
- Match program locality more accurately
- Reduction in total I/O
- Better utilization of memory (Less total allocation of memory)
- Larger page table size
- Increase in I/O time
- Increase in the number of page faults
- Smaller page size
Program restructuring
- System performance can be improved if the user (or compiler/linker/loader) has an awareness of the paged nature of memory or underlying demand paging system
- Program restructuring by user or compiler/linker/loader
…
TLB Reach
- TLB Reach
- The amount of memory accessible from the TLB
- (the number of entries) (page size)
- The amount of memory accessible from the TLB
- For increasing TLB hit-ratio
- Increase the number of entries in TLB
- Expensive
- Increase the number of entries in TLB
- Increase the size of page or provide multiple page sizes
- Solaris: 8KB/4MB
- With 64-entry TLB, TLB reach ranges 512KB~256MB
- OS should manage TLB to support multiple page sizes
- Solaris: 8KB/4MB
- Multiple page size
- Requires OS support
- One of the fields in TLB entry must indicate the size of the page corresponding to the TLB entry
- Software managed TLB
- UltraSparc, MIPS, Alpha
- Hardware managed TLB
- PowerPC, Pentium
- Requires OS support
Summary
- Virtual memory management
- Cost model
- HW/SW components
- Locality
- Virtual memory management schemes
- FA-based schemes
- VA-based schemes
- Other considerations
- Case studies
- MS Windows, Solaris, Linux
댓글남기기