Monday, April 7, 2014

Virtual memory, page fault and thrashing

Problem

Explain the following terms: virtual memory, page fault, thrashing.

Solution

In short:
  1. Virtual memory is a technique that uses hard disk pages as main memory.
  2. Page fault is the error occurred when transferring pages.
  3. Thrashing - When paging occurs very very frequently, causing performance degradation.
Now lets go in details:

Virtual Memory
Virtual memory is a computer system technique which gives an application program the impression that it has contiguous working memory (an address space), while in fact it may be physically fragmented and may even overflow on to disk storage. Systems that use this technique make programming of large applications easier and use real physical memory (e.g. RAM) more efficiently than those without virtual memory.

It is true that with virtual memory, you are able to have your programs commit (i.e. allocate) a total of more memory that physically available. However, this is only one of many benefits if having virtual memory and it's not even the most important one. Personally, when I use a PC, I periodically check task manager to see how close I come to using my actual RAM. If I constantly go over, I go and I buy more RAM.
The key attribute of all OSes that use virtual memory is that every process has its own isolated address space. That means you can have a machine with 1GB of RAM and have 50 processes running but each one will still have 4GB of addressable memory space (32-bit OS assumed). Why is it important? It's not that you can "fake things out" and use RAM that isn't there. As soon as you go over and swapping starts, your virtual memory manager will begin thrashing and performance will come a halt. A much more important implication of this is that if each program has it's own address space, there's no way it can write to any random memory location and affect another program.

That's the main advantage: stability/reliability. In Windows 95, you could write an application that would crash entire operating system. In W2K+, it is simply impossible to write a program that paves all over its own address space and crashes anything other than self.
There are few other advantages as well. When executables and DLLs are loaded into RAM, virtual memory manager can detect when the same binary is loaded more than once and it will make multiple processes share the same physical RAM. At virtual memory level, it appears as if each process has its own copy, but at a lower level, it all gets mapped to one spot. This speeds up program startup and also optimizes memory usage since each DLL is only loaded once.
Virtual memory managers also allow you to perform file I/O by simply mapping files to pages in the virtual address space. In addition to introducing interesting alternative to working with files, this also allows for implementations of shared memory segments which is when physical RAM with read/write pages is intentionally shared between processes for extremely efficient inter-process communications (IPC).
With all these benefits, if we consider that most of the time you still want to shoot for having more physical RAM than total commit size and consider that modern CPUs have support for virtual address mapping built directly into the hardware, the overhead of having virtual memory manager is actually very minimal. On the other hand, in environments where many applications from many different vendors run concurrently, process address space is priceless.
Page fault
A page is a fixed-length block of memory that is used as a unit of transfer between physical memory and external storage like a disk. According to Wikipedia, page fault is an interrupt (or exception) to the software raised by the hardware, when a program accesses a page that is mapped in address space, but not loaded in physical memory.
That's not entirely correct, as explained later in the same article (Minor page fault). There are soft page faults, where all the kernel needs to do is add a page to the working set of the process.

How to reduce page fault?
  • It might also be helpful to make sure that memory that is accessed after each other is near to each other (eg if you have some objects, place them in an array) if these objects have lots of data that is very infrequently used, place it in another class and make the first class have a reference to the second one. This way you will use less memory most of the time.
  • Increasing the physical RAM on your machine could result in fewer page faults, although design changes (as mentioned in the previous point) to your application will do much better than adding RAM.

Thrashing
Thrash is the term used to describe a degenerate situation on a computer where increasing resources are used to do a decreasing amount of work. In this situation the system is said to be thrashing. Usually it refers to two or more processes accessing a shared resource repeatedly such that serious system performance degradation occurs because the system is spending a disproportionate amount of time just accessing the shared resource. Resource access time may generally be considered as wasted, since it does not contribute to the advancement of any process. In modern computers, thrashing may occur in the paging system (if there is not ‘sufficient’ physical memory or the disk access time is overly long), or in the communications system (especially in conflicts over internal bus access), etc.

In operating systems that implement a virtual memory space the programs allocate memory from an address space that may be much larger than the actual amount of RAM the system possesses. The OS is responsible for deciding which programs "memory" is in actual RAM. It needs a place to keep things while they are "out". This is what is called "swap space", as the OS is swapping things in and out as needed. When this swapping activity is occurring such that it is the major consumer of the CPU time, then you are effectively thrashing. You prevent it by running fewer programs, writing programs that use memory more efficiently, adding RAM to the system, or maybe even by increasing the swap size.
A page fault occurs when the memory access requested (from the virtual address space) does not map to something that is in RAM. A page must then be sent from RAM to swap, so that the requested new page can be brought from swap to RAM. As you might imagine, 2 disk I/Os for a RAM read tends to be pretty poor performance.


References
http://tianrunhe.wordpress.com/2012/04/20/virtual-memory-page-fault-and-thrashing/
http://stackoverflow.com/questions/19031902/what-is-thrashing-why-does-it-occur
http://stackoverflow.com/questions/7713867/is-virtual-memory-really-useful-all-the-time

0 comments:

Post a Comment