0
Cottontail Memory Management: A System for Allocation, Deallocation, and Accounting
By: J.Vinoth Kumar
The Cottontail memory management system tracks memory address space usage for a flat-mode address space. It is somewhat platform-dependent, as it is designed for an Intel(r)-style system with a 32-bit total address space of 4GB, taking advantage of the x86 processor's paging feature. This means there are approximately 4.3 billion individual addressable byte locations within this address space (4GB), broken down into approximately 1.05 million (1M) logical page locations of 4096 (4KB) bytes each. The only information one needs to know when allocating address space is whether or not a page of address space is allocated. This shall be represented by a 1 for those in use, and a 0 for those free for allocation. The size of the bitmap needed to store this information is calculated as follows:1 bit per page x 2^20 pages x 1 byte per 8 bits = 131,072 bytes, or 128KB
From a programming standpoint, a linear search of this bitmap would be ungodfully slow, and this would occur each time memory would need to be allocated. The following C snippet represents a search for a free page:
unsigned char page_usage_bitmap[131072]; int i=0; while(page_usage_bitmap[i / 8] & (1
unsigned long page_usage_bitmap[32768]; int i=0; while(page_usage_bitmap[i] == 0xFFFFFFFF) { i++; }Due to the more simplistic algorithm, the number of machine instructions needed to describe it are reduced further, to a mere 8. 8 x 1,563 = 12,504 instructions x 2 clock cycles per instruction = about 25,000 clock cycles per memory allocation, a far cry from the 2 million in the first algorithm. But this is still too slow. What if there was even more memory allocated already? What if the operating system needed to allocate memory thousands of times per second over and over? These are some very realistic considerations for the designer of a successful high-demand high-performance OS, and therefore more optimization is needed. Divide the address space into 1,024 4MB "superpages" and create a second bitmap which tracks only 1 simple fact: if the superpage has at least one free page (4KB) in it or not. The size of this bitmap will be 1,024 superpages x 1 bit per superpage x 1 byte per 8 bits = 128 bytes total, a nominal sum. With the 32-bit optimization applied, this results in an array of a mere 32 doublewords. By searching this array first, one may skip over a massive amount of pages at a time. Let's go back to our example of 200MB allocated by Half-Life. Since each of the 32 superpage doublewords describes 128MB of address space, this entire 200MB would fall into the first two superpage doublewords. The first 50 superpages would then be automatically skipped over for linear searching, and the 51st would have at least some free pages. The number of iterations to determine a free page then would be 32, since there are 1,024 pages per superpage, and this is a worst-case scenario! This yields 32 iterations x 8 instructions x 2 clock cycles = 512 clock cycles per memory allocation! But what if one needs to allocate more than just one page at a time, say 64? All you would then need to do is use the above algorithm to find two consecutive doublewords of 0, or a doubleword of 0 with a certain number of free pages before and after it. This would be slightly more complicated and may take longer, say 2,000 clock cycles, but still is incredibly fast considering even an ancient 80486 processor has 50,000 clock cycles in a millisecond. Once a memory allocation is complete, it needs to A.) Update the page bitmap with the pages it allocated and B.) Check within each of the superpage areas it allocated address space in to make sure at least one page is free. This is very quick and its overhead is immaterial to this discussion. Deallocation is slightly simpler; merely update the page bitmap with the pages being deallocated and then set the bits of all affected superpages in the superpage bitmap to 0, because we know intuitively from the deallocation that there is at least one free page in that superpage now.
In conclusion, the Cottontail memory management system is the best I have been able to think up at the moment. Constructive criticism and/or comments on this paper would be appreciated from members of this newsgroup. This is only a first draft. Once discussion of it has ceased, I will do an edit and then add it to the material for the first issue of _The OS Development Journal_.
e-mail: frankm29a (at) no canned meat dot hotmail dot com
http://cottontail-os.tripod.com/ Download(ZIP 5.5kb) the source to Frank Millea's memory managment system.
0Awesome Comments!