[1] [2] [3] [4] [5] [6] [7] [8] [9]
Define the term heap as it relates to memory.
The heap is the large area of unstructured memory for storing
data used by a program. In C, memory is allocated in the heap
via the malloc library function; in Java, heap
allocation is accomplished via the new
operator.
Describe the reference counting technique for garbage collection.
Each memory allocation has a count associated with it tracking how many pointers reference that memory allocation. When a pointer is modified, the allocation which the pointer was referencing has its counter decremented, and the allocation to which the pointer is now referencing has its counter incremented. When the counter reaches 0, the allocation can safely be freed automatically.
Describe two disadvantages of reference counting as a garbage collection technique.
First, when there is a cycle of memory blocks each referencing the next block in the cycle, with no other locations referencing any of the cycle's blocks, reference counting will not identify these blocks as being available.
Second, reference counting has a high overhead for changing pointers: With every change to a pointer, we must decrement the reference counter for the previously referenced block, increment the reference couner for the newly referenced block, and check whether the previous reference counter has reached zero (to see whether to free it). By contrast, the lazier techniques allow the pointers to be updated directly without such an overhead.
Describe the Mark and Sweep algorithm for garbage collection.
With each memory allocation, we include a one-bit header. Periodically, we stop the process, and go through the following steps.
Set all of the currently allocated blocks' bit-headers to 0.
From each reference root (that is, all variable names that are accessible anywhere on the stack), we traverse to find the reachable memory: With each block we can find, we check whether its bit-header is 1, and if it is not, we set it to one and then recurse for every reference found within that block.
Finally, we sweep through all currently allocated blocks and free any blocks whose bit-headers are still 0.
Explain why a process must be frozen while the mark and sweep algorithm (in its simplest form) locates unused memory.
Suppose we have an array of pointers whose first location references an object A and whose last location references an object B, and suppose that A and B have no other references to them. If the marking phase runs while the program continues, then the program might decide to swap the first and last locations of the array while the marking phase is in the middle of the array. As a result, it would mark A when it reaches the array's beginning, and it would mark A again when it reaches the array's end. It would never mark B, and thus it would identify the memory as unused, and it would be deallocated even though the memory is in fact referenced.
Copy collection makes use of what's called a tombstone. What is the purpose of this tombstone?
The tombstone is a marker in from-space placed at the location of a memory block that has already been copied into to-space. As copy collection continues, other pointers to the old from-space location may be found. Finding a tombstone there will allow the pointer to be corrected to the to-space copy without performing any additional copies.
For a program that has many short-lived memory allocations, why is the process of collecting garbage much faster with copy collection than with mark and sweep?
Mark and sweep involves checking every memory allocation, whether it is active or not. With lots of short-lived memory allocations, most memory allocations are not active. With copy collection, the only memory traversed is that which is currently active.
Traditional thought regards automatic garbage collection
as automatically slower than explicit memory deallocation, since
garbage collection systems must spend time identifying unused
memory, which is unnecessary when programs use explicit memory
allocation.
However, we saw that copying collection actually allows
some programs to run faster than equivalent programs
running on systems using malloc and free, even
if the equivalent programs deallocate memory efficiently.
Why?
With explicit memory deallocation, memory can become highly fragmented, and requests for allocating memory blocks require some computation to find a block that is big enough. Copying collection, however, defragments the memory in the course of collecting garbage, so that there is always just one large block of available memory from which all memory allocations are taken.
(This would be particularly noticeable if we had a program that
allocated many short-lived blocks of memory. In such a case,
copying collection would require little time for copying the
used blocks over into to-space,
and memory allocation would
go quite quickly. With explicit deallocation, though, memory
would be freed frequently.)
What is a weak reference (as used in Python or Java)?
A weak reference to a memory block is ignored when
the garbage collector determines whether the memory block should be kept.
Thus, when we have a weak reference, it points to a memory
block that could potentially disappear once there are no
normal
references to the block.