Memory Management-Analysis of autorelease Principle

Developers who have experienced the MRC era must have used the autorelease method, which is used to hand over objects to the Autorelease Pool management and release them automatically when appropriate. In fact, the so-called automatic release of objects is to call the release method on the managed objects. To understand the principle of the autorelease method, we first need to understand what Autorelease Pool is.

Now let's look at a piece of code in MRC environment. Why discuss this under MRC? Because ARC will automatically add autorelease code to the right place and will not allow us to call this method manually, we still have to go back to MRC to study the principle of autorelease.

****************** main.m *****************
#import <Foundation/Foundation.h>
#import "CLPerson.h"

int main(int argc, const char * argv[]) {

    NSLog(@"pool--start");
    @autoreleasepool { 
        CLPerson *p = [[[CLPerson alloc] init] autorelease];
    } 
    NSLog(@"pool--end");

    return 0;
}

************** CLPerson.m **************
#import "CLPerson.h"

@implementation CLPerson

- (void)dealloc
{
    NSLog(@"%s", __func__);
    
    [super dealloc];
}
@end

****************** Print results *******************
2019-08-27 16:37:15.141523+0800 Interview16-autorelease[11602:772121] pool--start
2019-08-27 16:37:15.141763+0800 Interview16-autorelease[11602:772121] -[CLPerson dealloc]
2019-08-27 16:37:15.141775+0800 Interview16-autorelease[11602:772121] pool--end

To summarize what you see, the CLPerson instance object p is released at the end of @autorelease pool {} braces.
So what did @autorelease pool {} do? We execute the following commands on the main.m file in the command line window

xcrun -sdk iphoneos clang -arch arm64 -rewrite-objc main.m -o main.cpp

In the generated intermediate code main.cpp, find the underlying implementation of main function as follows

int main(int argc, const char * argv[]) {
    /* @autoreleasepool */ { __AtAutoreleasePool __autoreleasepool; 
        MJPerson *person = ((MJPerson *(*)(id, SEL))(void *)objc_msgSend)((id)((MJPerson *(*)(id, SEL))(void *)objc_msgSend)((id)((MJPerson *(*)(id, SEL))(void *)objc_msgSend)((id)objc_getClass("MJPerson"), sel_registerName("alloc")), sel_registerName("init")), sel_registerName("autorelease"));
    }
    return 0;
}

In fact, if you are familiar with the message mechanism, the above code can be converted into the following form

int main(int argc, const char * argv[]) {
    /* @autoreleasepool */ { 
        __AtAutoreleasePool __autoreleasepool; 
        CLPerson *p = [[[CLPerson alloc] init] autorelease];
    }
    return 0;
}

We observed that @autorelease pool {} had the following changes after compilation

Here's an extra _AtAutorelease Pool, which is actually a c++ structure and can be searched in main.cpp for its definition as follows

struct __AtAutoreleasePool {
    //Constructor - > can be analogous to the init method of OC, which is called at creation time
  __AtAutoreleasePool()
    {
        atautoreleasepoolobj = objc_autoreleasePoolPush();
    }
    
    //Destructor - > can be analogized to OC's dealloc method, which is called when destroyed
  ~__AtAutoreleasePool()
    {
        objc_autoreleasePoolPop(atautoreleasepoolobj);
    }
    
  void * atautoreleasepoolobj;
};

If you don't know the C++ grammar, it's similar to OC classes. You can have functions (methods). There are already two functions in the structure _AtAutorelease Pool.

  • A constructor _AtAutoreleasePool () --> atautoreleasepoolobj = objc_autoreleasePoolPush (); is called when the structure is created for initialization of the structure
  • A destructor _AtAutorelease Pool () - > objc_autorelease Pool Pop (atautorelease polobj); called when the structure is destroyed

Back to our main function, it's essentially the following formIn the case of single layer @autoreleasepool {}, if multiple layers @autoreleasepool {} are nested together, they can be disassembled according to the same rules.

objc_autoreleasePoolPush() & objc_autoreleasePoolPop()

Next, we will explore the implementation logic of these two functions. Their implementation can be found in the NSObject.mm file of the objc4 source code

*************** NSObject.mm (objc4) ******************
void *
objc_autoreleasePoolPush(void)
{
    return AutoreleasePoolPage::push();
}

void
objc_autoreleasePoolPop(void *ctxt)
{
    AutoreleasePoolPage::pop(ctxt);

As you can see, they call the push() and pop() functions of the C++ class Autorelease PoolPage, respectively. In order to go further into the implementation logic of subsequent functions, we need to take a look at the internal structure of this Autorelease PoolPage. It has a lot of content and a lot of functions. But first we need to clarify its member variables, which are changeable and controllable, so we can remove functions and some static constants. Simplify the Autorelease PoolPage structure as follows

class AutoreleasePoolPage 
{
    magic_t const magic;
    id *next;
    pthread_t const thread;
    AutoreleasePoolPage * const parent;
    AutoreleasePoolPage *child;
    uint32_t const depth;
    uint32_t hiwat;
}

According to its name, it is interpreted in Chinese as an automatic release pool page with the concept of a page. We know that the automatic release pool is used to store objects. This "page" indicates that the structure of the release pool should have page space limitation (memory space size). How big is it? Let's look at two functions of Autorelease PoolPage

id * begin() {

        return (id *) ((uint8_t *)this+sizeof(*this));
}

id * end() {
        return (id *) ((uint8_t *)this+SIZE);
}

The begin() function returns a pointer to the memory address after its last member variable (equivalent to overtaking its own memory space).
There's a SIZE in end(), so let's look at its definition.

static size_t const SIZE = 
#if PROTECT_AUTORELEASEPOOL
        PAGE_MAX_SIZE;  // must be multiple of vm page size
#else
        PAGE_MAX_SIZE;  // size and alignment, power of 2
#endif

********************************************
#define PAGE_MAX_SIZE           PAGE_SIZE
********************************************
#define PAGE_SIZE               I386_PGBYTES
********************************************
#define I386_PGBYTES            4096            /* bytes per 80386 page */

As you can see, SIZE is actually 4096. This means the end() function, which gets a pointer to the 4096 byte memory address after the address of the Autorelease PoolPage object.

Through the above information, we first throw out the conclusion, and then continue to deepen our understanding through the source code.

Each Autorelease PoolPage object occupies 4096 bytes, with member variables occupying 8 bytes * 7 = 56 bytes. The remaining 4040 bytes of space are used to store automatically released objects.

Because the memory of an Autorelease PoolPage object is limited, there may be many objects in the program that will be added to the automatic release pool, so there may be multiple Autorelease PoolPage objects to store the automatic release objects together. All Autorelease PoolPage objects are linked together in the form of a two-way linked list (data structure).

The member variables of the Autorelease PoolPage object have the following meanings

  • magic_t const magic;
  • id *next; point to the next memory address in the Autorelease PoolPage that can be used to store the auto-release object
  • pthread_t const thread; automatically releases the thread that the pool belongs to, indicating that it cannot be associated with multiple threads.
  • Autorelease PoolPage * const parent; a pointer to the release pool on the previous page
  • Autorelease PoolPage * child; a pointer to the release pool on the next page
  • uint32_t const depth;
  • uint32_t hiwat;
    AutoreleasePoolPage

[First Autorelease PoolPage:: push ();]

Next, we will officially begin our research on Autorelease PoolPage:: push();. Suppose we are now at the beginning of the first @autorelease pool {} of the main function of the project, that is, the whole program will call the push() function for the first time:

#   define POOL_BOUNDARY nil

static inline void *push() 
    {
        id *dest;
        if (DebugPoolAllocation) {//In Debug mode, each autorelease pool creates a new page
            dest = autoreleaseNewPage(POOL_BOUNDARY);
        } else {//In standard cases, call the autorelease Fast () function
            dest = autoreleaseFast(POOL_BOUNDARY);
        }
        assert(dest == EMPTY_POOL_PLACEHOLDER || *dest == POOL_BOUNDARY);
        return dest;
    }

POOL_BOUNDARY is the macro definition of nil. Ignoring Debug mode, we only look at normal mode. Then push() will call autorelease Fast (POOL_BOUNDARY) to get an id *dest and return it to the upper function. Take a look at this autorelease Fast () and see what it can return to us.

static inline id *autoreleaseFast(id obj)
    {
        //Get the page of the currently available AutooreleasePoolPage object
        AutoreleasePoolPage *page = hotPage();
        //(1) If page exists & page is not full, then obj is added directly.
        if (page && !page->full()) {
            return page->add(obj);
        } else if (page) {//(2) Call autorelease Full Page (obj, page) if it is full;
            return autoreleaseFullPage(obj, page);
        } else {//(3) If there is no page, call autorelease NoPage (obj);
            return autoreleaseNoPage(obj);
        }
    }

Because it is the first push operation of the whole program, the page object does not exist, so it will go according to the situation (3), that is, autorelease NoPage (obj);, which is implemented as follows

static __attribute__((noinline))
    id *autoreleaseNoPage(id obj)
    {
        
        /*--"No page"
         1.It can be said that no pool has been pushed yet.
         2.It also means that an empty placeholder pool (empty release pool placeholder) has been created, but nothing has been added yet.
         */
        assert(!hotPage());
        
        
        
        
        
        
        //Label - > Need to add additional POOL_BOUNDARY
        bool pushExtraBoundary = false;
        if (haveEmptyPoolPlaceholder()) {
            /*
             If there is Empty Pool Placeholder (empty placeholder pool), change the label to true.
             You will need to add additional POOL_BOUNDARY based on this label later.
             */
            pushExtraBoundary = true;
        }
        
        /*
         If the incoming obj is not equal to POOL_BOUNDARY (nil) and the current pool is not found (lost), return nil
         */
        else if (obj != POOL_BOUNDARY  &&  DebugMissingPools) {
            _objc_inform("MISSING POOLS: (%p) Object %p of class %s "
                         "autoreleased with no pool in place - "
                         "just leaking - break on "
                         "objc_autoreleaseNoPool() to debug", 
                         pthread_self(), (void*)obj, object_getClassName(obj));
            objc_autoreleaseNoPool(obj);
            return nil;
        }
        
        /*
         ♥️♥️♥️♥️If the incoming POOL_BOUNDARY is not in Debug mode,
         setEmptyPoolPlaceholder() is called to set an EmptyPoolPlaceholder
         */
        else if (obj == POOL_BOUNDARY  &&  !DebugPoolAllocation) {
            return setEmptyPoolPlaceholder();
        }
        
        
        

        // Initialize the first Autorelease PoolPage
        AutoreleasePoolPage *page = new AutoreleasePoolPage(nil);
        //Set it to the current page (hot)
        setHotPage(page);
        
        // Decide whether to stack one more POOL_BOUNDARY according to the pushExtraBoundary tag
        if (pushExtraBoundary) {
            page->add(POOL_BOUNDARY);
        }
        
        // The incoming obj is stacked through the add () function
        return page->add(obj);
    }

Since the Autorelease Pool Page has not been created and Empty Pool Placeholder has not been set up at this time, the program will hit the code marked by

#   define EMPTY_POOL_PLACEHOLDER ((id*)1)
static pthread_key_t const key = AUTORELEASE_POOL_KEY;
********************************************

static inline id* setEmptyPoolPlaceholder()
    {
        assert(tls_get_direct(key) == nil);
        tls_set_direct(key, (void *)EMPTY_POOL_PLACEHOLDER);
        return EMPTY_POOL_PLACEHOLDER;
    }

You can see that the key is actually bound to (id*)1, which is a static constant, and finally returns (id*)1 as an empty release pool placeholder, so that the first push() function of the whole program ends, resulting in the generation of an EMPTY_POOL_PLACEHOLDER (i.e. (id*)1) as a release pool placeholder.

[the first call to autorelease]

Next, after push(), when we execute the autorelease method on an object for the first time, we look at what is done inside the autorelease and find the source code as follows.

- (id)autorelease {
    return ((id)self)->rootAutorelease();//Going down from here
}

************************************************
inline id 
objc_object::rootAutorelease()
{
    if (isTaggedPointer()) return (id)this;
    if (prepareOptimizedReturn(ReturnAtPlus1)) return (id)this;

    return rootAutorelease2();//Going down from here
}

************************************************
__attribute__((noinline,used))
id 
objc_object::rootAutorelease2()
{
    assert(!isTaggedPointer());
    return AutoreleasePoolPage::autorelease((id)this);//Going down from here
}

************************************************
static inline id autorelease(id obj)
    {
        assert(obj);
        assert(!obj->isTaggedPointer());
        id *dest __unused = autoreleaseFast(obj);//Finally came to this method.
        assert(!dest  ||  dest == EMPTY_POOL_PLACEHOLDER  ||  *dest == obj);
        return obj;
    }


By going step by step, we can see that the autorelease method finally comes to the autorelease Fast () function.

static inline id *autoreleaseFast(id obj)
    {
        //Get the page of the currently available AutooreleasePoolPage object
        AutoreleasePoolPage *page = hotPage();
        //(1) If page exists & page is not full, then obj is added directly.
        if (page && !page->full()) {
            return page->add(obj);
        } else if (page) {//(2) Call autorelease Full Page (obj, page) if it is full;
            return autoreleaseFullPage(obj, page);
        } else {//(3) If there is no page, call autorelease NoPage (obj);
            return autoreleaseNoPage(obj);
        }
    }

So this time, let's look at hotPage() in the first line of code; what do we get?

static inline AutoreleasePoolPage *hotPage() 
    {
        AutoreleasePoolPage *result = (AutoreleasePoolPage *)
            tls_get_direct(key);
        //If you check that key has a binding EMPTY_POOL_PLACEHOLDER, return to nil
        if ((id *)result == EMPTY_POOL_PLACEHOLDER) return nil;
        
        if (result) result->fastcheck();
        return result;//Return the current page object
    }

Because we initially bound key to EMPTY_POOL_PLACEHOLDER, we return empty here, indicating that the current page space has not been created, so we return to the autorelease Fast method, which will call the autorelease NoPage (obj) function. According to our comments on this function step, the program should go this time. To the last part of the function
The following main things have been done:

  • Initialize the first Autorelease PoolPage
  • Set it to the current page (hot)
  • The original EMPTY_POOL_PLACEHOLDER set pushExtraBoundary to true, so you need to first stack a POOL_BOUNDARY for the first Autorelease PoolPage.
  • Finally, add(obj) is used to stack the incoming auto-release object obj

The specific function of the add() function above is to assign the value of obj to the memory space pointed to by the next pointer of the current Autorelease PoolPage, and then move next to the next available memory space to facilitate the next time the object is automatically released. as follows

id *add(id obj)
    {
        assert(!full());
        unprotect();
        id *ret = next;  // faster than `return next-1` because of aliasing
        *next++ = obj;//Assignment before +
        protect();
        return ret;
    }

Also note the setHotPage(page) function here, which is implemented as follows

static inline void setHotPage(AutoreleasePoolPage *page) 
    {
        if (page) page->fastcheck();
        tls_set_direct(key, (void *)page);
    }

Its function is to bind the newly created AuthorleasePoolPage with the key, and the hotPage() function will be able to get the current page directly through the key in the future.

#### [Call autorelease again]
If we continue to perform the autorelease operation on the new object, we will also come to the function, but since the Autorelease PoolPage object already exists, if the current page is not full, we will go to the following function
That is to say, the obj object is stacked directly through the add(obj) function.

As we said before, the number of automatic release objects that an Autorelease PoolPage object can store is limited. An automatic release object is a pointer, accounting for 8 bytes, while the space available for an Autorelease PoolPage object is 4040 bytes, that is, 505 objects (pointers), so a page of Autorelease PoolP is available. It is possible for an age to be full of pages. At this point, autorelease Fast will call autorelease FullPage (obj, page); the function is implemented as follows

static __attribute__((noinline))
    id *autoreleaseFullPage(id obj, AutoreleasePoolPage *page)
    {
        // The hot page is full. 
        // Step to the next non-full page, adding a new page if necessary.
        // Then add the object to that page.
        assert(page == hotPage());
        assert(page->full()  ||  DebugPoolAllocation);

        do {//Get the next page object that is not full through the child pointer
            if (page->child) page = page->child;
            else page = new AutoreleasePoolPage(page);
        } while (page->full());

        setHotPage(page);//Set the page obtained above to the current page (hot)
        return page->add(obj);//Store obj into the page through the add function
    }

In fact, the above is to find the next unsatisfactory page through the child pointer of the Autorelease PoolPage object. Autorelease PoolPage objects are two-way linked lists formed by child and parent pointers, which are used at this time. Similarly, when clearing the release pool object, if the current release pool is completely empty, the parent pointer will be used to find the upper release pool.

[Once again, Autorelease PoolPage:: push ();]

In addition to the initial @autoreleasepool {} added to the main function of the system, sometimes our own code may also use @autoreleasepool {} to facilitate more flexible memory management for some objects. So the @autoreleasepool {} we added manually must be nested inside the main function @autoreleasepool {}, equivalent to

int main(int argc, const char * argv[]) {
        @autoreleasepool {//This is the first layer added to the system.
                @autoreleasepool {}//This is the inner nesting that we might add.
        }

}

Now let's look again at how Autorelease PoolPage:: push (); will be executed. The same program is executed to autorelease Fast (POOL_BOUNDARY);POOL_BOUNDARY is passed into the autorelease Fast function and added to the page space of the Autorelease PoolPage object by adding () or autorelease FullPage (). In fact, it's the same as the normal [obj autorelease] process, but this time it's obj = POOL_BOUNDARY, apparently in preparation for a new @autorelease pool {}.

What on earth is POOL_BOUNDARY brought for? You'll know in a minute.

After analyzing the source code, I will show you the principle of @autoreleasepool through illustrations.
[Hypothesis] For easy display, only three release objects can be stored per page of Autorelease Pool Page, as follows

When does the autorelease object call back the release method?

The question is to figure out what the other half of @autorelease pool {} Autorelease Pool Page:: pop (atautorelease pool lobj); did. Let's have a look.The core function is release Untile (stop), where stop actually passes in POOL_BOUNDARY and enters the function.

void releaseUntil(id *stop) 
    {
        
        
        while (this->next != stop) {//If next points to POOL_BOUNDARY, jump out of the loop
            
            //Get the current page
            AutoreleasePoolPage *page = hotPage();

            //_If the current page is empty, get the last Autorelease PoolPage object as the current page through parent.
            while (page->empty()) {
                page = page->parent;
                setHotPage(page);
            }

            page->unprotect();
            
            //_Get the object at the top of the current page stack through the next
            id obj = *--page->next;
            memset((void*)page->next, SCRIBBLE, sizeof(*page->next));
            page->protect();

            if (obj != POOL_BOUNDARY) {
                //[obj release] if obj is not POOL_BOUNDARY
                objc_release(obj);
            }
        }

        setHotPage(this);
    }

The pop() core step is already reflected in the comments in the above function. That is, when the innermost @autorelease pool {} scope ends calling its corresponding pop() function, objects on the top of the stack are found from the current page of the Autorelease Pool Page list and released one by one until POOL_BOUNDARY is encountered, thus representing all contained in @autorelease {} of this layer. Objects all complete the release method call.

When the program reaches the end of the @autoreleasepool {} scope at the upper level, it executes the above process and calls the release method once for the object it contains. You can experience it with the example in the figure below.


Autorelease Pool and Run Loop

From the above research, we know that the function of @autorelease pool {} is actually to call objc_autorelease PoolPush () at the beginning and end of the scope respectively; and objc_autorelease PoolPop () function, but in the iOS project, when does the scope of @autorelease pool {} start and end? This requires understanding Run Loop, another knowledge point we have studied before. We know that unless we manually start RunLoop for the threads, only the main thread in the program has RunLoop, which is opened by default. Let's take a look at what's in the belly of Run Loop, the main thread.

We can easily create a new iOS project and print the current RunLoop object (that is, the RunLoop object of the main thread) directly in ViewController's viewDidLoad method.

@implementation ViewController

- (void)viewDidLoad {
    [super viewDidLoad];
    NSLog(@"%@",[NSRunLoop currentRunLoop]);
}

@end

Print results are a lot of sprinklers, if you are not familiar with the structure of Run Loop, you can refer to mine. Runloop's Internal Structure and Operating Principle What should be said in it is quite clear. We can find two observer s related to autorelease in the common mode items section of the printed results, as shown in the following figure

Specifically as follows

<CFRunLoopObserver 0x600003f3c640 [0x10a2fdae8]>
{
valid = Yes, activities = 0xa0, repeats = Yes, order = 2147483647, 
callout = _wrapRunLoopWithAutoreleasePoolHandler (0x10e17ac9d), 
context = 
<CFArray 0x6000000353b0 [0x10a2fdae8]>
    {
    type = mutable-small, count = 1, values = (0 : <0x7f91ff802058>)
    }
}


<CFRunLoopObserver 0x600003f3c500 [0x10a2fdae8]>
{
valid = Yes, activities = 0x1, repeats = Yes, order = -2147483647, 
callout = _wrapRunLoopWithAutoreleasePoolHandler (0x10e17ac9d), 
context = 
<CFArray 0x6000000353b0 [0x10a2fdae8]>
    {
    type = mutable-small, count = 1, values = (0 : <0x7f91ff802058> )
    }
}

As we can see, the states of the two monitors are respectively:

  • activities = 0xa0 (corresponding to 160 decimal)
  • activities = 0x1 (corresponding to decimal 1)
    How to interpret these two states? We can find the corresponding definition in the RunLoop source code of CF framework.
typedef CF_OPTIONS(CFOptionFlags, CFRunLoopActivity) {
    kCFRunLoopEntry = (1UL << 0),************Decimal system1---(Get into loop)
    kCFRunLoopBeforeTimers = (1UL << 1),****Decimal system2
    kCFRunLoopBeforeSources = (1UL << 2),**Decimal system4
    kCFRunLoopBeforeWaiting = (1UL << 5),***Decimal system32----(loop To be dormant)
    kCFRunLoopAfterWaiting = (1UL << 6),*****Decimal system64
    kCFRunLoopExit = (1UL << 7),**************Decimal system128----(Sign out loop)
    kCFRunLoopAllActivities = 0x0FFFFFFFU
};

According to the enumeration value of RunLoop state, 160 = 128 + 32, that is to say

  • activities = 0xa0 =(kCFRunLoopExit | kCFRunLoopBeforeWaiting)
  • activities = 0x1 =(kCFRunLoopEntry)
    So when these three states are monitored, the _wrapRunLoopWithAutorelease PoolHandler function is called. This function actually operates as shown in the figure below.
  • Monitor the kCFRunLoopEntry event and call objc_autorelease PoolPush ();
  • Monitor the kCFRunLoopBeforeWaiting event, call objc_autorelease PoolPop (), and then call objc_autorelease PoolPush ();
  • Monitor the kCFRunLoopExit event and call objc_autorelease PoolPop()

Based on the above analysis, we can conclude that in addition to the program start (corresponding to kCFRun Loop Entry) and the program exit (corresponding to kCFRun LoopExit), objc_autorelease PoolPush () is called once; and objc_autorelease PoolPop (), during the operation of the program, whenever Run Loop is about to sleep, kCFRun Loop BeforeW is monitored by observer. In aiting state, objc_autoreleasePoolPop() is called once, which calls the release method one by one from the current autoreleasepool, which is equivalent to emptying the release pool; then objc_autoreleasePoolPush() is called again; which is equivalent to opening a new release pool and waiting for the next one after RunLoop wakes up. Subcycle use.

When will the release method be invoked for objects in the automatic release pool?
In every cycle of RunLoop, the object that has called the autorelease method (that is, the object added to the Autorelease PoolPage) will be called the release method, or released, when the loop is about to enter a dormant state.

Well, here's the principle of Autorelease Pool and its relationship with Run Loop

Tags: iOS SDK

Posted on Thu, 29 Aug 2019 22:16:52 -0700 by paulchen