Friday, 16 March 2012

Deferring execution

Go language has defer statements which allow for postponing execution of a function or method call until the end of the current function. Similarly, D has scope guard statements which allow statement execution to be delayed until the end of the scope (optionally specified to only execute under successful or failed scenarios).

Go
D
lock(l)
// unlocking happens before
// surrounding function returns
defer unlock(l)

lock(l);
// unlock on leaving the scope
scope(exit) unlock(l);

This is similar to 'finally' statements that are used in languages like Java and Python. In C++, we have our trusty RAII technique to execute clean up code at the end of scope. But for the cases when that is not convenient, boost provides ScopeExit library to accomplish something akin to D:

lock(l);
BOOST_SCOPE_EXIT( (&l) )
{
    unlock(l);
} BOOST_SCOPE_EXIT_END

If you look at Boost.ScopeExit source, it is filled with enough macro magic to make your head spin. The upside is that it works with C++03. But with C++11 and lambdas, we can build the same functionality and more with easy to read, macro free code.

The basic idea is to put deferred code into a lambda function and then store the lambda in a class that will execute it when it destructs. Let's start with this:

template <typename F>
struct deferrer {
    F fun;
    ~deferrer() { fun(); }
};

template <typename F>
deferrer<F> defer(F const& f) {
    return deferrer<F>{ f };
}
With this simple class (struct for now) and helper function, we can defer code like this:
void foo() {
    lock_t l;
    lock(l);
    auto d = defer([&l] { unlock(l); });
    // ...
}

However, there are a number of problems with this code. First, if lambda function throws, we will get a throwing destructor and the program will go to terminate() in those cases when the stack was already being unwound due to an exception. We can either leave things as is and disallow deferred code to throw or surround fun() invokation in try/catch block. Since this problem plaques RAII as well, I'll just resort to the former.

Second, and more serious problem, is that deferrer<F> is copyable and since defer() returns the object by value, there will be temporaries that get constructed and destructed. Since the destructor executes our function, our deferred code will get executed multiple times and prematurely! Well, that's the worst case scenario. Most compilers now implement copy elision and so RVO will actually not make this scenario appear. But relying on a compiler optimization is a bad practice anyway.

You might be saying right now, "wait a minute, this is C++11, shouldn't the return value be moved instead of copied?". Yes, if the move constructor is available. But by defining the destructor, we opted out of move constructor (and move assignment operator) being implicitly defined for us. Remember, the C++ committee tightened the rules regarding when the implicit generation is OK. Dave Abrahams has a nice post on the topic. BTW, gcc 4.6 still generates the move constructor and I haven't tested the soon to be released 4.7.

The best thing to do here is to forbid copying and define a move constructor:
template <typename F>
class deferrer {
    F fun;
    bool enabled;

public:
    deferrer(F const& f) : fun(f), enabled(true) {}

    // move constructor
    deferrer(deferrer<F>&& rhs) : fun(rhs.fun), enabled(rhs.enabled) {
        rhs.enabled = false;
    }

    // move assignment
    deferrer<F>& operator=(deferrer<F>&& rhs) {
        if( this != &rhs ) {
            fun = rhs.fun;
            enabled = rhs.enabled;
            rhs.enabled = false;
        }
        return *this;
    }

    // no copying
    deferrer(deferrer<F> const& ) = delete;
    deferrer<F>& operator=(deferrer<F> const&) = delete;

    ~deferrer() {
        if( enabled )
            fun();
    }

    // add this as a bonus 
    void cancel() { enabled = false; }
};

Now let's turn our attention to the case when the deferred action should happen not at the end of lexical scope of where deferment was specified but at the end of the function scope or the end of some other lexical scope. For example:
void foo(const char* path, bool cleanup) {
    int fd = creat(path, S_IRUSR);
    if( cleanup ) {
        auto d = defer([path, fd]{
            close(fd);
            unlink(path);
        });
    }
    else {
        auto d = defer([fd]{ close(fd); });
    }
    // oops, closed and maybe unlinked too early!
    // ....
}

We can fix this by creating another class that we can instantiate in the lexical scope at the end of which we want the deferred action to be executed:
void foo(const char* path, bool cleanup) {
    deferred d; // deferred action will execute at the end of foo()
    int fd = creat(path, S_IRUSR);
    if( cleanup ) {
        d = defer([path, fd]{
            close(fd);
            unlink(path);
        });
    }
    else {
        d = defer([fd]{ close(fd); });
    }
    // ....
}
To implement such a beast, we'll need to use std::function to type erase the lambda into nullary function:
template <typename F>
class deferrer {
    friend class deferred;
    // remainder not changed
};

class deferred {
    std::function<void ()> fun;

public:
    deferred() = default;
    deferred(deferred const&) = delete;
    deferred& operator=(deferred const&) = delete;

    template <typename F>
    deferred(deferrer<F>&& d) {
        if( d.enabled ) {
            fun = d.fun;
            d.enabled = false;
        }
    }

    ~deferred() {
        if( fun )
            fun();
    }
};

Conclusion

While I think that the time it takes to define an RAII object is well spent, I admit that the occasional use of scope exit facility can be convenient. It is also great to see that lambda functions allow us to implement few simple utility classes that can do what other languages had to provide in the language itself.