2014年9月11日星期四

Fun with Lambdas: C++14 Style

It's common knowledge that Functional Programming is spreading like a wildfire in mainstream languages. Latest promoted languages: Java 8 and C++, both of which now support lambdas. So, let the lambdas begin! and may the fun be ever on your side. The same text is available in slides form on Slideshare. This blog post and the talk/slides are inspired by JSON inventor Douglas Crockford.

Write an Identity function that takes an argument and returns the same argument.
1
2
3
4
auto Identity = [](auto x) {
  return x;
};
Identity(3); // 3
Write 3 functions add, sub, and mul that take 2 parameters each and return their sum, difference, and product respectively.
1
2
3
4
5
6
7
8
9
auto add = [](auto x, auto y) {
  return x + y;
};
auto sub = [](auto x, auto y) {
  return x - y;
};
int mul (int x, int y) {
  return x * y;
};
Write a function, identityf, that takes an argument and returns an inner class object that returns that argument.
1
2
3
4
5
6
7
8
9
auto identityf = [](auto x) {
  class Inner {
    int x;
    public: Inner(int i): x(i) {}
    int operator() () { return x; }
  };
  return Inner(x);
};
identityf(5)(); // 5
Write a function, identityf, that takes an argument and returns a function that returns that argument.
1
2
3
4
auto identityf = [](auto x) {
  return [=]() { return x; };
};
identityf(5)(); // 5
Lambda != Closure
  • A lambda is just a syntax sugar to define anonymous functions and function objects. 
  • A closure in C++ is a function object which closes over the environment in which it was created. The line #2 above defines a closure that closes over x.
  • A lambda is a syntactic construct (expression), and a closure is a run-time object, an instance of a closure type. 
  • C++ closures do not extend the lifetime of their context. (If you need this use shared_ptr)
Write a function that produces a function that returns values in a range
1
2
3
4
5
6
7
8
9
10
11
auto fromto = [](auto start, auto finish) {   
  return [=]() mutable {     
    if(start < finish)       
      return start++;     
    else       
      throw std::runtime_error("Complete");   
  }; 
};
auto range = fromto(0, 10);
range(); // 0
range(); // 1
Write a function that adds from two invocations
1
2
3
4
5
6
auto addf = [](auto x) {
  return [=](auto y) {
    return x+y;
  };
};
addf(5)(4); // 9
Write a function swap that swaps the arguments of a binary function
1
2
3
4
5
6
auto swap =[](auto binary) {
  return [=](auto x, auto y) {
    return binary(y, x);
  };
};
swap(sub)(3, 2); // -1
Write a function twice that takes a binary function and returns a unary function that passes its argument to the binary function twice
1
2
3
4
5
6
auto twice =[](auto binary) {
  return [=](auto x) {
    return binary(x, x);
  };
};
twice(add)(11); // 22
Write a function that takes a binary function and makes it callable with two invocations
1
2
3
4
5
6
7
8
auto applyf = [](auto binary) {
  return [=](auto x) {
    return [=](auto y) {
      return binary(x, y);
    };
  };
};
applyf(mul)(3)(4); // 12
Write a function that takes a function and an argument and returns a function that takes the second argument and applies the function
1
2
3
4
5
6
auto curry = [](auto binary, auto x) {
  return [=](auto y) {
    return binary(x, y);
  };
};
curry(mul, 3)(4); // 12
Currying (schönfinkeling)
  • Currying is the technique of transforming a function that takes multiple arguments in such a way that it can be called as a chain of functions, each with a single argument.
  • In lambda calculus functions take a single argument only.
  • Must know Currying to understand Haskell.
  • Currying != Partial function application
Partial Function Application
1
2
3
4
5
6
7
8
9
10
auto addFour = [](auto a, auto b,
                  auto c, auto d) {
  return a+b+c+d;
};
auto partial = [](auto func, auto a, auto b) {
  return [=](auto c, auto d) {
    return func(a, b, c, d);
  };
};
partial(addFour,1,2)(3,4); //10
Without creating a new function show 3 ways to create the inc function
1
2
3
auto inc = curry(add, 1);
auto inc = addf(1);
auto inc = applyf(add)(1);
Write a function composeu that takes two unary functions and returns a unary function that calls them both
1
2
3
4
5
6
auto composeu =[](auto f1, auto f2) {
  return [=](auto x) {
    return f2(f1(x));
  };
};
composeu(inc1, curry(mul, 5))(3) // 20
Write a function that returns a function that allows a binary function to be called exactly once
1
2
3
4
5
6
7
8
9
10
11
12
auto once = [](auto binary) {   
  bool done = false;   
  return [=](auto x, auto y) mutable {
    if(!done) {       
      done = true;       
      return binary(x, y);     
    }     
    else       
      throw std::runtime_error("once!");    
  }; 
};
once(add)(3,4); // 7
Write a function that takes a binary function and returns a function that takes two arguments and a callback
1
2
3
4
5
6
7
auto binaryc = [](auto binary) {   
  return [=](auto x, auto y, auto callbk) {
   return callbk(binary(x,y));   
  }; 
};
binaryc(mul)(5, 6, inc) // 31
binaryc(mul)(5,6,[](int a) { return a+1; });
Write 3 functions: 
  • unit – same as Identityf
  • stringify – that stringifies its argument and applies unit to it
  • bind – that takes a result of unit and returns a function that takes a callback and returns the result of callback applied to the result of unit.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
auto unit = [](auto x) {
  return [=]() { return x; };
};
auto stringify = [](auto x) {   
  std::stringstream ss;
  ss << x;
  return unit(ss.str());
};
 
auto bind = [](auto u) {   
  return [=](auto callback) {
   return callback(u());   
  }; 
};
Then verify. 
1
2
3
4
5
6
7
8
9
10
11
std::cout << "Left Identity " 
          << stringify(15)()
          << "=="
          << bind(unit(15))(stringify)()
          << std::endl;
 
std::cout << "Right Identity "
          << stringify(5)()
          << "=="
          << bind(stringify(5))(unit)()
          << std::endl;
Why are unit and bind special? Read more about them here.

Look at some interesting examples of C++11/14 lambdas and how they interact with other language features and libraries. I hope to find some time to add some explanations. See part 1 if you missed it.

  • Associative containers and lambdas
    1
    2
    std::set<int, std::function<bool(int, int)>>
      numbers([](int i, int j) { return i < j; });
  • Recursive Lambdas (see Creating recursive lambdas and returning them too!)
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    auto make_fibo()
    {
      return [](int n) {
        std::function<int(int)> recurse;
        recurse = [&](int n){
           return (n<=2)? 1 : recurse(n-1) + recurse(n-2);
        };
        return recurse(n);
      };
    }
  • Composable list manipulation (e.g., cpplinqnarlLEESA)
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    Box boxes[] = { ... };
    int sum_of_weights =
         cpplinq::from_array(boxes)
      >> where([](const Box & box) {
           return box.color == Color.RED;
         })
      >> select([](const Box & box) {
           return box.get_weight();
         })
      >> sum();
  • Overloaded Lambdas
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    template <class... F>
    struct overload : F... {
      overload(F... f) : F(f)... {} 
    };
     
    template <class... F>
    auto make_overload(F... f) {
      return overload<F...>(f...);  
    }
     
    auto f =
        make_overload([](int i) { /* print */ },
                      [](double d) { /* print */ });
    f(10); // int
    f(9.99); // double
  • Type Switch (simple pattern matching) (see type_switch.cpp and this paper)
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    struct Base {
      virtual ~Base() {}
    };
    struct Derived : Base {};
     
    template <class Poly>
    void test(Poly& p) { 
      match(p)(
        [](int i)             { cout << "int";       },
        [](std::string &)     { cout << "string";    },
        [](Derived &)         { cout << "Derived";   },    
        [](Base &)            { cout << "Base";      },   
        otherwise([](auto x)  { cout << "Otherwise"; })
      ); 
    }
    Derived d;
    Base &b = d;
    std::string cpptruths = "C++ Truths";
    boost::any something = cpptruths;
     
    test(10);        // int
    test(cpptruths); // string
    test(something); // string
    test(b);         // Derived
    test(9.99);      // Otherwise
  • Converting shared_ptr between boost and std (see StackOverflow)
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    template <typename T>
    boost::shared_ptr<T>
    make_shared_ptr(std::shared_ptr<T> ptr)
    {     
      return boost::shared_ptr<T>(ptr.get(),
        [ptr](T*) mutable { ptr.reset(); });
    }
     
    template <typename T>
    std::shared_ptr<T>
    make_shared_ptr(boost::shared_ptr<T> ptr)
    {     
      return std::shared_ptr<T>(ptr.get(),
        [ptr](T*) mutable { ptr.reset(); });
    }
  • In-place parameter pack expansion 
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    template <class... T>
    void foreach(T... args)
      bool b[] = { [=](){
        std::cout << args << "\n";
        return true;
      }()... };
    }
     
    foreach(10, 20.2, true);
  • Memoization (see original)
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    template <typename ReturnType,
              typename... Args>
    auto memoize(ReturnType (*func)(Args...))
    {
        std::map<std::tuple<Args...>, ReturnType> cache;
     
        return ([=](Args... args) mutable 
        {
            std::tuple<Args...> t(args...);
            if (cache.find(t) == cache.end())               
            {
              std::cout << "not found\n";
              cache[t] = func(args...);
            }
            return cache[t];
        });
    }
  • Finally, slides
  • http://www.slideshare.net/SumantTambe/fun-with-lambdas-c14-style-part-2?redirected_from=save_on_embed

Now that we have C++14, it has opened up doors for truly mind-bending uses of lambdas--more specifically--generic lambdas. This blog post is the third installment in the series of "Fun with Lambdas: C++14 Style". Check out part 1 and part 2 if you have not already.

This post is about "monadic tuples". 

Monad--a simple but powerful abstraction, however, considered quite difficult to understand in the imperative circles. We will look into what's know as the "continuation monad". As it turns out, in C++14, you need just a couple of lines of code to create an instance of a continuation monad.

I'm fairly new to the world of monads. So, things did not begin with great clarity for me. It all started with an intriguing question onStackoverflow. As it turns out the same "trick" is also used in Boost.Hana and discussed on boost mailing list here

What you see below is more or less how I came to understand the idiom as an instance of a monad. Some background in functional programming may be helpful in reading this post. A good understanding of nested generic lambdas is a must. If you are wondering if you should read the part 1 first, then you probably should. 

Ok, lets cut to the chase. 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
auto list = [](auto ...xs) {
    return [=](auto access) { return access(xs...); };
};
 
auto head = [](auto xs) {
    return xs([](auto first, auto ...rest) { return first; });
};
 
auto tail = [](auto xs) {
    return xs([](auto first, auto ...rest) { return list(rest...); });
};
 
auto length = [](auto xs) {
    return xs([](auto ...z) { return sizeof...(z); });
};
 
int len = length(list(1, '2', "3"));  // 3
list is a generic lambda that accepts a variable number of arguments and returns a closure (an instance of the inner lambda) that captures the arguments by value. The inner lambda accepts a parameter (called access) that must be callable with an arbitrary number of arguments. The inner lambda simply expands the parameter pack while calling the callable. That way it provides "access" to the captured parameter pack. 

If you squint a little, you will probably realize that list is like a constructor of a tuple. As a matter of fact, if you were to implement the inner lambda using a good old class template, you will most likely resort to using a std::tuple member. 

head, tail, and length are examples of operations that you may perform on a list. head returns the first element, tail returns the list excluding the first element and length returns the size of the parameter pack. For example, a three element list is passed to the length lambda. As every list itself is a closure, it is called with an "accessor" function. The accessor simply does a sizeof... and returns the result, which propagates all the way out. 

It is probably immediately apparent that this idiom adds life to otherwise drab variadic parameter packs. Don't get me wrong, variadic parameter packs are cool and we won't have other cool things like std::tuple without them. However, the point is that the language allows very few operations on a parameter pack. In general, you can't "store" them. Pretty much, you can expand a parameter pack, ask for its size, and unwind it using the car/cdr recursive style. And that's about it. Until now, To store a parameter pack you have to put it in a std::tuple. 

But now there is an alternative. You can capture it using a lambda and provide access to it as done in the list lambda. As it turns out, this seemingly innocuous and perhaps needlessly convoluted approach to "accessing" parameter packs is phenomenally powerful

WHY? ... the list lambda and the closure inside are special. Together, they form an implementation of a Continuation Monad. 

A great introduction for continuation monad for C++ programmers is here. In essence, the list lambda above takes a value (a variadic parameter-pack) and returns a simple "continuator" (the inner closure). This continuator, when given a callable (called access), passes the parameter pack into it and returns whatever that callable returns. 

Borrowing from the FPComplete blogpost, a continuator is more or less like the following. 
1
2
3
4
5
template<class R, class A>
struct Continuator {
   virtual ~Continuator() {}
   virtual R andThen(std::function<R(A)> access) = 0;
};
The Continuator above is abstract--does not provide an implementation. So, here is a simple one. 
1
2
3
4
5
6
7
8
9
template<class R, class A>
struct SimpleContinuator : Continuator<R, A>
{
   SimpleContinuator(A x) : _x(x) {}
   R andThen(std::function<R(A)> access) {
       return access(_x);
   }
   A _x;
};
The SimpleContinuator accepts one value of type A and passes it on to access when andThen is called. The closure returned by the list lambda is conceptually the same. It is more general. Instead of a single value, the inner closure captures a parameter-pack and passes it to the access function. Neat! 

Hopefully that explains what it means to be a continuator. but what does it mean to be a monad? Here is a good introduction using pictures. 

The inner closure returned by the list lambda is also a list monad, which is implemented as a continuation monad. Note that continuation monad is the mother of all monads. I.e., you can implement any monad with a continuation monad. Of course, list monad is not out of reach. 

As a parameter-pack is quite naturally a "list" (often of heterogeneous types), it makes sense for it to work like a list/sequence monad, where operations can be chained one after another. The list lambda above is a very interesting way of converting C++ parameter-packs to a monadic structure. 

The head and length lambdas above, however, are a bit disappointing because they break the monad and the nested lambda inside simply returns a non-monadic value (something you can't chain more operations to). There is arguably a better way to write a chain of "processing" operations as shown below. 

Functor 

Before we can say that the list lambda is a monad constructor, we have to show that it is a functor. I.e., fmap must be written for the inner closure. Note that "functor" is a category theoretic term. It has no direct correlation with a C++ functor (i.e., a function object) 

The list lambda above serves as the creator of the functor from a parameter pack---essentially it serves as the "return" in Haskell. That created functor keeps the parameter-pack with itself (capture) and it allows access to it provided you give a callable that accepts a variable number of arguments. Note that the callable is called EXACTLY-ONCE. 

Lets write fmap for such a functor. 
1
2
3
auto fmap = [](auto func) {
        return [=](auto ...z) { return list(func(z)...); };
    };
The type of the func must be (a -> b). I.e., in C++ speak, 
1
2
template <class a, class b>
b func(a);
The type of fmap is "fmap: (a -> b) -> list[a] -> list[b]" I.e., in C++ speak, 
1
2
template <class Func, class a, class b>
list<b> fmap(Func, list<a>);
I.e., fmap simply maps list-of-a to a list-of-b. 
Now you can do 
1
2
3
4
5
auto twice = [](auto i) { return 2*i; };
auto print = [](auto i) { std::cout << i << " "; return i;};
list(1, 2, 3, 4)
    (fmap(twice))
    (fmap(print)); // prints 2 4 6 8 on clang (g++ in reverse)
Therefore, it is a functor. 

Monad 

Now, lets try to write a flatmap (a.k.a. bind, selectmany) 

Type of flatmap is "flatmap: (a -> list[b]) -> list[a] -> list[b]" 

I.e., given a function that maps a to a list-of-b and a list-of-a, flatmap return a list-of-b. Essentially, it takes each element from list-of-a, calls func on it, receives (potentially empty) list-of-b one-by-one, then concatenates all the list-of-b, and finally returns the concatenated list-of-b. 

Here's an implementation of flatmap for list. 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
auto concat = [](auto l1, auto l2) {
    auto access1 = [=](auto... p) {
      auto access2 = [=](auto... q) {
        return list(p..., q...);
      };
      return l2(access2);
    };
    return l1(access1);
};
 
template <class Func>
auto flatten(Func)
{
  return list();
}
 
template <class Func, class A>
auto flatten(Func f, A a)
{
  return f(a);
}
 
template <class Func, class A, class... B>
auto flatten(Func f, A a, B... b)
{
  return concat(f(a), flatten(f, b...));
}
 
auto flatmap = [](auto func) {
  return [func](auto... a) { return flatten(func, a...); };
};
Now you can do a lot of powerful things with a list. For example, 
1
2
3
4
5
6
auto pair  = [](auto i) { return list(-i, i); };
auto count = [](auto... a) { return list(sizeof...(a)); };
list(10, 20, 30)
    (flatmap(pair))
    (count)
    (fmap(print)); // prints 6.
The count function is a monad-perserving operation because it returns a list of single element. If you really want to get the length (not wrapped in a list) you have to terminate the monadic chain and get the value as follows. 
1
2
3
4
auto len = [](auto ...z) { return sizeof...(z); };
std::cout << list(10, 20, 30)
                 (flatmap(pair))
                 (len); // prints 6
If done right, the collection pipeline pattern (e.g., filter, reduce) can now be applied to C++ parameter-packs. So lets try to do that but first, lets verify the monad laws.

Monad Laws 

Let's make sure the list monad satisfies all three monad laws
1
2
3
4
5
6
7
8
9
10
11
12
auto to_vector = [](auto... a) { return std::vector<int> { a... }; };
 
auto M = list(11);
std::cout << "Monad law (left identity)\n";
assert(M(flatmap(pair))(to_vector) == pair(11)(to_vector));
     
std::cout << "Monad law (right identity)\n";
assert(M(flatmap(list))(to_vector) == M(to_vector));
     
std::cout << "Monad law (associativity)\n";
assert(M(flatmap(pair))(flatmap(pair))(to_vector) ==
       M(flatmap([=](auto x) { return pair(x)(flatmap(pair)); }))(to_vector));
All assertions are satisfied. 

Collection Pipeline 

Although the above list lambda is provably a monad and shares characteristics of the proverbial list-monad, it is quite unpleasant to work with as a collection pipeline. Especially because the behavior of a common collection pipeline combinator filter (a.k.a where) does not meet common expectations. 

The reason is just how C++ lambdas work. Each lambda expression produces a function object of a unique type. Therefore, list(1,2,3) produces a type that has nothing to do with list(1) and an empty list, which in this case would be list(). 

The straight-forward implementation of `where` fails compilation because in C++ a function can not return two different types. 
1
2
3
4
5
auto where_broken = [](auto func) {
  return flatmap([func](auto i) {
      return func(i)? list(i) : list(); // broken :-(
  });
};
In the above implementation, func returns a boolean. It's a predicate that says true or false for each element. The ?: operator does not compile because the types of list(i) and list() (empty list) are different. 

So, a different trick can be used to allow continuation of the collection pipeline. Instead of actually filtering the elements, they are simply flagged as such---and that's what makes it unpleasant. 
1
2
3
4
5
auto where_unpleasant = [](auto func) {
  return [=](auto... i) {
      return list(std::make_pair(func(i), i)...);
  };
};
The where_unpleasant gets the job done but unpleasantly... For example, this is how you can filter negative elements. 
1
2
3
4
5
6
7
8
9
10
auto positive = [](auto i) { return i >= 0; };
auto pair_print = [](auto pair) {
  if(pair.first)
     std::cout << pair.second << " ";
  return pair;
};
list(10, 20)
    (flatmap(pair))
    (where_unpleasant(positive))
    (fmap(pair_print)); // prints 10 and 20 in some order


Heterogeneous Tuples 

So far the discussion was about homogeneous tuples. Now lets generalize it to true tuples. Note that fmap, flatmap, where take only one callback lambda. To provide multiple lambdas each working on one type, we can overload them. For example, 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
template <class A, class... B>
struct overload : overload<A>, overload<B...> {
  overload(A a, B... b)
      : overload<A>(a), overload<B...>(b...)
  {} 
  using overload<A>::operator ();
  using overload<B...>::operator ();
};
  
template <class A>
struct overload<A> : A {
  overload(A a)
      : A(a) {}
  using A::operator();
};
 
template <class... F>
auto make_overload(F... f) {
  return overload<F...>(f...);  
}
 
auto test =
   make_overload([](int i) { std::cout << "int = " << i << std::endl; },
                 [](double d) { std::cout << "double = " << d << std::endl; });
test(10); // int
test(9.99); // double   
Let's use the overloaded lambda technique to process a heterogeneous tuple continuator. 
1
2
3
4
5
6
auto int_or_string =
        make_overload([](int i) { return 5*i; },
                      [](std::string s) { return s+s; });
    list(10, "20")
        (fmap(int_or_string))
        (fmap(print)); // prints 50 and 2020 (gcc in reverse)
Finally, Here is the complete live example. For more relevant reading, also see the lambda-over-lambda

P.S. Why is the order of output not the same across compilers? The order of variadic pack expansion is defined in the standard which corresponds to the original order of the pack. The order of evaluating function argument expressions is, however, not standardized. For example, checkout the implementation of fmap. func(z) is called as many time as there are arguments. However, the order in which multiple calls to func are evaluated is not guaranteed. As the calls to func print the values out to the console, the output is unpredictable across compilers. See more discussion on reddit/r/cpp.

没有评论:

发表评论