cGA, Parallelism, Processes, and Erlang

Back in Fall 2006 I was lucky to be at the right place, at the right time. Kumara Sastry and David E. Goldberg were working to pulverize some preconceptions about how far you could scale genetic algorithms. As I said, I was lucky I could help the best I could. It turned out that the answer was pretty simple, as far as you want. The key to that result was, again, built on Georges Harik’s compact genetic algorithm. The results were published on a paper titled Toward routine billion-variable optimization using genetic algorithms if you are curious.

Anyway, back on track. A few days ago, I was playing with Erlang and I coded, just for fun, yet another cGA implementation, now in Erlang. The code was pretty straight forward, so why not take another crack at it and write an Erlang version that uses some of the ideas we used on that paper.

The idea we used on the paper was simple. Slice the probabilistic model into smaller segments and update all those model fragments in parallel. The only caveat, if you go over the cGA model, is that you need the evaluation of two individuals to decide which way to update the model. Also, you need to know when to stop, or when your global model has converged. The flow is pretty simple:

  1. Sample in parallel two individuals.
  2. Compute the partial evaluation (in the example below the beloved OneMax).
  3. Emit the partial evaluations.
  4. Collect the partial evaluation, and compute the final fitness.
  5. Rebroadcast the final evaluation to all model fragments.
  6. With the final evaluations at hand, just update the model fragments.
  7. Compute if the local fragment of the model has converged and emit the outcome.
  8. With all the partial convergence checks, decide if the global model has globally converged.
  9. If the global model has not converged, continue to (1).

The implementation below is quite rough. It could be cleaned up using functional interfaces to hide all the message passing between processes, but you get the picture. Also, if you look at the implementation below, you may find that the way global fitness and convergence are computed have only one process serializing each those request. You may remember Amdhal’s law, not a big problem with a few thousand model fragments, but as you scale up you are going to eventually have to worry about. For instance, you could improve it, for instance, by using a broadcast tree. Anyway, let’s put all those a side for now, and do a simple implementation to get the ball rolling.

-module(pcga).
-export([one_max/1, cga/6, accumulator/4, has_converged/3, cga_loop/8, time/4]).
 
 % Accumulates the partial evaluations.
accumulator(Pids, Values1, Values2, Groups) when length(Pids) == Groups ->
  Acc1 = lists:sum(Values1),
  Acc2 = lists:sum(Values2),
  lists:map(fun(P) -> P ! {final_eval, Acc1, Acc2} end, Pids),
  accumulator([], [], [], Groups);
accumulator(Pids, Values1, Values2, Groups) when length(Pids) < Groups ->
  receive
    {eval, Pid,	Value1, Value2} ->
        accumulator([Pid | Pids], [Value1 | Values1], [Value2 | Values2], Groups);
    stop -> ok
  end.

% Convergence checker.
has_converged(Pids, Votes, Groups) when length(Pids) == Groups ->
  FinalVote = lists:sum(Votes),
  lists:map(fun(P) -> P ! {final_converged, FinalVote == Groups} end, Pids),
  has_converged([], [], Groups);
has_converged(Pids, Votes, Groups) when length(Pids) < Groups ->
  receive
    {converged, Pid, Vote} ->
      has_converged([Pid | Pids], [Vote | Votes], Groups);
    stop -> ok
  end.

% OneMax function.
one_max(String) -> lists:sum(String).
 
% Generates random strings of length N given a Model.
random_string(Model) ->
  lists:map(fun (P) -> case random:uniform() < P of true -> 1; _ -> 0 end end,
            Model).
 
% Generates a random population of size Size and strings of length N.
initial_model(N) -> repeat(N, 0.5, []).
 
% Given a pair of evaluated strings, returns the update values.
update({_, Fit}, {_, Fit}, N, _) ->
  repeat(N, 0, []);
update({Str1, Fit1}, {Str2, Fit2}, _, Size) ->
  lists:map(fun ({Gene, Gene}) -> 0;
                ({Gene1, _}) when Fit1 > Fit2 -> ((Gene1 * 2) - 1) / Size;
                ({_, Gene2}) when Fit1 < Fit2 -> ((Gene2 * 2) - 1) / Size
            end,
            lists:zip(Str1, Str2)).

% Check if the model has converged.
converged(Model, Tolerance) ->
  lists:all(fun (P) -> (P < Tolerance) or (P > 1 - Tolerance) end, Model).

% The main cGA loop.
cga(N, GroupSize, Groups, Fun, Tolerance, Print) 
  when N > 0, GroupSize > 0, Groups > 0, Tolerance > 0, Tolerance < 0.5 ->
  Acc = spawn(pcga, accumulator, [[], [], [], Groups]),
  Con = spawn(pcga, has_converged, [[], [], Groups]),
  lists:foreach(
    fun(_) ->
      spawn(pcga, cga_loop, 
            [N, GroupSize, Fun, initial_model(GroupSize), Tolerance, Acc, Con, Print])
    end,
    repeat(Groups, 1, [])).
 
cga_loop(N, Size, Fitness, Model, Tolerance, Acc, Con, Print) ->
  [{Str1, P1}, {Str2, P2} | _] = lists:map(
    fun (_) -> Str = random_string(Model), {Str, Fitness(Str)} end,
    [1,2]),
  Acc ! {eval, self(), P1, P2},
  receive
    {final_eval, FF1, FF2} ->
      NewModel = lists:map(fun ({M, U}) -> M + U end,
      lists:zip(Model, update({Str1, FF1}, {Str2, FF2}, Size, Size))),
      case converged(NewModel, Tolerance) of
        true -> Con ! {converged, self(), 1};
        false ->  Con ! {converged, self(), 0}
      end,
      receive
        {final_converged, true} -> 
          case Print of 
            true -> io:fwrite("~p\n", [NewModel]);
            _ -> true
          end,
          Acc ! Con ! stop;
        {final_converged, false} -> 
          cga_loop(N, Size, Fitness, NewModel, Tolerance, Acc, Con, Print)
      end
  end.

The code above allows you to decide how many model fragments (Groups) you are going to create. Each fragment is assigned to a process. Each fragment has GroupSize variable of the model and N is the population size. A simple example on how to run the code:

c(pcga).
pcga:cga(50000, 500, 10, fun pcga:one_max/1, 0.01, true).

The model will contain 50,000 variables split into 10 process each of each containing a fragment of 50 variables. I guess now the only thing left is measure how this scales.