Erlang Notes on Concurrent Programming

One of the most complicated things I encountered while learning Haskell and Erlang (in particularly, Erlang) had to do with the area server example and understanding it in order to apply it to an homework assignment.

First I watched this video here: https://vimeo.com/37921309

It was extremely helpful in being able to understand the concepts behind setting up a spawned process and referencing it.

Below is the code for the area_server.erl example we (the class) were provided…

-module(area_server).
-export([start/0,area/2,loop/0]).

start() ->
     spawn(area_server, loop, []).

area(Pid, What) ->
    rpc(Pid, What).

rpc(Pid, Request) ->
    Pid ! {self(), Request},
    receive
    {Pid, Response} ->
        Response
    end.

loop() ->
    receive
    {From, {rectangle, Width, Height}} ->
        From ! {self(), Width * Height};
        
    {From, {circle, R}} ->
        From ! {self(), 3.14159 * R * R};
        
    {From, Other} ->
        From ! {self(), {error, Other}}
            
    end,
    loop().

As notes to myself and to anyone learning Erlang headfirst, here is how to compile and run this using the interpreter once you have Erlang installed…

In a Linux Terminal:

erl

1>c(area_server).
2>Pidz = area_server:start().
3>area_server:area(Pidz,{circle,3}).
 28.274309999999996

Tada!!!!

So what just happened?!
Within the directory that the area_server.erl file exists…
By using the Linux shell command installed with Erlang of:

erl

it will start the Erlang interpreter program.

The 1> indicates that this is the n (in our case first) line executed syntax.
The line to execute isn’t finished even if you press return, it’ll just carry the 1> over UNTIL you use the . symbol.

The line itself of:

1>c(area_server).

will compile the area_server.erl file.

2>Pidz = area_server:start().

Runs the function start within the area_server program that has been compiled and loaded into the interpreter for use.
(Note: think of area_server:start() as being a scope type reference I suppose to make it easier to understand.)

It saves the output of that function to the variable Pidz which the shell/interpreter of Erlang will keep a hold of for us.

So what just happened with that start() function anyways?

Basically it’ll spawn the function loop() from within the module area_server and passes along an empty list of additional arguments.

The return value is the PID (Process IDentifier) — example: <1.2.3> for some thread running on the computer.

You can check what threads are running at any time within the Erlang interpreter by using the built in function of:

>i().

This will list all of the threads with their PID.

To reference the PID manually within Erlang you’d have to use something like pid(1,2,3) in order to generate PID object that Erlang can use to reference the thread.

So now sitting in the background of the computer, in it’s own thread, is the loop function from our area_server program due to the start() function spawning one off in the background.

Remember, the Pidz is storing that PID value of that thread for us so we don’t have to worry about using some pid() function to pass it along.

As for the third line:

3>area_server:area(Pidz,{circle,3}).

This line runs the area function within the area_server module, it passes the arguments of Pidz (the spawned loop function from 2> and it passes a tuple of {circle,3}.

circle is an atom, which I remember personally as being a sort of enumeration type of syntax. You can’t assign circle a value, and it’s not a string either. It’s an atom. I’m not the best at explaining it perhaps, but I would suggest looking into it a little if this is confusing since atoms are very important in Erlang.

The area function calls the function within area_server module called rpc() and passes the Pidz variable we passed through as it’s Pid, and the {circle,3} input as it’s What value.

area(Pid, What) ->
 rpc(Pid, What).

So then within the rpc function:

rpc(Pid, Request) ->
 Pid ! {self(), Request},
 receive
 {Pid, Response} ->
 Response
 end.

The first thing to note is the Pid ! {} portion.

What this means is that Pid will be sent, whatever is after that ! character.

So our Pidz (the thread we spawned of loop) will be sent the tuple containing {self(), Request} where Request contains our {circle,3} tuple.

The self() function is a built in Erlang function that returns back the PID of the process it originates from. So since our Erlang/Linux shell thread is the one executing the rpc function, the PID of that will be the return value of self() in this case.

For our example, let’s say that the shell’s PID is: <9.9.9>

Let’s come back to the receive …… end. related code of rpc() in a second though.

Let’s trace where our {self(),{circle,3}} (or more explicitly: {pid(9,9,9),{circle,3}}) was sent to first.

Since it was sent to Pidz, the loop() method, that’s what will be receiving this.

In order for loop() to receive it, it uses the receive syntax to receive messages.

By focusing on just the relevant portions of the loop function…

loop() ->
 {From, {circle, R}} ->
 From ! {self(), 3.14159 * R * R};
 end,
 loop().

You can see a pattern matching emerge.

It will obtain a tuple, where it knows it’ll receive something as the first, and a tuple as the second argument.

And the second argument tuple must contain the circle atom and some value R.

This matches up with our {<9.9.9>, {circle,3}} tuple we sent, so the function executes that inner function.

Right now we’re within the thread of <1.2.3> of our spawned loop thread, off in the vast distance of space (imagine) so it needs to actually go and send a message BACK to our shell thread of <9.9.9> the result of whatever it wants to be doing.

So it uses From (our <9.9.9> PID thread address) and sends (using !) the tuple {self(), 3.14159 * R * R};

Which is a rough estimation of the area of a circle. Our R value in our case is 3.

The self() in that tuple is the PID of the spawned thread, so what we’ll be sending to our shell is actually:
{<1.2.3>, 3.14159 * 3 * 3}

the rpc() function has a receive method and receives this (remember earlier I mentioned we’d come back to this)

rpc(Pid, Request) ->
 Pid ! {self(), Request},
 receive
     {Pid, Response} ->
         Response
 end.

So the tuple we sent gets translated, the Pid we sent isn’t used in this case, and the Response of 3.14159 * 3 * 3 is echo’d back out and is calculated as it’s echo’d.

I don’t want to give away how to do the homework assignment in case it’s used in future assignments by specifying exactly how to do the homework or anything.

But the key difference is that the start() function uses a named atom to register the loop() spanwed thread.

Because of this you don’t have to store the output of start() into a variable like Pidz.

The only hint I’ll give is one that was given by the TA at the time which was absolutely huge in me being able to figure out how to pull this off and that is a catch all example.

Within the loop function, you can have a function at the VERY end, after all of your other methods that runs:

{From, Any} ->
 From ! {db, Any}

Assume db is the registered atom of the spawned loop() thread.
The purpose of this code is to catch anything that wasn’t matched by other atom tuples (think circle, rectangle from the area_server module) and to catch it here so that it doesn’t sit not knowing how to respond and freeze your shell interpreter since it’s actually waiting on a response before it does anything. (rpc() is to be more specific)

It takes the input of From, the shell’s PID, and Any which is a variable that stores the tuple or what-have-you that wasn’t caught by any of the proceeding methods and will run From ! {db, Any}

In turn, this will send back a tuple {db, Any} where the db atom is used by the rpc() in that method to perform pattern matching in it’s functions (just like how loop() did).

And rpc will just echo out whatever Any’s value was since no processing took place during the loop() method.

One final hint for Erlang syntax that was difficult for me to catch and learn was the use of semi-colons and . and commas.

rpc() uses commas to run multiple events within it’s method in succession.

So run this, then run that, then run that. It runs the Pid ! {self(), Request}

but then it runs receive (which then waits idling forever until something is received that it can handle)

The last line of the area_server example has an Other’s line. Note that it doesn’t end with a semi-colon because it’s the last block.

I never really mastered the intuition as to WHY that’s the case, but just the fact that it IS the case.

The trick for PUTting and GETing can be very tricky for that homework assignment. However Hoogle might be able to help with a method that let’s you perhaps PUT or GET things somewhere based on a Key, Value pair.

I hope this information can help someone, or at least myself in the future if I ever need use of Erlang again.