Free Tracing and Event Logging with Sys

2 minute read

In Erlang/OTP both the server and statem behaviours have builtin tracing and event logging. No additional code required. Here is how.

sys:trace

To enable tracing on any server or statem using sys:trace/2:

sys:trace(Pid, true).

Lets look at running tracing on a simple statem as follows:

-module(simple).


-export([add/2]).
-export([callback_mode/0]).
-export([handle_event/4]).
-export([init/1]).
-export([multiply/2]).
-export([start/0]).


start() ->
    gen_statem:start({local, ?MODULE}, ?MODULE, [], []).


add(X, Y) ->
    gen_statem:call(?MODULE, {?FUNCTION_NAME, X, Y}).


multiply(X, Y) ->
    gen_statem:call(?MODULE, {?FUNCTION_NAME, X, Y}).


init([]) ->
    {ok, ready, #{}}.

callback_mode() ->
    handle_event_function.

handle_event({call, From}, {add, X, Y}, _, _) ->
    {keep_state_and_data, {reply, From, X + Y}}.

Note that simple is being registered in the local registry, so we will use simple in the following rather than the PID (either will work!).

Lets start the process up and running:

1> {ok, _} = simple:start().

Without tracing:

2> simple:add(2, 3).
5

With tracing:

4> sys:trace(simple, true).
ok

5> simple:add(5, 6).
*DBG* simple receive call {add,5,6} from <0.1486.0> in state ready
*DBG* simple send 11 to <0.1486.0>
*DBG* simple consume call {add,5,6} from <0.1486.0> in state ready
11

As simple as that, we can dynamically switch on (and off) tracing. A description of the messages traced can be found in the manual page. If you want to quickly understand what a process is doing, without enabling code level tracing for all processes, switching on sys:trace/2 can be a really nice option.

sys:log

To enable event logging on any server or statem using sys:log/2:

sys:log(Name, true).

In the event of the process crashing, the last 10 (default) events processed will be logged. Lets turn off tracing, and switch on logging - note that we didn’t implement multiply in the above code, so (spoiler!) crash ahead:

6> sys:trace(simple, false).
ok
7> sys:log(simple, true).
ok
8> simple:add(21, 21).
42
9> simple:multiply(5, 6).

Lets look at the crash log:


** State machine simple terminating
** Last event = {{call,{<0.1486.0>,#Ref<0.2417936041.2661548033.194517>}},
                 {multiply,5,6}}
** When server state  = {ready,#{}}
** Reason for termination = error:function_clause
...
** Stacktrace =
**  [{simple,handle_event,
             [{call,{<0.1486.0>,#Ref<0.2417936041.2661548033.194517>}},
              {multiply,5,6},
              ready,#{}],
...
** Log =
**  [{in,{{call,{<0.1486.0>,#Ref<0.2417936041.2661548033.180664>}},
          {add,21,21}},
         ready},
     {out,42,{<0.1486.0>,#Ref<0.2417936041.2661548033.180664>}},
     {consume,{{call,{<0.1486.0>,#Ref<0.2417936041.2661548033.180664>}},
               {add,21,21}},
              ready,ready},
     {in,{{call,{<0.1486.0>,#Ref<0.2417936041.2661548033.194517>}},
          {multiply,5,6}},
         ready}]

As well as the usual Stacktrace saying where the actual crash happened, in the Log we get a history of the last 10 events, that we can examine and understand the events that led to the crash.

Enabling sys:log/2 can really help when a crash is a bit more complex, event A, followed by B, causing a crash when C happens, and because all the information is in one place, it can be simpler than bringing all the previous events together. The main thing to watch for is the memory overhead of retaining the last N events.

To retain a different number of events with sys:log/2:

sys:log(Pid, {true, NoOfEvents}).

I think of sys:log/2 as a kind of flight recorder for processes, and it can make understanding the cause(s) of a crash much simpler.

The last parameter of start and start_link are list of start_op() which determine whether trace or log should be enabled when the process is launched.

In our simple module above:

    gen_statem:start({local, ?MODULE}, ?MODULE, [], []).

We could enable log by default, by instead having:

    gen_statem:start({local, ?MODULE}, ?MODULE, [], [{debug, [log]}]).

The above will enable log immediately on startup for all processes in that module. In many cases, this can be a good idea, providing extra “flight recorder” data into the crash log, with little overhead. The other options for can be found in the manual page.