Erlang trace追踪调用
核心函数 erlang:trace/3 和 erlang:trace_pattern/2-3
erlang:trace/3
最基本的跟踪工具,能够使当前进程跟踪指定(其他进程,端口等)的内容。
erlang:trace(PidPortSpec, OnOffBoolean, FlagList) -> integer()
PidPortSpec = pid() | port() | all | processes | ports | existing | existing_process | existing_ports | new | new_process | new_ports
参数1一般来说只需要用到这几个
all
跟踪所有进程和Portprocess
这跟踪进程new
只跟踪新建的进程和Portexisting
只跟踪当前已存在的进程和Port
参数2是boolean值,true
为打开跟踪,false
为关闭跟踪。
参数3 FlagList是标志开关,用来控制如何进行跟踪的:
all | send | 'receive' | procs | ports | call | arity |
return_to | silent | running | exiting | running_procs |
running_ports | garbage_collection | timestamp |
cpu_timestamp | monotonic_timestamp |
strict_monotonic_timestamp | set_on_spawn |
set_on_first_spawn | set_on_link | set_on_first_link |
{tracer, pid() | port()} |
{tracer, module(), term()}
选项之间可以组合使用。文档trace/3有每个选项的含义。比较重要的几个分别为:
all
组合除了tracer
和cpu_timestamp
外所有的跟踪选项。send
,receive
跟踪发送和接收的消息。garbage_collection
跟踪垃圾回收情况。set_on_spawn
和set_on_link
让让已经被跟踪的进程创建出的进程或者link的进程也能被跟踪。{tracer, Pid}
指定一个进程去接收跟踪时收到的消息。当不想把信息全发给shell时可以这样干。{tracer, Module, State}
指定一个跟踪函数,用来处理跟踪得到的信息,可以参考erl_tracer编写的跟踪模块。
erlang:trace/3
是可以单独使用的,但是如果我们想跟踪函数,我们就需要使用erlang:trace_pattern/2-3
来告诉Erlang那些函数是我们想要跟踪的。erlang:trace/3
用来指定那些进程或者Port会被跟踪,而erlang:trace_pattern/2-3
是用来指定那些函数是我们要跟踪的。跟踪事件只会在被跟踪的进程调用被跟踪的函数时才会发生。巨日参数如下:
% 模块下所有名为Function的函数
{Module,Function,'_'}
% 模块下所有函数
{Module,'_','_'}
% 所有模块下的全部函数
{'_', '_', '_'}
参数2可以为true或者false用来开启或者关闭跟踪或者是符合match spec规范的模式跟踪。我们稍后再讨论match spec。第3个参数和模式跟踪有关系,这里我们只关注最重要的两个:
- global 跟踪所有通过模块名显示调用函数的情况,例如
trace:test_multiply(3, 4)
- local 跟踪所有调用该函数的场景,包括不使用模块名的情况,例如
test_multiply(3, 4)
其他请移步trace_pattern/3
%% 跟踪所有进程的call操作
1> erlang:trace(all, true, [call]).
393
%% 跟踪math模块的所有函数
2> erlang:trace_pattern({math, '_', '_'}, true, [local]).
27
%% 启动进程去调用
3> spawn(fun() -> math:pow(10, 2) end).
<0.5475.0>
4> spawn(fun() -> math:pow(10, 2) end).
<0.5477.0>
%% 查看进程消息
5> flush().
Shell got {trace,<9132.5475.0>,call,{math,pow,[10,2]}}
Shell got {trace,<9132.5477.0>,call,{math,pow,[10,2]}}
ok
%% 停止跟踪
7> erlang:trace(all, false, [call]).
ok
%% 停止跟踪后再调用
8> spawn(fun() -> math:pow(10, 2) end).
9> spawn(fun() -> math:pow(10, 2) end).
%% 没有内容
10> flush().
ok
%% 重启跟踪进程,因为没有关闭函数跟踪,所以不需要重启函数跟踪
11> erlang:trace(all, false, [call]).
12> spawn(fun() -> math:pow(10, 2) end).
13> spawn(fun() -> math:pow(10, 2) end).
14> flush().
Shell got {trace,<9132.5558.0>,call,{math,pow,[10,2]}}
Shell got {trace,<9132.5560.0>,call,{math,pow,[10,2]}}
ok
erlang:trace/3 和 erlang:trace_pattern/2-3的另一个例子
代码就不展示了,可以下载这个模块: 测试用的trace.erl
start_test_tracer() ->
spawn(fun() -> ?MODULE:test_tracer_loop() end).
test_tracer_loop() ->
receive
{trace, Pid, send, Msg, To} ->
% 处理send
io:format("Process ~p sent message ~p to process ~p~n", [Pid, Msg, To]);
{trace, Pid, 'receive', Msg} ->
% 处理receive
io:format("Process ~p received message ~p~n", [Pid, Msg]);
{trace, Pid, return_from, {Module, Function, Arity}, ReturnValue} ->
% 处理返回值
io:format("Function ~p:~p/~p returned ~p in process ~p~n",
[Module, Function, Arity, ReturnValue, Pid]);
{trace, Pid, call, {Module, Function, Arguments}} ->
% 处理调用
io:format("Function ~p:~p invoked with arguments ~p in process ~p~n",
[Module, Function, Arguments, Pid]);
Msg ->
io:format("Received unexpected message ~p~n", [Msg])
end.
跟踪测试
% 启动跟踪模块
1> Pid = trace:start_test_tracer().
% 启动算数服务器,此时我们还没有启动跟踪器
2> ServerPid = trace:start_test_loop_server().
% 开始进行跟踪
3> erlang:trace(ServerPid, true, [call, send, 'receive', {tracer, Pid}]).
% 设置模式跟踪
4> erlang:trace_pattern({trace, '_', '_'}, [{'_', [], [{return_trace}]}], [local]).
% 执行算数服务器中的函数
5> trace:test_multiply(ServerPid, 4, 3).
Process <0.89.0> received message {<0.85.0>,{test_multiply,4,3}}
Function trace:test_multiply invoked with arguments [4,3] in process <0.89.0>
12
Function trace:'-test_multiply/2-fun-0-' invoked with arguments [1,0,4] in process <0.89.0>
Function trace:'-test_multiply/2-fun-0-'/3 returned 4 in process <0.89.0>
Function trace:'-test_multiply/2-fun-0-' invoked with arguments [2,4,4] in process <0.89.0>
Function trace:'-test_multiply/2-fun-0-'/3 returned 8 in process <0.89.0>
Function trace:'-test_multiply/2-fun-0-' invoked with arguments [3,8,4] in process <0.89.0>
Function trace:'-test_multiply/2-fun-0-'/3 returned 12 in process <0.89.0>
Function trace:test_multiply/2 returned 12 in process <0.89.0>
Process <0.89.0> sent message 12 to process <0.85.0>
Function trace:loop invoked with arguments [] in process <0.89.0>
% 关闭跟踪
6> erlang:trace(all, false, [call, send, 'receive']).
Recon
Recon是Fred Hebert大神开发的一个开源库,其中包含很多模块用来对Erlang系统的生产环境进行诊断。其中的一个模块就是recon_trace,它的文档很好的概括了它的功能。
recon_trace可以在生产环境中安全跟踪单一Erlang节点上的函数调用,其中功能包括:
· 比dbg或者trace BIFs更易使用的接口
· 防呆设计(例如不小心跟踪了整个节点上所有的调用)
· 使用的跟踪事件计数或者限速进行安全保护
· 更好的日志格式
· recon_trace
用rebar3
可以很简单的使用recon,只需要加入的deps
配置
{deps, [{recon, "2.5.1"}]}.
核心函数 recon_trace:calls/2-3
1> recon_trace:calls({trace, '_', fun(_) -> return_trace() end}, 100, [{scope, local}]).
%% Erlang shell 调用,默认输出为 时间 调用者PID,Module:function + 各种信息。
2> trace:test().
17:23:56.767032 <0.725.0> trace:test()
17:23:56.767126 <0.725.0> trace:test/0 --> "I am Test"
"I am Test"
3> self().
<0.725.0>
4> spawn(fun() -> trace:test()end).
17:24:31.623102 <0.732.0> trace:test()
<0.732.0>
17:24:31.623191 <0.732.0> trace:test/0 --> "I am Test"
%% 停止监控
5> recon_trace:clear().
opt参数
-type pidspec() :: all | existing | new | recon:pid_term().
[{pid, pidspec() | [pidspec(), ...]} |
{timestamp, formatter | trace} |
{args, args | arity} |
{io_server, pid() | atom()} |
{formatter, formatterfun()} |
return_to |
{return_to, boolean()} |
{scope, global | local}].
pid
与erlang:trace
的pid一样args
控制监控什么类型的参数,可以是0-255的数字,表示监控参数为x个参数的函数,或者match才匹配例如:’_’。又或者是一个过滤函数自定义想要监控的参数。formatter
自定义输出格式的函数,建议从recon_trace:format/1里面抄一下然后再自行改动。scope
只能是global
和local
,与erlang:trace_pattern/3
的global
和local
一样。- 其他更多参数解释
recon_trace
的使用非常简单,几乎只需要用到有2个接口calls/2-3
和clear
,分别负责启动监控和停止监控。不过我个人觉得输出有点丑,而且停止监控一下只能停止全部有点傻逼。所以自己写了一个模块处理一下。
-module(tracer).
-include_lib("stdlib/include/ms_transform.hrl").
%% API
-export([
watch/1,
watch/2,
watch/3,
watch/4,
clear/0,
clear/1,
clear/2
]).
%% ===================== test recon_trace =====================================
watch(Mod) ->
watch(Mod, '_', all, {50, 100}).
watch(Mod, Max) when is_integer(Max) ->
watch(Mod, '_', all, Max);
watch(Mod, {_, _}=Max) ->
watch(Mod, '_', all, Max);
watch(Mod, Func) ->
watch(Mod, Func, all, {50, 100}).
watch(Mod, Func, {_, _}=Max) ->
watch(Mod, Func, all, Max);
watch(Mod, Func, Max) when is_integer(Max) ->
watch(Mod, Func, all, Max);
watch(Mod, Func, PID) when is_pid(PID) orelse PID=:=all orelse PID=:=new orelse PID=:=existing ->
watch(Mod, Func, PID, {50, 100}).
watch(Mod, Func, PID, Max) ->
recon_trace:calls({Mod, Func, match_spec()}, Max, [{scope, local}, {formatter, fun format/1}, {pid, PID}]).
match_spec() ->
dbg:fun2ms(fun(_) -> message(caller()), return_trace(), exception_trace() end).
clear() ->
recon_trace:clear().
clear(Mod) ->
erlang:trace_pattern({Mod,'_','_'}, false, [local,meta,call_count,call_time]),
erlang:trace_pattern({Mod,'_','_'}, false, []),
ok.
clear(Mod, Func) ->
erlang:trace_pattern({Mod,Func,'_'}, false, [local,meta,call_count,call_time]),
erlang:trace_pattern({Mod,Func,'_'}, false, []),
ok.
%% copy from recon_trace:format/1
format(TraceMsg) ->
{Type, Pid, {_, {Hour,Min,Sec}}, TraceInfo} = extract_info(TraceMsg),
{HeadStr, FormatStr, FormatArgs} = case {Type, TraceInfo} of
%% {trace, Pid, 'receive', Msg}
{'receive', [Msg]} ->
{"==Recv=======", "< ~99999p", [Msg]};
%% {trace, Pid, send, Msg, To}
{send, [Msg, To]} ->
{"==Send=======", " > ~99999p: ~99999p", [To, Msg]};
%% {trace, Pid, send_to_non_existing_process, Msg, To}
{send_to_non_existing_process, [Msg, To]} ->
{"==Send=======", " > (non_existent) ~99999p: ~99999p", [To, Msg]};
%% {trace, Pid, call, {M, F, Args}}
{call, [{M,F,Args}]} ->
{"==Call=======", "~99999p:~99999p/~99999p <-- ~99999p", [M,F,erlang:length(Args),Args]};
%% {trace, Pid, call, {M, F, Args}, Msg}
{call, [{M,F,Args}, Msg]} ->
{"==Call=======", "~99999p:~99999p/~99999p <-- ~99999p @~99999p", [M,F,erlang:length(Args),Args, Msg]};
%% {trace, Pid, return_to, {M, F, Arity}}
{return_to, [{M,F,Arity}]} ->
{"==Return To==", " '--> ~99999p:~99999p/~99999p", [M,F,Arity]};
%% {trace, Pid, return_from, {M, F, Arity}, ReturnValue}
{return_from, [{M,F,Arity}, Return]} ->
{"==Return=====", "~99999p:~99999p/~99999p --> ~99999p", [M,F,Arity, Return]};
%% {trace, Pid, exception_from, {M, F, Arity}, {Class, Value}}
{exception_from, [{M,F,Arity}, {Class,Val}]} ->
{"==Exception!=", "~99999p:~99999p/~99999p {~99999p,~99999p}", [M,F,Arity, Class, Val]};
%% {trace, Pid, spawn, Spawned, {M, F, Args}}
{spawn, [Spawned, {M,F,Args}]} ->
{"==Spawn======", "spawned ~99999p as ~99999p:~99999p~99999p", [Spawned, M, F, Args]};
%% {trace, Pid, exit, Reason}
{exit, [Reason]} ->
{"==Exit!======", "EXIT ~99999p", [Reason]};
%% {trace, Pid, link, Pid2}
{link, [Linked]} ->
{"==Link=======", "link(~99999p)", [Linked]};
%% {trace, Pid, unlink, Pid2}
{unlink, [Linked]} ->
{"==Unlink=====", "unlink(~99999p)", [Linked]};
%% {trace, Pid, getting_linked, Pid2}
{getting_linked, [Linker]} ->
{"==Get Link===", "getting linked by ~99999p", [Linker]};
%% {trace, Pid, getting_unlinked, Pid2}
{getting_unlinked, [Unlinker]} ->
{"==Get Unlink=", "getting unlinked by ~99999p", [Unlinker]};
%% {trace, Pid, register, RegName}
{register, [Name]} ->
{"==Register===", "registered as ~99999p", [Name]};
%% {trace, Pid, unregister, RegName}
{unregister, [Name]} ->
{"==Register===", "no longer registered as ~99999p", [Name]};
%% {trace, Pid, in, {M, F, Arity} | 0}
{in, [{M,F,Arity}]} ->
{"==In=========", "scheduled in for ~99999p:~99999p/~99999p", [M,F,Arity]};
{in, [0]} ->
{"==In=========", "scheduled in", []};
%% {trace, Pid, out, {M, F, Arity} | 0}
{out, [{M,F,Arity}]} ->
{"==Out========", "scheduled out from ~99999p:~99999p/~99999p", [M, F, Arity]};
{out, [0]} ->
{"==Out========", "scheduled out", []};
%% {trace, Pid, gc_start, Info}
{gc_start, [Info]} ->
HeapSize = proplists:get_value(heap_size, Info),
OldHeapSize = proplists:get_value(old_heap_size, Info),
MbufSize = proplists:get_value(mbuf_size, Info),
{"==GC Start===", "gc beginning -- heap ~99999p bytes",
[HeapSize + OldHeapSize + MbufSize]};
%% {trace, Pid, gc_end, Info}
{gc_end, [Info]} ->
HeapSize = proplists:get_value(heap_size, Info),
OldHeapSize = proplists:get_value(old_heap_size, Info),
MbufSize = proplists:get_value(mbuf_size, Info),
{"==GC End=====", "gc finished -- heap ~99999p bytes",
[HeapSize + OldHeapSize + MbufSize]};
_ ->
{"==Unknown====", "unknown trace type ~99999p -- ~99999p", [Type, TraceInfo]}
end,
io_lib:format(HeadStr ++ "~99999p:~99999p:~9.6.0f ~99999p " ++ FormatStr ++ "~n",
[Hour, Min, Sec, Pid] ++ FormatArgs).
extract_info(TraceMsg) ->
case tuple_to_list(TraceMsg) of
[trace_ts, Pid, Type | Info] ->
{TraceInfo, [Timestamp]} = lists:split(length(Info)-1, Info),
{Type, Pid, to_hms(Timestamp), TraceInfo};
[trace, Pid, Type | TraceInfo] ->
{Type, Pid, to_hms(os:timestamp()), TraceInfo}
end.
to_hms(Stamp = {_, _, Micro}) ->
{{Y, Mo, D},{H, M, Secs}} = calendar:now_to_local_time(Stamp),
Seconds = Secs rem 60 + (Micro / 1000000),
{{Y,Mo,D}, {H,M,Seconds}};
to_hms(_) ->
{{0,0,0}, {0,0,0}}.
Erlang中其它跟踪调试模块
这里依然还有好几个Erlang自身携带的可以用于跟踪调试的模块。我在这篇博文中并不打算详细介绍,也许将来的博文会介绍他们。他们分别是
dbg Erlang自带的跟踪模块。dbg是一个字符界面的跟踪器,通常它只是将采集到的数据发送到shell中。它是runtime_tools
应用的一部分,如果我们想在应用中使用dbg,就需要确保我们已经包含了runtime_tools
。
et 事件跟踪。用于记录任意事件。可以使用记录的事件生成序列。
ttb 跟踪构建工具。用于在分布式系统上创建跟踪。依赖dbg并且提供了很多简介的方法创建跟踪,并在分布式系统上进行跟踪。
dyntrace 这是个捐赠模块,它提供了使用类似dtrace
这种外部工具来进行动态跟踪的接口。
seq_trace 用于跟踪进程间消息传输的跟踪模块。
erl_tracer 用与实现自己跟踪器的behavior。
跟踪调试库
还有很多开源第三方的调试和跟踪调试库。这里介绍几个比较知名的库。
redbug 一个和Erlang跟踪函数交互的工具。它具有一些内建的安全特性,如只使用跟踪函数功能的安全子集,当数据量过大的时候自动退出,并且可以在跟踪一段时间或接收了一定数量的跟踪事件后自动退出。
一个使用Erlang跟踪BIF且具有界面的Erlang,Elixir和LFE调试器。