Anthropic / VO 面试真题解析:将采样调用栈转换为 Trace 事件

19次阅读
没有评论

Sampling profilers are a performance analysis tool for finding the slow parts of your code by periodically sampling the entire call stack. In this problem, the samples will be a list of Sample objects, each containing a float timestamp and a list of function names, in order by timestamp.

  • Sample stacks contain every function that is currently executing.
  • The stacks are ordered from the outermost function (like main) to the innermost currently executing function.
  • An unlimited amount of execution can happen between samples. We do not have all the function calls that happened, only some samples we want to visualize.

Sometimes it is useful to visualize these samples on a chronological timeline of the call stack using a trace visualizer UI. Time is on the X axis and nesting is on the Y axis.

These UIs were built for instrumentation that records the start and end of functions. We want to use this to visualize the data we have as best we can, and so need to convert the sorted samples into a list of start and end events for each function call.

The returned events should be in a list order that is properly nested as if they were recorded. For example, a nested function call’s start event is after the one from the enclosing function, and the nested call’s end event is before the enclosing call’s end event.

Assume call frames in the last sample haven’t finished. The resulting events should use this type:

@dataclass
class Event:
    kind: str  # either 'start' or 'end'
    ts: float
    name: str

Implement:

def convert_to_trace(samples: List[Sample]) -> List[Event]:
    ...

Example:

samples = [Sample(7.5, ["main"]), Sample(9.2, ["main", "my_fn"]), Sample(10.7, ["main"])]

Expected result:
[Event('start', 7.5, 'main'), Event('start', 9.2, 'my_fn'), Event('end', 10.7, 'my_fn')]

这道题的核心是把按时间排序的采样调用栈,转换成适合 trace 可视化的 start/end 事件序列。由于每个 sample 给出的是从外层到内层的完整栈帧,因此可以把相邻两个栈进行对比:公共前缀保持不变,前一个 sample 多出来的内层函数需要先补上结束事件,后一个 sample 新出现的函数则需要补上开始事件。最后一个 sample 中仍在执行的函数要保留为未结束状态,不要强行补 end。整体思路非常适合用栈来做,按层比较即可得到正确的嵌套顺序。

正文完
 0