Sampling profilers are a performance analysis tool for finding the slow parts of your code by periodically sampling the entire call stack. In this problem, the samples will be a list of Sample objects, each containing a float timestamp and a list of function names, in order by timestamp.
- Sample stacks contain every function that is currently executing.
- The stacks are ordered from the outermost function (like
main) to the innermost currently executing function. - An unlimited amount of execution can happen between samples. We do not have all the function calls that happened, only some samples we want to visualize.
Sometimes it is useful to visualize these samples on a chronological timeline of the call stack using a trace visualizer UI. Time is on the X axis and nesting is on the Y axis.
These UIs were built for instrumentation that records the start and end of functions. We want to use this to visualize the data we have as best we can, and so need to convert the sorted samples into a list of start and end events for each function call.
The returned events should be in a list order that is properly nested as if they were recorded. For example, a nested function call’s start event is after the one from the enclosing function, and the nested call’s end event is before the enclosing call’s end event.
Assume call frames in the last sample haven’t finished. The resulting events should use this type:
@dataclass
class Event:
kind: str # either 'start' or 'end'
ts: float
name: str
Implement:
def convert_to_trace(samples: List[Sample]) -> List[Event]:
...
Example:
samples = [Sample(7.5, ["main"]), Sample(9.2, ["main", "my_fn"]), Sample(10.7, ["main"])]
Expected result:
[Event('start', 7.5, 'main'), Event('start', 9.2, 'my_fn'), Event('end', 10.7, 'my_fn')]
This problem asks you to reconstruct a properly nested trace from sampled call stacks. The key insight is to compare each sample with the previous one, keep the shared prefix, close any functions that disappeared, and open any newly appearing functions in stack order. A stack-based diff over adjacent samples produces the correct start/end event sequence, while functions still active in the final sample remain open.