Amazon VO Interview Question: Median of a Data Stream

19 Views
No Comments

Given a data stream of CPU utilization values as doubles: 10.0, 20.5, 35.0, 45.0, 20.0, 19.0, 18.5, …

Find the median of the data stream.

The expectation is to write a class as below:

median_cpu_util = MedianCPUUtilization()
median_cpu_util.add(10.0)
median_cpu_util.add(20.5)
median_cpu_util.add(35.0)
median_cpu_util.get_median() => 20.5
median_cpu_util.add(45.0)
median_cpu_util.get_median() => ??
median_cpu_util.add(20.0)
median_cpu_util.get_median() => 20.5
median_cpu_util.add(21.0)
median_cpu_util.get_median() => (20.5 + 21) / 2

Java

class MedianCPUUtilization {void add() { }

    double getMedian() {}
}

This problem asks you to design a data structure that supports incremental insertion of CPU utilization values and fast median queries. The standard solution uses two heaps to keep the lower and upper halves of the stream balanced, allowing each new value to be inserted in logarithmic time while the median can be returned in constant time. It is a classic online data stream question that tests heap usage, balancing logic, and handling both odd and even counts correctly.

END
 0