Google Log Stream Design Interview Question | 设计聊天日志流处理器(Most Active User 统计)

3次阅读
没有评论
You're given a log stream of a chat application.
Every log entry has the following fields:- timestamp: long – the number of seconds that have elapsed since 1970-01-01.
- sender: string – username of the sender
- receiver: string – username of the receiver
- message_text: string – the message payloadWe want to implement a log stream processor which supports two methods:- RegisterEvent(timestamp,
sender_username,
receiver_username,
message_text)
- registers the event that the message have been sent- GetMostActiveUser()
- Returns the username with the largest number of conversations.
- We count any amount of messages exchanged between two unique users as a single conversation.

关键点只有一句话:

conversation 是“无序的两个不同用户组合”,而不是消息条数。

也就是说:

  • A 和 B 发 1000 条消息
  • B 和 A 再发 1000 条消息

都只算 1 个 conversation。

这道题本质是在考你如何在流式数据中把“无序用户对”抽象成唯一会话,并实时维护出会话数最多的用户。

正文完
 0