McKinsey / VO 面试真题解析：Data Cleaning 字符串博弈

29次阅读

In a data cleaning pipeline, engineers optimize text datasets by alternately removing substrings: Alex removes substrings with an odd number of vowels in the first step, followed by Chris removing substrings with an even number of vowels in the next step.

Alex always goes first, then the engineers take turns optimally until no valid substring remains. The goal is to determine which engineer removes the last substring.

Implement a function optimizeDataCleaning to determine the outcome of the cleaning process for each dataset, assuming alternating and optimal cleaning.

The function optimizeDataCleaning takes the following input:

string datasets[n]: each string represents a dataset

The function should return an array of strings specifying the result of the cleaning process for each dataset.

Note: Vowels are 'a', 'e', 'i', 'o', and 'u'.

Example

Given n = 2 and datasets = ["git", "dry"]

这道题本质上是一个字符串博弈：Alex 只能删除“元音数为奇数”的子串，Chris 只能删除“元音数为偶数”的子串，而且双方都采取最优策略，判断最终是谁删掉最后一个合法子串。解题时通常不需要真的模拟每一步删除，而是先分析哪些子串在当前规则下可删、谁能形成必胜态，再据此推导最终结果。关键点是字符串中元音分布对可操作子串的影响，并用博弈思维或数学归纳快速判断每个数据集的胜负。

Post Views: 33

正文完

发表至： Mckinsey VO

2026-04-27

TikTok OA 面试真题解析：Longest Consecutive Sequence（最长连续序列）

DoorDash VO 面试真题解析：Get User Bootstrap Response（聚合接口 / JSON 组装）

面经搜索

McKinsey / VO 面试真题解析：Data Cleaning 字符串博弈

联系我

友情提示