谷歌面试真题:支持“转写映射”的字符串匹配

30次阅读
没有评论
# String Matching With Transliteration
# You may or may not know that Google Docs has a feature to find and replace even for characters 
# not in the standard English alphabet ([A-Z]). To do this, we first need a transliteration map, e.g.

# Keys are single non-standard-English characters, values are
# strings consisting of standard English characters.

tl_map = {
    "ë": "e",
    "é": "e",
    "û": "u",
    "æ": "ae",
    "ž": "zed",   # Made up, meant as a harder case for testing.
    ...
}

1. Given a haystack string and a needle string and the transliteration mapping,
   return whether the needle is matched within the haystack.

Example 1:
haystack: "I love crème brûlée"
needle:   "eme"
return:   True

Example 2:
haystack: "I love crème brûlée"
needle:   "ême"
return:   True

Example 3:
haystack: "Ole Gunnar Solskjær is amazing"
needle:   "Solskja"
return:   True

Example 4:
haystack: "Ole Gunnar Solskjær is amazing"
needle:   "Solskjear"
return:   False

这道谷歌字符串题的核心在于:匹配时必须先根据给定的转写 tl_map 把所有“非标准英文字母”转换成等价的英文串。例如“crème”中的“è/é/ë”等都要转成“e”,而“æ”要转成“ae”。needle 与 haystack 都需要先(或在匹配时)被正确转写,然后再做子串匹配。因为映射可能是“一字符对应多字符”,要特别注意长度变化、边界处理和高效实现方式(例如预处理成新字符串,或在匹配过程中按需展开)。

正文完
 0