Amazon OA 面试真题解析:Result(正则表达式匹配)

14次阅读
没有评论

Amazon is developing a regex matching library for NLP use cases. A prototype regex matching implementation has the following requirements:

  • The regex expression contains lowercase English letters, ., (, ), and *.
  • . matches exactly one lowercase English letter.
  • A regular expression followed by * matches zero or more occurrences of the regular expression.
  • If an expression is enclosed in parentheses ( and ), it is treated as one expression and any * applies to the whole expression.
  • It is guaranteed that there is no nested parentheses in the given regex for the prototype.

For example:

  • Regex (ab)*d matches strings d, ababd, abd, but not abbd, abab.
  • Regex ab*cd matches strings abbcd, ad, abd, but not abbd.
  • Regex (a.b)* matches strings a, aa, aaa, b, bbb, and many more, but not ac, and, or bcd.

Given an array of strings, for each string determine whether the given regex matches the string or not, and report an array of strings "YES" or "NO" respectively.

这道 Amazon OA 题考察的是带有通配符 <code>.</code>、量词 <code>*</code> 以及小括号分组的正则匹配。核心难点不在于普通字符串比对,而在于如何正确处理“一个字符”“零次或多次重复”和“括号内整体重复”这三种规则。比较稳妥的做法是先把正则按可重复的最小单元切分,再用动态规划判断每个字符串是否能被完整匹配;对于带括号的片段,需要把括号内内容当成一个整体来处理。由于输入规模不大,枚举匹配状态或做区间 DP 都是可行方向。

正文完
 0