Microsoft OA 面试真题解析:简单 Markdown 格式标签解析

26次阅读
没有评论

You’ll need to parse these tags: <p/>, <br/>, <blockquote/>, and <del/>.

A sequence of two or more consecutive newline characters starts a new paragraph. Other formatting commands do not cross paragraphs.

A single newline character is a soft line break, <br/>.

A blockquote is one or more consecutive lines where the first two characters on the line are > (a greater-than sign followed by a space).

Text in strikethrough (<del></del>) is surrounded by a pair of ~~ (two tildes).

markdown_input = """This is a paragraph with a soft
line break.

This is another paragraph that has
> Some text that
> is in a
> block quote.

This is another paragraph with a ~~strikethrough~~ word."""

output = """<p>This is a paragraph with a soft<br />line break.</p>

<p>This is another paragraph that has<br />
<blockquote>Some text that<br />is in a<br />block quote.</blockquote>
</p>

<p>This is another paragraph with a <del>strikethrough</del> word.</p>"""

Note: It’s not important to produce this specific output! We only care if the HTML is valid.

这道题考察的是一个简化版 Markdown 到 HTML 的解析与输出。题目要求段落、软换行、引用块和删除线四种格式:连续两个及以上换行表示新段落,单个换行转成 <code>&lt;br /&gt;</code>,以 <code>&gt; </code> 开头的连续行组成 blockquote,<code>~~</code> 包裹的内容转成 <code>&lt;del&gt;</code>。解题时关键是按段处理,避免格式跨段传播,并在解析过程中维护当前是否处于段落、引用块和删除线状态,最后拼出合法 HTML 即可。

正文完
 0