bookmark spec: Markdown-to-HTML Conversion
The parser processes input text by identifying blocks, which are sequences of non-empty lines separated by one or more blank lines. Leading and trailing whitespace are trimmed from these blocks before applying conversion rules.
Headings:
- Input: Lines starting with one to six
#symbols, followed by a space, then the heading text. - Output: Converts to HTML heading tags
<h1>through<h6>. The number of#symbols determines the level. - Example Markdown:
## Section Title - Example HTML Output:
<h2>Section Title</h2>
- Input: Lines starting with one to six
Horizontal Rules:
- Input: A line that begins with seven or more hyphens (e.g.,
-------). - Output: Converts to an HTML
<hr>tag. - Example Markdown:
------- - Example HTML Output:
<hr>
- Input: A line that begins with seven or more hyphens (e.g.,
Italic Blocks (Multi-line Emphasis):
- Input: An entire block of text (potentially spanning multiple lines in the input, as long as they are not separated by blank lines) that starts exactly with a single underscore
_and ends exactly with a single underscore_. The content between the underscores must be non-empty after trimming. - Output: The content is wrapped in
<em>...</em>tags. Newlines within the original italicized content are converted to<br>HTML tags. This rule takes precedence over paragraph and section number rules for a qualifying block. - Example Markdown:
_This is anemphasized block._ - Example HTML Output:
<em>This is an<br>emphasized block.</em>
- Input: An entire block of text (potentially spanning multiple lines in the input, as long as they are not separated by blank lines) that starts exactly with a single underscore
Section Numbers:
- Input: A block of text starting with a number enclosed in single asterisks (e.g.,
*1*,*23*), possibly followed by whitespace, and then the rest of the text for that block. - Output: The number is wrapped in
<span class="section-number">...</span>. The text following the section number (within the same block) is appended. Newlines within this appended text are converted to<br>HTML tags. This rule is checked after the Italic Block rule. - Example Markdown:
*7* Introduction to the TopicFurther details. - Example HTML Output:
<span class="section-number">7</span>Introduction to the Topic<br>Further details.
- Input: A block of text starting with a number enclosed in single asterisks (e.g.,
Paragraphs:
- Input: Any block of text that does not match the rules for Headings, Horizontal Rules, Italic Blocks, or Section Numbers.
- Output: The text content is wrapped in
<p>...</p>tags. Newlines within the original paragraph content are converted to<br>HTML tags. - Example Markdown:
This is a regular paragraph.It has multiple lines. - Example HTML Output:
<p>This is a regular paragraph.<br>It has multiple lines.</p>
Processing Order:
- The input text is split into lines.
- Lines are grouped into blocks based on blank line separators.
- For each block:
- It's first checked if it's an Italic Block.
- If not, it's checked if it's a Section Number.
- If not, it's treated as a Paragraph.
- Lines starting with
#(Headings) or-------(Horizontal Rules) are treated as distinct elements and terminate the preceding block. - The
parseMarkdownfunction ultimately returns an array of these generated HTML string elements.