I have a lot of different blog posts in a database. They are in markdown. I want to convert them to html. I have problems with the ul tag.
For instance in a simplified way I have that:
Something
- First
- Second
Something else
Second ul
- First
- Second
Some final text
I could put the li tags:
$text = preg_replace("/^- (.*)/m", "<li>$1</li>", $text);
But how can I identify the beginning or end of the list to put the ul tag?
I want this result:
Something
<ul>
<li>- First</li>
<li>- Second</li>
</ul>
Something else
Second ul
<ul>
<li>- First</li>
<li>- Second</li>
</ul>
Some final text
You need to do this in two passes, so with two preg_replace
calls:
$text = preg_replace("/^- (.*)/m", "<li>$1</li>",
preg_replace("/^- .*(\R- .*)*/m", "<ul>
$0
</ul>", $text)
);
The inner one executes first, and finds lines that starts with -
and captures all the following lines that also start with -
until the line that follows does not start -
.
.*
matches anything up to the end of the current line. The new line character is not matched by .
. So the next \R
is used for matching anything considered a linebreak sequence. And then again the hyphen check is done and the remainder of the line is captured. This can repeat (( ... )*
) as many times as possible.
That matched block is then wrapped in an ul
tag. The hyphens are not removed at this stage.
The outer one will wrap lines that start with -
in li
tags, taking out the -
.