I've written a content generator tool for a project im working to assist me batch importing fake content into text fields of a database. It just assists making the site look populated.
I'm using an external class called lorem-php-sum to actually generate the strings that I am inserting. Its incredibly simple really, it just inserts paragraphs of text wrapped in <p>
tags (and a random number of them each time) and I then insert these strings into my chosen table within a big loop.
Now the thing is, I want to slightly advance what content is being randomly generated and to add some html list tags, horizontal line tags and other stuff. I want my new html elements to be placed randomly within the paragraphs that I get returned from this paragraph generator class.
The problem is that whilst I can easily insert list tags into my big paragraph string at some random point, I fear sometimes it may insert my new html tags within the existing markup in a way that will break the html.
Does anyone have a trick for inserting html with some rules into another string? I imagine that maybe the php domDocument class can assist with this but not sure now?
You'd need to incorporate some kind of state machine in your generator.
You can think of something like this:
Step1: Choose which element to render: a textnode, a paragraph, a list node.
When you pick a textnode you randomly generate some text and return to Step 1.
When you pick a paragraph you emit <p>
and generate some text, emit </p>
and return to Step 1.
In the case of a list node you can only make list elements <li>
, so pick a random number of elements and fill them with same rules from Step 1.
--
You can also allow nesting. In <li>
you can add <strong>
and <em>
, similar for <p>
.
You can make it as crazy as you want I guess :)
Tweak a bit with the coefficients to get good results. Try to make a generator that produces random, but predictable output, total length might be a good thing to control on.
You could hierarchically loop through multidimensional arrays. No cell without a row, no row without a table, as such no li without a ul.
$tags = array("<table>%s</table>
" ,
array (" <tr>%s</tr>
" ,
array(" <td>%s</td>
)),
"<ul>%s</ul>
",
arrray (" <li>%s</li>
") //continue with more tags
);
$tags_simple = array("%s", "<strong>%s</strong>",
"<i>%s</i>", "<p>%s</p>
", "%s</ br>
"
); //etc, "%s" for a none tag, add more if you like
Pick a ramdom from $tags, multiloop them, sprintf the random sentences and add random simple tags to them. It's a standalone possibility.
So I managed to work this out with other code samples and using domDocument. I ended up making a function that explodes the string via paragraph tags and returns it as an array containing each paragraph as a separate item.
function splitTextByPara($string,$split_on="p"){
// Add alternative tags to split on with syntax: |//ul|//br
$dom = new DOMDocument();
$dom->loadHTML($string);
$domx = new DOMXPath($dom);
$entries = $domx->evaluate("//".$split_on);
$result = array();
foreach ($entries as $entry) {
$result[] = $entry->ownerDocument->saveHTML( $entry );
}
// re-encode to utf8
$result = array_map("utf8_decode", $result);
return $result;
}