I have a string which I store book pages. It's something like this:
///0///
Page1 Text
///1///
Page2 Text
///2///
Page3 Text
///3///
I want to extract page texts (Page1 Text, Page2 Text, Page3 Text). Here is the Regular Expression which is am using:
$format = "%///\d*///(.*)///\d*///%";
preg_replace_callback($format, "process_page", $text);
According to this page I can use other character than / in the start and end of the expression. So I used % to simplify my pattern, so I don't have to use escape character like this \/
It seems okay to me, but it return nothing. Can somebody please tell me where is the problem?
I think preg_split
might be a better option for you:
$text = '
Page1 Text
///1///
Page2 Text
///2///
Page3 Text
';
$format = "%///\d+///%";
$arr = preg_split($format, $text);
// $arr = Array
// (
// [0] => Page1 Text
//
// [1] =>
// Page2 Text
//
// [2] =>
// Page3 Text
// )
Each page is now in it's own array element.
I think you need the s
modifier: $format = "%///\d*///(.*)///\d*///%s";
s (PCRE_DOTALL)
If this modifier is set, a dot metacharacter in the pattern matches all characters, including newlines. Without it, newlines are excluded. This modifier is equivalent to Perl's /s modifier. A negative class such as [^a] always matches a newline character, independent of the setting of this modifier.
I'm not sure what you're tryingto do but personally I wouldn't use regex for this. you know the exact string to look for (eg ///4///
) and from there the end string (///5///
or end of file). A simle substr with strpos might be a better option.
I would use something like preg_spilt
(see Tim Cooper's answer).
But for your RegEx, try this:
$format = "%///\d+///(.*?)(?=///\d+///)%s";
With Look-around assertion and s
-modifier.