A true multiline regexp in PHP. The "I miss U" technique
The following regular expression matches tags that are opened and close in different lines, albeit can be used for any other purpose. It is also ungreedy, meaning that when the first closing tag is found the rest of equal tags will be ignored.
It is very easy to remember and to apply, I call it the "I MISS YOU" technique, see the why in the regexp modifiers: misU
$html =<<<MULTILINE <p class="interesting">I am the <strong>interesting</strong> text</p> <p>But this should be ignored</p> MULTILINE; $open = preg_quote( '<p class="interesting">' ); $close = preg_quote( '</p>' ); $pattern = "~$open(.+)$close~misU"; preg_match_all( $pattern, $html, $matches); var_dump( $matches[1] ); die; // Displays array(1) { [0]=> string(42) "I am the <strong>interesting</strong> text" }
And the "I miss you" technique is because misU means:
- m: Multiline modifer (even the "s" modifier actually does that)
- i: Case insensitive
- s: That's the important one, matches all characters including newlines
- U: the ungreedy (must be uppercase, lower case is for utf-8)
Note: The "Us" modifier would be enough for this specific example, but less prosaic.
Easy to remember both the "Us" or the "misU", happy scrapping!