使用正则表达式提取文本块

Using PHP I want to compare two text files, the first file is the main one that the other one should compare with it. If a line of first.txt does not exist in second.txt or is different from it, the script should return the whole block of that line, for example:

first.txt

interface Vlan11
 description xxx
 ip address 10.10.10.10 255.255.255.255
 shutdown
!
vlan 34
!
vlan 17
 name sth
!
route-map sth
 match ip address exm
 set ip next-hop 1.2.3.4
!

second.txt

interface Vlan11
 description xxx
 ip address 20.20.20.20 255.255.255.255
 shutdown
!
vlan 34
!
route-map sth
 match ip address exm
 set ip next-hop 1.2.3.4
!

For the compare I extracted the first.txt lines using file() and search them in the second.txt, now the IP address is different in third line of second.txt, then we should return the block of this line (from interface to the bang(!)):

interface Vlan11
 description xxx
 ip address 20.20.20.20 255.255.255.255
 shutdown
!

or in the second.txt one of vlan blocks does not exists, so it should returns:

vlan 17
 name sth
!

It's easy to write a regex that extracts the block between the two bangs, but because I should go back to the start of block I don't know what the pattern should start with.

Also I have another idea that every block starts with a character, then some lines that start with a space and then a bang at the end, but the problem is about how to start the pattern.

You could use the following regular expression to match blocks:

/.*?\R!\R*/s

The \R matches newlines, and the s modifier makes sure that . will also match newlines.

You could then use preg_match_all to get all blocks from a text, and use array_diff to make the comparison and extract the blocks that are different:

$text1 = file_get_contents("first.txt");
$text2 = file_get_contents("second.txt");

preg_match_all('/.*?\R!\R*/s', $text1, $blocks1);
preg_match_all('/.*?\R!\R*/s', $text2, $blocks2);

$result = array_diff($blocks1[0], $blocks2[0]);

print_r($result);

See it run on eval.in;

This is one way to find the common and unique parts of the two files which are separated by '!'.

<?php

$first_txt = "interface Vlan11
 description xxx
 ip address 10.10.10.10 255.255.255.255
 shutdown
!
vlan 34
!
vlan 17
 name sth
!
route-map sth
 match ip address exm
 set ip next-hop 1.2.3.4
!
";


$second_txt = "interface Vlan11
 description xxx
 ip address 20.20.20.20 255.255.255.255
 shutdown
!
vlan 34
!
route-map sth
 match ip address exm
 set ip next-hop 1.2.3.4
!
";

$first_parts=explode('!',$first_txt);
$second_parts=explode('!',$second_txt);

print_r($first_parts);
print_r($second_parts);

foreach ( $first_parts as $part) 
{
    if ( in_array( $part, $second_parts ) )
    {
        echo "found in second_parts $part";
        echo "";
    }
    else 
    {
        echo "not found in second_parts $part";
        echo "";
    }
}
foreach ( $second_parts as $part) 
{
    if ( in_array( $part, $first_parts ) )
    {
        echo "found in first_parts $part";
        echo "";
    }
    else 
    {
        echo "not found in first_parts $part";
        echo "";
    }
}