Regex的替代方法,用于从字符串中提取信息

I'm attempt to extract information from a string, which will always be in the same format.

The format will always be:

To:
                     Name here
Date:
                     26/08/2014 14:52
Order Number:
                     123456
Service Required:
                     Plumbing
Service Response:
                     48 Hour
Service Limit:
                     110.00

123 TEST ROAD
LEEDS
LS1 1HL

Contact:
                     Mr J Smith - 0777 123456
Telephone:
                     01921 123456

Work Details:

Notes here etc 

I have tried exploding the string by spaces and looping through the array but I cannot structure it in such a way that I receive the information.

E.g: I try to retrieve "Name here" from after "To:" without also retrieving "Date: etc..", the eventual idea is to create variables for each bit of information so i can enter it into a database.

Any help/suggestions/idea's are really welcome.

thanks for reading

You could use regex easily.

If you use this regex, you can get the name here:

To:\s+(.*)

Working demo

enter image description here

The idea of this regex is to look for the key you want to look for and fetch the value. For instance, above regex looks for To: whitespaces and store in a capturing group the content.

You just need to change To for whatever you want, if you modify it to Date you will get the date.

enter image description here

As a note, this only works with single line values.

The code to implement this regex in php is very straightforward, like this:

$re = "/To:\\s+(.*)/";
$str = "YOUR STRING HERE";
preg_match($re, $str, $matches);

On the other hand, below data follows a different pattern:

123 TEST ROAD
LEEDS
LS1 1HL

You'd need a different regex pattern too, so to fetch that information you could use:

^(\w+[\w\s]+)(?!:)$

Working demo

Later update (the work done)

Here's the fully working script. I guess you'll appreciate here how flexible it is!

$s = "To:
          Name here
Date:
          26/08/2014 14:52
Order Number:
          123456
Service Required:
          Plumbing
Service Response:
          48 Hour
Service Limit:
          110.00

123 TEST ROAD
LEEDS
LS1 1HL

Contact:
          Mr J Smith - 0777 123456
Telephone:
          01921 123456

Work Details:

Notes here etc ";


$a = Array(
  Array("To:", "Date:" ),
  Array("Date:", "Order Number:" ),
  Array("Order Number:", "Service Required:" ),
  Array("Service Limit:", 'Contact:' ),
  //etc
);  

foreach ($a as $anchors)  {
  $t = explode ($anchors[0], " ".$s );
  $t = explode ($anchors[1], $t[1]  );
  $r = trim($t[0]);
  echo $anchors[0] ." [". $r ."]
"  ;
}

Which will produce:

augusto@cubo:~/Documents$ php script.php
To: [Name here]
Date: [26/08/2014 14:52]
Order Number: [123456]
Service Limit: [110.00

123 TEST ROAD
LEEDS
LS1 1HL]

Older answer (the concept) Seems to be not too difficilt.

You have many good anchors to work on! explode() will be a good friend.

$tmp = explode ('anchor-before', $string  );
$tmp = explode ('anchor-after', $tmp[1]) ;
$res = trim($tmp[0]);

If you don't want to use a regex, since you are looking for the first field content, you can use a double explode:

$firstfield= trim(explode("
",explode(':', $data, 3)[1])[1]);

var_dump($firstfield);

Otherwise to obtain fields and values with a regex, you can use this:

$pattern = '~^(\w+(?: \w+)*):\s*(.+?)\s*(?=(?1):|\z)~ms';

preg_match_all($pattern, $data, $m, PREG_SET_ORDER);

foreach ($m as $v) {
    $results[$v[1]] = $v[2];
}

echo $results['To'];

A regex is not that difficult. Try this one.

 # '~(?msi)^To:\s*(.*?)\s*^Date:\s*(.*?)\s*^Order\ Number:\s*(.*?)\s*^Service\ Required:\s*(.*?)\s*^Service\ Response:\s*(.*?)\s*^Service\ Limit:\s*(.*?)\s*^Contact:\s*(.*?)\s*^Telephone:\s*(.*?)\s*^Work\ Details:\s*(.*?)\s*~'

 (?msi)
 ^ To: \s* 
 ( .*? )                            # (1)
 \s* 
 ^ Date: \s* 
 ( .*? )                            # (2)
 \s* 
 ^ Order\ Number: \s* 
 ( .*? )                            # (3)
 \s* 
 ^ Service\ Required: \s* 
 ( .*? )                            # (4)
 \s* 
 ^ Service\ Response: \s* 
 ( .*? )                            # (5)
 \s* 
 ^ Service\ Limit: \s* 
 ( .*? )                            # (6)
 \s* 
 ^ Contact: \s* 
 ( .*? )                            # (7)
 \s* 
 ^ Telephone: \s* 
 ( .*? )                            # (8)
 \s* 
 ^ Work\ Details: \s* 
 ( .*? )                            # (9)
 \s* 

Output

 **  Grp 1 -  ( pos 26 , len 9 ) 
Name here  
 **  Grp 2 -  ( pos 65 , len 16 ) 
26/08/2014 14:52  
 **  Grp 3 -  ( pos 119 , len 6 ) 
123456  
 **  Grp 4 -  ( pos 167 , len 8 ) 
Plumbing  
 **  Grp 5 -  ( pos 217 , len 7 ) 
48 Hour  
 **  Grp 6 -  ( pos 263 , len 39 ) 
110.00

123 TEST ROAD
LEEDS
LS1 1HL  
 **  Grp 7 -  ( pos 337 , len 24 ) 
Mr J Smith - 0777 123456  
 **  Grp 8 -  ( pos 396 , len 12 ) 
01921 123456  
 **  Grp 9 -  ( pos 427 , len 0 )  EMPTY