I'm attempt to extract information from a string, which will always be in the same format.
The format will always be:
To:
Name here
Date:
26/08/2014 14:52
Order Number:
123456
Service Required:
Plumbing
Service Response:
48 Hour
Service Limit:
110.00
123 TEST ROAD
LEEDS
LS1 1HL
Contact:
Mr J Smith - 0777 123456
Telephone:
01921 123456
Work Details:
Notes here etc
I have tried exploding the string by spaces and looping through the array but I cannot structure it in such a way that I receive the information.
E.g: I try to retrieve "Name here" from after "To:" without also retrieving "Date: etc..", the eventual idea is to create variables for each bit of information so i can enter it into a database.
Any help/suggestions/idea's are really welcome.
thanks for reading
You could use regex easily.
If you use this regex, you can get the name here
:
To:\s+(.*)
The idea of this regex is to look for the key
you want to look for and fetch the value. For instance, above regex looks for To:
whitespaces and store in a capturing group the content.
You just need to change To
for whatever you want, if you modify it to Date
you will get the date.
As a note, this only works with single line values.
The code to implement this regex in php is very straightforward, like this:
$re = "/To:\\s+(.*)/";
$str = "YOUR STRING HERE";
preg_match($re, $str, $matches);
On the other hand, below data follows a different pattern:
123 TEST ROAD
LEEDS
LS1 1HL
You'd need a different regex pattern too, so to fetch that information you could use:
^(\w+[\w\s]+)(?!:)$
Later update (the work done)
Here's the fully working script. I guess you'll appreciate here how flexible it is!
$s = "To:
Name here
Date:
26/08/2014 14:52
Order Number:
123456
Service Required:
Plumbing
Service Response:
48 Hour
Service Limit:
110.00
123 TEST ROAD
LEEDS
LS1 1HL
Contact:
Mr J Smith - 0777 123456
Telephone:
01921 123456
Work Details:
Notes here etc ";
$a = Array(
Array("To:", "Date:" ),
Array("Date:", "Order Number:" ),
Array("Order Number:", "Service Required:" ),
Array("Service Limit:", 'Contact:' ),
//etc
);
foreach ($a as $anchors) {
$t = explode ($anchors[0], " ".$s );
$t = explode ($anchors[1], $t[1] );
$r = trim($t[0]);
echo $anchors[0] ." [". $r ."]
" ;
}
Which will produce:
augusto@cubo:~/Documents$ php script.php
To: [Name here]
Date: [26/08/2014 14:52]
Order Number: [123456]
Service Limit: [110.00
123 TEST ROAD
LEEDS
LS1 1HL]
Older answer (the concept) Seems to be not too difficilt.
You have many good anchors to work on! explode() will be a good friend.
$tmp = explode ('anchor-before', $string );
$tmp = explode ('anchor-after', $tmp[1]) ;
$res = trim($tmp[0]);
If you don't want to use a regex, since you are looking for the first field content, you can use a double explode:
$firstfield= trim(explode("
",explode(':', $data, 3)[1])[1]);
var_dump($firstfield);
Otherwise to obtain fields and values with a regex, you can use this:
$pattern = '~^(\w+(?: \w+)*):\s*(.+?)\s*(?=(?1):|\z)~ms';
preg_match_all($pattern, $data, $m, PREG_SET_ORDER);
foreach ($m as $v) {
$results[$v[1]] = $v[2];
}
echo $results['To'];
A regex is not that difficult. Try this one.
# '~(?msi)^To:\s*(.*?)\s*^Date:\s*(.*?)\s*^Order\ Number:\s*(.*?)\s*^Service\ Required:\s*(.*?)\s*^Service\ Response:\s*(.*?)\s*^Service\ Limit:\s*(.*?)\s*^Contact:\s*(.*?)\s*^Telephone:\s*(.*?)\s*^Work\ Details:\s*(.*?)\s*~'
(?msi)
^ To: \s*
( .*? ) # (1)
\s*
^ Date: \s*
( .*? ) # (2)
\s*
^ Order\ Number: \s*
( .*? ) # (3)
\s*
^ Service\ Required: \s*
( .*? ) # (4)
\s*
^ Service\ Response: \s*
( .*? ) # (5)
\s*
^ Service\ Limit: \s*
( .*? ) # (6)
\s*
^ Contact: \s*
( .*? ) # (7)
\s*
^ Telephone: \s*
( .*? ) # (8)
\s*
^ Work\ Details: \s*
( .*? ) # (9)
\s*
Output
** Grp 1 - ( pos 26 , len 9 )
Name here
** Grp 2 - ( pos 65 , len 16 )
26/08/2014 14:52
** Grp 3 - ( pos 119 , len 6 )
123456
** Grp 4 - ( pos 167 , len 8 )
Plumbing
** Grp 5 - ( pos 217 , len 7 )
48 Hour
** Grp 6 - ( pos 263 , len 39 )
110.00
123 TEST ROAD
LEEDS
LS1 1HL
** Grp 7 - ( pos 337 , len 24 )
Mr J Smith - 0777 123456
** Grp 8 - ( pos 396 , len 12 )
01921 123456
** Grp 9 - ( pos 427 , len 0 ) EMPTY