如何从文本中提取内容?

I have a script providing information for some ip address.

I want to extract the country from the text.

in the following text country line is "Country: US"

I want to display : US only

The text is:

[Querying whois.arin.net]
[whois.arin.net]
#
# Query terms are ambiguous.  The query is assumed to be:
#     "n 173.194.74.100"
#
# Use "?" to get help.
#

#
# The following results may also be obtained via:
# http://whois.arin.net/rest/nets;q=173.194.74.100?showDetails=true&showARIN=false&ext=netref2
#

NetRange:       173.194.0.0 - 173.194.255.255
CIDR:           173.194.0.0/16
OriginAS:       AS15169
NetName:        GOOGLE
NetHandle:      NET-173-194-0-0-1
Parent:         NET-173-0-0-0-0
NetType:        Direct Allocation
RegDate:        2009-08-17
Updated:        2012-02-24
Ref:            http://whois.arin.net/rest/net/NET-173-194-0-0-1


OrgName:        Google Inc.
OrgId:          GOGL
Address:        1600 Amphitheatre Parkway
City:           Mountain View
StateProv:      CA
PostalCode:     94043
Country:        US
RegDate:        2000-03-30
Updated:        2011-09-24
Ref:            http://whois.arin.net/rest/org/GOGL

OrgTechHandle: ZG39-ARIN
OrgTechName:   Google Inc
OrgTechPhone:  +1-650-253-0000 
OrgTechEmail:  arin-contact@google.com
OrgTechRef:    http://whois.arin.net/rest/poc/ZG39-ARIN

OrgAbuseHandle: ZG39-ARIN
OrgAbuseName:   Google Inc
OrgAbusePhone:  +1-650-253-0000 
OrgAbuseEmail:  arin-contact@google.com
OrgAbuseRef:    http://whois.arin.net/rest/poc/ZG39-ARIN

#
# ARIN WHOIS data and services are subject to the Terms of Use
# available at: https://www.arin.net/whois_tou.html
#

If it's just the regex that you need - try this - the country id will be in the first group

Country:\s*([A-Z]{2})
  • Country: - match literals
  • \s* - match any number of whitespaces, tabs etc.
  • ([A-Z]{2}) - match and capture any letter (uppercase) twice

use preg_match_all if you need all occurrences of this pattern

With preg_match you can do something like :

if (preg_match('/^Country:\s*([A-Z]{2,3)$/m', $str, $match)) {
    echo $match[1];
}

There is a phpwhois library for working with whois data. It'll get you the response as an array.

Extract with preg_match

preg_match("/Country:(.*)\"/siU", $str, $match);
echo trim($match[1]);
$regex = "/country:[\ \t
\f][A-Z]+\s/";

$txt = "descr: NCC#200X44704917
country: FR
admin-c: ACPSA223-RIPE
tech-c: TCWQQP8-RIPE";

preg_match($regex, $txt, $result);

print_r($result);

------------------------------------
Array ( [0] => country: FR )