当他们以撇号开头时将城市名称大写

I have an issue with right capitalization of (dutch) city names when they start with an apostrophe. For instance I could have the names:

'S-HERTOGENBOSCH or 's gravendeel or 'T Harde

What I would like to do is to bring all to lowercase and then capitalize the following letter after the prefix 'S or 's or 'T. So the outcome should be:

's-Hertogenbosch and 's Gravendeel and 't Harde

I'm thinking about using a Regex to do this but am not quite sure yet how this should be done. Could someone point me in the right direction?

Thanks!

Try the following function which is based on sanchises RegEx (I edited it slightly...):

function dutch_city_name($name) {
    $name = strtolower(trim($name));
    $matches = array();
    preg_match("/'([a-z])( |-)[a-z]*/", $name, $matches);
    if(count($matches) == 0) {
        return $name;
    }
    return "'".$matches[1].$matches[2].ucfirst(substr($name, 3, strlen($name) - 3));
}

I tried it and it is working.

Firstly, I would like to reccomend websites like regex101.com or equivalent. Then, lets talk you through a very basic regex: -You want the literal "'" character followed by exactly one other character which you would like to match to 'uncapitalize', -and then a whole word

Basically, you need to match something of the form '(a-zA-Z)(?: |\-)[a-zA-Z]*. From left to right

  • ' Literal '
  • (a-zA-Z) Single character in the alphabet, lower- or uppercase. Is a matching group.
  • (?: |\-) Either a space or a dash. Is not a matching group
  • [a-zA-Z]* A series of characters in the alphabet. Could be (a-zA-Z)* if you want something with this bit too.

Now that you have your matching, all you need to do is replace it with the uncapitalized version, for example using a PHP function.

You could use preg_replace_callback.

$city = strtolower("'T-HERTOGENBOSCH");

echo preg_replace_callback("/('(s|t)( |\-))([a-z])/", function($matches) {
    return $matches[1] . ucfirst($matches[4]);
}, $city);

The pattern is using multiple subpatterns, whose results getting reassembled in the callback function:

('(s|t)( |\-)) # Apostrophe, then 's' or 't', then '<space>' or '-'
([a-z])        # The following lowercased character

Note that I've wrapped the first part into an additional subpattern. This makes reassembling it simpler.

Here's one without regex. It simply checks if the first character is an apostrophe and if so, skips the character after the apostrophe when searching for the first letter to capitalize.

function capitalizeCityName($name) {

    $name = strtolower(trim($name));        
    $i = ($name[0] === "'") ? 2 : 1;

    for(; $i<strlen($name); $i++) {
        if(ctype_alpha($name[$i])) {
            $name[$i] = strtoupper($name[$i]);
            break;
        }
    }

    return $name;
}

print capitalizeCityName("'T Harde"); //'t Harde
print capitalizeCityName("Harde"); //Harde

I don't know if the PHP replace function you want to use supports changing case of letters on dynamic replace string. But following worked with Perl regular expression engine in text editor UltraEdit v21.10.

Search string:

'([STst])(\W)(\w)([\w\-]+)

Replace string:

'\L\1\E\2\U\3\E\L\4\E

or

'\l\1\2\u\3\L\4\E

The search string matches:

  • a straight apostrophe,
  • followed by character s or t in any case marked for backreferencing as string 1,
  • a single non word character marked for backreferencing as string 2,
  • a single word character marked for backreferencing as string 3,
  • 1 or more additional word characters or hyphens marked for backreferencing as string 4.

The replace string:

  • keeps the apostrophe,
  • first marked string (character s or t in any case) converted to lower case,
  • second marked string unmodified,
  • third marked string (first word character of city name) converted to upper case,
  • fourth marked string converted to lower case.

Explanation of the special characters in replace string:

  • \l ... convert only next character to lower case.
  • \u ... convert only next character to upper case.
  • \L ... convert all characters up to \E to lower case.
  • \U ... convert all characters up to \E to upper case.

Note: The case conversion works only for the ASCII letters A-Za-z and not for language specific, localized letters like German umlauts, characters with an accent, etc.