正则表达式 - 替换存储元素中的空格

I have a HTML page in a string, and I need to replace all the spaces in the a href references with %20 so my parser understands it.

So for example:

<a href="file with spaces.mp3">file with spaces.mp3</a>

needs to turn into

<a href="file%20with%20spaces.mp3">file with spaces.mp3</a>

One space works fine since I can just use

(.+?)([ *])(.+?)

and then substitute it with %20 in between $1 and $3

But how would you do it for multiple and an unknown number of spaces, while still having the file name to put the %20's in between?

While it's not recommended to use regex, here's a potential regex that works for your example:

(?:<a href="|\G)\S*\K (?=[^">]*")

regex101 demo

(?:
  <a href="   # Match <a href=" literally
|
\G            # Or start the match from the previous end-match
)
\S*           # Match any non-space characters
\K            # Reset the match so only the following matches are replaced
 (?=[^">]*")  # Ensure that the matching part is still within the href link

The above regex could also break on certain edge-cases, so I recommend using DOMDocument in like Amal's excellent answer which is more robust.

HTML is not a regular language and cannot be properly parsed using a regular expression. Use a DOM parser instead. Here's a solution using PHP's built-in DOMDocument class:

$dom = new DOMDocument;
$dom->loadHTML($html);

foreach ($dom->getElementsByTagName('a') as $tag) {
    $href = $tag->getAttribute('href');
    $href = str_replace(' ', '%20', $href);
    $tag->setAttribute('href', $href);
}

$html = $dom->saveHTML();

It basically iterates over all the links and changes the href attribute using str_replace.

Demo