Preg_match从XML中的所有URL中删除跟踪代码

I want to extract all the urls from an XML file, excludeing the the tracking code in the url:

Here's an example of a URL, they all follow the same format

http://www.domain.com.au/category/pXXXXXX?uni_id=XXXXXX&cid=1_demo_1

So the only thing that changes between the domains is XXXXXX which is a numerical value

The end result I want is

http://www.domain.com.au/category/pXXXXXX

I have tried to use preg_replace in the below code but it ended up replacing the whole URL with a random (i think) number

$data = preg_replace('/http\:\/\/www\.domain\.com.au\/[^\?]+([^.]+)/','',$data);

Match URLs in the XML with preg_match():

preg_match("(http://[^\s]+|ftp://[^\s]+)", $input, $matches);

Then, you should use preg_replace() and should only match the part of the string that needs to be removed:

foreach($matches as $value)
{
    preg_replace("(\?[^\s]+)","",$value);
}