解析组合日志格式日志的问题

I have changed my nginx logs to show custom logs instead of the default. I've added two fields $request_time and $upstream_response_time. I'm using PHP to parse this.

I'm not great with regexes but I tried to modify another regex I picked up from Parse Apache log in PHP using preg_match

The regex there is:

$regex = '/^(\S+) (\S+) (\S+) \[([^:]+):(\d+:\d+:\d+) ([^\]]+)\] \"(\S+) (.*?) (\S+)\" (\S+) (\S+) "([^"]*)" "([^"]*)"$/';

I'm not great with regexes, so this is what I'm trying to do instead:

$pattern = '/^(\S+) (\S+) (\S+) \[([^:]+):(\d+:\d+:\d+) ([^\]]+)\] \"(\S+) (.*?) (\S+)\" (\S+) (\S+) "([^"]*)" "([^"]*)"$ ^(\S+) ^(\S+) /';

Where my input looks something like this:

$line = "127.0.0.1 - - [12/Nov/2015:13:39:19 -0500] \"GET /mj/feed/ HTTP/1.1\" 200 3276 \"-\" \"rogerbot/1.0 (http://www.moz.com/dp/rogerbot, rogerbot-crawler@moz.com)\" 0.254 0.254";

The two extra fields are 0.254 and 0.254 above.

So I'm trying to obtain [14] = 0.254 and [15] = 0.254.

I've tried playing around with the regex through live online regex tools without any luck.

Any help would be appreciated.

The ^ is the start of a string (or line if the m modifier is being used). In a character class it negates the character inside. So

^(\S+) ^(\S+)

doesn't work in the middle of your regex.

Give this a try:

^(\S+) (\S+) (\S+) \[([^:]+):(\d+:\d+:\d+) ([^\]]+)\] \"(\S+) (.*?) (\S+)\" (\S+) (\S+) "([^"]*)" "([^"]*)" (\S+) (\S+)$

Regex101 Demo: https://regex101.com/r/lQ6zX9/1

or another way of writing using the negated character class:

^(\S+) (\S+) (\S+) \[([^:]+):(\d+:\d+:\d+) ([^\]]+)\] \"(\S+) (.*?) (\S+)\" (\S+) (\S+) "([^"]*)" "([^"]*)" ([^\s]+) ([^\s]+)$