如何从Instagram标题中明智地删除所有尾随主题标签?

Many Instagram posts end with a plethora of hashtags, for example:

"This is one of the amazing Mountains you can find in the National Forest Park in #Zhangjiajie #Chinawhich is where James Cameron drew his inspiration for the flying mountains in #Avatar..

Credit: @phototravelnomads 
#pictoura #gydr 
#destinationearth #earthpix #ourlonelyplanet#wonderful_earthLife #timeoutsociety#fantastic_earthpics #liveoutdoors #igglobalclub#awesomeearth #mist_vision #earthdeluxe
# #worldbestgram #mthrworld #fantastic_earth#famouscaptures #destination_wow #dreamlifepix#wonderful_places #igworldclub #ig_global_life
#natureaddict #beautifuldestinations #traveler #guider#locals"

I'm looking to process the captions to remove the hashtag collection at the end, while leaving the rest intact. What would be a good approach to doing this? I'm sure I can figure out a brute force way, but I'm hoping to get some thoughts on an elegant solution. Doesn't have to be actual code. :)

Edit per burna's comment: The expected result would be:

"This is one of the amazing Mountains you can find in the National Forest Park in #Zhangjiajie #Chinawhich is where James Cameron drew his inspiration for the flying mountains in #Avatar..

Credit: @phototravelnomads"

Edit per Alan Moore's answer: This works quite well, but not in every situation. For instance, if the input text would be:

"This is one of the amazing Mountains you can find in the National Forest Park in #Zhangjiajie #Chinawhich is where James Cameron drew his inspiration for the flying mountains in #Avatar"

... it would be cut off from "#Zhangjiajie" on.

I'm thinking there's probably a bit more logic required, perhaps splitting the string into an array; checking if it ends in hashtags; if so then how many; if more than X (4?), cut it off from the first one in the last complete series.

If I understand correctly the following should work:

$hashTag="pictoura #gydr 

destinationearth #earthpix #ourlonelyplanet#wonderful_earthLife #timeoutsociety#fantastic_earthpics #liveoutdoors #igglobalclub#awesomeearth #mist_vision #earthdeluxe

 #worldbestgram #mthrworld #fantastic_earth#famouscaptures #destination_wow #dreamlifepix#wonderful_places #igworldclub #ig_global_life

natureaddict #beautifuldestinations #traveler #guider#locals";

echo preg_replace('/(#.*\s*)/','',$hashTag);

That outputs:

pictoura destinationearth natureaddict

Good luck!!

It looks like this will do it:

$result = preg_replace('/#[#\w\s]*\z/', '', $subject);

DEMO

The regex matches a hash (#), followed by zero or more of the characters that make up hashtags plus the whitespace that separates them ([#\w\s]*), followed by the end of the string (\z).

\w is equivalent to [A-Za-z0-9_]. If there are other characters that are allowed in hashtags, or if digits are not allowed, let me know and I'll update the regex.


UPDATE: If you want to remove all robo-tags while leaving the legitimate ones, there's probably no reliable way--certainly not with regexes alone. However, this will remove all but the first line of hashtags:

$result = preg_replace('/^(#[#\w\h]+\R)#[#\w\s]*\z/m', '$1', $subject);

DEMO

\h matches only horizontal whitespace (space, tab, nbsp...), and \R matches any line separator ( or any single vertical whitespace character).

As for hashtag-like things in the text, this won't touch them because it's anchored to the end of the text. The beginning-of-line anchor (^ in multiline mode) isn't really necessary, but it may help future readers of the regex (including yourself) understand what it does. Of course, comments will help even more. ;)