Okay, so when you post a link on Facebook, it does a quick scan of the page to find images and text etc. to create a sort of preview on their site. I'm sure other social networks such as Twitter do much the same, too.
Anyway, I created a sort of "one time message" system, but when you create a message and send the link in a chat on Facebook, it probes the page and renders the message as "seen".
I know that the Facebook probe has a user agent of facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
, so I could just block all requests from anything with that user agent, but I was wondering if there's a more efficient way of achieving this with all sites that "probe" links for content?
No there's no fool-proof way to do this. The easiest way to achieve something like this is to manually block certain visitors from marking the content as seen.
Every entity on the web identifies itself with a user agent, although not every non-human entity identfies itself in an unique way there are online database like this one that can help achieve your goal.
In case of trying to block all bots via robots.txt, not every bot holds up to that standard. I will speculate that Facebook may try to prevent malware from being spread across their network by visiting any shared link.
you could try something like this in your robots.txt file
User-agent: *
Disallow: /