Currently, I limit an address to 100 characters with no rules about what it must be composed of. Punctuation, digits, letters; all are welcome.
I use strip_tags
upon saving the address to my database (prepared statements). I use $this->escape()
(Zend Framework) when echoing it to a page.
I don't want to go crazy, but I think that I need to be a little more restrictive. What am I missing?
If these are US addresses, you should use the United States Postal Service's APIs to look up and standardize addresses.
To build on Dan's answer, I would submit that the best solution for you is to perform a simple request to an address validation service, such as SmartyStreets' LiveAddress (it's totally free and uses official USPS data). I actually work for SmartyStreets, so I've been through these hoops time and time again.
I read the comments on your question, and even though it doesn't appear to be a matter of standardizing the address, ... it really is in the end. Users can enter bad data -- and if they do it on purpose, that does tell you something right there, as Catcall mentioned -- and you want to make sure it is correct and complete. CASS-Certified vendors can provide this service, and I think the easiest/most affordable one you'll find is LiveAddress.
There's no regular expression which will do this, and even the USPS API has its major drawbacks/limitations, despite being the authority (for example, it's not always clear on its deliverability).
When you verify any input, even an address, for validity, it also builds the trust of your users so that they know you care about their interactions with you.