Bug #15054
closed
Could you give an example of an email-address which doesn't work?
Thorsten, I think that your false negatives are invalid. While the first one is a valid recipient in a mail header it's nothing that should be tested in t3lib_div::validEmail(). First extract the contents between <> then test.
The second is valid per definitoin, but invalid for all practical cases. 99.99999% of all cases don't need mail addresses of internal networks.
The false positives are correct except the last one. It's indeed valid to use IP addresses. So a correct test would be to check either for an IPv4 address or a proper domain name.
This one doesn't work: +43664498NNNN@A1plus.at
(NNNN = last part of disclosed phone number; A1 is an Austrian carrier)
Yes folks all these characters are allowed:
a-zA-Z\d!#$%&'*+\-/=?^_`{|}~
I don't care much about the back tick but I do care about the +
preg_match('/^[a-zA-Z\d!#$%&*=?^`\/{|}._~+-]+@(([A-Za-z0-9]+-)?[A-Za-z0-9]+\.)+[A-Za-z]{2,}$/',$email)
This allows all allowed characters for the local part and does reasonable checking for the domain part:
- must not start or end part (eg "foo-bar", not "foo-")
. must not be repeated ( eg "..")
tld contains only letters and must be at least two chars long
This one does not work with the following valid addresses
quoting:
abc""def@domain.tld
spaces:
abc def@domain.tld
full header:
foo bar <local@domain.tld>
IP:
abc
[123.34.2.34]
The full header is IHMO not needed as the calling code could extract the part within <..>. Quoting and IPs are rarely seen. The latter is easy to implement the former is a PITA. Spaces can accoring to the RFC be collapsed (so "abc def" is equivalent to "abcdef").
Oh, and please don't forget things like
karsten@localhost
karsten@localhost.localdomain
Those are valid, and I use them a lot while testing. (I didn't check whether they work, but they definitely should!).
In addition to me last comment, there's a lot more to learn about email addresses... From http://www.faqs.org/faqs/mail/addressing/:
The following are all legal and equivalent addresses for me:
< eli netusa . net >
<eli(jah)@netusa.net>
< eli(Elijah)@netusa(not associated with usa.net).net >
(Elijah) <eli
(dougs-home)netusa.net>
< eli (the raw IP for mail (and thus subject to change)) [204.141.0.25] >
< eli
(a subtler variation on the above) [204.141.25] >
<eli (Pogonatus (latin for <the bearded>)) (qz (pronounced (queasy) ) \
.little-neck (I did not want that, but RFC1480 required it) .ny (New \
F%
!: York) .us (USA) or ) netusa (Located on Long Island) . net> (Elijah)
Wow! ;)
Karsten, I know that the RFC allows very interseting formats. The problems is that a regexp that accepts all valid input is hard to write and certainly hard to reaf and therefore not maintainable.
I suggest we either don't use a solution with a single regexp (this is by noi means required) or support the most common sane address formats.
So forget about all thore mail header comments (the values brackets are comments!). Nobody uses them in real life in a header and nobody will enter them in a form when submitting his email address.
Spaces may be ignored as well, but we may support IP addresses: user@[127.0.0.1]. But again, for readbility I'd use two regexps to handle this.
Martin, the example with the very weird addresses were not meant to be implemented... :)
Of course we should make this redable and maintainable, so I'm all for using two (or more) regexps and whatever else is needed...
- Status changed from Resolved to Closed
Also available in: Atom
PDF