Bug #15054
closedGremlin: Insufficient email validation
0%
Description
The RegEx used in t3lib_div::validEmail() will return true only in some rather homebred cases. The check in t3lib/jsfunc.validateform.js in function validateForm is even worse.
I'd suggest to use one of those rated with 4 of those very popular green bars ;-) at [1]. One of these should work in JS also. And at least they're all closer to the RFC 822 [2].
[1] http://www.regexlib.com/Search.aspx?k=email
[2] http://www.faqs.org/rfcs/rfc822.html
(issue imported from #M1606)
Updated by Schmid Valentin about 19 years ago
Could you give an example of an email-address which doesn't work?
Updated by Thorsten Kahler about 19 years ago
False positives:
-abc@def.gh
abc@def.ßcom
abc@....com
abc@def.24
abc@192.168.0
False negatives:
"abc" <def@ghi.jkl>
abc@intranet
These examples are not tested, just constructed looking at the regex.
Updated by Martin Kutschker about 19 years ago
Thorsten, I think that your false negatives are invalid. While the first one is a valid recipient in a mail header it's nothing that should be tested in t3lib_div::validEmail(). First extract the contents between <> then test.
The second is valid per definitoin, but invalid for all practical cases. 99.99999% of all cases don't need mail addresses of internal networks.
The false positives are correct except the last one. It's indeed valid to use IP addresses. So a correct test would be to check either for an IPv4 address or a proper domain name.
Updated by Martin Kutschker almost 19 years ago
This one doesn't work: +43664498NNNN@A1plus.at
(NNNN = last part of disclosed phone number; A1 is an Austrian carrier)
Yes folks all these characters are allowed:
a-zA-Z\d!#$%&'*+\-/=?^_`{|}~
I don't care much about the back tick but I do care about the +
Updated by Martin Kutschker almost 19 years ago
preg_match('/^[a-zA-Z\d!#$%&*=?^`\/{|}._~+-]+@(([A-Za-z0-9]+-)?[A-Za-z0-9]+\.)+[A-Za-z]{2,}$/',$email)
This allows all allowed characters for the local part and does reasonable checking for the domain part:
- must not start or end part (eg "foo-bar", not "foo-")
. must not be repeated ( eg "..")
tld contains only letters and must be at least two chars long
This one does not work with the following valid addresses
quoting:
abc""def@domain.tld
[123.34.2.34]
spaces:
abc def@domain.tld
full header:
foo bar <local@domain.tld>
IP:
abc
The full header is IHMO not needed as the calling code could extract the part within <..>. Quoting and IPs are rarely seen. The latter is easy to implement the former is a PITA. Spaces can accoring to the RFC be collapsed (so "abc def" is equivalent to "abcdef").
Updated by Wolfgang Klinger over 18 years ago
@Martin: the last one does not allow this valid E-Mail address: art@this-border-rocks.com
Updated by Karsten Dambekalns over 18 years ago
Oh, and please don't forget things like
karsten@localhost
karsten@localhost.localdomain
Those are valid, and I use them a lot while testing. (I didn't check whether they work, but they definitely should!).
Updated by Karsten Dambekalns over 18 years ago
In addition to me last comment, there's a lot more to learn about email addresses... From http://www.faqs.org/faqs/mail/addressing/:
The following are all legal and equivalent addresses for me:
< eli netusa . net >
(dougs-home)netusa.net>
<eli(jah)@netusa.net>
< eli(Elijah)@netusa(not associated with usa.net).net >
(Elijah) <eli
< eli (the raw IP for mail (and thus subject to change)) [204.141.0.25] >
(a subtler variation on the above) [204.141.25] >
< eli
<eli (Pogonatus (latin for <the bearded>)) (qz (pronounced (queasy) ) \
!: York) .us (USA) or ) netusa (Located on Long Island) . net> (Elijah)
.little-neck (I did not want that, but RFC1480 required it) .ny (New \
F%
Wow! ;)
Updated by Karsten Dambekalns over 18 years ago
And this one is nice:
http://www.remote.org/jochen/mail/info/chars.html
Updated by Martin Kutschker over 18 years ago
Karsten, I know that the RFC allows very interseting formats. The problems is that a regexp that accepts all valid input is hard to write and certainly hard to reaf and therefore not maintainable.
I suggest we either don't use a solution with a single regexp (this is by noi means required) or support the most common sane address formats.
So forget about all thore mail header comments (the values brackets are comments!). Nobody uses them in real life in a header and nobody will enter them in a form when submitting his email address.
Spaces may be ignored as well, but we may support IP addresses: user@[127.0.0.1]. But again, for readbility I'd use two regexps to handle this.
Updated by Karsten Dambekalns over 18 years ago
Martin, the example with the very weird addresses were not meant to be implemented... :)
Of course we should make this redable and maintainable, so I'm all for using two (or more) regexps and whatever else is needed...
Updated by Karsten Dambekalns about 16 years ago
Just to make sure potential synergy isn't lost:
http://forge.typo3.org/repositories/entry/package-flow3/trunk/Classes/Validation/Validator/F3_FLOW3_Validation_Validator_EmailAddress.php
But I just noticed that that one also misses "something@192.168.0.1", so keep an eye on updates to that code. :)