Re: [PHP] Help: Validate Domain Name by Regular Express
On Sun, 2011-01-09 at 11:37 +0800, WalkinRaven wrote: > On 01/09/2011 01:09 AM, Ashley Sheridan wrote: > > On Sat, 2011-01-08 at 16:55 +0800, WalkinRaven wrote: > > > >> PHP 5.3 PCRE > >> > >> Regular Express to match domain names format according to RFC 1034 - > >> DOMAIN NAMES - CONCEPTS AND FACILITIES > >> > >> /^ > >> ( > >> [a-z] | > >> [a-z] (?:[a-z]|[0-9]) | > >> [a-z] (?:[a-z]|[0-9]|\-){1,61} (?:[a-z]|[0-9]) ) # One > >> label > >> > >> (?:\.(?1))*+# More labels > >> \.? # Root domain name > >> $/iDx > >> > >> This rule matches only and. but not > >> > >> I don't know what wrong with it. > >> > >> Thank you. > >> > > > > > > > > I think trying to do all of this in one regex will prove more trouble > > than it's worth. Maybe breaking it down into something like this: > > > > > $domain = "www.ashleysheridan.co.uk"; > > $valid = false; > > > > $tlds = array('aero', 'asia', 'biz', 'cat', 'com', 'coop', 'edu', 'gov', > > 'info', 'int', 'jobs', 'mil', 'mobi', 'museum', 'name', 'net', 'org', > > 'pro', 'tel', 'travel', 'xxx', 'ac', 'ad', 'ae', 'af', 'ag', 'ai', 'al', > > 'am', 'an', 'ao', 'aq', 'ar', 'as', 'at', 'au', 'aw', 'ax', 'az', 'ba', > > 'bb', 'bd', 'be', 'bf', 'bg', 'bh', 'bi', 'bj', 'bm', 'bn', 'bo', 'br', > > 'bs', 'bt', 'bv', 'bw', 'by', 'bz', 'ca', 'cc', 'cd', 'cf', 'cg', 'ch', > > 'ci', 'ck', 'cl', 'cm', 'cn', 'co', 'cr', 'cu', 'cv', 'cx', 'cy', 'cz', > > 'de', 'dj', 'dk', 'dm', 'do', 'dz', 'ec', 'ee', 'eg', 'er', 'es', 'et', > > 'eu', 'fi', 'fj', 'fk', 'fm', 'fo', 'fr', 'ga', 'gb', 'gd', 'ge', 'gf', > > 'gg', 'gh', 'gi', 'gl', 'gm', 'gn', 'gp', 'gq', 'gr', 'gs', 'gt', 'gu', > > 'gw', 'gy', 'hk', 'hm', 'hn', 'hr', 'ht', 'hu', 'id', 'ie', 'il', 'im', > > 'in', 'io', 'iq', 'ir', 'is', 'it', 'je', 'jm', 'jo', 'jp', 'ke', 'kg', > > 'kh', 'ki', 'km', 'kn', 'kp', 'kr', 'kw', 'ky', 'kz', 'la', 'lb', 'lc', > > 'li', 'lk', 'lr', 'ls', 'lt', 'lu', 'lv', 'ly', 'ma', 'mc', 'md', 'me', > > 'mg', 'mh', 'mk', 'ml', 'mm', 'mn', 'mo', 'mp', 'mq', 'mr', 'ms', 'mt', > > 'mu', 'mv', 'mw', 'mx', 'my', 'mz', 'na', 'nc', 'ne', 'nf', 'ng', 'ni', > > 'nl', 'no', 'np', 'nr', 'nu', 'nz', 'om', 'pa', 'pe', 'pf', 'pg', 'ph', > > 'pk', 'pl', 'pm', 'pn', 'pr', 'ps', 'pt', 'pw', 'py', 'qa', 're', 'ro', > > 'rs', 'ru', 'rw', 'sa', 'sb', 'sc', 'sd', 'se', 'sg', 'sh', 'si', 'sj', > > 'sk', 'sl', 'sm', 'sn', 'so', 'sr', 'st', 'su', 'sv', 'sy', 'sz', 'tc', > > 'td', 'tf', 'tg', 'th', 'tj', 'tk', 'tl', 'tm', 'tn', 'to', 'tp', 'tr', > > 'tt', 'tv', 'tw', 'tz', 'ua', 'ug', 'uk', 'us', 'uy', 'uz', 'va', 'vc', > > 've', 'vg', 'vi', 'vn', 'vu', 'wf', 'ws', 'ye', 'yt', 'za', 'zm', > > 'zw', ); > > > > > > if(strlen($domain<= 253)) > > { > > $labels = explode('.', $domain); > > if(in_array($labels[count($labels)-1], $tlds)) > > { > > for($i=0; $i > { > > if(strlen($labels[$i])<= 63&& > > (!preg_match('/^[a-z0-9][a-z0-9 > > \-]*?[a-z0-9]$/', $labels[$i]) || preg_match('/^[0-9]+$/', > > $labels[$i]) )) > > { > > $valid = false; > > break; // no point continuing if one label is > > wrong > > } > > else > > { > > $valid = true; > > } > > } > > } > > } > > > > var_dump($valid); > > > > > > This matches the last label with a TLD, and each label thereafter > > against the standard a-z0-9 and hyphen rule as indicated in the > > preferred characters allowed in a label (LDH rule), with the start and > > end character in a label isn't a hyphen (oddly enough it doesn't mention > > starting with a digit!) > > > > Also, each label is checked to ensure it doesn't run over 63 characters, > > and the whole thing isn't over 253 characters. Lastly, each label is > > checked to ensure it doesn't completely consist of digits. > > > > I've tested it only with my domain so far, but it should work fairly > > well. As I said before, I couldn't think of a way to do it all with one > > regex. It could probably be done, but would you really want to create a > > huge and difficult to read/understand expression just because it's > > possible? > > > > Thanks, > > Ash > > http://www.ashleysheridan.co.uk > > > > > > > > Thank you for replying, Ash. > > I know it may better to pre-deal it with explode()-like, and then we > will get a less complex regular express. But I just want to know what > the problem in my Regular express. > > And the code you've offered, I don't like the idea of a limited set of > suffix, for when it may be updated some times. I just want to do format > validation, not content validation. > > And the regular express itself, yes it is complex, but I've checked it > times very carefully -- letter by letter -- I just don't understand > what's wrong with it. Or there is some bug in PCRE engine? The list there
Re: [PHP] Help: Validate Domain Name by Regular Express
On Sun, 2011-01-09 at 11:44 +0800, WalkinRaven wrote: > Right, RFC 1034 allow valid endless . parts, till the sum length is over > 255. > > On 01/09/2011 01:21 AM, TR Shaw wrote: > > On Jan 8, 2011, at 12:09 PM, Ashley Sheridan wrote: > > > >> On Sat, 2011-01-08 at 16:55 +0800, WalkinRaven wrote: > >> > >>> PHP 5.3 PCRE > >>> > >>> Regular Express to match domain names format according to RFC 1034 - > >>> DOMAIN NAMES - CONCEPTS AND FACILITIES > >>> > >>> /^ > >>> ( > >>>[a-z] | > >>>[a-z] (?:[a-z]|[0-9]) | > >>>[a-z] (?:[a-z]|[0-9]|\-){1,61} (?:[a-z]|[0-9]) ) # One > >>> label > >>> > >>> (?:\.(?1))*+# More labels > >>> \.? # Root domain name > >>> $/iDx > >>> > >>> This rule matches only and. but not > >>> > >>> I don't know what wrong with it. > >>> > >>> Thank you. > >>> > >> > >> > >> I think trying to do all of this in one regex will prove more trouble > >> than it's worth. Maybe breaking it down into something like this: > >> > >> >> $domain = "www.ashleysheridan.co.uk"; > >> $valid = false; > >> > >> $tlds = array('aero', 'asia', 'biz', 'cat', 'com', 'coop', 'edu', 'gov', > >> 'info', 'int', 'jobs', 'mil', 'mobi', 'museum', 'name', 'net', 'org', > >> 'pro', 'tel', 'travel', 'xxx', 'ac', 'ad', 'ae', 'af', 'ag', 'ai', 'al', > >> 'am', 'an', 'ao', 'aq', 'ar', 'as', 'at', 'au', 'aw', 'ax', 'az', 'ba', > >> 'bb', 'bd', 'be', 'bf', 'bg', 'bh', 'bi', 'bj', 'bm', 'bn', 'bo', 'br', > >> 'bs', 'bt', 'bv', 'bw', 'by', 'bz', 'ca', 'cc', 'cd', 'cf', 'cg', 'ch', > >> 'ci', 'ck', 'cl', 'cm', 'cn', 'co', 'cr', 'cu', 'cv', 'cx', 'cy', 'cz', > >> 'de', 'dj', 'dk', 'dm', 'do', 'dz', 'ec', 'ee', 'eg', 'er', 'es', 'et', > >> 'eu', 'fi', 'fj', 'fk', 'fm', 'fo', 'fr', 'ga', 'gb', 'gd', 'ge', 'gf', > >> 'gg', 'gh', 'gi', 'gl', 'gm', 'gn', 'gp', 'gq', 'gr', 'gs', 'gt', 'gu', > >> 'gw', 'gy', 'hk', 'hm', 'hn', 'hr', 'ht', 'hu', 'id', 'ie', 'il', 'im', > >> 'in', 'io', 'iq', 'ir', 'is', 'it', 'je', 'jm', 'jo', 'jp', 'ke', 'kg', > >> 'kh', 'ki', 'km', 'kn', 'kp', 'kr', 'kw', 'ky', 'kz', 'la', 'lb', 'lc', > >> 'li', 'lk', 'lr', 'ls', 'lt', 'lu', 'lv', 'ly', 'ma', 'mc', 'md', 'me', > >> 'mg', 'mh', 'mk', 'ml', 'mm', 'mn', 'mo', 'mp', 'mq', 'mr', 'ms', 'mt', > >> 'mu', 'mv', 'mw', 'mx', 'my', 'mz', 'na', 'nc', 'ne', 'nf', 'ng', 'ni', > >> 'nl', 'no', 'np', 'nr', 'nu', 'nz', 'om', 'pa', 'pe', 'pf', 'pg', 'ph', > >> 'pk', 'pl', 'pm', 'pn', 'pr', 'ps', 'pt', 'pw', 'py', 'qa', 're', 'ro', > >> 'rs', 'ru', 'rw', 'sa', 'sb', 'sc', 'sd', 'se', 'sg', 'sh', 'si', 'sj', > >> 'sk', 'sl', 'sm', 'sn', 'so', 'sr', 'st', 'su', 'sv', 'sy', 'sz', 'tc', > >> 'td', 'tf', 'tg', 'th', 'tj', 'tk', 'tl', 'tm', 'tn', 'to', 'tp', 'tr', > >> 'tt', 'tv', 'tw', 'tz', 'ua', 'ug', 'uk', 'us', 'uy', 'uz', 'va', 'vc', > >> 've', 'vg', 'vi', 'vn', 'vu', 'wf', 'ws', 'ye', 'yt', 'za', 'zm', > >> 'zw', ); > >> > >> > >> if(strlen($domain<= 253)) > >> { > >>$labels = explode('.', $domain); > >>if(in_array($labels[count($labels)-1], $tlds)) > >>{ > >>for($i=0; $i >>{ > >>if(strlen($labels[$i])<= 63&& > >> (!preg_match('/^[a-z0-9][a-z0-9 > >> \-]*?[a-z0-9]$/', $labels[$i]) || preg_match('/^[0-9]+$/', > >> $labels[$i]) )) > >>{ > >>$valid = false; > >>break; // no point continuing if one label is > >> wrong > >>} > >>else > >>{ > >>$valid = true; > >>} > >>} > >>} > >> } > >> > >> var_dump($valid); > >> > >> > >> This matches the last label with a TLD, and each label thereafter > >> against the standard a-z0-9 and hyphen rule as indicated in the > >> preferred characters allowed in a label (LDH rule), with the start and > >> end character in a label isn't a hyphen (oddly enough it doesn't mention > >> starting with a digit!) > >> > >> Also, each label is checked to ensure it doesn't run over 63 characters, > >> and the whole thing isn't over 253 characters. Lastly, each label is > >> checked to ensure it doesn't completely consist of digits. > >> > >> I've tested it only with my domain so far, but it should work fairly > >> well. As I said before, I couldn't think of a way to do it all with one > >> regex. It could probably be done, but would you really want to create a > >> huge and difficult to read/understand expression just because it's > >> possible? > > Ash > > > > I doubt its possible since the ccTLD's have valid 3 and more dotted domain > > names. You should see .us And .uk doesn't follow the ccTLS rules for .tk > > for example. > > > > Now, if the purpose is to write a regex for a host name then that's a > > different story. > > > > Tom > Which is what my code does too, while also checking for label length. Thanks, Ash http://www.ashleysheridan.co.uk
Re: [PHP] Re: Help: Validate Domain Name by Regular Express
Tamara Temple wrote: > On Jan 8, 2011, at 2:22 PM, Al wrote: > >> >> >> On 1/8/2011 3:55 AM, WalkinRaven wrote: >>> PHP 5.3 PCRE >>> >>> Regular Express to match domain names format according to RFC 1034 >>> - DOMAIN >>> NAMES - CONCEPTS AND FACILITIES >>> >>> /^ >>> ( >>> [a-z] | >>> [a-z] (?:[a-z]|[0-9]) | >>> [a-z] (?:[a-z]|[0-9]|\-){1,61} (?:[a-z]|[0-9]) ) # One label >>> >>> (?:\.(?1))*+ # More labels >>> \.? # Root domain name >>> $/iDx >>> >>> This rule matches only and . but not >>> >>> >>> I don't know what wrong with it. >>> >>> Thank you. >> >> >> >> Look at filter_var() >> >> Validates value as URL (according to » >> http://www.faqs.org/rfcs/rfc2396) , >> > > > I'm wondering what mods to make for this now that unicode chars are > allowed in domain names You're talking about IDNs ? The actual domain name is still US-ASCII, only when you decode punycode do you get UTF8 characters. -- Per Jessen, Zürich (10.1°C) -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Re: Help: Validate Domain Name by Regular Express
At 12:15 PM +0100 1/9/11, Per Jessen wrote: Tamara Temple wrote: > I'm wondering what mods to make for this now that unicode chars are allowed in domain names You're talking about IDNs ? The actual domain name is still US-ASCII, only when you decode punycode do you get UTF8 characters. Per Jessen, Zürich (10.1°C) Unfortunately, you are correct. It was never the intention of the IDNS WG for the end-user to see PUNYCODE, but rather that all IDNS be seen by the end-user as actual Unicode code points (Unicode characters). The only browser that currently supports this is Safari. For example -- http://xn--19g.com -- is square-root dot com. In all browsers except Safari, PUNYCODE is shown in the address bar, but in Safari it's shown as .com The IDNS works, but for fear of homographic attacks IE (and other browsers) will not show the IDNS correctly. Cheers, tedd -- --- http://sperling.com/ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Re: Help: Validate Domain Name by Regular Express
On Sun, Jan 9, 2011 at 11:58, tedd wrote: > > For example -- > > http://xn--19g.com > > -- is square-root dot com. In all browsers except Safari, PUNYCODE is shown > in the address bar, but in Safari it's shown as ˆ.com Not sure if that's a typo or an issue in translation while the email was being relayed through the tubes, but ˆ.com directs to xn--wqa.com here. -- Network Infrastructure Manager Documentation, Webmaster Teams http://www.php.net/ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Re: Help: Validate Domain Name by Regular Express
On Sun, Jan 9, 2011 at 12:32, Ashley Sheridan wrote: > > ^ is to the power of, not square root, which is √, which does translate to > Tedds domain Thanks for the math lesson, professor, but I already knew that. ;-P My point is, and as you can see in the quoted text from my email, that I don't know if it was a typo on Tedd's part or what, but ^.com is what came through here. -- Network Infrastructure Manager Documentation, Webmaster Teams http://www.php.net/ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Re: Help: Validate Domain Name by Regular Express
Daniel Brown wrote: On Sun, Jan 9, 2011 at 11:58, tedd wrote: For example -- http://xn--19g.com -- is square-root dot com. In all browsers except Safari, PUNYCODE is shown in the address bar, but in Safari it's shown as ˆ.com Not sure if that's a typo or an issue in translation while the email was being relayed through the tubes, but ˆ.com directs to xn--wqa.com here. error in translation. I get the same domain for: seamonkey firefox googlechrome safari but yes, the actual square root character appears in safari only. Interesting! Donovan -- D Brooke -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Re: Help: Validate Domain Name by Regular Express
On Sun, 2011-01-09 at 12:23 -0500, Daniel Brown wrote: > On Sun, Jan 9, 2011 at 11:58, tedd wrote: > > > > For example -- > > > > http://xn--19g.com > > > > -- is square-root dot com. In all browsers except Safari, PUNYCODE is shown > > in the address bar, but in Safari it's shown as ˆ.com > > Not sure if that's a typo or an issue in translation while the > email was being relayed through the tubes, but ˆ.com directs to > xn--wqa.com here. > > -- > > Network Infrastructure Manager > Documentation, Webmaster Teams > http://www.php.net/ > ^ is to the power of, not square root, which is √, which does translate to Tedds domain Thanks, Ash http://www.ashleysheridan.co.uk
Re: [PHP] Re: Help: Validate Domain Name by Regular Express
On Sun, 2011-01-09 at 12:38 -0500, Daniel Brown wrote: > On Sun, Jan 9, 2011 at 12:32, Ashley Sheridan > wrote: > > > > ^ is to the power of, not square root, which is √, which does translate to > > Tedds domain > > Thanks for the math lesson, professor, but I already knew that. ;-P > > My point is, and as you can see in the quoted text from my email, > that I don't know if it was a typo on Tedd's part or what, but ^.com > is what came through here. > > -- > > Network Infrastructure Manager > Documentation, Webmaster Teams > http://www.php.net/ Sorry, lol! It came through as an unrecognised character for me, maybe some email issue then? Thanks, Ash http://www.ashleysheridan.co.uk
[PHP] curl & rtmp
does cUrl supports rtmp protocol? if so is there any example? do we need enable different library? so if not can we save rtmp by curl? if not is there any other rtmp downloader that u know ?
Re: [PHP] curl & rtmp
On Sun, Jan 9, 2011 at 2:58 PM, Tontonq Tontonq wrote: > does cUrl supports rtmp protocol? if so is there any example? These are obvious by searching for the terms, which seem to be quite specific to have not found an answer in the search engines. do we need > enable different library? so if not can we save rtmp by curl? if not is > there any other rtmp downloader that u know ? You seem to know enough to have answered this by yourself, almost in your own questions. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] gzdeflate and file_get_contents memory leak?
I have a download script that streams the contents of multiple files into a zip archive before passing it on the to browser to be downloaded. The script uses file_get_contents() and gzdeflate() to loop over multiple files to create the archive. Everything works fine, except I have noticed that for a large number of files, this script will exceed the php memory limit. I have used memory_get_peak_usage() to narrow down the source of the high memory usage and found it to be the two above methods. The methods are used in a loop and the variable containing the file data is unset() and not referenced in between calls. The script peak memory usage for the script should be a function of the single largest file that is included in the archive, but it seems to be the aggregate of all files. Here is the pseudo-code for this loop: header( /* specify header to indicate download */ ); foreach( $files as $file ) { echo zip_local_header_for($file); $data = file_get_contents( $file ) $zdata = gzdeflate( $data ); unset($data); unset($zdata); } echo zip_central_dir_for($files); If I remove either the gzdeflate and replace the file_get_contents() with a fread() based method, the script no longer experiences memory problems. Is this behavior as designed for these two functions (because PHP scripts are usually short lived)? Is there a way to get them to release memory? Is there something I'm missing? Thanks. -- Ryan -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php