ID: 43098 Updated by: [EMAIL PROTECTED] Reported By: harvie at email dot cz -Status: Open +Status: Feedback Bug Type: Performance problem Operating System: Linux (Debian Etch) - php5-cli PHP Version: 5.2.4 New Comment:
It won't matter what you put in timeout if you wrap everything in a while(1) loop. Of course it just sits there, try adding some error checking there.. Previous Comments: ------------------------------------------------------------------------ [2007-10-25 17:21:10] harvie at email dot cz [EMAIL PROTECTED]: It's possible, but in this case the default_socket_timeout have to close the socket and continue with next URL (crawler freezing too with many different URLs). Or default_socket_timeout doesn't matter in here? It's true, that my router sux, but i don't see any reason why PHP should crash at first problem with connectivity, that ends in total script freeze, i thought, that is why we have socket timeout option. Or not? ------------------------------------------------------------------------ [2007-10-25 11:57:20] [EMAIL PROTECTED] Are you sure it's not just your network connection freezing? f.e. some kind of firewall stopping you from connecting to one site too many times in too short time? (fyi: your script works fine for me, I stopped it after 10 minutes..) ------------------------------------------------------------------------ [2007-10-24 19:02:01] harvie at email dot cz I tryed to run this at PHP4 - CLI (MS Windows Server2003) It returned this errors. May be, this error is handled another way in PHP5 and it causes the hang up... c:/>bugshow.php #0#1#2#3#4#5#6#7#8#9#10#11#12#13#14#15 Warning: file_get_contents(res:///PHP/http:\\w.moreover.com\): failed to open stream: No such file or directory in D:\bug.php on line 12 #16#17#18#19#20#21#22#23#24#25#26#27#28#29#30#31#32#33#34#35#36#37#38#39#40#41#4 2#43#44#45#46#47 Warning: file_get_contents(res:///PHP/http:\\w.moreover.com\): failed to open stream: No such file or directory in D:\bug.php on line 12 #48#49#50 ------------------------------------------------------------------------ [2007-10-24 18:16:26] harvie at email dot cz I have runned the script with strace debuger (This is debuging interpreter calls, not PHP code... of course.), if you are interested: # strace ./bugshow.php execve("./emails.php", ["./emails.php"], [/* 29 vars */]) = 0 uname({sys="Linux", node="harvie-ntb", ...}) = 0 brk(0) = 0x854c000 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7fa8000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=67381, ...}) = 0 mmap2(NULL, 67381, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7f97000 close(3) = 0 ...Lot of irrelevant stuff... connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.2.1")}, 28) = 0 fcntl64(3, F_GETFL) = 0x2 (flags O_RDWR) fcntl64(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 gettimeofday({1193256681, 718238}, NULL) = 0 poll([{fd=3, events=POLLOUT, revents=POLLOUT}], 1, 0) = 1 send(3, "\6?\1\0\0\1\0\0\0\0\0\0\1w\10moreover\3com\0\0\1\0\1", 32, 0) = 32 poll([{fd=3, events=POLLIN, revents=POLLIN}], 1, 5000) = 1 ioctl(3, FIONREAD, [56]) = 0 recvfrom(3, "\6?\201\200\0\1\0\1\0\0\0\0\1w\10moreover\3com\0\0\1\0"..., 1024, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.2.1")}, [16]) = 56 close(3) = 0 gettimeofday({1193256681, 768001}, NULL) = 0 socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 3 fcntl64(3, F_GETFL) = 0x2 (flags O_RDWR) fcntl64(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("170.224.8.50")}, 16) = -1 EINPROGRESS (Operation now in progress) poll([{fd=3, events=POLLIN|POLLOUT|POLLERR|POLLHUP, revents=POLLOUT}], 1, 1000) = 1 getsockopt(3, SOL_SOCKET, SO_ERROR, [0], [4]) = 0 fcntl64(3, F_SETFL, O_RDWR) = 0 send(3, "GET / HTTP/1.0\r\n", 16, 0) = 16 send(3, "Host: w.moreover.com\r\n", 22, 0) = 22 send(3, "\r\n", 2, 0) = 2 poll([{fd=3, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0 poll([{fd=3, events=POLLIN|POLLERR|POLLHUP, revents=POLLIN}], 1, 1000) = 1 recv(3, "HTTP/1.1 200 OK\r\nDate: Wed, 24 O"..., 8192, 0) = 524 poll([{fd=3, events=POLLIN|POLLERR|POLLHUP, revents=POLLIN}], 1, 1000) = 1 recv(3, "s, online news, current awarenes"..., 8192, 0) = 524 poll([{fd=3, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0 poll([{fd=3, events=POLLIN|POLLERR|POLLHUP}], 1, 1000) = 0 poll([{fd=3, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0 poll([{fd=3, events=POLLIN|POLLERR|POLLHUP}], 1, 1000) = 0 poll([{fd=3, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0 poll([{fd=3, events=POLLIN|POLLERR|POLLHUP}], 1, 1000) = 0 poll([{fd=3, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0 poll([{fd=3, events=POLLIN|POLLERR|POLLHUP}], 1, 1000) = 0 poll([{fd=3, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0 poll([{fd=3, events=POLLIN|POLLERR|POLLHUP}], 1, 1000) = 0 poll([{fd=3, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0 ...This is repeating few times a second... ------------------------------------------------------------------------ [2007-10-24 18:00:48] harvie at email dot cz Description: ------------ I have writed spider/crawler to make some web search engine as school project. So... I have small problem: I am using file_get_contents() (i've tryed fopen() too...). Crawler works 100% great, but sometimes it freezing. I have tryed to trace what function freezes, and i found it, it's file_get_contents()... So, i googled and found default_socket_timeout setting, i set it to 1, but sometimes its freezes and never get up again. I've done this example, so you can see, that it freezes after few iterations. I have supplyed URL, that causes freeze of my crawler (im not sure why...): Reproduce code: --------------- #!/usr/bin/php < ?php /*Run and wait for a while, this can totaly stop the script at the dead point...*/ ini_set('default_socket_timeout',1); set_time_limit(0); //$url='http://ad.doubleclick.net/click'; $url='http://w.moreover.com/'; while(1) { @file_get_contents($url, false, null, 0, 10000); echo "#"; } ?> Expected result: ---------------- I will download file from specified URL few times, and after that it will freeze and never be better... (It works if you are using different url each time too, but it takes more time...) Actual result: -------------- harvie-ntb:/home/harvie/Desktop/crawler# ./bugshow.php #1#2#3#4#5#6#7#8#9#10#11#12#13#14#15#16#17 And in there it freezes for eternity (i thought, that this will continue after 1 second if failed with ini_set('default_socket_timeout',1);, But whole script stops, i tryed to wait realy long long time...) ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=43098&edit=1