Edit report at http://bugs.php.net/bug.php?id=46817&edit=1

 ID:                 46817
 Comment by:         vanessasold at yahoo dot com
 Reported by:        master dot jexus at gmail dot com
 Summary:            tokenizer misses last single-line comment (PHP 5.3+,
                     with re2c lexer)
 Status:             Closed
 Type:               Bug
 Package:            Scripting Engine problem
 Operating System:   *
 PHP Version:        5.3.0alpha3
 Assigned To:        shire
 Block user comment: N
 Private report:     N

 New Comment:

Hi, <a href=http://trig.com/rich73/biography>where can i buy kamagra in
london.</a> buy kamagra with paypal <a
href=http://trig.com/rich73/biography>you buy kamagra next day delivery
.</a> Here cheapest place to buy kamagra online<a
href=http://trig.com/rich73/biography>cheapest jelly kamagra uk
only</a>. <a href=http://trig.com/rich73/biography>kamagra north east
england.</a>


Previous Comments:
------------------------------------------------------------------------
[2009-03-11 22:18:04] sh...@php.net

This bug has been fixed in CVS.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.
 
Thank you for the report, and for helping us make PHP better.



------------------------------------------------------------------------
[2009-03-06 07:41:45] lu...@php.net

I'm seeing what could be related if not the same problem trying to
detect trailing windows CR+LF in T_WHITESPACE:



Reproduce code:

---------------

<?php

// this comment and trailing blank contain windows CR+LF^M

^M



Expected result:

----------------

array(3) {

  [0]=>

  array(3) {

    [0]=>

    int(367)

    [1]=>

    string(6) "<?php

"

    [2]=>

    int(1)

  }

  [1]=>

  array(3) {

    [0]=>

    int(365)

    [1]=>

"   string(57) "// this comment and trailing blank contain windows
CR+LF

    [2]=>

    int(2)

  }

  [2]=>

  array(3) {

    [0]=>

    int(370)

    [1]=>

    string(3) "



"

    int(2)

  }

}



    [2]=>

    int(2)

  }

}



Actual result:

--------------

array(2) {

  [0]=>

  array(3) {

    [0]=>

    int(368)

    [1]=>

    string(6) "<?php

"

    [2]=>

    int(1)

  }

  [1]=>

  array(3) {

    [0]=>

    int(366)

    [1]=>

"   string(57) "// this comment and trailing blank contain windows
CR+LF

    [2]=>

    int(2)

  }

}

------------------------------------------------------------------------
[2008-12-10 10:25:25] nlop...@php.net

this is a problem in the new lexer. The problem is reproduceable if
after the comment there's the EOF (with no \n after the comment).

This, again, is triggered because of the difference in handling the EOF
between flex and re2c..

A simple hack would be to detect the ST_ONE_LINE_COMMENT state on EOF
and return the correct value, but I would prefer a more general thing.

------------------------------------------------------------------------
[2008-12-09 22:35:46] master dot jexus at gmail dot com

Description:
------------
When using the tokenizer to lex given text, the output seems to miss 

the last token, if it was a single line comment.



It only seems to occur if there isn't a newline behind the comment 

lexeme.



Note the last entries in the arrays.

Reproduce code:
---------------
<?php

print_r(token_get_all(file_get_contents(__FILE__)));



// test

$var = 5;

// test

Expected result:
----------------
Array

(

    [0] => Array

        (

            [0] => 367

            [1] =>  1

        )

 

    [1] => Array

        (

            [0] => 307

            [1] => print_r

            [2] => 2

        )

 

    [2] => (

    [3] => Array

        (

            [0] => 307

            [1] => token_get_all

            [2] => 2

        )

 

    [4] => (

    [5] => Array

        (

            [0] => 307

            [1] => file_get_contents

            [2] => 2

        )

 

    [6] => (

    [7] => Array

        (

            [0] => 364

            [1] => __FILE__

            [2] => 2

        )

 

    [8] => )

    [9] => )

    [10] => )

    [11] => ;

    [12] => Array

        (

            [0] => 370

            [1] => 

 

 

            [2] => 2

        )

 

    [13] => Array

        (

            [0] => 365

            [1] => // test

 

            [2] => 4

        )

 

    [14] => Array

        (

            [0] => 309

            [1] => $var

            [2] => 5

        )

 

    [15] => Array

        (

            [0] => 370

            [1] =>  

            [2] => 5

        )

 

    [16] => =

    [17] => Array

        (

            [0] => 370

            [1] =>  

            [2] => 5

        )

 

    [18] => Array

        (

            [0] => 305

            [1] => 5

            [2] => 5

        )

 

    [19] => ;

    [20] => Array

        (

            [0] => 370

            [1] => 

 

            [2] => 5

        )

 

    [21] => Array

        (

            [0] => 365

            [1] => // test

            [2] => 6

        )

 

)

Actual result:
--------------
Array

(

    [0] => Array

        (

            [0] => 368

            [1] =>  1

        )

 

    [1] => Array

        (

            [0] => 307

            [1] => print_r

            [2] => 2

        )

 

    [2] => (

    [3] => Array

        (

            [0] => 307

            [1] => token_get_all

            [2] => 2

        )

 

    [4] => (

    [5] => Array

        (

            [0] => 307

            [1] => file_get_contents

            [2] => 2

        )

 

    [6] => (

    [7] => Array

        (

            [0] => 365

            [1] => __FILE__

            [2] => 2

        )

 

    [8] => )

    [9] => )

    [10] => )

    [11] => ;

    [12] => Array

        (

            [0] => 371

            [1] => 

 

 

            [2] => 2

        )

 

    [13] => Array

        (

            [0] => 366

            [1] => // test

 

            [2] => 4

        )

 

    [14] => Array

        (

            [0] => 309

            [1] => $var

            [2] => 5

        )

 

    [15] => Array

        (

            [0] => 371

            [1] =>  

            [2] => 5

        )

 

    [16] => =

    [17] => Array

        (

            [0] => 371

            [1] =>  

            [2] => 5

        )

 

    [18] => Array

        (

            [0] => 305

            [1] => 5

            [2] => 5

        )

 

    [19] => ;

    [20] => Array

        (

            [0] => 371

            [1] => 

 

            [2] => 5

        )

 

)


------------------------------------------------------------------------



-- 
Edit this bug report at http://bugs.php.net/bug.php?id=46817&edit=1

Reply via email to