Re: GSoC :Project Idea(Before final Submission) for review and feedback

Oleg Endo Sun, 25 Mar 2012 02:56:34 -0700

Please reply in CC to the GCC mailing list, so others can follow the
discussion.


On Sun, 2012-03-25 at 09:21 +0530, Subrata Biswas wrote:
> On 25 March 2012 03:59, Oleg Endo <oleg.e...@t-online.de> wrote:
> >
> > I might be misunderstanding the idea...
> > Let's assume you've got a program that doesn't compile, and you leave
> > out those erroneous blocks to enforce successful compilation of the
> > broken program.  How are you going to figure out for which blocks it is
> > actually safe to be removed and for which it isn't?
> 
> I can do it by tracing the code blocks which are dependent on the
> erroneous block. i.e if any block is data/control dependent(the output
> or written value of the erroneous part is read) on this erroneous
> block or line of code will be eliminated.
> 
> > Effectively, you'll
> > be changing the original semantics of a program, and those semantic
> > changes might be completely not what the programmer originally had in
> > mind.  In the worst case, something might end up with an (un)formatted
> > harddisk...*
> >
> > Cheers,
> > Oleg
> >
> Thank you sir for your great feedback. You have understood it
> correctly. Now the programmer will be informed about the change in
> code and the semantics.(Notice that this plug-in is not going to
> modify the original code!, it just copy the original code and perform
> all the operations on the temporary file!!!) Even from the partial
> execution of the code the programmer will get an overview of his
> actual progress.
> 
> suppose the program written by the programmer be:
> 
> 1 int main(void)
> 2 {
> 3    int arr[]={3,4,-10,22,33,37,11};
> 4    sort(arr);
> 5    int a = arr[3] // Now suppose the programmer missed the semicolon
> here. Which generates a compilation error at line 5;
> 6    printf("%d\n",a);
> 7    for(i=0;i<7;i++)
> 8    {
> 9        printf("%d\n",arr[i]);
> 10    }
> 11  }
> 
> 
> Now if we just analyze the data (i.e. variable), we can easily find
> that there is only data dependency exists between line 5 and line 6.
> The rest of the program is not being effected due to elimination or
> commenting line 5.
> 
> Hence the temporary source file after commenting out the erroneous
> part of the code and the code segment that is dependent on this
> erroneous  part would be:
> 
> 1 int main(void)
> 2 {
> 3    int arr[]={3,4,-10,22,33,37,11};
> 4    sort(arr);
> 5    //int a = arr[3] // Now suppose the programmer missed the
> semicolon here. Which generates a compilation error at line 5;
> 6   // printf("%d\n",a);
> 7    for(i=0;i<7;i++)
> 8    {
> 9        printf("%d\n",arr[i]);
> 10    }
> 11  }
> 
> Now this part of the program(broken program) is error free. Now we can
> compile this part using GCC and get the partial executable.
> 
> Now the possible output after compilation using this plug in(if
> programmer use it) with GCC would be:
> 
> "You have syntax error at Line no. 5. and to generate the partial
> executable Line 5 and Line 6 have removed in the temporary executable
> execute the partial executable excute p.out"
> 
> Advantages to the Programmer:
> 1. If programmer can see the result of the partial executable he can
> actually quantify his/her progress in code.
> 2. The debug become easier as this plug-in would suggest about
> possible correction in the code etc.

I don't think it will make the actual debugging task easier.  It might
make writing code easier (that's what IDEs are doing these days while
you're typing code...).  In order to debug a program, the actual bugs
need to be _in_ the program, otherwise there is nothing to debug.
Removing arbitrary parts of the program could potentially introduce new
artificial bugs, just because of a missing semicolon.

> * I did not understand the  worst case that you have mentioned as
> (un)formatted hard disk. Can you kindly explain it?
> 

Let's say I'm writing a kind of disk utility that reads and writes
sectors...

---------------------
source1.c:

bool
copy_sector (void* outbuf, const void* inbuf, int bytecount)
{
  if (bytecount < 4)
    return false;
  
  if ((bytecount & 3) != 0)
    return false;

  int* out_ptr = (int*)outbuf;
  const int* in_ptr = (const int*)inbuf;
  int count = bytecount / 4;

  do
  {
    int i = *in_ptr++;
    if (i & 1)
      i = do_something_special0 (i);
    else if (i & (1 << 16))
      i = do_something_special1 (i);
    *out_ptr++ = i;
  } while (--count);

  return true;
}

---------------------
source0.c:

int main (void)
{
  ...
  int sector_size = get_sector_size (...);
  void* sector_read_buf = malloc (sector_size);
  void* sector_write_buf = malloc (sector_size);

  while (sector_count > 0)
  {
    read_next_sector (sector_read_buf);
    if (copy_sector (sector_write_buf, sector_read_buf, sector_size))
      write_next_sector (sector_write_buf);
  }
  ...
}


Let's assume that in the function copy_sector in source1.c there is a
syntax error:

  do
  {
    int i = *in_ptr++;
    if (i & 1)
      i = do_something_special0 (i);
    else if (i & (1 << 16))
      i = do_something_special1 (i);

    *outptr++ = i;  // misspelled 'out_ptr'.
                    // There is no such variable 'outptr'.
                    // This line will be left out to make it compile.

  } while (--count);


If this broken program is executed it will happily transform data into
garbage.


Another example could be (copy_sector function again):

  if (bytcount < 4)  // syntax error again, if block is removed
    return false;    // to enforce compilation.

Now the copy_sector function will happily accept values <= 0 for
'bytecount', which will most likely end up in an integer overflow and a
page fault...

Those might be overly extreme and/or silly examples.  What I'm trying to
say is that by leaving out program parts, there is a risk of introducing
artificial data corruption or artificial infinite loops that
accidentally overwrite data.

Cheers,
Oleg

Re: GSoC :Project Idea(Before final Submission) for review and feedback

Reply via email to