Gary Feldman wrote:

Internally, we have a flag "isRecursive" that we set on patterns that reflects whether the pattern potentially matches more than a single directory:

Without studying the code to figure out the intent of the recursive flag, my gut intuition is that only the ** implies recursion. That is, given a pattern

     **/foo/bar*
>
you have to check every folder in the tree to see if it has a match for the foo/bar* pattern. In other words, the foo/bar* must be applied at various levels in the tree. But why should
    */foo/bar*
be considered recursive?

The "recursive" flag indicates to the scanning engine that we need to recursively scan a set of directories beneath the "base" directory. The base directory is the part of the directory wildcard that is always matched. In your two examples, since the wildcard is at the start of the pattern, the base directories are both the base directory of the fileset itself.

For the pattern "foo/**/bar.cs", the base directory is "foo".

We use the concept of base directories to aggregate a number of potentially recursive scans into one. For instance, if you are searching for:

foo/**/bar.cs
  and
foo/bar/*/bar.cs

you'll need to recursively scan from foo/ once and only once to ensure that you've checked all the files against the pattern.

This algorithm works well for most cases, which is why it has existed for so long. :)

What we need to do is to recognize that with the following patterns:
foo/blah*/*/bar.cs
  and
foo/bar*/*/bar.cs

the base directory is still foo/, but we only need to recurse two directory levels deep.

Bonus points if you can recognize that you only need to recurse past a "blah*" or "bar*" node.

I imagine the correct solution would involve some sort of tree structure in memory that gets build from the components of all the patterns and represents the ideal scanning strategy for the entire set. Dealing with patterns like "foo/**/bar/blah.cs" would be an interested problem as well.

My explanation is all over the place, but I guess my final point is that the correct implementation is a Hard Problem and our current solution is Good Enough(tm) for most people. ;)

PS The tests should be there, anyway. One of the important points of agile methodologies is that a comprehensive test suite makes it much cheaper to fix design problems.

The DirectoryScanner code is one of the oldest pieces of NAnt and has a lot of black-box testing. :) While the tests might indicate that the code is correctly passing the tests, it needs a lot more white-box testing to make sure that we're not wasting time scanning extra directories, etc. I strongly agree with comprehensive unit tests but, as a community-supported project, someone needs to step forward and either refactor the DirectoryScanner into a more testable object model or add appropriate debugging hooks to ensure that performance doesn't suffer.

The biggest reason is that NAnt 0.85 is getting very close to a release and any major change could push the release even further back. Past the release, I don't think anyone will mind a little refactoring.

Matt.


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
Nant-users mailing list
Nant-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nant-users

Reply via email to