[Nant-users] Re: Suboptimal fileset include

Matthew Mastracci Tue, 21 Jun 2005 14:23:07 -0700

Gary Feldman wrote:

Internally, we have a flag "isRecursive" that we set on patterns thatreflects whether the pattern potentially matches more than a singledirectory:
Without studying the code to figure out the intent of the recursiveflag, my gut intuition is that only the ** implies recursion. That is,given a pattern
     **/foo/bar*

you have to check every folder in the tree to see if it has a match forthe foo/bar* pattern. In other words, the foo/bar* must be applied atvarious levels in the tree. But why should
    */foo/bar*
be considered recursive?

The "recursive" flag indicates to the scanning engine that we need torecursively scan a set of directories beneath the "base" directory. Thebase directory is the part of the directory wildcard that is alwaysmatched. In your two examples, since the wildcard is at the start ofthe pattern, the base directories are both the base directory of thefileset itself.


For the pattern "foo/**/bar.cs", the base directory is "foo".

We use the concept of base directories to aggregate a number ofpotentially recursive scans into one. For instance, if you aresearching for:


foo/**/bar.cs
  and
foo/bar/*/bar.cs

you'll need to recursively scan from foo/ once and only once to ensurethat you've checked all the files against the pattern.

This algorithm works well for most cases, which is why it has existedfor so long. :)


What we need to do is to recognize that with the following patterns:
foo/blah*/*/bar.cs
  and
foo/bar*/*/bar.cs

the base directory is still foo/, but we only need to recurse twodirectory levels deep.

Bonus points if you can recognize that you only need to recurse past a"blah*" or "bar*" node.

I imagine the correct solution would involve some sort of tree structurein memory that gets build from the components of all the patterns andrepresents the ideal scanning strategy for the entire set. Dealing withpatterns like "foo/**/bar/blah.cs" would be an interested problem as well.

My explanation is all over the place, but I guess my final point is thatthe correct implementation is a Hard Problem and our current solution isGood Enough(tm) for most people. ;)

PS The tests should be there, anyway. One of the important points ofagile methodologies is that a comprehensive test suite makes it muchcheaper to fix design problems.

The DirectoryScanner code is one of the oldest pieces of NAnt and has alot of black-box testing. :) While the tests might indicate that thecode is correctly passing the tests, it needs a lot more white-boxtesting to make sure that we're not wasting time scanning extradirectories, etc. I strongly agree with comprehensive unit tests but,as a community-supported project, someone needs to step forward andeither refactor the DirectoryScanner into a more testable object modelor add appropriate debugging hooks to ensure that performance doesn'tsuffer.

The biggest reason is that NAnt 0.85 is getting very close to a releaseand any major change could push the release even further back. Past therelease, I don't think anyone will mind a little refactoring.


Matt.


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
Nant-users mailing list
Nant-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nant-users

[Nant-users] Re: Suboptimal fileset include

Reply via email to