Dumb glob question
I've run into an issue with glob and matching filenames with brackets '[]'
in them. The problem comes when I'm using part of such a filename as the
path I'm passing to glob. Here's a trimmed down dumb example. Let's say I
have a directory with the following files in it.
foo.par2
foo.vol0+1.par2
foo.vol1+1.par2
zzz [foo].par2
zzz [foo].vol0+1.par2
zzz [foo].vol1+1.par2
While processing one of the files I want to do certain things in batch so
I've been using glob as a means to get all of the files in a set. The
following code will print the filenames for parity volumes in each set
while working with the base checksum, unless there are brackets in the
name.
#re2 = re.compile(r'vol', re.IGNORECASE)
#for nuke in glob.glob('*.par2'):
#if not re2.search(nuke):
#list = glob.glob(nuke[:-5]+'*vol*')
#for name in list: print os.path.join(os.getcwd(),name)
I'm sure there is something obvious I'm missing. I figured I could use
something like re.escape on the trimmed filename for matching but that
hasn't worked either. Using win32api.FindFiles instead of glob works but
I'd obviously rather do it the _right_ way and have it work properly in
*nix too.
--
http://mail.python.org/mailman/listinfo/python-list
Re: Dumb glob question
"[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote in comp.lang.python:
> code like below willprint all files ending on 'par2', except tose not
> containong 'vol' from the 5th position. is that what you need?
> -import glob
> -for nuke in glob.glob(r"""c:\temp\*.par2"""):
> -try:
> -nuke.index('vol', 5)
> -print nuke
> -except ValueError, e:
> -print e
Not quite. I'm sorry my example wasn't very clear. While working with any
single file I need to be able to build a list of all the other files in a
particular set. Basically I just need globbing of the base filename.
glob.glob(basename+'.*some_extension')
So if I was working with 'foo.par2' at the moment...
glob.glob(filename[:-5]+'.*par2')
would catch all of the files belonging to the set including 'foo.par2'
'foo.vol0+1.par2' 'foo.vol1+1.par2' etc.
This works great (as expected) until you are working with a filename with
brackets '[]' in it. Then glob just returns an empty list. So if I happen
to be processing 'foo [bar].par2'
glob.glob(filename[:-5]+'.*par2')
doesn't return anything. Using win32api.FindFiles(filename[:-5]+'.*par2')
works perfectly, but I don't want to rely on win32api functions. I hope
that made more sense :).
--
http://mail.python.org/mailman/listinfo/python-list
Re: Dumb glob question
Michael Hoffman <[EMAIL PROTECTED]> wrote in comp.lang.python:
> Python Dunce wrote:
>
>> So if I happen
>> to be processing 'foo [bar].par2'
>>
>> glob.glob(filename[:-5]+'.*par2')
>>
>> doesn't return anything. Using
>> win32api.FindFiles(filename[:-5]+'.*par2') works perfectly, but I don't
>> want to rely on win32api functions. I hope that made more sense :).
>
> If you look in the source for glob.py, you will find that it calls the
> fnmatch module, and this is the docstring for fnmatch.translate():
>
> """Translate a shell PATTERN to a regular expression.
>
> There is no way to quote meta-characters.
> """
>
> So you cannot do what you want with glob.
>
> You can replace [] with ? in your glob string, if you are sure that
> there won't be other characters there. That's a bit of a hack, and I
> wouldn't do it.
>
> In my mind it would probably be best to do:
>
> re_vol = re.compile(re.escape(startpart) + ".*vol.*")
> lst = [filename for filename in os.listdir(".") if
> re_vol.match(filename)]
>
> I changed "list" to "lst" because the former shadows a built-in.
Thanks, that should do the trick! I had tried basically the same thing
once but I was getting back empty lists. I think it was just a brain fart
involving a case sensitive regex that didn't match the files I was testing
it on :/.
--
http://mail.python.org/mailman/listinfo/python-list
