On Sunday, February 4, 2018 at 5:32:51 PM UTC-6, Stanley Denman wrote:
> On Sunday, February 4, 2018 at 4:26:24 PM UTC-6, Stanley Denman wrote:
> > I am trying to parse a Python nested list that is the result of the
> > getOutlines() function of module PyPFD2 using pyparsing module. This is the
> > result I get. what in the world are 'expandtabs' and why is that making a
> > difference to my parse attempt?
> >
> > Python Code
> > 7
> > import PPDF2,pyparsing
> > from pyparsing import Word, alphas, nums
> > pdfFileObj=open('x.pdf','rb')
> > pdfReader=PyPDF2.PdfFileReader(pdfFileObj)
> > List=pdfReader.getOutlines()
> > myparser = Word( alphas ) + Word(nums, exact=2) +"of" + Word(nums, exact=2)
> > myparser.parseString(List)
> >
> > This is the error I get:
> >
> > Traceback (most recent call last):
> > File "<pyshell#23>", line 1, in <module>
> > myparser.parseString(List)
> > File "C:\python\lib\site-packages\pyparsing.py", line 1620, in parseString
> > instring = instring.expandtabs()
> > AttributeError: 'list' object has no attribute 'expandtabs'
> >
> > Thanks so much, not getting any helpful responses from
> > https://python-forum.io.
I have found that I can use the index values in the list to print out the
section I need. So print(MyList[7]) get me to section f taht I want.
print(MyList[9][1]) for example give me a string that is the bookmark entry for
Exhibit 1F. But this index value would presumeably be different for each pdf
file - that is there may not always be Section A-E, but there will always be a
Section F. In ther words, the index values that get me to the right section
would be different in each pdf file.
--
https://mail.python.org/mailman/listinfo/python-list