Re: [Tutor] PDF Scrapping

2015-11-25 Thread shawn wilson
On Nov 25, 2015 12:44 PM, "Francois Dion"  wrote:
>

> if you
> have any choice at all, avoid PDF at all cost to get data.
>

Agreed and IIRC all of that data should be in xml somewhere (look for their
rpc pages). Probably start by searching for similar table names (and Google
dorking their site for appropriate APIs and/or look through the code of w/e
tables you find). That's simpler than dealing with pdf. Might also try
emailing them and asking where the data came from (keeping in mind
thanksgiving is a federal holiday in the States so you won't get a reply
until Monday earliest). OTOH, they can just tell you to go away since pdf
is "open" - YMMV.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] subprocess not returning

2014-04-22 Thread shawn wilson
This works when I have a class for ldd and nothing else, but when I
run it like this:
https://gist.github.com/ag4ve/11171201

I don't get any of the libraries and I can't figure out where it's failing.
['fattr', [['/testroot', 0, 0, 777], ['/bin/dash']]]
HERE1 [/testroot]
HERE2 [/bin/dash]
['ldd', ['/lib/x86_64-linux-gnu/libc.so.6', '/lib64/ld-linux-x86-64.so.2']]
HERE2 [/lib/x86_64-linux-gnu/libc.so.6]
['ldd', ['/lib64/ld-linux-x86-64.so.2']]
HERE2 [/lib64/ld-linux-x86-64.so.2]
['ldd', ['statically']]
HERE1 [statically]
HERE2 [/lib64/ld-linux-x86-64.so.2]
['ldd', ['statically']]
HERE1 [statically]
[   'filelist',
[['/testroot', 0, 0, 777], ['/bin/dash', 0, 0, '755'], [[[]], [

Obviously it's returning something - but no usable info.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] subprocess not returning

2014-04-22 Thread shawn wilson
On Tue, Apr 22, 2014 at 3:10 PM, Alan Gauld  wrote:

> We have no clue what you are doing. You say "this works"
> but we can't see what 'this' is. Is the code on the
> pastebin link the working or the broken version?
>

Per what is expected output (which I forgot to provide - sorry about
that). Should be something like this:
[
  ['/testroot', '0', '0', '777'],
  ['/bin/dash', 0, 0, '755'],
  ['/lib/x86_64-linux-gnu/libc.so.6', '0', '0', '777'],
  ['/lib64/ld-linux-x86-64.so.2', '0', '0', '777'],
  ['/lib/x86_64-linux-gnu/libc-2.17.so', '0', '0', '755'],
  ['/lib/x86_64-linux-gnu/ld-2.17.so', '0', '0', '755']
]

Ie, find libraries a program is linked against (just try ldd against
any file because I'm not caring about optimizing at this point) and
then find the permissions of them and follow symlinks and do the same.

Though, what I'm asking specifically is why __ldd isn't returning any
values in my module.


The best I can simplify to show the part working that should also be
working in the gist code is:
import subprocess
import sys
import pprint
pp = pprint.PrettyPrinter(indent=4)

class T:
  def ldd(filename):
libs = []
for x in filename:
  p = subprocess.Popen(["ldd", x],
universal_newlines=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)

  for line in p.stdout:
s = line.split()
pp.pprint(s)
if "=>" in s:
  if len(s) == 3: # virtual library
continue
  else:
libs.append(s[2])
else:
  if len(s) == 2:
libs.append(s[0])

return libs

if __name__ == "__main__":
  t = T
  fattr = [
'/bin/dash'
  ]
  pp.pprint(["OUT", t.ldd(fattr)])


Past this, I can see that the ldd method is being called in my actual
code but nothing is being returned from it like it is here.

> It's also a very long listing. Can you produce a shorter
> example, perhaps with hard coded values that exhibits
> the problem?

I really did try to simplify t.py (where all of the data is included
in the script). I guess the reason for my question is because I'm not
sure what's not working or what to try next?
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor