python guru.. for a short conversation regarding bittorrent..
hi.. i'm not a python developer, but i have a few questions regarding python (some quite basic), and bittorrent. i'm looking to talk to someone/anyone who has experience with the infrastructure of bittorrent, not just running a bittorrent client app... the bittorrent mailing lists/groups haven't been responsive, so i figured i'd try here... if there's anyone here that i could talk with (phone) who's knowledgable about these areas, i'd appreciate it. i'm trying to get a much better understanding of the actual underlying app. thanks -bruce [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
RE: Any college offering Python short term course?
hey... i'm looking for classes (advanced) in python/php in the bay area as well... actually i'm looking for the students/teachers/profs of these classes... any idea as to how to find them. calling the various schools hasn't really been that helpful. The schools/institutions haven't had a good/large selection... it appears that some of the classes are taught by adjunct/part-time faculty, and they're not that easy to get to... if anybody knows of user-groups that also have this kind of talent, i'd appreciate it as well... send responses to the list as well!!! thanks -bruce -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of arches73 Sent: Sunday, November 20, 2005 4:04 PM To: [email protected] Subject: Any college offering Python short term course? Hi, I want to learn Python. I appreciate if someone point me to the colleges / institutions offering any type of course in Python programming in the Bay area CA. Please send me the links to my email. Thanks, Arches -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
amazon web services/api...
hi... has anybody ever played/successfully with the amazon- web services/api? i'm trying to figure out how i can use the browsenodeid to generate ISBN information for a book. the examples i've seen on various sites haven't been much help yet. thanks -bruce [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
fsm - revert to previous state (newbie)
Hi all, I`m working on a fsm that looks up its transitions in some dictionaries. I`ve defined a default transition to an "error_state" that is launched when there is no match of input_symbol, state etc in other dicts. The error state just prints out that user_input was not defined. After this I want to revert back to the state before the error, so the user can try again. My solution now is to update the value in the dictionary state_transitions whose key is "state_error", inside the fsm loop. This is not very elegant I think. Does anyone know of a better way? -- http://mail.python.org/mailman/listinfo/python-list
Re: How to clear screen in Python interactive shell mode?
elif os.name in ("nt", "dos", "ce"):
# emacs/Windows
What`s the right statement here?
--
http://mail.python.org/mailman/listinfo/python-list
possible python/linux/gnome issue!!
hi... i'm running rh8.0 with gnome.. i'm not sure of the version (it's whatever rh shipped). i've recently updated (or tried to update) python to the latest version. when i try to run the 'Server Settings/Services' Icon within gnome, nothing happens... i tried to run the 'start services' command from a command line, and got the following... it appears to be a conflict somewhere... [EMAIL PROTECTED] bin]# redhat-config-services /usr/share/redhat-config-services/serviceconf.py:331: SyntaxWarning: argument named None def on_mnuRescan_activate(self,None): /usr/share/redhat-config-services/serviceconf.py:342: SyntaxWarning: argument named None def on_optRL3_activate(self, None): /usr/share/redhat-config-services/serviceconf.py:351: SyntaxWarning: argument named None def on_optRL4_activate(self, None): /usr/share/redhat-config-services/serviceconf.py:375: SyntaxWarning: argument named None def on_optRL5_activate(self, None): /usr/share/redhat-config-services/serviceconf.py:409: SyntaxWarning: argument named None def on_selectCursor(self,None): /usr/share/redhat-config-services/serviceconf.py:419: SyntaxWarning: argument named None def on_btnSave_clicked(self, None): /usr/share/redhat-config-services/serviceconf.py:435: SyntaxWarning: argument named None def on_btnRevert_clicked(self, None): /usr/share/redhat-config-services/serviceconf.py:456: SyntaxWarning: argument named None def on_btnStart_clicked(self,None): /usr/share/redhat-config-services/serviceconf.py:462: SyntaxWarning: argument named None def on_btnStop_clicked(self,None): /usr/share/redhat-config-services/serviceconf.py:468: SyntaxWarning: argument named None def on_btnRestart_clicked(self,None): Traceback (most recent call last): File "/usr/share/redhat-config-services/serviceconf.py", line 24, in ? import gtk ImportError: No module named gtk [EMAIL PROTECTED] bin]# - can someone perhaps suggest a solution, or point me to where i might find a solution to this issue. could the python changes be causing a problem? would upgrading gnome potentially fix the issue? can you upgrade gnome without upgrading the rh kernel? i'm posting here, because it might be a python related issue, in that i might have to upgrade other apps to match the new version of python. any ideas/thoughts/etc.. would be appreciated... this is rather frustrating... thanks -bruce [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
python upgrade process...
hi... i currently have a rh8.0 server that is a little messed up!! i need to know what's the best/right process to upgrade to the latest python/mod_python, and any associated libs i need to make sure that i don't break the gnome/httpd, or any other python dependent app... currently i have different versions on the box, but it appears when i do a $>python, i get the older version... which is not what i need/want!!! so, i'm asking, what's the best process for doing an update. i've tried 'yum update' but it gives some sort of baseurl error... 'apt-get' doesn't seem to work either... however. if you suggest that these tools would be the best approach, would these tools actually get all required dependencies/etc... thanks.. bruce -- http://mail.python.org/mailman/listinfo/python-list
python/mod_python conflicts...
hi... i have a linux redhat8 server. i'm trying to get python andmod_python to play nicely, meaning that i have the right mod_python for the python that i've installed. it appears that the box has multiple versions of python. when i'm 'root' the python version is 2.2.1 when i'm a user 'test', the version is 2.3.5 when i do a 'rpm -q python, i get python-2.2.1-17 as being what was/is installed via 'rpm' when i do a 'rpm -q mod_python, i get mod_python-3.1.3-5 as being what was/is installed via 'rpm' --- i tried to do an upgrade of the python/mod_python using the rpms from redhat for RH8 and RH9 and got the following err msg from the command line python/interpreter: >>> import mod_python . . . ImportError: No module named psp This happened for both the RH8 and RH9 rppms that I used... so... any ideas as to how to get this situation to work/resolved... thanks -bruce -- http://mail.python.org/mailman/listinfo/python-list
python/linux guru needed.. now!!!!
hi... i have a situation with a linux rh8 server. i can't seem to get python/mod_python/apache working as one... i can't seem to import mod_python from the python interpreter to work, and i'm not sure as to why. i'm fairly convinced that it's a conflict issue of some type, but i'm not sure as to how to resolve it... if you are a guru with python/mod_python/linux then i'd like to talk with you... searching through google/mailing lists/etc... is getting me nowhere!! thanks bruce -- http://mail.python.org/mailman/listinfo/python-list
RE: Working on a log in script to my webpage
hi... regarding the issue of creating a login (user/passwd) script... there are numerous example scripts/apps written that use php/mysql... i suggest that you take a look at a few and then incoporate the features that you want into your script. from your questions, it seems like this approach will give you a better/faster solution to your problem. -regards -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Steve Holden Sent: Tuesday, March 08, 2005 4:02 PM To: [email protected] Subject: Re: Working on a log in script to my webpage Pete. wrote: > Hi all I am working on a log in script for my webpage. > > I have the username and the password stored in a PostgreSQL database. > > The first I do is I make a html form, where the user can type in his > username and code, when this is done I want to run the > script(testifcodeisokay) that verifies that the code and username are the > right ones ( that means if they match the particular entered username and > password) If they are then I want to load page1 if they are not I want to > load the loginpage again. > > Login page: > > print ''' > Username: > Code: ''' > > print '' > print ''' ''' > > This works. > Here I store the entered text in the variables "username" and "code" > I then get the entered value by > > testifcodeisokay script > > connect = PgSQL.connect(user="user", password="password", host="host", > database="databse") > cur = connect.cursor() > > form = cgi.FieldStorage() > username = form["username"].value > code= form["code"].value > > I then want to test if they match the ones in the database > > insert_command = "SELECT username, code FROM codetable WHERE > codetable.username = '%s' AND codetable.code = '%s' " %(username, code) > cur.execute(insert_command) > This is an amazingly bad choice of variable name, since the command doesn't actually insert anything into the database! > I should then have found where the entered username,code (on the login page) > is the same as those in the database. > > But now I am stuck. > > Does any know how I can then do something like: > > If the codes from the loginpage matches the users codes in the db > Then the user should be taken to page1 > IF the codes arnt correct the login page should load again. > > The program dosnt need to remember who the user is, after the user has been > loggen in, it is only used to log the user in. > > Thanks for your time.. > The Python you want is almost certainly something like if len(curs.fetchall()) == 1: # username/password was found in db although unless your database is guarantees to contain only one of each combination it might be better to test if len(curs.fetchall()) != 0: # username/password was found in db > > There are other matters of concern, however, the most pressing of which is: How am I going to stop user from navigating directly to page1? Answering this question will involve learning about HTTP session state and writing web applications. I could write a book on that subject :-) regards Steve -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
RE: Working on a log in script to my webpage
pete... simply use google and search for "php scripts login user authentication mysql session etc..." these terms will give you lots of examples... you could also look at some of the bulletin board/forum apps that are open source to se what they use. or, you could also look through the code for some of the php content management apps... of course, there are also the open source ecommerce solutions. all of these types of apps have functionality to deal with the user login/registration issues... -regards,,, -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Pete. Sent: Tuesday, March 08, 2005 6:26 PM To: [email protected] Subject: Re: Working on a log in script to my webpage The 2 scripts I made is actually working the way they where meant to. So im kindda happy :) The problem is, that I didnt think about the problem: as Steve wrote: "There are other matters of concern, however, the most pressing of which is: How am I going to stop user from navigating directly to page1?" Maybee I can find some premade feature, that prevents users to go to page1 without logging in. Any ideas as to where I can find some information about this. Nice that you all take time to help a newbie, so thanks to the helpfull people :) > hi... > > regarding the issue of creating a login (user/passwd) script... there are > numerous example scripts/apps written that use php/mysql... i suggest that > you take a look at a few and then incoporate the features that you want > into > your script. > > from your questions, it seems like this approach will give you a > better/faster solution to your problem. > > -regards > > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf > Of Steve Holden > Sent: Tuesday, March 08, 2005 4:02 PM > To: [email protected] > Subject: Re: Working on a log in script to my webpage > > > Pete. wrote: >> Hi all I am working on a log in script for my webpage. >> >> I have the username and the password stored in a PostgreSQL database. >> >> The first I do is I make a html form, where the user can type in his >> username and code, when this is done I want to run the >> script(testifcodeisokay) that verifies that the code and username are the >> right ones ( that means if they match the particular entered username and >> password) If they are then I want to load page1 if they are not I want to >> load the loginpage again. >> >> Login page: >> >> print ''' >> Username: >> Code: ''' >> >> print '' >> print ''' ''' >> >> This works. >> Here I store the entered text in the variables "username" and "code" >> I then get the entered value by >> >> testifcodeisokay script >> >> connect = PgSQL.connect(user="user", password="password", host="host", >> database="databse") >> cur = connect.cursor() >> >> form = cgi.FieldStorage() >> username = form["username"].value >> code= form["code"].value >> >> I then want to test if they match the ones in the database >> >> insert_command = "SELECT username, code FROM codetable WHERE >> codetable.username = '%s' AND codetable.code = '%s' " %(username, code) >> cur.execute(insert_command) >> > This is an amazingly bad choice of variable name, since the command > doesn't actually insert anything into the database! > >> I should then have found where the entered username,code (on the login > page) >> is the same as those in the database. >> >> But now I am stuck. >> >> Does any know how I can then do something like: >> >> If the codes from the loginpage matches the users codes in the db >> Then the user should be taken to page1 >> IF the codes arnt correct the login page should load again. >> >> The program dosnt need to remember who the user is, after the user has > been >> loggen in, it is only used to log the user in. >> >> Thanks for your time.. >> > The Python you want is almost certainly something like > > if len(curs.fetchall()) == 1: > # username/password was found in db > > although unless your database is guarantees to contain only one of each > combination it might be better to test > > if len(curs.fetchall()) != 0: > # username/password was found in db >> >> > There are other matters of concern, however, the most pressing of which > is: > > How am I going to stop user from navigating directly to page1? > > Answering this question will involve learning about HTTP session state > and writing web applications. I could write a book on that subject :-) > > regards > Steve > > -- > http://mail.python.org/mailman/listinfo/python-list > -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
RE: How to create stuffit files on Linux?
noah, i'm fairly certain that stuffit will accommodate a number of formats, including zip. if you look around, you probably have open source that will create zip, which can then be read by stuffit... stuffit also provides an sdk that can probably be used to create what you need. check their site, or use google to get a better feel for what it provdes.. -bruce -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Noah Sent: Tuesday, March 15, 2005 2:03 PM To: [email protected] Subject: How to create stuffit files on Linux? I need to create Stuffit (.sit) files on Linux. Does anyone have any ideas for how to do this? I checked the Python docs and on SourceForge, but I didn't see any open source stuffit compatible libraries. Are my Mac users out of luck? Yours, Noah -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
python/svn issues....
hi... in trying to get viewcvs up/running, i tried to do the following: [EMAIL PROTECTED] viewcvs-0.9.2]# python Python 2.3.3 (#1, May 7 2004, 10:31:40) [GCC 3.3.3 20040412 (Red Hat Linux 3.3.3-7)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import svn.repos Traceback (most recent call last): File "", line 1, in ? File "/dar/tmp/subversion-1.1.4-0.1.1.fc2.rf-root/usr/lib/python2.3/site-packages /svn/repos.py", line 19, in ? File "/dar/tmp/subversion-1.1.4-0.1.1.fc2.rf-root/usr/lib/python2.3/site-packages /svn/fs.py", line 28, in ? File "/dar/tmp/subversion-1.1.4-0.1.1.fc2.rf-root/usr/lib/python2.3/site-packages /libsvn/fs.py", line 4, in ? ImportError: /usr/lib/libsvn_fs_base-1.so.0: undefined symbol: db_create >>> the undefined symbol is obviously a deal breaker!!! so, i'm trying to figure out what's going wrong... i'm not sure if this is a python/svn error... i'm on fedora core 2, viewcvs from sourceforge.cvs subversion-1.1.4-0.1.1.fc2 from dag repos (devel/perl) db4-4.2.52-6 google turned up a few instances of this, but no solution.. any ideas as to what's occuring... thanks bruce [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
RE: python/svn issues....
david... thanks for the reply... it's starting to look as though the actual /usr/lib/libdb-4.2.so from the rpm isn't exporting any of the symbols... when i do: nm /usr/lib/libdb-4.2.so | grep db_create i get nm: /usr/lib/libdb-4.2.so: no symbols which is strange... because i should be getting the db_create symbol... i'll try to build berkeley db by hand and see what i get... if you could try the 'nm' command against your berkely.. i'd appreciate you letting me know what you get.. thanks bruce -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of David M. Cooke Sent: Tuesday, April 12, 2005 12:46 PM To: [email protected] Subject: Re: python/svn issues "bruce" <[EMAIL PROTECTED]> writes: > hi... > > in trying to get viewcvs up/running, i tried to do the following: > > [EMAIL PROTECTED] viewcvs-0.9.2]# python > Python 2.3.3 (#1, May 7 2004, 10:31:40) > [GCC 3.3.3 20040412 (Red Hat Linux 3.3.3-7)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> import svn.repos > Traceback (most recent call last): > File "", line 1, in ? > File > "/dar/tmp/subversion-1.1.4-0.1.1.fc2.rf-root/usr/lib/python2.3/site-packages > /svn/repos.py", line 19, in ? > File > "/dar/tmp/subversion-1.1.4-0.1.1.fc2.rf-root/usr/lib/python2.3/site-packages > /svn/fs.py", line 28, in ? > File > "/dar/tmp/subversion-1.1.4-0.1.1.fc2.rf-root/usr/lib/python2.3/site-packages > /libsvn/fs.py", line 4, in ? > ImportError: /usr/lib/libsvn_fs_base-1.so.0: undefined symbol: db_create This looks like a problem when Subversion was built: this library was not linked against the Berkeley DB libraries. You can check what's linked using ldd, and see the unresolved symbols with ldd -r. For instance, on my AMD64 Debian system, $ ldd -r /usr/lib/libsvn_fs_base-1.so.0 libsvn_delta-1.so.0 => /usr/lib/libsvn_delta-1.so.0 (0x002a95696000) libsvn_subr-1.so.0 => /usr/lib/libsvn_subr-1.so.0 (0x002a9579f000) libaprutil-0.so.0 => /usr/lib/libaprutil-0.so.0 (0x002a958c8000) libldap.so.2 => /usr/lib/libldap.so.2 (0x002a959e) liblber.so.2 => /usr/lib/liblber.so.2 (0x002a95b19000) libdb-4.2.so => /usr/lib/libdb-4.2.so (0x002a95c27000) libexpat.so.1 => /usr/lib/libexpat.so.1 (0x002a95e05000) libapr-0.so.0 => /usr/lib/libapr-0.so.0 (0x002a95f29000) librt.so.1 => /lib/librt.so.1 (0x002a9604e000) libm.so.6 => /lib/libm.so.6 (0x002a96155000) libnsl.so.1 => /lib/libnsl.so.1 (0x002a962dc000) libpthread.so.0 => /lib/libpthread.so.0 (0x002a963f2000) libc.so.6 => /lib/libc.so.6 (0x002a96506000) libdl.so.2 => /lib/libdl.so.2 (0x002a96746000) libcrypt.so.1 => /lib/libcrypt.so.1 (0x002a96849000) libresolv.so.2 => /lib/libresolv.so.2 (0x002a9697c000) libsasl2.so.2 => /usr/lib/libsasl2.so.2 (0x002a96a91000) libgnutls.so.11 => /usr/lib/libgnutls.so.11 (0x002a96ba8000) /lib64/ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2 (0x00552000) libtasn1.so.2 => /usr/lib/libtasn1.so.2 (0x002a96d1b000) libgcrypt.so.11 => /usr/lib/libgcrypt.so.11 (0x002a96e2b000) libgpg-error.so.0 => /usr/lib/libgpg-error.so.0 (0x002a96f77000) libz.so.1 => /usr/lib/libz.so.1 (0x002a9707b000) If it doesn't look like that, then I'd say your Subversion package was built badly. You may also want to run ldd on your svn binary, to see what libraries it pulls in. -- |>|\/|< /--\ |David M. Cooke |cookedm(at)physics(dot)mcmaster(dot)ca -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
python guru... ViewCVS
hi... i'm playing/testing a python file (viewcvs.cgi/viewcvs.pyc) and i need to talk with someone for a few minutes... i have no knowledge of python. the file apparently opens up/reads a config file. however, i'm confused at figuring out how to debug the file, to see what/where it's going in the file... i'm pretty sure that there is an issue with the config file. but the weird thing is, i've actually changed the name of the config file, and i still get the same err from the app, which is weird... i've tried to print out a few err/msg statements, but i'm screwing something up, because it appears that the web page can't print the print statements... any thoughts/ideas/help would be greatly appreciated... thanks bruce [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
trying to parse a file...
hi, i'm trying to modify an app (gforge) that uses python to do some file parsing/processing... i have the following shell file that uses python. if i understand it correctly, it's supposed to modify the 'viewcvs.conf' file, and replace/update the section with 'svn_roots'. it isn't working correctly... can anybody tell me what i need to do? basically, i'd like to continually add to the svn_root: block with an additional line as required. also, can someone tell me what i'd need to do, if i wanted to remove a line of text from the 'svn_root' block if i had a given 'test_x:' thanks bruce [EMAIL PROTECTED] --- viewcvs.conf: . . . # # This setting specifies each of the Subversion roots (repositories) # on your system and assigns names to them. Each root should be given # by a "name: path" value. Multiple roots should be separated by # commas and can be placed on separate lines. # #svn_roots = test2: /svn-gforge/uploadsvn svn_roots = test5: /gforge-svn/test7/svn, test2: /gforge-svn/test7/svn, test3: /gforge-svn/test7/svn, # The 'root_parents' setting specifies a list of directories in which # any number of repositories may reside. Rather than force you to add . . . --- [EMAIL PROTECTED] bin]# cat test.sh --- #! /bin/sh python < /var/lib/gforge/etc/viewcvs.py --- -- http://mail.python.org/mailman/listinfo/python-list
trying to parse a file...
hi, i'm trying to modify an app (gforge) that uses python to do some file parsing/processing... i have the following shell file that uses python. if i understand it correctly, it's supposed to modify the 'viewcvs.conf' file, and replace/update the section with 'svn_roots'. it isn't working correctly... can anybody tell me what i need to do? basically, i'd like to continually add to the svn_root: block with an additional line as required. also, can someone tell me what i'd need to do, if i wanted to remove a line of text from the 'svn_root' block if i had a given 'test_x:' thanks bruce [EMAIL PROTECTED] --- viewcvs.conf: . . . # # This setting specifies each of the Subversion roots (repositories) # on your system and assigns names to them. Each root should be given # by a "name: path" value. Multiple roots should be separated by # commas and can be placed on separate lines. # #svn_roots = test2: /svn-gforge/uploadsvn svn_roots = test5: /gforge-svn/test7/svn, test2: /gforge-svn/test7/svn, test3: /gforge-svn/test7/svn, # The 'root_parents' setting specifies a list of directories in which # any number of repositories may reside. Rather than force you to add . . . --- [EMAIL PROTECTED] bin]# cat test.sh --- #! /bin/sh python < /var/lib/gforge/etc/viewcvs.py --- -- http://mail.python.org/mailman/listinfo/python-list
internet explorer/firefox plugin/toolbar
hi... this probably isn't the correct mailing list, but we're giving it a shot!! does anyone have any ideas as to how to go about creating a plugin/toolbar for both/either the IE/Firefox browsers? We're curious as to how to do this, and what languages/technologies you'd use to do this.. could it be done in python??? searching google didn't really turn up anything for the IE side of things... any comments/assistance/etc would be useful... thanks bruce [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
find matching contiguous text
Hi. I have a xpath test that generates the text/html between two xpath functions, basically a chunk of HTML between two dom elements. However, it's slow. As a test, I'd like to compare the speed if I get all the HTML following a given element, and then get all the HTML preceding a given element.. and then do a "union/join/intersection" of the text between the two text segments. I'm trying to find an efficient/effective approach to determining the contiguous matching text, where the text starts with the 1st line in the test from the following element test. Thanks -- https://mail.python.org/mailman/listinfo/python-list
crawling/parsing a webpage based on dynamic javascript
Hi. Looking at using python/cerely/twisted to test in parsing a test site. Also looking at being able to parse a site created using dynamic javascript. I've got test apps to parse a site, but I'm interested in getting a better understanding of using multi-thread/multi-processing approaches to spin out as many fetch processes as possible. At the same time, I'm interested in understanding a bit better what's used for parsing the javascript pages in the py world. Also, rather than just point me to something like "scrapy", I'm actually interested in finding someone who's done this that I can talk to. Heck, for the right person, I'll even toss some cash your way!! Thanks -- http://mail.python.org/mailman/listinfo/python-list
web2py - running on fedora
Hi. I know this is a python list, but hoping that I can find someone to help get a base install of web2py running. I've got an older version of fedora, running py 2.6.4 running apache v2.2 I'm simply trying to get web2py up/running, and then to interface it with apache, so that the existing webapps (php apps) don't get screwed up. Thanks -bruce -- http://mail.python.org/mailman/listinfo/python-list
trying to strip out non ascii.. or rather convert non ascii
hi.. getting some files via curl, and want to convert them from what i'm guessing to be unicode. I'd like to convert a string like this:: Alcántar, Iliana to:: Alcantar, Iliana where I convert the " á " to " a" which appears to be a shift of 128, but I'm not sure how to accomplish this.. I've tested using the different decode/encode functions using utf-8/ascii with no luck. I've reviewed stack overflow, as well as a few other sites, but haven't hit the aha moment. pointers/comments would be welcome. thanks -- https://mail.python.org/mailman/listinfo/python-list
splitting file/content into lines based on regex termination
hi.
got a test file with the sample content listed below:
the content is one long string, and needs to be split into separate lines
I'm thinking the pattern to split on should be a kind of regex like::
#45 / 58#0#
or
#9 / 58#0
but i have no idea how to make this happen!!
if i read the content into a buf -> s
import re
dat = re.compile("what goes here??").split(s)
--i'm not sure what goes in the compile() to get the process to work..
thoughts/comments would be helpful.
thanks
test dat::
10116#000#C S#S#100##001##DAY#Fund of Computing#Barrett,
William#3#MWF#08:00am#08:50am#3718 HBLL #45 /
58#0#10116#000#C S#S#100##002##DAY#Fund of Computing#Barrett,
William#3#MWF#09:00am#09:50am#3718 HBLL #9 /
58#0#10178#000#C S#S#124##001##DAY#Computer Systems#Roper,
Paul#3#MWF#11:00am#11:50am#1170 TMCB #41 /
145#0#10178#000#C S#S#124##002##DAY#Computer Systems#Roper,
Paul#3#MWF#2:00pm#2:50pm#1170 TMCB #40 /
120#0#01489#002#C S#S#142##001##DAY#Intro to Computer
Programming#Burton, Robert Seppi, Kevinhttps://mail.python.org/mailman/listinfo/python-list
Re: splitting file/content into lines based on regex termination
update...
dat=re.compile("#(\d+) / (\d+)#(\d+)#").split(s)
almost works..
except i get
m = 10116#000#C S#S#100##001##DAY#Fund of Computing#Barrett,
William#3#MWF#08:00am#08:50am#3718 HBLL
m = 45
m = 58
m = 0
m = 10116#000#C S#S#100##002##DAY#Fund of Computing#Barrett,
William#3#MWF#09:00am#09:50am#3718 HBLL
m = 9
m = 58
m = 0
and what i want is:
m = 10116#000#C S#S#100##001##DAY#Fund of Computing#Barrett,
William#3#MWF#08:00am#08:50am#3718 HBLL 45 / 58,0
m = 10116#000#C S#S#100##002##DAY#Fund of Computing#Barrett,
William#3#MWF#09:00am#09:50am#3718 HBLL 9 / 58,0
so i'd have the results of the "compile/regex process" to be added to
the split lines
thoughts/comments??
thanks
On Thu, Nov 7, 2013 at 12:15 PM, bruce wrote:
> hi.
>
> got a test file with the sample content listed below:
>
> the content is one long string, and needs to be split into separate lines
>
> I'm thinking the pattern to split on should be a kind of regex like::
> #45 / 58#0#
> or
> #9 / 58#0
> but i have no idea how to make this happen!!
>
> if i read the content into a buf -> s
>
> import re
> dat = re.compile("what goes here??").split(s)
>
> --i'm not sure what goes in the compile() to get the process to work..
>
> thoughts/comments would be helpful.
>
> thanks
>
>
> test dat::
> 10116#000#C S#S#100##001##DAY#Fund of Computing#Barrett,
> William#3#MWF#08:00am#08:50am#3718 HBLL #45 /
> 58#0#10116#000#C S#S#100##002##DAY#Fund of Computing#Barrett,
> William#3#MWF#09:00am#09:50am#3718 HBLL #9 /
> 58#0#10178#000#C S#S#124##001##DAY#Computer Systems#Roper,
> Paul#3#MWF#11:00am#11:50am#1170 TMCB #41 /
> 145#0#10178#000#C S#S#124##002##DAY#Computer Systems#Roper,
> Paul#3#MWF#2:00pm#2:50pm#1170 TMCB #40 /
> 120#0#01489#002#C S#S#142##001##DAY#Intro to Computer
> Programming#Burton, Robert Seppi, Kevin />https://mail.python.org/mailman/listinfo/python-list
Re: splitting file/content into lines based on regex termination
hi.
thanks for the reply.
tried what you suggested. what I see now, is that I print out the
lines, but not the regex data at all. my initial try, gave me the
line, and then the next items , followed by the next line, etc...
what I then tried, was to do a capture/findall of the regex, and
combine the outputs in separate loops, which will be ugly but will
work
ff= "byu2.dat"
#fff= "sdsu2.dat"
with open(ff,"r") as myfile:
s=myfile.read()
s=s.replace(" ", "")
#with open(fff,"w") as myfile2:
# myfile2.write(s)
##45 / 58#0#
##45 / 58#0#
#dat1=re.compile("#(\d+) / (\d+)#(\d+)#").search(s).findall()
dat1=re.findall("#(\d+) / (\d+)#(\d+)#",s)
dat=re.compile("#(\d+) / (\d+)#(\d+)#").split(s)
dat2 = re.compile(r"#\d+ / \d+#\d+#").split(s)
#dat=re.split('("#(\d+) / (\d+)#(\d+)#")',s)
#dat=re.compile("#(\d+)").split(s)
for m in dat:
if m:
print "m = "+m
#sys.exit()
print "dat1"
print dat1
print len(dat1)
print "dat2a"
#sys.exit()
# for m in dat1:
#if m:
# print "m = "+m
#
# #sys.exit()
for m in dat2:
if m:
print "m = "+m
#sys.exit()
sys.exit()
return
the test data is pasted to -->>> http://bpaste.net/show/kYzBUIfhc5023phOVmcu/
thanks
!!
On Thu, Nov 7, 2013 at 1:13 PM, MRAB wrote:
> On 07/11/2013 17:45, bruce wrote:
>>
>> update...
>>
>>dat=re.compile("#(\d+) / (\d+)#(\d+)#").split(s)
>>
>> almost works..
>>
>> except i get
>> m = 10116#000#C S#S#100##001##DAY#Fund of Computing#Barrett,
>> William#3#MWF#08:00am#08:50am#3718 HBLL
>> m = 45
>> m = 58
>> m = 0
>> m = 10116#000#C S#S#100##002##DAY#Fund of Computing#Barrett,
>> William#3#MWF#09:00am#09:50am#3718 HBLL
>> m = 9
>> m = 58
>> m = 0
>>
>> and what i want is:
>> m = 10116#000#C S#S#100##001##DAY#Fund of Computing#Barrett,
>> William#3#MWF#08:00am#08:50am#3718 HBLL 45 / 58,0
>> m = 10116#000#C S#S#100##002##DAY#Fund of Computing#Barrett,
>> William#3#MWF#09:00am#09:50am#3718 HBLL 9 / 58,0
>>
>>
>> so i'd have the results of the "compile/regex process" to be added to
>> the split lines
>>
>> thoughts/comments??
>>
>> thanks
>>
> The split method also returns what's matched in any capture groups,
> i.e. "(\d+)". Try omitting the parentheses:
>
> dat = re.compile(r"#\d+ / \d+#\d+#").split(s)
>
> You should also be using raw string literals as above (r"..."). It
> doesn't matter in this instance, but it might in others.
>
>>
>>
>> On Thu, Nov 7, 2013 at 12:15 PM, bruce wrote:
>>>
>>> hi.
>>>
>>> got a test file with the sample content listed below:
>>>
>>> the content is one long string, and needs to be split into separate lines
>>>
>>> I'm thinking the pattern to split on should be a kind of regex like::
>>> #45 / 58#0#
>>> or
>>> #9 / 58#0
>>> but i have no idea how to make this happen!!
>>>
>>> if i read the content into a buf -> s
>>>
>>> import re
>>> dat = re.compile("what goes here??").split(s)
>>>
>>> --i'm not sure what goes in the compile() to get the process to work..
>>>
>>> thoughts/comments would be helpful.
>>>
>>> thanks
>>>
>>>
>>> test dat::
>>> 10116#000#C S#S#100##001##DAY#Fund of Computing#Barrett,
>>> William#3#MWF#08:00am#08:50am#3718 HBLL #45 /
>>> 58#0#10116#000#C S#S#100##002##DAY#Fund of Computing#Barrett,
>>> William#3#MWF#09:00am#09:50am#3718 HBLL #9 /
>>> 58#0#10178#000#C S#S#124##001##DAY#Computer Systems#Roper,
>>> Paul#3#MWF#11:00am#11:50am#1170 TMCB #41 /
>>> 145#0#10178#000#C S#S#124##002##DAY#Computer Systems#Roper,
>>> Paul#3#MWF#2:00pm#2:50pm#1170 TMCB #40 /
>>> 120#0#01489#002#C S#S#142##001##DAY#Intro to Computer
>>> Programming#Burton, Robert Seppi, Kevin>> />>
>>
>
> --
> https://mail.python.org/mailman/listinfo/python-list
--
https://mail.python.org/mailman/listinfo/python-list
try/exception - error block
Hi.
I have a long running process, it generates calls to a separate py
app. The py app appears to generate errors, as indicated in the
/var/log/messages file for the abrtd daemon.. The errors are
intermittent.
So, to quickly capture all possible exceptions/errors, I decided to
wrap the entire "main" block of the test py func in a try/exception
block.
This didn't work, as I'm not getting any output in the err file
generated in the exception block.
I'm posting the test code I'm using. Pointers/comments would be helpful/useful.
the if that gets run is the fac1 logic which operates on the input
packet/data..
elif (level=='collegeFaculty1'):
#getClasses(url, college, termVal,termName,deptName,deptAbbrv)
ret=getParseCollegeFacultyList1(url,content)
Thanks.
if __name__ == "__main__":
# main app
try:
#college="asu"
#url="https://webapp4.asu.edu/catalog";
#termurl="https://webapp4.asu.edu/catalog/TooltipTerms.ext";
#termVal=2141
#
# get the input struct, parse it, determine the level
#
#cmd='cat /apps/parseapp2/asuclass1.dat'
#print "cmd= "+cmd
#proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE)
#content=proc.communicate()[0].strip()
#print content
#sys.exit()
#s=getClasses(content)
#print "arg1 =",sys.argv[0]
if(len(sys.argv)<2):
print "error\n"
sys.exit()
a=sys.argv[1]
aaa=a
#
# data is coming from the parentApp.php
#data has been rawurlencode(json_encode(t))
#-reverse/split the data..
#-do the fetch,
#-save the fetched page/content if any
#-create the returned struct
#-echo/print/return the struct to the
# calling parent/call
#
##print urllib.unquote_plus(a).decode('utf8')
#print "\n"
#print simplejson.loads(urllib.unquote_plus(a))
z=simplejson.loads(urllib.unquote_plus(a))
##z=simplejson.loads(urllib.unquote(a).decode('utf8'))
#z=simplejson.loads(urllib2.unquote(a).decode('utf8'))
#print "aa \n"
print z
#print "\n bb \n"
#
#-passed in
#
url=str(z['currentURL'])
level=str(z['level'])
cname=str(z['parseContentFileName'])
#
# need to check the contentFname
# -should have been checked in the parentApp
# -check it anyway, return err if required
# -if valid, get/import the content into
# the "content" var for the function/parsing
#
##cmd='echo ${yolo_clientFetchOutputDir}/'
cmd='echo ${yolo_clientParseInputDir}/'
#print "cmd= "+cmd
proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE)
cpath=proc.communicate()[0].strip()
cname=cpath+cname
#print "cn = "+cname+"\n"
#sys.exit()
cmd='test -e '+cname+' && echo 1'
#print "cmd= "+cmd
proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE)
c1=proc.communicate()[0].strip()
if(not c1):
#got an error - process it, return
print "error in parse"
#
# we're here, no err.. got content
#
#fff= "sdsu2.dat"
with open(cname,"r") as myfile:
content=myfile.read()
myfile.close()
#-passed in
#college="louisville"
#url="http://htmlaccess.louisville.edu/classSchedule/";
#termVal="4138"
#print "term = "+str(termVal)+"\n"
#print "url = "+url+"\n"
#jtest()
#sys.exit()
#getTerm(url,college,termVal)
ret={} # null it out to start
if (level=='rState'):
#ret=getTerm(content,termVal)
ret=getParseStates(content)
elif (level=='stateCollegeList'):
#getDepts(url,college, termValue,termName)
ret=getParseStateCollegeList(url,content)
elif (level=='collegeFaculty1'):
#getClasses(url, college, termVal,termName,deptName,deptAbbrv)
ret=getParseCollegeFacultyList1(url,content)
elif (level=='collegeFaculty2'):
#getClasses(url, college, termVal,termName,deptName,deptAbbrv)
ret=getParseCollegeFacultyList2(content)
#
# the idea of this section.. we have the resulting
# fetched content/page...
#
a={}
status=False
if(ret['status']==True):
s=ascii_strip(ret['data'])
if(((s.find("-1) or (s.find("-1)) and
((s.find("-1) or (s.find("-1)) and
level=='classSectionDay'):
status=True
#print "herh"
#sys.exit()
#
# build the returned struct
#
#
a['Status']=True
a['recCount']=ret['count']
a['data']=ret['data']
a['nextLevel']=''
a['timestamp']=''
a['macAddress']=''
elif(ret['status']==False):
a['Status']=False
a['recCount']=0
a['data']=''
a['nextLevel']=''
a['timestamp']=''
a['macAddress']=''
res=urllib.quote(simplejson.dumps(a))
##print res
name=subprocess.Popen('uuidgen -t', shell=True,stdout=subprocess.PIPE)
name=name.communicat
Re: try/exception - error block
chris.. my bad.. I wasnt intending to mail you personally.
Or I wouldn't have inserted the "thanks guys"!
> thanks guys...
>
> but in all that.. no one could tell me .. why i'm not getting any
> errs/exceptions in the err file which gets created on the exception!!!
>
> but thanks for the information on posting test code!
Don't email me privately - respond to the list :)
Also, please don't top-post.
ChrisA
On Sun, Aug 3, 2014 at 10:29 AM, bruce wrote:
> Hi.
>
> I have a long running process, it generates calls to a separate py
> app. The py app appears to generate errors, as indicated in the
> /var/log/messages file for the abrtd daemon.. The errors are
> intermittent.
>
> So, to quickly capture all possible exceptions/errors, I decided to
> wrap the entire "main" block of the test py func in a try/exception
> block.
>
> This didn't work, as I'm not getting any output in the err file
> generated in the exception block.
>
> I'm posting the test code I'm using. Pointers/comments would be
> helpful/useful.
>
>
> the if that gets run is the fac1 logic which operates on the input
> packet/data..
> elif (level=='collegeFaculty1'):
> #getClasses(url, college, termVal,termName,deptName,deptAbbrv)
> ret=getParseCollegeFacultyList1(url,content)
>
>
> Thanks.
>
> if __name__ == "__main__":
> # main app
>
> try:
> #college="asu"
> #url="https://webapp4.asu.edu/catalog";
> #termurl="https://webapp4.asu.edu/catalog/TooltipTerms.ext";
>
>
> #termVal=2141
> #
> # get the input struct, parse it, determine the level
> #
>
> #cmd='cat /apps/parseapp2/asuclass1.dat'
> #print "cmd= "+cmd
> #proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE)
> #content=proc.communicate()[0].strip()
> #print content
> #sys.exit()
>
> #s=getClasses(content)
>
> #print "arg1 =",sys.argv[0]
> if(len(sys.argv)<2):
> print "error\n"
> sys.exit()
>
> a=sys.argv[1]
> aaa=a
>
> #
> # data is coming from the parentApp.php
> #data has been rawurlencode(json_encode(t))
> #-reverse/split the data..
> #-do the fetch,
> #-save the fetched page/content if any
> #-create the returned struct
> #-echo/print/return the struct to the
> # calling parent/call
> #
>
> ##print urllib.unquote_plus(a).decode('utf8')
> #print "\n"
> #print simplejson.loads(urllib.unquote_plus(a))
> z=simplejson.loads(urllib.unquote_plus(a))
> ##z=simplejson.loads(urllib.unquote(a).decode('utf8'))
> #z=simplejson.loads(urllib2.unquote(a).decode('utf8'))
>
> #print "aa \n"
> print z
> #print "\n bb \n"
>
> #
> #-passed in
> #
> url=str(z['currentURL'])
> level=str(z['level'])
> cname=str(z['parseContentFileName'])
>
>
> #
> # need to check the contentFname
> # -should have been checked in the parentApp
> # -check it anyway, return err if required
> # -if valid, get/import the content into
> # the "content" var for the function/parsing
> #
>
> ##cmd='echo ${yolo_clientFetchOutputDir}/'
> cmd='echo ${yolo_clientParseInputDir}/'
> #print "cmd= "+cmd
> proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE)
> cpath=proc.communicate()[0].strip()
>
> cname=cpath+cname
> #print "cn = "+cname+"\n"
> #sys.exit()
>
>
> cmd='test -e '+cname+' && echo 1'
> #print "cmd= "+cmd
> proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE)
> c1=proc.communicate()[0].strip()
>
> if(not c1):
> #got an error - process it, return
> print "error in parse"
>
> #
> # we're here, no err.. got content
> #
>
> #fff= "sdsu2.dat"
> with open(cname,"r") as myfile:
> content=myfile.read()
> myfile.close()
>
>
> #-passed in
> #college="louisville"
> #url="http://htmlaccess.louisville.edu/classSchedule/";
> #termVal="4138"
>
>
> #print "term = "+str(termVal)+"\n"
> #print "url = "+url+"\n"
Re: try/exception - error block
Hi Alan.
Yep, the err file in the exception block gets created. and the weird
thing is it matches the time of the abrtd information in the
/var/log/messages log..
Just nothing in the file!
On Sun, Aug 3, 2014 at 4:01 PM, Alan Gauld wrote:
> On 03/08/14 18:52, bruce wrote:
>
>>> but in all that.. no one could tell me .. why i'm not getting any
>>> errs/exceptions in the err file which gets created on the exception!!!
>
>
> Does the file actually get created?
> Do you see the print statement output - are they what you expect?
>
> Did you try the things Steven suggested.
>
>
>>>except Exception, e:
>>> print e
>>> print "pycolFac1 - error!! \n";
>>> name=subprocess.Popen('uuidgen -t',
>>> shell=True,stdout=subprocess.PIPE)
>>> name=name.communicate()[0].strip()
>>> name=name.replace("-","_")
>
>
> This is usually a bad idea. You are using name for the process and its
> output. Use more names...
> What about:
>
> uuid=subprocess.Popen('uuidgen -t',shell=True,stdout=subprocess.PIPE)
> output=uuid.communicate()[0].strip()
> name=output.replace("-","_")
>
>>> name2="/home/ihubuser/parseErrTest/pp_"+name+".dat"
>
>
> This would be a good place to insert a print
>
> print name2
>
>>> ofile1=open(name2,"w+")
>
>
> Why are you using w+ mode? You are only writing.
> Keep life as simple as possible.
>
>>> ofile1.write(e)
>
>
> e is quite likely to be empty
>
>>> ofile1.write(aaa)
>
>
> Are you sure aaa exists at this point? Remember you are catching all errors
> so if an error happens prior to aaa being created this will
> fail.
>
>>> ofile1.close()
>
>
> You used the with form earlier, why not here too.
> It's considered better style...
>
> Some final comments.
> 1) You call sys.exit() several times inside
> the try block. sys.exit will not be caught by your except block,
> is that what you expect?.
>
> 2) The combination of confusing naming of variables,
> reuse of names and poor code layout and excessive commented
> code makes it very difficult to read your code.
> That makes it hard to figure out what might be going on.
> - Use sensible variable names not a,aaa,z, etc
> - use 3 or 4 level indentation not 2
> - use a version control system (RCS,CVS, SVN,...) instead
> of commenting out big blocks
> - use consistent code style
> eg with f as ... or open(f)/close(f) but not both
> - use the os module (and friends) instead of subprocess if possible
>
> 3) Have you tried deleting all the files in the
> /home/ihubuser/parseErrTest/ folder and starting again,
> just to be sure that your current code is actually
> producing the empty files?
>
> 4) You use tmpParseDir in a couple of places but I don't
> see it being set anywhere?
>
>
> That's about the best I can offer based on the
> information available.
>
> --
> Alan G
> Author of the Learn to Program web site
> http://www.alan-g.me.uk/
> http://www.flickr.com/photos/alangauldphotos
>
> --
> https://mail.python.org/mailman/listinfo/python-list
--
https://mail.python.org/mailman/listinfo/python-list
libxml2dom quesiton
Hi.
Test question. Trying to see how to insert a test node into an
existing dom tree. For the test, it has a TR/TD with a
td[@class="foo"] that has an associated TR..
Trying to figure out how out how to insert a "" around the
tr/td in question...
Curious as to how to accomplish this.
Thoughts/pointers.
thanks
--
sample code/html chunk follows:
import libxml2dom
text is below
tt = libxml2dom.parseString(text, html=1)
t1path_=tt.xpath(t1path)
aa=tt.createElement("")
print len(t1path_)
for a in t1path_:
tt.insertBefore(aa,None)
print a.nodeName
print a.toString()
sys.exit()
s3==
trying to get::
--
http://mail.python.org/mailman/listinfo/python-list
generating unique set of dicts from a list of dicts
trying to figure out how to generate a unique set of dicts from a
json/list of dicts.
initial list :::
[{"pStart1a":
{"termVal":"1122","termMenu":"CLASS_SRCH_WRK2_STRM","instVal":"OSUSI",
"instMenu":"CLASS_SRCH_WRK2_INSTITUTION","goBtn":"CLASS_SRCH_WRK2_SSR_PB_SRCH",
"pagechk":"CLASS_SRCH_WRK2_SSR_PB_SRCH","nPage":"CLASS_SRCH_WRK2_SSR_PB_CLASS_SRCH"},
"pSearch1a":
{"chk":"CLASS_SRCH_WRK2_MON","srchbtn":"DERIVED_CLSRCH_SSR_EXPAND_COLLAPS"}},
{"pStart1":""},
{"pStart1a":{"termVal":"1122","termMenu":"CLASS_SRCH_WRK2_STRM","instVal":"OSUSI",
"instMenu":"CLASS_SRCH_WRK2_INSTITUTION","goBtn":"CLASS_SRCH_WRK2_SSR_PB_SRCH",
"pagechk":"CLASS_SRCH_WRK2_SSR_PB_SRCH","nPage":"CLASS_SRCH_WRK2_SSR_PB_CLASS_SRCH"},
"pSearch1a":
{"chk":"CLASS_SRCH_WRK2_MON","srchbtn":"DERIVED_CLSRCH_SSR_EXPAND_COLLAPS"}},
{"pStart1":""}]
As an exmple, the following is the test list:
[{"pStart1a":
{"termVal":"1122","termMenu":"CLASS_SRCH_WRK2_STRM","instVal":"OSUSI",
"instMenu":"CLASS_SRCH_WRK2_INSTITUTION","goBtn":"CLASS_SRCH_WRK2_SSR_PB_SRCH",
"pagechk":"CLASS_SRCH_WRK2_SSR_PB_SRCH","nPage":"CLASS_SRCH_WRK2_SSR_PB_CLASS_SRCH"},
"pSearch1a":
{"chk":"CLASS_SRCH_WRK2_MON","srchbtn":"DERIVED_CLSRCH_SSR_EXPAND_COLLAPS"}},
{"pStart1":""},
{"pStart1a":{"termVal":"1122","termMenu":"CLASS_SRCH_WRK2_STRM","instVal":"OSUSI",
"instMenu":"CLASS_SRCH_WRK2_INSTITUTION","goBtn":"CLASS_SRCH_WRK2_SSR_PB_SRCH",
"pagechk":"CLASS_SRCH_WRK2_SSR_PB_SRCH","nPage":"CLASS_SRCH_WRK2_SSR_PB_CLASS_SRCH"},
"pSearch1a":
{"chk":"CLASS_SRCH_WRK2_MON","srchbtn":"DERIVED_CLSRCH_SSR_EXPAND_COLLAPS"}},
{"pStart1":""}]
Trying to get the following, list of unique dicts, so there aren't
duplicate dicts.
Searched various sites/SO.. and still have a mental block.
[
{"pStart1a":
{"termVal":"1122","termMenu":"CLASS_SRCH_WRK2_STRM","instVal":"OSUSI",
"instMenu":"CLASS_SRCH_WRK2_INSTITUTION","goBtn":"CLASS_SRCH_WRK2_SSR_PB_SRCH",
pagechk":"CLASS_SRCH_WRK2_SSR_PB_SRCH","nPage":"CLASS_SRCH_WRK2_SSR_PB_CLASS_SRCH"},
"pSearch1a":
{"chk":"CLASS_SRCH_WRK2_MON","srchbtn":"DERIVED_CLSRCH_SSR_EXPAND_COLLAPS"}},
{"pStart1":""}]
I was considering iterating through the initial list, copying each
dict into a new list, and doing a basic comparison, adding the next
dict if it's not in the new list.. is there another/better way?
posted this to StackOverflow as well.
http://stackoverflow.com/questions/8808286/simplifying-a-json-list-to-the-unique-dict-items
<<<
There was a potential soln that I couldn't understand.
-
The simplest approach -- using list(set(your_list_of_dicts)) won't
work because Python dictionaries are mutable and not hashable (that
is, they don't implement __hash__). This is because Python can't
guarantee that the hash of a dictionary won't change after you insert
it into a set or dict.
However, in your case, since you (don't seem to be) modifying the data
at all, you can compute your own hash, and use this along with a
dictionary to relatively easily find the unique JSON objects without
having to do a full recursive comparison of each dictionary to the
others.
First, we need a function to compute a hash of the dictionary. Rather
than trying to build our own hash function, let's use one of the
built-in ones from hashlib:
def dict_hash(d):
out = hashlib.md5()
for key, value in d.iteritems():
out.update(unicode(key))
out.update(unicode(value))
return out.hexdigest()
(Note that this relies on unicode(...) for each of your values
returning something unique -- if you have custom classes in the
dictionaries whose __unicode__ returns something like "MyClass
instance", this will fail or will require modification. Also, in your
example, your dictionaries are flat, but I'll leave it as an exercise
to the reader how to expand this solution to work with dictionaries
that contain other dicts or lists.)
Since dict_hash returns a string, which is immutable, you can now use
a dictionary to find the unique elements:
uniques_map = {}
for d in list_of_dicts:
uniques[dict_hash(d)] = d
unique_dicts = uniques_map.values()
*** not sure what the "uniqes" is, or what/how it should be defined
thoughts/comments are welcome
thanks
--
http://mail.python.org/mailman/listinfo/python-list
python-base rpm
hi... in trying to find the FC rpms for python-base, i can only seem to find rpms for mandrake can these rpms be used for FC4? would someone know of source rpms for python-base? thanks -bruce -- http://mail.python.org/mailman/listinfo/python-list
uTidylib question..
hi... you can do : import tidy s = tidy.parseString(foo) which runs the information in "foo" through tidy, for cleaning. this results in "s" being a "document object" print "s" will display the contents of the object. my question, can the "document object" be turned/translated into a string var? thanks -bruce -- http://mail.python.org/mailman/listinfo/python-list
bay area based - python guru..
hi... is there a bay area based guru, or someone who's into mentoring that i'we can talk to... specifically someone who's experienced using mechanize/browser/urllib/urllib2/cookies/etc... we're running into small situations that are taking an inordinate amount of time to resolve, so we thought we'd toss this here and see if anyone has thoughts/comments. thanks -bruce -- http://mail.python.org/mailman/listinfo/python-list
RE: Automate Mozilla Firefox
hi david... just what are you trying to do... are you trying to drive some functionality of firefox? are you trying to parse web files?? more information might allow someone to shed additional thoughts on a solution.. -bruce -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of david brochu jr Sent: Wednesday, July 26, 2006 11:05 AM To: [email protected] Subject: Automate Mozilla Firefox Hey everyone, I am trying to automate navigating to urls (all from a txt file) 1 at a time in Firefox and then killing firefox before navigating to the next. I think I might have to use PyXPCOM to do this but I have never used this package and cannot find any good resources on the net to help me along. Anyone have an idea as to how I would do this or have any good resources they could point me to? Thanks -- http://mail.python.org/mailman/listinfo/python-list
RE: Client/Server Question
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of faulkner i'm not familiar with python's os/system/exec functions, but i'm willing to bet that you're running into an issue with how the "system" function works. look for a function that allows you to spawn an external app, but that dosen't wait (or require) the child app to return. i would also suggest that you work with a client/server setup on the same machine as a test. furthermore, i would suggest that you have your server do something like print put to a file/cmdline so you can actually verify that it's working as it should. once that's working well, then hook in the system/exec/spawn functionality... good luck Sent: Friday, July 28, 2006 3:06 PM To: [email protected] Subject: Re: Client/Server Question you might want to look at sshd. if you're on a windows box, you may need cygwin. if you're on linux, you either already have it, or it's in your package manager. [EMAIL PROTECTED] wrote: > My server.py looks like this > > -CODE- - > #!/usr/bin/env python > import socket > import sys > import os > > s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) > host = '' > port = 2000 > > s.bind((host,port)) > s.listen(1) > conn, addr = s.accept() > print 'client is at', addr > > while True: > data = conn.recv(100) > if (data == 'MaxSim'): > print 'MaxiSim' > os.system('notepad') > elif (data == 'Driving Sim'): > print 'Driving Sim' > os.system('explorer') > elif (data == 'SHUTDOWN'): > print 'Shutting down...' > os.system('shutdown -s') > conn.close() > break > ---CODE > END- > > I am running this above program on a windows machine. My client is a > Linux box. What I want to achieve is that server.py should follows > instructions till I send a 'SHUTDOWN' command upon which it should shut > down. > > When I run this program and suppose send 'MaxSim' to it, it launches > notepad.exe fine, but then after that it doesn't accept subsequent > command. I want is that it should accept subsequent commands like > Driving Sim and launch windows explorer etc untill I send a 'SHUTDOWN' > command. > > Any help on this, it will be greatly appreciated. -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
running an app as user "foo"
hi. within python, what's the best way to automatically spawn an app as a given user/group. i'm testing an app, and i'm going to need to assign the app to a given user/group, as well as assign it certain access rights/modes (rwx) i then want to copy the test app to a given dir, and then spawn a process to run the app.. thanks -- http://mail.python.org/mailman/listinfo/python-list
upgrading python...
hi. i'min a situation where i might need to upgrade python. i have the current version of python for FC3. i might need to have the version for FC4. i built the version that's on FC4 from the python source RPM. however, when i simply try to install the resulting RPM, the app gives me dependency issues from apps that are dependent on the previous/current version of python. i'm trying to figure out if there's a 'best' way to proceed. do i simply do the install, and force it to overwrite the current version of python? is there a way to point 'yum' at my new python RPM, and let yum take care of dealing with any dependcy issues? and how would yum handle weird dependency issues with RPMs that don't exist.. does yum have the ability to actually build required apps from source? comments/thoughts/etc... thanks -bruce ps. the reason for this is that i'm looking at some of the newer functionality in the 2.4 version of python over the 2.3 -- http://mail.python.org/mailman/listinfo/python-list
seaching a list...
hi... i'm playing with a test sample. i have somethhing like: dog = mysql_get(.) . . . such that 'dog' will be an 'AxB' array of data from the tbls furher in the test app, i'm going to have a list, foo: foo = 'a','b','c','d' i'm trying to determine what's the fastest way of searching through the 'dog' array/list of information for the 'foo' list. should i essentially make dog into A lists, where each list is B elements, or should it somehow combine all the elements/items in 'dog' into one large list, and then search through that for the 'foo' list... also, in searching through google, i haven't come across the list.search function.. so just how do i search a list to see if it contains a sublist... my real problem involves figuring out how to reduce the number of hits to the db/tbl... thanks ps. if this is confusing, i could provide psuedo-code to make it easier to see... -- http://mail.python.org/mailman/listinfo/python-list
RE: seaching a list...
hi larry... thanks for the reply... the issue i'm having is that i'm going to have to compare multiple rows of information to the information in the db. so essentially i'd have to do a hit to the db, for each row of information i want to compare if i did it your way... (which was what i had thought about) the issue of doing the string/list compare/search is that i can get everything from the db with one call... i can then iterate through memory for each of my row information that i'm searching to see if it exists in the db... memory searches should be faster than the network overhead, and the associated multiple db calls... -bruce -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Larry Bates Sent: Thursday, August 10, 2006 4:28 PM To: [email protected] Subject: Re: seaching a list... bruce wrote: > hi... > > i'm playing with a test sample. i have somethhing like: > dog = mysql_get(.) > . > . > . > > such that 'dog' will be an 'AxB' array of data from the tbls > > furher in the test app, i'm going to have a list, foo: > foo = 'a','b','c','d' > > i'm trying to determine what's the fastest way of searching through the > 'dog' array/list of information for the 'foo' list. > > should i essentially make dog into A lists, where each list is B elements, > or should it somehow combine all the elements/items in 'dog' into one large > list, and then search through that for the 'foo' list... > > also, in searching through google, i haven't come across the list.search > function.. so just how do i search a list to see if it contains a sublist... > > my real problem involves figuring out how to reduce the number of hits to > the db/tbl... > > thanks > > ps. if this is confusing, i could provide psuedo-code to make it easier to > see... > > > > You should use the database for what it is good at storing and searching through data. Don't read all the data from a table and search through it. Rather, create indexes on the table so that you can locate the data quickly in the database by passing in something you are looking for and let the database do the searching. I can promise you this will almost always be faster and more flexible. Something like: Assume the columns are called rownumber, c1, c2, c3, c4 and the table is indexed on c1, c2, c3, and c4. This will happen almost instantly no matter how many rows you are searching for. select rownumber from database_table where c1="a" and c2="b" and c3="c" and c5="d" It takes one "hit" to the db/table to return the rowindex that matches. -Larry Bates -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
python/mysql/list question...
hi.
i have the following sample code. i'm trying to figure out if there's a way
to use a 'list of lists' in a mysql execute...
i've tried a variety of combinations but i get an error:
Error 1241: Operand should contain 1 column(s)
the test code is:
insertSQL = """insert into appTBL
(appName, universityID)
values(%s,%s)"""
a = []
b = []
a.append('qqa')
a.append(222)
b.append(a)
a=[]
a.append('bbb')
a.append(66)
b.append(a)
try:
c.execute(insertSQL, (b[0],b[1])) <<<
except MySQLdb.Error, e:
print "Error %d: %s" % (e.args[0], e.args[1])
print b
sys.exit (1)
i've tried to use b, (b), etc...
using b[0] works for the 1st row...
any thoughts/comments...
thanks
--
http://mail.python.org/mailman/listinfo/python-list
RE: python/mysql/list question...
never mind... doh!!! executemany as opposed to execute!!! thanks -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of bruce Sent: Thursday, August 10, 2006 6:37 PM To: [email protected] Subject: python/mysql/list question... hi. i have the following sample code. i'm trying to figure out if there's a way to use a 'list of lists' in a mysql execute... i've tried a variety of combinations but i get an error: Error 1241: Operand should contain 1 column(s) the test code is: insertSQL = """insert into appTBL (appName, universityID) values(%s,%s)""" a = [] b = [] a.append('qqa') a.append(222) b.append(a) a=[] a.append('bbb') a.append(66) b.append(a) try: c.execute(insertSQL, (b[0],b[1])) <<<<<<<<<<<<<<<<<<< except MySQLdb.Error, e: print "Error %d: %s" % (e.args[0], e.args[1]) print b sys.exit (1) i've tried to use b, (b), etc... using b[0] works for the 1st row... any thoughts/comments... thanks -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Is there an alternative to os.walk?
Hi all, I have a question about traversing file systems, and could use some help. Because of directories with many files in them, os.walk appears to be rather slow. I`m thinking there is a potential for speed-up since I don`t need os.walk to report filenames of all the files in every directory it visits. Is there some clever way to use os.walk or another tool that would provide functionality like os.walk except for the listing of the filenames? -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there an alternative to os.walk?
waylan wrote: > Bruce wrote: > > Hi all, > > I have a question about traversing file systems, and could use some > > help. Because of directories with many files in them, os.walk appears > > to be rather slow. I`m thinking there is a potential for speed-up since > > I don`t need os.walk to report filenames of all the files in every > > directory it visits. Is there some clever way to use os.walk or another > > tool that would provide functionality like os.walk except for the > > listing of the filenames? > > You might want to check out the path module [1] (not os.path). The > following is from the docs: > > > The method path.walk() returns an iterator which steps recursively > > through a whole directory tree. path.walkdirs() and path.walkfiles() > > are the same, but they yield only the directories and only the files, > > respectively. > > Oh, and you can thank Paul Bissex for pointing me to path [2]. > > [1]: http://www.jorendorff.com/articles/python/path/ > [2]: http://e-scribe.com/news/289 A little late but.. thanks for the replies, was very useful. Here`s what I do in this case: def search(a_dir): valid_dirs = [] walker = os.walk(a_dir) while 1: try: dirpath, dirnames, filenames = walker.next() except StopIteration: break if dirtest(dirpath,filenames): valid_dirs.append(dirpath) return valid_dirs def dirtest(a_dir): testfiles = ['a','b','c'] for f in testfiles: if not os.path.exists(os.path.join(a_dir,f)): return 0 return 1 I think you`re right - it`s not os.walk that makes this slow, it`s the dirtest method that takes so much more time when there are many files in a directory. Also, thanks for pointing me to the path module, was interesting. -- http://mail.python.org/mailman/listinfo/python-list
RE: Python v PHP: fair comparison?
interesting ongoing thread... i've seen a number of these over the years.. my language is better than your language!! i'm sure this question on the php list would have findings/results that are essentially opposite of what is being discussed here! -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Bruno Desthuilliers Sent: Wednesday, November 15, 2006 4:35 PM To: [email protected] Subject: Re: Python v PHP: fair comparison? walterbyrd a écrit : > Michael Torrie wrote: > > >>Absolutely false. Most of my standalone, command-line scripts for >>manipulating my unix users in LDAP are written in PHP, although we're >>rewriting them in python. >> > > > I would say that you are one of very few who use PHP for sys-admin > tasks - and even you have switched to Python. In general, it does not > seem to me that PHP has caught on as a sys-admin language. > > However, as sys-admin scripting langanges go, I would also say that > Python is far less popular than butt-ugly Perl. Again - just based on > what I've seen. Perl is a scripting language. By 'design'. It's meant to be a better sh+sed+awk. Python is a general purpose programming language meant to fill the gap between shell scripts and C programs. So Perl is obviously a better scripting language than Python. The problem is that Q&D sys-admin scripts tend to become full-blown apps - and then, Perl starts to suck. -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
RE: Python v PHP: fair comparison?
ummm bruno... you don't 'need' apache to run php. in fact, although i'm from the old hard c/c++ world way before web apps, i haven't really found much for most general apps (not ui/not threaded stuff) that php can't do.. peace -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Bruno Desthuilliers Sent: Wednesday, November 15, 2006 3:46 PM To: [email protected] Subject: Re: Python v PHP: fair comparison? walterbyrd a écrit : > Bjoern Schliessmann wrote: > >>walterbyrd wrote: >> >> >>>- PHP has a lower barrier to entry >> >>Which kind of barrier do you mean -- syntax, availability, ...? > > > Putting php into a web-site is as easy as throwing some php code into a > my html file, and maybe giving the file a php extension. I can get php > hosting for $10 a year easy. > > This may not be what you want for a major developement project, Not even for a minor one. While one can write not-trivial applications in PHP, the kind of work this requires would greatly benefit from a real programming language (vs a Q&D web scripting language). > but the > barrier to entry is very low. > Ok. This observation is very specific to web development. Python is a general purpose programming language. And from this POV, it's IMHO much easier to learn. And not only because you don't need Apache to use it. -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
beautifulsoup .vs tidy
hi... never used perl, but i have an issue trying to resolve some html that appears to be "dirty/malformed" regarding the overall structure. in researching validators, i came across the beautifulsoup app and wanted to know if anybody could give me pros/cons of the app as it relates to any of the other validation apps... the issue i'm facing involves parsing some websites, so i'm trying to extract information based on the DOM/XPath functions.. i'm using perl to handle the extraction thanks -bruce [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
RE: beautifulsoup .vs tidy
hi paddy... that's exactly what i'm trying to accomplish... i've used tidy, but it seems to still generate warnings... initFile -> tidy ->cleanFile -> perl app (using xpath/livxml) the xpath/linxml functions in the perl app complain regarding the file. my thought is that tidy isn't cleaning enough, or that the perl xpath/libxml functions are too strict! which is why i decided to see if anyone on the python side has experienced/solved this problem.. -bruce -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Paddy Sent: Saturday, July 01, 2006 1:09 AM To: [email protected] Subject: Re: beautifulsoup .vs tidy bruce wrote: > hi... > > never used perl, but i have an issue trying to resolve some html that > appears to be "dirty/malformed" regarding the overall structure. in > researching validators, i came across the beautifulsoup app and wanted to > know if anybody could give me pros/cons of the app as it relates to any of > the other validation apps... > I'm not too sure of what you are after. You mention tidy in the subject which made me think that maybe you were trying to generate well-formed HTML from malformed webppages that nonetheless browsers can interpret. If that is the case then try HTML tidy: http://www.w3.org/People/Raggett/tidy/ - Pad. -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
building/installing python libs/modules...
paul, thanks for the replies to my issues!!! much appreciation. i've got the python app: http://www.boddie.org.uk/python/downloads/libxml2dom-0.3.3.tar.gz and i've downloaded it, an untarred it... i have the dir structure, but i don't know what needs to be done now!!! i have a setup.py. does it get run? the Readme file didn't tell me how to build the app i now, but everyone starts at something, sometime!! thanks -bruce -- http://mail.python.org/mailman/listinfo/python-list
RE: building/installing python libs/modules...
my bad.. saw that.. completely missed it... i had guessed.. and am up/runing/testing.. -bruce -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of John Machin Sent: Saturday, July 01, 2006 12:53 PM To: [email protected] Subject: Re: building/installing python libs/modules... On 2/07/2006 5:37 AM, bruce wrote: > paul, > > thanks for the replies to my issues!!! much appreciation. > > i've got the python app: > http://www.boddie.org.uk/python/downloads/libxml2dom-0.3.3.tar.gz > > and i've downloaded it, an untarred it... > i have the dir structure, but i don't know what needs to be done now!!! i > have a setup.py. does it get run? > > the Readme file didn't tell me how to build the app > http://www.boddie.org.uk/python/libxml2dom.html (see section headed "Installation" at the bottom of the page) -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
really basic question regarding arrays/function output...
hi...
i have the following test python script i'm trying to figure out a
couple of things...
1st.. how can i write the output of the "label" to an array, and then how i
can select a given element of the array.. i know real basic..
2nd.. where can i go to find methods of libxml2dom. i've been looking using
google, but can't seem to find a site pointing out the underlying methods,
which is kind of strange...
-bruce
-
test.py
#! /usr/bin/env python
#test python script
import libxml2dom
import urllib
print "hello"
turl =
"http://courses.tamu.edu/ViewSections.aspx?campus=CS&x=hvm4MX4PXIY9J8C2Dcxz5
0ncXTJdT7v2&type=M"
f = urllib.urlopen(turl)
s = f.read()
f.close()
# s contains HTML not XML text
d = libxml2dom.parseString(s, html=1)
# get the community-related links
for label in
d.xpath("/html/body/table/tr/td[2]/table/tr[6]/td/table/child::tr/[EMAIL
PROTECTED]
'sectionheading']/text()"):
print label.nodeValue
--
http://mail.python.org/mailman/listinfo/python-list
python guru in the Bay Area
hi... is there someone in the Bay Area who knows python, that I can talk to ... I have the shell of a real basic app, and I'd like someone who can walk me through how to set it up. i'm talking about passing arrays, setting up global data, etc... basically, if i can create a shell for what i'm trying to create, i should be ok.. lunch of course, would be on me!! thanks -bruce ps. yeah.. i know i could eventually figure most of this by searching/experimenting/etc... using google.. but i'd rather move as fast as possible. -- http://mail.python.org/mailman/listinfo/python-list
size of an array..
hi..
another basic question...
in the following:
#test python script
import libxml2dom
import urllib
print "hello"
turl =
"http://courses.tamu.edu/ViewSections.aspx?campus=CS&x=hvm4MX4PXIY9J8C2Dcxz5
0ncXTJdT7v2&type=M"
f = urllib.urlopen(turl)
s = f.read()
f.close()
# s contains HTML not XML text
d = libxml2dom.parseString(s, html=1)
# get the community-related links
for label in
d.xpath("/html/body/table/tr/td[2]/table/tr[6]/td/table/child::tr/[EMAIL
PROTECTED]
'sectionheading']/text()"):
print label.nodeValue
print "/n/n/n/"
l =
d.xpath("/html/body/table/tr/td[2]/table/tr[6]/td/table/child::tr/[EMAIL
PROTECTED]
'sectionheading']/text()")
xx = =sizeof(l)
print "test\n"
print "l = ".xx." \n"
print l[1].nodeValue
=
how do i determine the size of "l"
from the "for label in d.xpath", i assume that "d.xpath" is returning
multiple strings/objects/etc... and that label is assigned to the next
"element during the loop.
is there a way to determine the number of elements in "l", or the number of
iterations for the "for loop" prior to running it...
the tutorials i've seen as of yet haven't mentioned this..
thanks
-bruce
--
http://mail.python.org/mailman/listinfo/python-list
python/libxml2dom questions
hi paul... in playing around with the test python app (see below) i've got a couple of basic questions. i can't seem to find the answers via google, and when i've looked in the libxml2dom stuff that i downloaded i didn't see answers either... for the list in the "for label in d.xpath" how can i find out the size of the list a simple/basic question, but it's driving me up a wall!!! also, how can i determine what methods are available for a libxml2dom object? thanks... -bruce -- http://mail.python.org/mailman/listinfo/python-list
python/libxml2dom questions
hi paul...
in playing around with the test python app (see below) i've got a couple of
basic questions. i can't seem to find the answers via google, and when i've
looked in the libxml2dom stuff that i downloaded i didn't see answers
either...
for the list in the "for label in d.xpath" how can i find out the size of
the list a simple/basic question, but it's driving me up a wall!!!
also, how can i determine what methods are available for a libxml2dom
object?
thanks...
-bruce
sample code:
#test python script
import libxml2dom
import urllib
print "hello"
turl =
"http://courses.tamu.edu/ViewSections.aspx?campus=CS&x=hvm4MX4PXIY9J8C2Dcxz5
0ncXTJdT7v2&type=M"
f = urllib.urlopen(turl)
s = f.read()
f.close()
# s contains HTML not XML text
d = libxml2dom.parseString(s, html=1)
# get the community-related links
for label in
d.xpath("/html/body/table/tr/td[2]/table/tr[6]/td/table/child::tr/[EMAIL
PROTECTED]
'sectionheading']/text()"):
print label.nodeValue
print "/n/n/n/"
l =
d.xpath("/html/body/table/tr/td[2]/table/tr[6]/td/table/child::tr/[EMAIL
PROTECTED]
'sectionheading']/text()")
print l[1].nodeValue
--
http://mail.python.org/mailman/listinfo/python-list
RE: python/libxml2dom questions
thanks barry... now if i can figure out what the attributes do, and which ones i need to deal with!! -bruce -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Barry Kelly Sent: Saturday, July 01, 2006 10:24 PM To: [email protected] Subject: Re: python/libxml2dom questions "bruce" <[EMAIL PROTECTED]> wrote: > also, how can i determine what methods are available for a libxml2dom > object? Have you tried dir(object)? It works great on the command-line reply-eval-print loop. -- Barry -- http://barrkel.blogspot.com/ -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
RE: xpath question
hi is there anyone with XPath expertise here? i'm trying to figure out if there's a way to use regex expressions with an xpath query? i've seen references to the ability to use regex and xpath/xml, but i'm not sure how to do it... i have a situation where i have something like: /html/table//[EMAIL PROTECTED]'foo'] is it possible to do soomething like [EMAIL PROTECTED]/fo/] so i'd match the class attribute with fo i'm trying to parse HTML/Web docs... thanks -bruce -- http://mail.python.org/mailman/listinfo/python-list
RE: xpath question
simon.. you may not.. but lot's of people use python and xpath for html/xml functionality.. check google "python xpath"... later.. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Simon Forman Sent: Sunday, July 02, 2006 2:10 PM To: [email protected] Subject: Re: xpath question bruce wrote: > hi > > is there anyone with XPath expertise here? i'm trying to figure out if > there's a way to use regex expressions with an xpath query? i've seen > references to the ability to use regex and xpath/xml, but i'm not sure how > to do it... > > i have a situation where i have something like: > /html/table//[EMAIL PROTECTED]'foo'] > > is it possible to do soomething like [EMAIL PROTECTED]/fo/] so i'd match the > class > attribute with fo > > i'm trying to parse HTML/Web docs... > > thanks > > -bruce I'll take this one... Dude, this is a *python* mailing list, not an xml/xpath/regex one. In addition, the regex syntax you're using above (~=/fo/) looks like *perl* code-- but I wouldn't know 'cause I don't use perl myself. Now it's entirely possible that there are *many* people here that are xml/xpath/regex Kung Fu Masters, *and* it's entirely possible that one or more of them are about to answer your question informatively and in exhaustive detail. It's also entirely possible that this is the most friendly and informative reply that you're going to get, here. Try a more appropriate newsgroup, and good luck. -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
python function defs/declarations
hi.. the docs state that the following is valid... def foo(): i = 2 print "i = "i print "hello" foo() is there a way for me to do this.. print "hello" foo() def foo(): i = 2 print "i = "i ie, to use 'foo' prior to the declaration of 'foo' thanks -bruce -- http://mail.python.org/mailman/listinfo/python-list
how to stop python...
hi... perl has the concept of "die". does python have anything similar. how can a python app be stopped? the docs refer to a sys.stop.. but i can't find anything else... am i missing something... thanks -bruce -- http://mail.python.org/mailman/listinfo/python-list
ascii character - removing chars from string
hi... i'm running into a problem where i'm seeing non-ascii chars in the parsing i'm doing. in looking through various docs, i can't find functions to remove/restrict strings to valid ascii chars. i'm assuming python has something like valid_str = strip(invalid_str) where 'strip' removes/strips out the invalid chars... any ideas/thoughts/pointers... thanks -bruce -- http://mail.python.org/mailman/listinfo/python-list
RE: ascii character - removing chars from string
hi... update. i'm getting back html, and i'm getting strings like " foo " which is valid HTML as the ' ' is a space. i need a way of stripping/removing the ' ' from the string the needs to be treated as a single char... text = "foo cat " ie ok_text = strip(text) ok_text = "foo cat" thanks -bruce -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Rune Strand Sent: Monday, July 03, 2006 5:43 PM To: [email protected] Subject: Re: ascii character - removing chars from string bruce wrote: > hi... > > i'm running into a problem where i'm seeing non-ascii chars in the parsing > i'm doing. in looking through various docs, i can't find functions to > remove/restrict strings to valid ascii chars. > > i'm assuming python has something like > > valid_str = strip(invalid_str) > > where 'strip' removes/strips out the invalid chars... > > any ideas/thoughts/pointers... If you're able to define the invalid_chars, the most convenient is probably to use the strip() method: >>> a_string = "abcdef" >>> invalid_chars = 'abc' >>> a_string.strip(invalid_chars) 'def' -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
RE: ascii character - removing chars from string
simon... the ' ' is not to be seen/viewed as text/ascii.. it's a representation of a hex 'u\xa0' if i recall... i'm looking to remove or replace the insances with a ' ' (space) -bruce -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Simon Forman Sent: Monday, July 03, 2006 7:17 PM To: [email protected] Subject: Re: ascii character - removing chars from string bruce wrote: > hi... > > update. i'm getting back html, and i'm getting strings like " foo " > which is valid HTML as the ' ' is a space. &, n, b, s, p, ; Those are all ascii characters. > i need a way of stripping/removing the ' ' from the string > > the needs to be treated as a single char... > > text = "foo cat " > > ie ok_text = strip(text) > > ok_text = "foo cat" Do you really want to remove those html entities? Or would you rather convert them back into the actual text they represent? Do you just want to deal with 's? Or maybe the other possible entities that might appear also? Check out htmlentitydefs.entitydefs (see http://docs.python.org/lib/module-htmlentitydefs.html) it's kind of ugly looking so maybe use pprint to print it: >>> import htmlentitydefs, pprint >>> pprint.pprint(htmlentitydefs.entitydefs) {'AElig': 'Æ', 'Aacute': 'Á', 'Acirc': 'Â', . . . 'nbsp': '\xa0', . . . etc... HTH, ~Simon "You keep using that word. I do not think it means what you think it means." -Inigo Montoya, "The Princess Bride" -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
RE: ascii character - removing chars from string update
update... here is a sample of the text i'm looking to do hte search/replace for... ACCT 209 - SURVEY OF ACCT PRIN i'm trying to figure out how to replace the " " with a ''. in html, the ' ' char is not a valid ascii char... in perl, i'd do 's / //' and be done with it!!! -bruce -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of bruce Sent: Monday, July 03, 2006 8:26 PM To: 'Simon Forman' Cc: [email protected] Subject: RE: ascii character - removing chars from string simon... the ' ' is not to be seen/viewed as text/ascii.. it's a representation of a hex 'u\xa0' if i recall... i'm looking to remove or replace the insances with a ' ' (space) -bruce -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Simon Forman Sent: Monday, July 03, 2006 7:17 PM To: [email protected] Subject: Re: ascii character - removing chars from string bruce wrote: > hi... > > update. i'm getting back html, and i'm getting strings like " foo " > which is valid HTML as the ' ' is a space. &, n, b, s, p, ; Those are all ascii characters. > i need a way of stripping/removing the ' ' from the string > > the needs to be treated as a single char... > > text = "foo cat " > > ie ok_text = strip(text) > > ok_text = "foo cat" Do you really want to remove those html entities? Or would you rather convert them back into the actual text they represent? Do you just want to deal with 's? Or maybe the other possible entities that might appear also? Check out htmlentitydefs.entitydefs (see http://docs.python.org/lib/module-htmlentitydefs.html) it's kind of ugly looking so maybe use pprint to print it: >>> import htmlentitydefs, pprint >>> pprint.pprint(htmlentitydefs.entitydefs) {'AElig': 'Æ', 'Aacute': 'Á', 'Acirc': 'Â', . . . 'nbsp': '\xa0', . . . etc... HTH, ~Simon "You keep using that word. I do not think it means what you think it means." -Inigo Montoya, "The Princess Bride" -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
RE: ascii character - removing chars from string
simon... the issue that i'm seeing is not a result of simply using the 'string.replace' function. it appears that there's something else going on in the text although i can see the nbsp in the file, the file is manipulated by a number of other functions prior to me writing the information out to a file. somewhere the 'nbsp' is changed, so there's something else going on... however, the error i get indicates that the char 'u\xa0' is what's causing the issue.. as far as i can determine, the string.replace can't/doesn't handle non-ascii chars. i'm still looking for a way to search/replace non-ascii chars... this would/should resolve my issue.. -bruce -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Simon Forman Sent: Monday, July 03, 2006 11:28 PM To: [email protected] Subject: Re: ascii character - removing chars from string bruce wrote: > simon... > > the ' ' is not to be seen/viewed as text/ascii.. it's a representation > of a hex 'u\xa0' if i recall... Did you not see this part of the post that you're replying to? > 'nbsp': '\xa0', My point was not that '\xa0' is an ascii character... It was that your initial request was very misleading: "i'm running into a problem where i'm seeing non-ascii chars in the parsing i'm doing. in looking through various docs, i can't find functions to remove/restrict strings to valid ascii chars." That's why you got three different answers to the wrong question. You weren't "seeing non-ascii chars" at all. You were seeing ascii representations of html entities that, in the case of ' ', happen to represent non-ascii values. > > i'm looking to remove or replace the insances with a ' ' (space) Simplicity: s.replace(' ', ' ') ~Simon "You keep using that word. I do not think it means what you think it means." -Inigo Montoya, "The Princess Bride" > > -bruce > > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf > Of Simon Forman > Sent: Monday, July 03, 2006 7:17 PM > To: [email protected] > Subject: Re: ascii character - removing chars from string > > > bruce wrote: > > hi... > > > > update. i'm getting back html, and i'm getting strings like " foo " > > which is valid HTML as the ' ' is a space. > > &, n, b, s, p, ; Those are all ascii characters. > > > i need a way of stripping/removing the ' ' from the string > > > > the needs to be treated as a single char... > > > > text = "foo cat " > > > > ie ok_text = strip(text) > > > > ok_text = "foo cat" > > Do you really want to remove those html entities? Or would you rather > convert them back into the actual text they represent? Do you just > want to deal with 's? Or maybe the other possible entities that > might appear also? > > Check out htmlentitydefs.entitydefs (see > http://docs.python.org/lib/module-htmlentitydefs.html) it's kind of > ugly looking so maybe use pprint to print it: > > >>> import htmlentitydefs, pprint > >>> pprint.pprint(htmlentitydefs.entitydefs) > {'AElig': 'Æ', > 'Aacute': 'Á', > 'Acirc': 'Â', > . > . > . > 'nbsp': '\xa0', > . > . > . > etc... > > > HTH, > ~Simon > > "You keep using that word. I do not think it means what you think it > means." > -Inigo Montoya, "The Princess Bride" > > -- > http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
RE: ascii character - removing chars from string
update...
the error i'm getting...
>>>UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in
position 62: ordinal not in range(128)
is there a way i can tell/see what the exact char is at pos(62). i was
assuming that it's the hex \xa0.
i've done the s.replace('\xa0','') with no luck.
-bruce
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf
Of Steven D'Aprano
Sent: Tuesday, July 04, 2006 8:45 AM
To: [email protected]
Subject: RE: ascii character - removing chars from string
On Tue, 04 Jul 2006 08:09:53 -0700, bruce wrote:
> simon...
>
> the issue that i'm seeing is not a result of simply using the
> 'string.replace' function. it appears that there's something else going on
> in the text
>
> although i can see the nbsp in the file, the file is manipulated by a
number
> of other functions prior to me writing the information out to a file.
> somewhere the 'nbsp' is changed, so there's something else going on...
>
> however, the error i get indicates that the char 'u\xa0' is what's causing
> the issue..
As you have written it, that's not a character, it is a string of length
two. Did you perhaps mean the Unicode character u'\xa0'?
>>> len('u\xa0')
2
>>> len(u'\xa0')
1
> as far as i can determine, the string.replace can't/doesn't
> handle non-ascii chars. i'm still looking for a way to search/replace
> non-ascii chars...
Seems to work for me:
>>> c = u'\xa0'
>>> s = "hello " + c + " world"
>>> s
u'hello \xa0 world'
>>> s.replace(c, "?")
u'hello ? world'
--
Steven.
--
http://mail.python.org/mailman/listinfo/python-list
--
http://mail.python.org/mailman/listinfo/python-list
RE: ascii character - removing chars from string
steven... when you have the >>>u'hello ? world'<< in your interpreter/output, is the 'u' indicating that what you're displaying is unicode? i pretty much tried what you have in the replace.. and i got the same error regarding the unicodedecode error... -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Steven D'Aprano Sent: Tuesday, July 04, 2006 8:45 AM To: [email protected] Subject: RE: ascii character - removing chars from string On Tue, 04 Jul 2006 08:09:53 -0700, bruce wrote: > simon... > > the issue that i'm seeing is not a result of simply using the > 'string.replace' function. it appears that there's something else going on > in the text > > although i can see the nbsp in the file, the file is manipulated by a number > of other functions prior to me writing the information out to a file. > somewhere the 'nbsp' is changed, so there's something else going on... > > however, the error i get indicates that the char 'u\xa0' is what's causing > the issue.. As you have written it, that's not a character, it is a string of length two. Did you perhaps mean the Unicode character u'\xa0'? >>> len('u\xa0') 2 >>> len(u'\xa0') 1 > as far as i can determine, the string.replace can't/doesn't > handle non-ascii chars. i'm still looking for a way to search/replace > non-ascii chars... Seems to work for me: >>> c = u'\xa0' >>> s = "hello " + c + " world" >>> s u'hello \xa0 world' >>> s.replace(c, "?") u'hello ? world' -- Steven. -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
RE: ascii character - removing chars from string
yep! dang phat fingers!!! thanks everything's working as it should... 6 hours to track down this little issue!!! arrrgghhh.. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Fredrik Lundh Sent: Tuesday, July 04, 2006 9:09 AM To: [email protected] Subject: Re: ascii character - removing chars from string bruce wrote: > i've done the s.replace('\xa0','') with no luck. let me guess: you wrote s.replace("\xa0", "") instead of s = s.replace("\xa0", "") ? -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
RE: ascii character - removing chars from string
thanks for your replies!! the solution.. dd = dd.replace(u'\xa0','') this allows the nbsp hex representation to be replaced with a ''. i thought i had tried this early in the process.. but i may have screwed up the typing... -bruce -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Steven D'Aprano Sent: Tuesday, July 04, 2006 9:35 AM To: [email protected] Subject: RE: ascii character - removing chars from string On Tue, 04 Jul 2006 09:01:15 -0700, bruce wrote: > update... > > the error i'm getting... > >>>>UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in > position 62: ordinal not in range(128) Okay, now we're making progress -- we know what exception you're getting. Now, how about telling us what you did to get that exception? > is there a way i can tell/see what the exact char is at pos(62). i was > assuming that it's the hex \xa0. That's what it's saying. > i've done the s.replace('\xa0','') with no luck. What does that mean? What does it do? Crash? Raise an exception? Return a string you weren't expecting? More detail please. Here is some background you might find useful. My apologies if you already know it: "Ordinary" strings in Python are delimited with quote marks, either matching " or '. At the risk of over-simplifying, these strings can contain only single-byte characters, i.e. ordinal values 0 through 255, or in hex, 0 through FF. The character you are having a problem with is within that range of single bytes: ord(u'\xa0') = 160. Notice that a string '\xa0' is a single byte; a Unicode string u'\xa0' is a different type of object, even though it has the same value. String methods will blindly operate on any string, regardless of what bytes are in them. However, converting from unicode to ordinary strings is NOT the same -- the *character* chr(160) is not a valid ASCII character, since ASCII only uses the range chr(0) through chr(127). If this is confusing to you, you're not alone. -- Steven. -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
python - regex handling
hi... does python provide regex handling similar to perl. can't find anything in the docs i've seen to indicate it does... -bruce -- http://mail.python.org/mailman/listinfo/python-list
defining multi dimensional array
hi... basic question.. how do i define a multi dimensional array a[10][10] is there a kind of a = array(10,10) thanks -bruce -- http://mail.python.org/mailman/listinfo/python-list
RE: defining multi dimensional array
update... i need a multi dimensional array of lists... ie [q,a,d] [q1,a1,d1] [q2,a2,d2] [q3,a3,d3] which would be a (3,4) array... -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of bruce Sent: Tuesday, July 04, 2006 8:15 PM To: [email protected] Subject: defining multi dimensional array hi... basic question.. how do i define a multi dimensional array a[10][10] is there a kind of a = array(10,10) thanks -bruce -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
numarray
hi... i'm trying to find numarray.. i found the numpy on sourceforge and downloaded/installed.. i did a python>> import numarray and got an error... the docs that i've seen point to the sourceforge area.. but i only see numpy.. which appears to incorporate numarray.. my goal is to somehow define multi-dimensional arrays of strengs... pointers... thanks -bruce -- http://mail.python.org/mailman/listinfo/python-list
RE: defining multi dimensional array
i tried to do a1[10][10] = ['a','q'] and get an error saying a1 is not defined -bruce -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Erik Max Francis Sent: Tuesday, July 04, 2006 9:14 PM To: [email protected] Subject: Re: defining multi dimensional array bruce wrote: > basic question.. > > how do i define a multi dimensional array > > a[10][10] > > is there a kind of a = array(10,10) It's just a list of lists. -- Erik Max Francis && [EMAIL PROTECTED] && http://www.alcyone.com/max/ San Jose, CA, USA && 37 20 N 121 53 W && AIM erikmaxfrancis Have you ever loved somebody / Who didn't know -- Zhane -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
RE: numarray
robert i did an python>>> import numpy a = array([['q','a'],['w','e']]) and it didn't work... i used >>from import numpy * and it seems to accept the 'array' word.. .looks like it will work... what's the difference between 'import numpy', and "from import numpy *" comments... thanks -bruce -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Robert Kern Sent: Tuesday, July 04, 2006 9:42 PM To: [email protected] Subject: Re: numarray bruce wrote: > hi... > > i'm trying to find numarray.. i found the numpy on sourceforge and > downloaded/installed.. > > i did a > python>> import numarray > > and got an error... Never just say "I got an error." It tells us nothing. Copy-and-paste the exact error message. I presume, however, that you installed numpy, not numarray. They are not the same thing. > the docs that i've seen point to the sourceforge area.. but i only see > numpy.. which appears to incorporate numarray.. No, it replaces numarray. http://www.scipy.org/NumPy > my goal is to somehow define multi-dimensional arrays of strengs... >>> from numpy import * >>> a = array([['some', 'strings'],['in an', 'array']], dtype=object) >>> a array([[some, strings], [in an, array]], dtype=object) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
RE: defining multi dimensional array
do i need to import anything for this.. or is it supposed to work out of the box.. and just what is it doing! -bruce -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of tac-tics Sent: Tuesday, July 04, 2006 9:53 PM To: [email protected] Subject: Re: defining multi dimensional array bruce wrote: > hi... > > basic question.. > > how do i define a multi dimensional array > > a[10][10] I find that list comprehensions are useful for this. [ [None for x in xrange(10)] for y in xrange(10)] You could easily write a wrapper for it to clean the syntax a bit. -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
python list/array question...
hi... i'm trying to deal with multi-dimension lists/arrays i'd like to define a multi-dimension string list, and then manipulate the list as i need... primarily to add lists/information to the 'list/array' and to compare the existing list information to new lists i'm not sure if i need to import modules, or if the base python install i have is sufficient. an example, or pointer to examples would be good... i'd like define a[][] #basically, i'd like a 3x3 array, where each element #has one of the a,b,c items.. # |a1, b1, c1| # |a2, b2, c2| # |a3, b3, c3| a[1][1] = ['a1','b1','c1'] a[1][2] = ['a2','b2','c2'] a[1][3] = ['a3','b3','c3'] b = ['f','g','h'] v = ['f1','g1','h1'] if a[1][2] == b print 'good!' a[1][4] = b x = 4 g = ['p1','l1','g1'] for i in range[g] a[x][i] = g[i] these are the kinds of list/array functions i'd like to be able to accomplish pointers/code samples/pointers to code would be helpful... and yeah. i've been looking via google... thanks -bruce -- http://mail.python.org/mailman/listinfo/python-list
python/xpath question...
for guys with python/xpath expertise.. i'm playing with xpath.. and i'm trying to solve an issue... i have the following kind of situation where i'm trying to get certain data. i have a bunch of tr/td... i can create an xpath, that gets me all of the tr.. i only want to get the sibling tr up until i hit a 'tr' that has a 'th' anybody have an idea as to how this query might be created?.. the idea would be to start at the "Summer B", to skip the 1st "tr", to get the next "tr"s until you get to the next "Summer" section... sample data. Summer B Course Course Course number and suffix, if applicable. C = combined lecture and lab course L = laboratory course AST 1002 AST 1022L AST 1022L AST 1022L Summer C Course . . . thanks... -bruce -- http://mail.python.org/mailman/listinfo/python-list
string/list comparison
hi... i have the following piece of code that i'm testing... it should be using/comparing two equal strings. apparently it doesn't. i've tried to do a "strip" to remove pre/post whitespace.. but there appears to be something else going on. i suspect that there is some type of unicode going on. is there some way that i can print out what's really in the string so i can know what to remove... in the test, the sstr/trstr should both be "Summer A" for some reason i'm getting a len of 10 and 11 chars... any thoughts/comments... thanks -bruce #extract process from the class page print "section" f = urllib.urlopen(url) s = f.read() f.close() #print s # s contains HTML not XML text d = libxml2dom.parseString(s, html=1) #get the tr list tr1 = d.xpath(alltr) #get the sess list sess1 = d.xpath(sess) #build the tr list trlist = [] for aaa in tr1: trlist.append(aaa.nodeValue) #print "aaa = ",aaa.nodeValue #build the course list sesslist = [] for in sess1: sesslist.append(.nodeValue) #print " = ",.nodeValue print "sesstest = ",sesslist[0] print "sesstest2 = ",sesslist[1] print "trtest = ",trlist[3] sstr = sesslist[0] <<<<<<<<<<<<<<<<<< these should be the same trstr = trlist[3]<<<<<<<<<<<<<<<<<< "Summer A" sstr.strip(sstr) trstr.strip(trstr) print "slen = ",len(sstr) print "trlen = ",len(trstr) if sesslist[0] == trlist[3]: print "okk" sys.exit() -- http://mail.python.org/mailman/listinfo/python-list
RE: string/list comparison
thanks tim... the strip should have been 'sstr.strip()'< thanks -Original Message- From: Tim Chase [mailto:[EMAIL PROTECTED] Sent: Thursday, July 06, 2006 12:17 PM To: [EMAIL PROTECTED] Cc: [email protected] Subject: Re: string/list comparison > sstr = sesslist[0] << these should be the same > trstr = trlist[3]<< "Summer A" > > sstr.strip(sstr) > trstr.strip(trstr) > > print "slen = ",len(sstr) > print "trlen = ",len(trstr) Have you tried printing the repr(sstr) and repr(trstr) to see what Python thinks they are? Without having these values posted to the list, it's near impossible to be of more assistance. As soon as you do print them out, you should be able to visdiff them and see where matters have gone awry. -tkc -- http://mail.python.org/mailman/listinfo/python-list
for loop question
hi.. basic foor/loop question.. i can do: for a in foo print a if i want to do something like for a, 2, foo print foo where go from 2, to foo.. i can't figure out how to accomplish this... can someone point me to how/where this is demonstrated... found plenty of google for for/loop.. just not this issue.. thanks -bruce -- http://mail.python.org/mailman/listinfo/python-list
RE: for loop question
'ppreaciate the answers duh... -bruce -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Daniel Haus Sent: Thursday, July 06, 2006 2:02 PM To: [email protected] Subject: Re: for loop question just do: for a in range(2, foo+1): print a range(a, b) gives [a, a+1, a+2, ..., b-2, b-1] bruce schrieb: > hi.. > > basic foor/loop question.. > > i can do: > > for a in foo > print a > > if i want to do something like > for a, 2, foo > print foo > > where go from 2, to foo.. > > i can't figure out how to accomplish this... > > can someone point me to how/where this is demonstrated... > > found plenty of google for for/loop.. just not this issue.. > > thanks > > -bruce -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
fetching a POST webpage...
hi... i have the basic code to fetcha url/web page. however, i'm trying to fetch a page that uses a FORM/POST. has anyone done this, i've tried a few times without success. i've analyzed the data stream using Firefox/Livehttpheaders to get the HTTP stream.. but i'm doing something wrong, and can't quite see what the err/issue is... if you've done this kind of thing, and you have some thoughts, let me know. i can send you the output of the livehttpheaders app, and the test code that i have... thanks.. -bruce -- http://mail.python.org/mailman/listinfo/python-list
IRC questions!!
hi... i'm trying to figure out what i have to do to setup mIRC to get the #python channel on IRC!! any pointers. the mIRC docs didn't get me very far. is there an irc.freenode.net that i need to connect to? how do i do it? thanks.. -bruce -- http://mail.python.org/mailman/listinfo/python-list
RE: IRC questions!!
given that nothing appears to be connecting.. should i have anything in the "group" window/dialog of the server setting... -bruce -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Jon Clements Sent: Friday, July 07, 2006 10:57 AM To: [email protected] Subject: Re: IRC questions!! bruce wrote: > hi... > > i'm trying to figure out what i have to do to setup mIRC to get the #python > channel on IRC!! > > any pointers. the mIRC docs didn't get me very far. > > is there an irc.freenode.net that i need to connect to? how do i do it? > > thanks.. > > -bruce Assuming you're familiar with the basics of IRC. In mIRC, File->Select Server->Add, enter "Freenode" as description, enter "irc.freenode.net" as server. Leave the port as 6667, then change it later if server supports other ports. Click connect to server. Cheers, Jon. -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
RE: IRC questions!! (off topic)
hi jon i can sortof connect however, to join "irc.freenode.net", it states i have to register... the docs that i've seen say i have to : /msg nickserv register if my nickname is tom, do i do /msg tom register(if foo is the passwd i want) -or- /msg tom register foo or is there something else i need to do... thanks -bruce -Original Message- From: Jon Clements [mailto:[EMAIL PROTECTED] Sent: Friday, July 07, 2006 1:45 PM To: [EMAIL PROTECTED] Subject: Re: IRC questions!! Bruce, What happens when you attempt to connect? What messages do you get etc...? Cheers, Jon. On 07/07/06, bruce <[EMAIL PROTECTED]> wrote: hey jon... thanks for the reply.. and i fully understand!!! i've tried everything that i can think of with regards to setting up mIRC, and getting the firewall to behave. as far as i can tell, the firewall is setup to pass 113/6667/8001 though to the mIRC pc.. however, when i try to connect to any server, the mIRC client still has a status of "not connected" -bruce -Original Message- From: Jon Clements [mailto:[EMAIL PROTECTED] Sent: Friday, July 07, 2006 1:09 PM To: [EMAIL PROTECTED] Subject: Re: IRC questions!! Hi Bruce, Sorry, was AFK from keyboard for a while... Have you tried typing in the status window: /server irc.freenode.net And once connected, /join #python I don't use the freenode network myself but I think you need to also register, google for info on freenode. You also need to ensure you can respond to ident requests, and, in terms of registration, they may have a "nickserv" equivalent (like DALnet). I wouldn't have normally responded to your post as it's kind of off-topic, but I was an IRC administrator and operator for about 4 years on a network with roughly 8k users. Quite challenging when it's just the seven of you! I wish you luck, sorry I haven't been of much help. Jon. "The words of the prophets are written on the subway walls..." - Paul Simon On 07/07/06, bruce <[EMAIL PROTECTED]> wrote: hi jon... i followed your instructions.. nothing seems to happen...!!! simply has a window saying status not connected ( irc.freenode.net) -bruce -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Jon Clements Sent: Friday, July 07, 2006 10:57 AM To: [email protected] Subject: Re: IRC questions!! bruce wrote: > hi... > > i'm trying to figure out what i have to do to setup mIRC to get the #python > channel on IRC!! > > any pointers. the mIRC docs didn't get me very far. > > is there an irc.freenode.net that i need to connect to? how do i do it? > > thanks.. > > -bruce Assuming you're familiar with the basics of IRC. In mIRC, File->Select Server->Add, enter "Freenode" as description, enter "irc.freenode.net " as server. Leave the port as 6667, then change it later if server supports other ports. Click connect to server. Cheers, Jon. -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
python guru for urllib/mechanize
hi... i'm trying to get the pages from a site "axess.stanford.edu", and i'm running into problems. i've got some test code that allows me to get the 1st few pages. i'm having an issue when i run into a page that somehow interprets a url from a src of a frameset. i can't seem to mimic/implement this kind of function... if you have expertise with http/web fetching, i'd appreciate any thoughts/comments/etc... i can provide the test code. i'm pretty sure the answer is fairly simple, but i just can't get my hands around it... thanks -bruce -- http://mail.python.org/mailman/listinfo/python-list
Mechanize-Browser question..
hi.. i have the following piece of test code. i'm trying to implement/check out the follow-link method. i'm just trying to figure out how to get a link from the page. i was hoping that the regex would basically get the 1st url link... any thoughts/comments/ideas as to what i'm doing wrong. thanks -bruce br = Browser() br.set_handle_redirect(True) br.set_handle_referer(True) br.open(url2) #br.set_cookiejar(cj) br.set_debug_redirects(True) # Log HTTP response bodies (ie. the HTML, most of the time). br.set_debug_responses(True) # Print HTTP headers. br.set_debug_http(True) r2 = br.follow_link(url_regex=re.compile(r"\*"),nr=1) <<<<<<<<<<<<<<< response = br.response() # this is a copy of response print response.read() -- http://mail.python.org/mailman/listinfo/python-list
xpath question...
hi... i have the following section of test code where i'm trying to get the attribute of a frame i'm trying to print/get the src value. the xpath query that i have displays the "src" attribute in the Xpather/Firefox plugin. however, i can't quite figure out how to get the underlying value in my test app... sxpath = "/html/frameset/frame[2]/attribute::src" # s contains HTML not XML text d = libxml2dom.parseString(s, html=1) #get the tr list tr1 = d.xpath(sxpath) url = tr1[0] #get the url/link >>semester page #link = br.find_link(nr=1) #url = link.url print "link = ",url sys.exit() err output link = -- i'm not sure what i need to add to the line url = tr1 to resolve the issue/error... looking over google hasn't given any real pointers... thanks -bruce -- http://mail.python.org/mailman/listinfo/python-list
RE: [wwwsearch-general] ClientForm request re ParseErrors
hi john...
not sure exactly who i should talk to tabout this..but here goes...
i have the following piece of code... i'm trying to do a select form, and my
test throws an error...
i have the actual form "main" in the html, so it should find it... as far as
i can tell, i've followed the docs.. but i could be wrong. any thoughts?
the code, output, and partial html is below...
thoughts/comments/ideas/etc...
thanks
-bruce
test code
#get the semester page
#get the 2nd semester/frame src url page
br.open(url)
response = br.response() # this is a copy of response
s = response.read()
print response.read()
print s
#we now have the semester page...
d = libxml2dom.parseString(s, html=1)
ff = d.xpath(fnamepath)
fname = ff[0].nodeValue
print "fname = ",fname
br.select_form(name="main")<<<<<<<<<<<<<<< error happens
output
fname = main
Traceback (most recent call last):
File "./stest.py", line 156, in ?
br.select_form(name="main")
File "build/bdist.linux-i686/egg/mechanize/_mechanize.py", line 352, in
select_form
File "build/bdist.linux-i686/egg/mechanize/_mechanize.py", line 296, in
forms
File "build/bdist.linux-i686/egg/mechanize/_html.py", line 510, in forms
File "build/bdist.linux-i686/egg/mechanize/_html.py", line 226, in forms
File "build/bdist.linux-i686/egg/ClientForm.py", line 922, in
ParseResponse
File "build/bdist.linux-i686/egg/ClientForm.py", line 952, in ParseFile
File "/usr/lib/python2.4/sgmllib.py", line 95, in feed
self.goahead(0)
File "/usr/lib/python2.4/sgmllib.py", line 165, in goahead
k = self.parse_declaration(i)
File "/usr/lib/python2.4/markupbase.py", line 89, in parse_declaration
decltype, j = self._scan_name(j, i)
File "/usr/lib/python2.4/markupbase.py", line 378, in _scan_name
self.error("expected name token at %r"
File "/usr/lib/python2.4/sgmllib.py", line 102, in error
raise SGMLParseError(message)
sgmllib.SGMLParseError: expected name token at '
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of
John J Lee
Sent: Sunday, July 09, 2006 9:51 AM
To: [EMAIL PROTECTED]
Subject: Re: [wwwsearch-general] ClientForm request re ParseErrors
On Sun, 9 Jul 2006, Titus Brown wrote:
[...]
> Define "better patch"...? The code I sent out before lets ClientForm
> parse otherwise unparseable HTML, and it works fine. I suppose it's
> less elegant than having two separate while loops; is that what you
> mean?
No, I just hate going one char at a time in Python. Surely this should be
fixed somewhere else? (I'm not sure where; I haven't looked recently)
If you've determined that fixing it elsewhere pulls in too much code or
requires a fix to stdlib code (if so, why?), maybe I should do as you
suggest anyway, but I don't like it.
> -> > The problem I have is that there's literally no way to pass
> -> > configuration parameters like 'ignore_errors' down from the
> -> > mechanize.Factory.forms() call.
> ->
> -> You can reimplement FormsFactory. It's a trivial (if slightly verbose)
> -> class, right?
>
> I could do that, yes. But I'd also need to redefine Factory.forms(),
> too, which calls FormsFactory.
Why? You can supply your own FormsFactory, as DefaultFactory does.
[...]
> -> > Separately, it'd be nice if ignore_errors wasn't hardcoded as False
in
> -> > ParseFile ;).
> ->
> -> I'm not sure what you want here. Could you send a patch?
>
> Line 914 of ClientForm.py should be changed to 'ignore_errors,'
Oh. Sure, if I apply a patch to enable ignore_errors, I'll of course do
that too.
John
-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job
easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
wwwsearch-general mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/wwwsearch-general
--
http://mail.python.org/mailman/listinfo/python-list
unistall python mechanize
hi.. i'm trying to figure out how to uninstall "mechanize". i don't see an "unistall" from the "python --help-commands" function... i'm looking to rebuild/reinstall mechanize from the svn repos to try to see if an apparent parsing issue that i mentioned is fixed... thanks -bruce -- http://mail.python.org/mailman/listinfo/python-list
RE: [wwwsearch-general] ClientForm request re ParseErrors
update.
out of curiosity, i fetched the latest mechanize from svn.. i get the same
error with the parse...
i've also tried to do:
br.select_form(nr = 1)
br.select_form(name="foo")
br.select_form(name=foo)
br.select_form(name="foo")
etc.... same err occurs...
-bruce
hi john...
not sure exactly who i should talk to tabout this..but here goes...
i have the following piece of code... i'm trying to do a select form, and my
test throws an error...
i have the actual form "main" in the html, so it should find it... as far as
i can tell, i've followed the docs.. but i could be wrong. any thoughts?
the code, output, and partial html is below...
thoughts/comments/ideas/etc...
thanks
-bruce
test code
#get the semester page
#get the 2nd semester/frame src url page
br.open(url)
response = br.response() # this is a copy of response
s = response.read()
print response.read()
print s
#we now have the semester page...
d = libxml2dom.parseString(s, html=1)
ff = d.xpath(fnamepath)
fname = ff[0].nodeValue
print "fname = ",fname
br.select_form(name="main")<<<<<<<<<<<<<<< error happens
output
fname = main
Traceback (most recent call last):
File "./stest.py", line 156, in ?
br.select_form(name="main")
File "build/bdist.linux-i686/egg/mechanize/_mechanize.py", line 352, in
select_form
File "build/bdist.linux-i686/egg/mechanize/_mechanize.py", line 296, in
forms
File "build/bdist.linux-i686/egg/mechanize/_html.py", line 510, in forms
File "build/bdist.linux-i686/egg/mechanize/_html.py", line 226, in forms
File "build/bdist.linux-i686/egg/ClientForm.py", line 922, in
ParseResponse
File "build/bdist.linux-i686/egg/ClientForm.py", line 952, in ParseFile
File "/usr/lib/python2.4/sgmllib.py", line 95, in feed
self.goahead(0)
File "/usr/lib/python2.4/sgmllib.py", line 165, in goahead
k = self.parse_declaration(i)
File "/usr/lib/python2.4/markupbase.py", line 89, in parse_declaration
decltype, j = self._scan_name(j, i)
File "/usr/lib/python2.4/markupbase.py", line 378, in _scan_name
self.error("expected name token at %r"
File "/usr/lib/python2.4/sgmllib.py", line 102, in error
raise SGMLParseError(message)
sgmllib.SGMLParseError: expected name token at '
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of
John J Lee
Sent: Sunday, July 09, 2006 9:51 AM
To: [EMAIL PROTECTED]
Subject: Re: [wwwsearch-general] ClientForm request re ParseErrors
On Sun, 9 Jul 2006, Titus Brown wrote:
[...]
> Define "better patch"...? The code I sent out before lets ClientForm
> parse otherwise unparseable HTML, and it works fine. I suppose it's
> less elegant than having two separate while loops; is that what you
> mean?
No, I just hate going one char at a time in Python. Surely this should be
fixed somewhere else? (I'm not sure where; I haven't looked recently)
If you've determined that fixing it elsewhere pulls in too much code or
requires a fix to stdlib code (if so, why?), maybe I should do as you
suggest anyway, but I don't like it.
> -> > The problem I have is that there's literally no way to pass
> -> > configuration parameters like 'ignore_errors' down from the
> -> > mechanize.Factory.forms() call.
> ->
> -> You can reimplement FormsFactory. It's a trivial (if slightly verbose)
> -> class, right?
>
> I could do that, yes. But I'd also need to redefine Factory.forms(),
> too, which calls FormsFactory.
Why? You can supply your own FormsFactory, as DefaultFactory does.
[...]
> -> > Separately, it'd be nice if ignore_errors wasn't hardcoded as False
in
> -> > ParseFile ;).
> ->
> -> I'm not sure what you want here. Could you send a patch?
>
> Line 914 of ClientForm.py should be changed to 'ignore_errors,'
Oh. Sure, if I apply a patch to enable ignore_errors, I'll of course do
that too.
John
-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job
easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
wwwsearch-general mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/wwwsearch-general
--
http://mail.python.org/mailman/listinfo/python-list
possible issue with mechanize/python parsing
hi...
it appears that i'm running into a possible problem with
mechanize/browser/python rgarding the "select_form" method. i've tried the
following and get the error listed:
br.select_form(nr = 1)
br.select_form(name="foo")
br.select_form(name=foo)
br.select_form(name="foo")
here's a short test app, as well as the html to be placed in a test data
file
everything is straight forward...
any thoughts/comments/ideas would be helpful. i have the latest mechanize
from the svn repos.
thanks
-bruce
the error i get is:
Traceback (most recent call last):
File "./axess.py", line 127, in ?
br.select_form(name = "main")
File "build/bdist.linux-i686/egg/mechanize/_mechanize.py", line 354, in
select_form
mechanize._mechanize.BrowserStateError: not viewing HTML
test code
---
#! /usr/bin/env python
#test python script
import re
import libxml2dom
import urllib
import urllib2
import sys, string
#import numarray
import httplib
from mechanize import Browser
import mechanize
#
# Parsing App Information
# datafile
tfile = open("stanford.dat", 'wr+')
cj = mechanize.CookieJar()
br = Browser()
if __name__ == "__main__":
# main app
#
# start trying to get the stanford pages
cj = mechanize.CookieJar()
br = Browser()
fh = open('axess.dat')
s = fh.read()
fh.close()
br.open("file:///home/test/axess.dat")
print "foo"
# particular cookiejar)
br.set_cookiejar(cj)
# Log information about HTTP redirects and Refreshes.
##br.set_debug_redirects(True)
# Log HTTP response bodies (ie. the HTML, most of the time).
#WARNING!! using this will apparently
#kill the Browser instance!!!
#br.set_debug_responses(True)
# Print HTTP headers.
# br.set_debug_http(True)
# br.set_handle_redirect(True)
# br.set_handle_referer(True)
response = br.response() # this is a copy of response
#get the option/semester name
snamepath =
"/html/[EMAIL PROTECTED]'PSPAGE']/form[2]/table/tr[7]/td[3]/select/@name"
#get the form name
fnamepath = "/html/[EMAIL PROTECTED]'PSPAGE']/form[2]/attribute::name"
s = response.read()
print response.read()
print s
#we now have the semester page...
d = libxml2dom.parseString(s, html=1)
#get option name
sem_optname = d.xpath(snamepath)
sem_optname = sem_optname[0].nodeValue
print "sem = ",sem_optname
ff = d.xpath(fnamepath)
fname = ff[0].nodeValue
print "fname = ",fname
br.select_form(name = "main")
print "ss"
sys.exit()
data file
---
var totalTimeoutMilliseconds = 120;
var warningTimeoutMilliseconds = 108;
var timeOutURL =
'<a rel="nofollow" href="https://axess.stanford.edu/psp/psp_prd/?cmd=expire&languageCd=ENG">https://axess.stanford.edu/psp/psp_prd/?cmd=expire&languageCd=ENG</a>';
var timeoutWarningPageURL =
'<a rel="nofollow" href="https://psweb.stanford.edu/servlets/iclientservlet/a2k_prd/?ICType=Script&I">https://psweb.stanford.edu/servlets/iclientservlet/a2k_prd/?ICType=Script&I</a>
CScriptProgramName=WEBLIB_TIMEOUT.PT_TIMEOUTWARNING.FieldFormula.IScript_TIM
EOUTWARNING';
View Schedule of Classes
var baseKey_main = "";
var altKey_main = "05678\xbc\xbe\xbf\xde";
var ctrlKey_main = "JK";
var bTabOverTB_main = false;
var bTabOverPg_main = false;
var bTabOverNonPS_main = false;
document.domain = "stanford.edu";
function submitAction_main(form, name)
{
form.ICAction.value=name;
form.ICXPos.value=getScrollX();
form.ICYPos.value=getScrollY();
form.submit();
}
http://psrepos.stanford.edu:9070/PSOL/htmldoc/f1search.htm?ContextID=
CLASS_SRCH_ENTRY&LangCD=ENG">
http://psrepos.stanford.edu:9070/PSOL/htmldoc/f1search.htm?ContextID=C
LASS_SRCH_ENTRY&LangCD=ENG" target='help' accesskey='9' tabindex='1'
class='PSHYPERLINK'>Help
Class Search
Select
Term
Select the term you wish to search, and
then click Basic Search,
Advanced Search, or Independent Study
Search to continue.
*Select a
Term:
(1068) 2005-2006 Summer
(1066) 2005-2006 Spring
(1064) 2005-2006 Winter
(1062) 2005-2006 Autumn
--
http://mail.python.org/mailman/listinfo/python-list
RE: [wwwsearch-general] ClientForm request re ParseErrors
is this where i've seen references to integrating Beautifulsoup in the wb browsing app? -bruce -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of John J Lee Sent: Monday, July 10, 2006 2:29 AM To: [EMAIL PROTECTED] Cc: [email protected] Subject: RE: [wwwsearch-general] ClientForm request re ParseErrors On Sun, 9 Jul 2006, bruce wrote: [...] > sgmllib.SGMLParseError: expected name token at ' > > partial html > --- > > > Action="/servlets/iclientservlet/a2k_prd/?ICType=Panel&Menu=SA_LEARNER_SERVI > CES&Market=GBL&PanelGroupName=CLASS_SEARCH" autocomplete=off> > > > [...] You don't include the HTML mentioned in the exception message (''. The comment sgmllib is complaining about is missing the '--'. You can work around bad HTML using the .set_data() method on response objects and the .set_response() method on Browser. Call the latter before you call any other methods that would require parsing the HTML. r = br.response() r.set_data(clean_html(br.get_data())) br.set_response(r) You must write clean_html yourself (though you may use an external tool to do so, of course). Alternatively, use a more robust parser, e.g. br = mechanize.Browser(factory=mechanize.RobustFactory()) (you may also integrate another parser of your choice with mechanize, with more effort) John -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
RE: [wwwsearch-general] ClientForm request re ParseErrors
hi john...
this is in regards to the web/parsing/factory/beautifulsoup
to reiterate, i have python 2.4, mechanize, browser, beatifulsoup installed.
i have the latest mech from svn.
i'm getting the same err as reported by john t. the code/err follows.. (i
can resend the test html if you need)
any thoughts/pointers/etc would be helpful...
thanks
-bruce
test code
#! /usr/bin/env python
#test python script
import re
import libxml2dom
import urllib
import urllib2
import sys, string
#import numarray
import httplib
from mechanize import Browser, RobustFactory
import mechanize
import BeautifulSoup
#
# Parsing App Information
# datafile
tfile = open("stanford.dat", 'wr+')
cj = mechanize.CookieJar()
br = Browser()
if __name__ == "__main__":
# main app
#
# start trying to get the stanford pages
cj = mechanize.CookieJar()
br = Browser(factory=RobustFactory())
fh = open('axess.dat')
s = fh.read()
fh.close()
br.open("file:///home/test/axess.dat")
.
.
.
.
err/output
Traceback (most recent call last):
File "./axess.py", line 45, in ?
br.open("file:///home/test/axess.dat")
File "build/bdist.linux-i686/egg/mechanize/_mechanize.py", line 130, in
open
File "build/bdist.linux-i686/egg/mechanize/_mechanize.py", line 170, in
_mech_open
File "build/bdist.linux-i686/egg/mechanize/_mechanize.py", line 213, in
set_response
File "build/bdist.linux-i686/egg/mechanize/_html.py", line 577, in
set_response
File "build/bdist.linux-i686/egg/mechanize/_html.py", line 316, in
__init__
File "/usr/lib/python2.4/site-packages/BeautifulSoup.py", line 1326, in
__init__
BeautifulStoneSoup.__init__(self, *args, **kwargs)
File "/usr/lib/python2.4/site-packages/BeautifulSoup.py", line 973, in
__init__
self._feed()
File "/usr/lib/python2.4/site-packages/BeautifulSoup.py", line 987, in
_feed
smartQuotesTo=self.smartQuotesTo)
File "/usr/lib/python2.4/site-packages/BeautifulSoup.py", line 1580, in
__init__
u = self._convertFrom(proposedEncoding)
File "/usr/lib/python2.4/site-packages/BeautifulSoup.py", line 1614, in
_convertFrom
proposed = self.find_codec(proposed)
File "/usr/lib/python2.4/site-packages/BeautifulSoup.py", line 1731, in
find_codec
return self._codec(self.CHARSET_ALIASES.get(charset, charset)) \
File "/usr/lib/python2.4/site-packages/BeautifulSoup.py", line 1740, in
_codec
codecs.lookup(charset)
TypeError: lookup() argument 1 must be string, not bool
is this where i've seen references to integrating Beautifulsoup in the wb
browsing app?
-bruce
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf
Of John J Lee
Sent: Monday, July 10, 2006 2:29 AM
To: [EMAIL PROTECTED]
Cc: [email protected]
Subject: RE: [wwwsearch-general] ClientForm request re ParseErrors
On Sun, 9 Jul 2006, bruce wrote:
[...]
> sgmllib.SGMLParseError: expected name token at '
>
> partial html
> ---
>
>
>
Action="/servlets/iclientservlet/a2k_prd/?ICType=Panel&Menu=SA_LEARNER_SERVI
> CES&Market=GBL&PanelGroupName=CLASS_SEARCH" autocomplete=off>
>
>
>
[...]
You don't include the HTML mentioned in the exception message (''. The comment sgmllib is complaining about is missing the '--'.
You can work around bad HTML using the .set_data() method on response
objects and the .set_response() method on Browser. Call the latter before
you call any other methods that would require parsing the HTML.
r = br.response()
r.set_data(clean_html(br.get_data()))
br.set_response(r)
You must write clean_html yourself (though you may use an external tool to
do so, of course).
Alternatively, use a more robust parser, e.g.
br = mechanize.Browser(factory=mechanize.RobustFactory())
(you may also integrate another parser of your choice with mechanize, with
more effort)
John
--
http://mail.python.org/mailman/listinfo/python-list
--
http://mail.python.org/mailman/listinfo/python-list
Mechanize/Browser question
hi...
i can do the following
br = Browser
br.open("www.yahoo.com")
br.open("file:///foo")
but can i do
s = "..." <<<< qualified html text
br.open(s)
i'm curious, if i have html from someother source, is there a way to simply
get it into the "Browser" so i can modify it...
thanks
-bruce
--
http://mail.python.org/mailman/listinfo/python-list
mechanize select_form issue..
hi...
update to an ongoing issue i've been having regarding html/Browser and
selecting forms.
i've created a basic test app, and created a stripped down page of html. the
html has a single form.
i get the following error:
fname = main <<<< the app can find the frame from the XPath...
Traceback (most recent call last):
File "./axess.py", line 90, in ?
br.select_form(name = "main") <<<<< app is dying!!!
File "build/bdist.linux-i686/egg/mechanize/_mechanize.py", line 354, in
select_form
mechanize._mechanize.BrowserStateError: not viewing HTML
any thoughts/ideas/comments will be useful!!
thanks
-bruce
test code
---
import re
import libxml2dom
import urllib
import urllib2
import sys, string
#import numarray
import httplib
from mechanize import Browser, RobustFactory
import mechanize
from BeautifulSoup import *
#
# Parsing App Information
# datafile
tfile = open("stanford.dat", 'wr+')
cj = mechanize.CookieJar()
br = Browser()
if __name__ == "__main__":
# main app
#
# start trying to get the stanford pages
cj = mechanize.CookieJar()
# br = Browser(factory=RobustFactory())
br = Browser()
fh = open('axess1.dat')
s = fh.read()
fh.close()
br.open("file:///home/test/axess1.dat")
# br.open(s)
print "foo"
# particular cookiejar)
br.set_cookiejar(cj)
response = br.response() # this is a copy of response
fnamepath = "/html/[EMAIL PROTECTED]'PSPAGE']/form[1]/attribute::name"
s = response.read()
print response.read()
d = libxml2dom.parseString(s, html=1)
ff = d.xpath(fnamepath)
fname = ff[0].nodeValue
print "fname = ",fname
br.select_form(name = "main")
print "ss"
sys.exit()
test html
---
View Schedule of Classes
hi john...
this is in regards to the web/parsing/factory/beautifulsoup
to reiterate, i have python 2.4, mechanize, browser, beatifulsoup installed.
i have the latest mech from svn.
i'm getting the same err as reported by john t. the code/err follows.. (i
can resend the test html if you need)
any thoughts/pointers/etc would be helpful...
thanks
-bruce
test code
#! /usr/bin/env python
#test python script
import re
import libxml2dom
import urllib
import urllib2
import sys, string
#import numarray
import httplib
from mechanize import Browser, RobustFactory
import mechanize
import BeautifulSoup
#
# Parsing App Information
# datafile
tfile = open("stanford.dat", 'wr+')
cj = mechanize.CookieJar()
br = Browser()
if __name__ == "__main__":
# main app
#
# start trying to get the stanford pages
cj = mechanize.CookieJar()
br = Browser(factory=RobustFactory())
fh = open('axess.dat')
s = fh.read()
fh.close()
br.open("file:///home/test/axess.dat")
.
.
.
.
err/output
Traceback (most recent call last):
File "./axess.py", line 45, in ?
br.open("file:///home/test/axess.dat")
File "build/bdist.linux-i686/egg/mechanize/_mechanize.py", line 130, in
open
File "build/bdist.linux-i686/egg/mechanize/_mechanize.py", line 170, in
_mech_open
File "build/bdist.linux-i686/egg/mechanize/_mechanize.py", line 213, in
set_response
File "build/bdist.linux-i686/egg/mechanize/_html.py", line 577, in
set_response
File "build/bdist.linux-i686/egg/mechanize/_html.py", line 316, in
__init__
File "/usr/lib/python2.4/site-packages/BeautifulSoup.py", line 1326, in
__init__
BeautifulStoneSoup.__init__(self, *args, **kwargs)
File "/usr/lib/python2.4/site-packages/BeautifulSoup.py", line 973, in
__init__
self._feed()
File "/usr/lib/python2.4/site-packages/BeautifulSoup.py", line 987, in
_feed
smartQuotesTo=self.smartQuotesTo)
File "/usr/lib/python2.4/site-packages/BeautifulSoup.py", line 1580, in
__init__
u = self._convertFrom(proposedEncoding)
File "/usr/lib/python2.4/site-packages/BeautifulSoup.py", line 1614, in
_convertFrom
proposed = self.find_codec(proposed)
File "/usr/lib/python2.4/site-packages/BeautifulSoup.py", line 1731, in
find_codec
return self._codec(self.CHARSET_ALIASES.get(charset, charset)) \
File "/usr/lib/python2.4/site-packages/BeautifulSoup.py", line 1740, in
_codec
codecs.lookup(charset)
TypeError: lookup() argument 1 must be string, not bool
is this where i've seen references to integrating Beautifulsoup in the wb
browsing app?
-bruce
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf
Of John J Lee
Sent: Monday, July 1
RE: Don't use regular expressions to "validate" email addresses (was: Ineed some help with a regexp please)
so ben... if you were creating a web app with an email form... rather than try to check if the email is valid... you'd create something to allow anyone to potentially spam the hell out of a system... my two cents worth... try to verify/validate that the email is valid, and possibly belongs to the user... peace... -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Ben Finney Sent: Thursday, September 21, 2006 6:07 PM To: [email protected] Subject: Don't use regular expressions to "validate" email addresses (was: Ineed some help with a regexp please) "John Machin" <[EMAIL PROTECTED]> writes: > A little more is unfortunately not enough. The best advice you got was > to use an existing e-mail address validator. The definition of a valid > e-mail address is complicated. You may care to check out "Mastering > Regular Expressions" by Jeffery Friedl. In the first edition, at least > (I haven't looked at the 2nd), he works through assembling a 4700+ byte > regex for validating e-mail addresses. Yes, that's 4KB. It's the best > advertisement for *not* using regexes for a task like that that I've > ever seen. The best advice I've seen when people ask "How do I validate whether an email address is valid?" was "Try sending mail to it". It's both Pythonic, and truly the best way. If you actually want to confirm, don't try to validate it statically; *use* the email address, and check the result. Send an email to that address, and don't use it any further unless you get a reply saying "yes, this is the right address to use" from the recipient. The sending system's mail transport agent, not regular expressions, determines which part is the domain to send the mail to. The domain name system, not regular expressions, determines what domains are valid, and what host should receive mail for that domain. Most especially, the receiving mail system, not regular expressions, determines what local-parts are valid. -- \ "I believe in making the world safe for our children, but not | `\our children's children, because I don't think children should | _o__) be having sex." -- Jack Handey | Ben Finney -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
RE: client/server design and advice
hi irmen... happened to come across this post. haven't looked at pyro. regarding your 'work packets' could these essentially be 'programs/apps' that that are requested by the client apps, and are then granted by the dispatch/server app? i'm considering condor (univ of wisconsin) but am curious as to if pyro might also work. i'm looking to create a small distributed crawling app for crawling/scraping of targeted websites thanks -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Irmen de Jong Sent: Friday, December 01, 2006 10:05 AM To: [email protected] Subject: Re: client/server design and advice TonyM wrote: > Lastly, as far as the networking goes, i have seen posts and such about > something called Pyro (http://pyro.sourceforge.net) and wondered if > that was worth looking into for the client/server interaction. I'm currently busy with a new version of Pyro (3.6) and it already includes a new 'distributed computing' example, where there is a single dispatcher service and one or more 'worker' clients. The clients request work 'packets' from the dispatcher and process them in parallel. Maybe this is a good starting point of your system? Current code is available from Pyro's CVS repository. --Irmen -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
update attribute - (newbie)
>>> class A: ... def __init__(self): ... self.t = 4 ... self.p = self._get_p() ... def _get_p(self): ... return self.t ... >>> a = A() >>> a.p 4 >>> a.t += 7 >>> a.p 4 I would like to have it that when I ask for p, method _get_p is always called so that attribute can be updated. How can I have this functionality here? thanks -- http://mail.python.org/mailman/listinfo/python-list
RE: BitKeeper for Python?
john you might check out trac and subversion/svn... CVS is often used as well... keep in mind that any app equires that you be willing to out in a certain amount of time for the admin of the project tool. if what you're creating is open source, you might as well go ahead and use the sourceforge.org site/app... -bruce -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Jp Calderone Sent: Sunday, May 01, 2005 1:26 PM To: [email protected] Subject: Re: BitKeeper for Python? On Sun, 01 May 2005 20:16:40 GMT, John Smith <[EMAIL PROTECTED]> wrote: >I am going to be working with some people on a project that is going to be >done over the internet. I am looking for a good method of keeping everyone's >code up to date and have everyone be able to access all the code including >all the changes and be able to determine what parts of the code were >changed. > >If it were opensource that would be even better. > Have you checked out this page? http://www.google.com/search?hl=en&lr=&q=version+control+system&btnG=Search Jp -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
