Re: FTP example going through a FTP Proxy
On Jan 7, 12:32 pm, jakecjacobson wrote:
> Hi,
>
> I need to write a simple Python script that I can connect to a FTP
> server and download files from the server to my local box. I am
> required to go through a FTP Proxy and I don't see any examples on how
> to do this. The FTP proxy doesn't require username or password to
> connect but the FTP server that I am connecting to does.
>
> Any examples on how to do this would be greatly appreciated. I am
> limited to using Python version 2.4.3 on a Linux box.
This is what I have tried so far,
import urllib
proxies = {'ftp':'ftp://proxy_server:21'}
ftp_server = 'ftp.somecompany.com'
ftp_port='21'
username = ''
password = 'secretPW'
ftp_string='ftp://' + username + '@' + password + ftp_server + ':' +
ftp_port
data = urllib.urlopen(ftp_string, proxies=proxies)
data=urllib.urlopen(req).read()
print data
I get the following error:
Traceback (most recent call last):
File "./ftptest.py", line 22, in ?
data = urllib.urlopen(ftp_server, proxies=proxies)
File "/usr/lib/python2.4/urllib.py", line 82, in urlopen
return opener.open(url)
File "/usr/lib/python2.4/urllib.py", line 190, in open
return getattr(self, name)(url)
File "/usr/lib/python2.4/urllib.py", line 470, in open_ftp
host, path = splithost(url)
File "/usr/lib/python2.4/urllib.py", line 949, in splithost
match = _hostprog.match(url)
TypeError: expected string or buffer
--
http://mail.python.org/mailman/listinfo/python-list
Re: FTP example going through a FTP Proxy
On Jan 7, 2:11 pm, jakecjacobson wrote:
> On Jan 7, 12:32 pm, jakecjacobson wrote:
>
> > Hi,
>
> > I need to write a simple Python script that I can connect to a FTP
> > server and download files from the server to my local box. I am
> > required to go through a FTP Proxy and I don't see any examples on how
> > to do this. The FTP proxy doesn't require username or password to
> > connect but the FTP server that I am connecting to does.
>
> > Any examples on how to do this would be greatly appreciated. I am
> > limited to using Python version 2.4.3 on a Linux box.
>
> This is what I have tried so far,
>
> import urllib
>
> proxies = {'ftp':'ftp://proxy_server:21'}
> ftp_server = 'ftp.somecompany.com'
> ftp_port='21'
> username = ''
> password = 'secretPW'
>
> ftp_string='ftp://' + username + '@' + password + ftp_server + ':' +
> ftp_port
>
> data = urllib.urlopen(ftp_string, proxies=proxies)
>
> data=urllib.urlopen(req).read()
>
> print data
>
> I get the following error:
>
> Traceback (most recent call last):
> File "./ftptest.py", line 22, in ?
> data = urllib.urlopen(ftp_server, proxies=proxies)
> File "/usr/lib/python2.4/urllib.py", line 82, in urlopen
> return opener.open(url)
> File "/usr/lib/python2.4/urllib.py", line 190, in open
> return getattr(self, name)(url)
> File "/usr/lib/python2.4/urllib.py", line 470, in open_ftp
> host, path = splithost(url)
> File "/usr/lib/python2.4/urllib.py", line 949, in splithost
> match = _hostprog.match(url)
> TypeError: expected string or buffer
I might be getting closer. Now I am getting "I/O error(ftp error):
(111, 'Connection refused')" error with the following code:
import urllib2
proxies = {'ftp':'ftp://proxy_server:21'}
ftp_server = 'ftp.somecompany.com'
ftp_port='21'
username = ''
password = 'secretPW'
password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()
top_level_url = ftp_server
password_mgr.add_password(None, top_level_url, username, password)
proxy_support = urllib2.ProxyHandler(proxies)
handler = urllib2.HTTPBasicAuthHandler(password_mgr)
opener = urllib2.build_opener(proxy_support)
opener = urllib2.build_opener(handler)
a_url = 'ftp://' + ftp_server + ':' + ftp_port + '/'
print a_url
try:
data = opener.open(a_url)
print data
except IOError, (errno, strerror):
print "I/O error(%s): %s" % (errno, strerror)
--
http://mail.python.org/mailman/listinfo/python-list
FTP example going through a FTP Proxy
Hi, I need to write a simple Python script that I can connect to a FTP server and download files from the server to my local box. I am required to go through a FTP Proxy and I don't see any examples on how to do this. The FTP proxy doesn't require username or password to connect but the FTP server that I am connecting to does. Any examples on how to do this would be greatly appreciated. I am limited to using Python version 2.4.3 on a Linux box. -- http://mail.python.org/mailman/listinfo/python-list
Re: FTP example going through a FTP Proxy
On Jan 7, 3:56 pm, jakecjacobson wrote:
> On Jan 7, 2:11 pm, jakecjacobson wrote:
>
>
>
> > On Jan 7, 12:32 pm, jakecjacobson wrote:
>
> > > Hi,
>
> > > I need to write a simple Python script that I can connect to a FTP
> > > server and download files from the server to my local box. I am
> > > required to go through a FTP Proxy and I don't see any examples on how
> > > to do this. The FTP proxy doesn't require username or password to
> > > connect but the FTP server that I am connecting to does.
>
> > > Any examples on how to do this would be greatly appreciated. I am
> > > limited to using Python version 2.4.3 on a Linux box.
>
> > This is what I have tried so far,
>
> > import urllib
>
> > proxies = {'ftp':'ftp://proxy_server:21'}
> > ftp_server = 'ftp.somecompany.com'
> > ftp_port='21'
> > username = ''
> > password = 'secretPW'
>
> > ftp_string='ftp://' + username + '@' + password + ftp_server + ':' +
> > ftp_port
>
> > data = urllib.urlopen(ftp_string, proxies=proxies)
>
> > data=urllib.urlopen(req).read()
>
> > print data
>
> > I get the following error:
>
> > Traceback (most recent call last):
> > File "./ftptest.py", line 22, in ?
> > data = urllib.urlopen(ftp_server, proxies=proxies)
> > File "/usr/lib/python2.4/urllib.py", line 82, in urlopen
> > return opener.open(url)
> > File "/usr/lib/python2.4/urllib.py", line 190, in open
> > return getattr(self, name)(url)
> > File "/usr/lib/python2.4/urllib.py", line 470, in open_ftp
> > host, path = splithost(url)
> > File "/usr/lib/python2.4/urllib.py", line 949, in splithost
> > match = _hostprog.match(url)
> > TypeError: expected string or buffer
>
> I might be getting closer. Now I am getting "I/O error(ftp error):
> (111, 'Connection refused')" error with the following code:
>
> import urllib2
>
> proxies = {'ftp':'ftp://proxy_server:21'}
> ftp_server = 'ftp.somecompany.com'
> ftp_port='21'
> username = ''
> password = 'secretPW'
>
> password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()
> top_level_url = ftp_server
> password_mgr.add_password(None, top_level_url, username, password)
>
> proxy_support = urllib2.ProxyHandler(proxies)
> handler = urllib2.HTTPBasicAuthHandler(password_mgr)
> opener = urllib2.build_opener(proxy_support)
> opener = urllib2.build_opener(handler)
> a_url = 'ftp://' + ftp_server + ':' + ftp_port + '/'
> print a_url
>
> try:
> data = opener.open(a_url)
> print data
> except IOError, (errno, strerror):
> print "I/O error(%s): %s" % (errno, strerror)
I tried the same code from a different box and got a different error
message:
I/O error(ftp error): 501 USER format: proxy-user:auth-
met...@destination. Closing connection.
My guess is that my original box couldn't connect with the firewall
proxy so I was getting a connection refused error. Now it appears
that the password mgr has an issue if I understand the error
correctly. I really hope that someone out in the Python Community can
give me a pointer.
--
http://mail.python.org/mailman/listinfo/python-list
Getting/Setting HTTP Headers
I need to write a feed parser that takes a url for any Atom or RSS feed and transform it into an Atom feed. I done the transformation part but I want to support conditional HTTP requests. I have not been able to find any examples that show: 1. How to read the Last_Modified or ETag header value from the requester 2. How to set the corresponding HTTP header value, either a 302 not modified or the new Last_Modified date and/or ETag values -- http://mail.python.org/mailman/listinfo/python-list
Processing XML File
I need to take a XML web resource and split it up into smaller XML files. I am able to retrieve the web resource but I can't find any good XML examples. I am just learning Python so forgive me if this question has been answered many times in the past. My resource is like: ... ... ... ... So in this example, I would need to output 2 files with the contents of each file what is between the open and close document tag. -- http://mail.python.org/mailman/listinfo/python-list
Re: Processing XML File
On Jan 29, 1:04 pm, Adam Tauno Williams wrote: > On Fri, 2010-01-29 at 09:25 -0800, jakecjacobson wrote: > > I need to take a XML web resource and split it up into smaller XML > > files. I am able to retrieve the web resource but I can't find any > > good XML examples. I am just learning Python so forgive me if this > > question has been answered many times in the past. > > My resource is like: > > > > ... > > ... > > > > > > ... > > ... > > > > So in this example, I would need to output 2 files with the contents > > of each file what is between the open and close document tag. > > Do you want to parse the document or SaX? > > I have a SaX example at > <http://coils.hg.sourceforge.net/hgweb/coils/coils/file/99b227b08f7f/s...> Thanks but I am way over my head with XML, Python. I am working with DDMS and need to output the individual resource nodes to their own file. I hope that this helps and I need a good example and how to use it. Here is what a resource node looks like: https://metadata.dod.mil/mdr/ns/DDMS/1.4/ https://metadata.dod.mil/mdr/ns/DDMS/1.4/"; xmlns:ddms="https://metadata.dod.mil/mdr/ns/DDMS/1.4/"; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; xmlns:ICISM="urn:us:gov:ic:ism:v2"> https://metadata.dod.mil/mdr/ ns/MDR/1.0/MDR.owl#GovernanceNamespace" ddms:value="TBD"/> Sample Taxonomy This is a sample taxonomy created for the Help page. Sample Developer FGM, Inc. 703-885-1000 [email protected] You can see the DDMS site at https://metadata.dod.mil/. -- http://mail.python.org/mailman/listinfo/python-list
Re: Processing XML File
On Jan 29, 2:41 pm, Stefan Behnel wrote:
> Sells, Fred, 29.01.2010 20:31:
>
> > Google is your friend. Elementtree is one of the better documented
> > IMHO, but there are many modules to do this.
>
> Unless the OP provides some more information, "do this" is rather
> underdefined. And sending someone off to Google who is just learning the
> basics of Python and XML and trying to solve a very specific problem with
> them is not exactly the spirit I'm used to in this newsgroup.
>
> Stefan
Just want to thank everyone for their posts. I got it working after I
discovered a name space issue with this code.
xmlDoc = libxml2.parseDoc(guts)
# Ignore namespace and just get the Resource
resourceNodes = xmlDoc.xpathEval('//*[local-name()="Resource"]')
for rNode in resourceNodes:
print rNode
--
http://mail.python.org/mailman/listinfo/python-list
Authenticating to web service using https and client certificate
Hi, I need to post some XML files to a web client that requires a client certificate to authenticate. I have some code that works on posting a multipart form over http but I need to modify it to pass the proper certificate and post the XML file. Is there any example code that will point me in the correct direction? Thanks for your help. -- http://mail.python.org/mailman/listinfo/python-list
exceptions.TypeError an integer is required
I am trying to do a post to a REST API over HTTPS and requires the
script to pass a cert to the server. I am getting
"exceptions.TypeError an integer is required" error and can't find the
reason. I commenting out the lines of code, it is happening on the
connection.request() line. Here is the problem code. Would love some
help if possible.
head = {"Content-Type" : "application/x-www-form-urlencoded",
"Accept" : "text/plain"}
parameters = urlencode({"collection" : collection, "entryxml" : open
(file,'r').read()})
try:
connection = httplib.HTTPSConnection(host, port, key_file,
cert_file)
connection.request('POST', path, parameters, head)
response = connection.getresponse()
print response.status, response.reason
except:
print sys.exc_type, sys.exc_value
connection.close()
--
http://mail.python.org/mailman/listinfo/python-list
Re: exceptions.TypeError an integer is required
On Jul 24, 3:11 pm, Steven D'Aprano wrote: > On Fri, 24 Jul 2009 11:24:58 -0700, jakecjacobson wrote: > > I am trying to do a post to a REST API over HTTPS and requires the > > script to pass a cert to the server. I am getting "exceptions.TypeError > > an integer is required" error and can't find the reason. I commenting > > out the lines of code, it is happening on the connection.request() line. > > Here is the problem code. Would love some help if possible. > > Please post the traceback that you get. > > My guess is that you are passing a string instead of an integer, probably > for the port. > > [...] > > > except: > > print sys.exc_type, sys.exc_value > > As a general rule, a bare except of that fashion is bad practice. Unless > you can explain why it is normally bad practice, *and* why your case is > an exception (no pun intended) to the rule "never use bare except > clauses", I suggest you either: > > * replace "except:" with "except Exception:" instead. > > * better still, re-write the entire try block as: > > try: > [code goes here] > finally: > connection.close() > > and use the Python error-reporting mechanism instead of defeating it. > > -- > Steven Steven, You are quite correct in your statements. My goal was not to make great code but something that I could quickly test. My assumption was that the httplib.HTTPSConnection() would do the cast to int for me. As soon as I cast it to an int, I was able to get past that issue. Still not able to post because I am getting a bad cert error. Jake Jacobson -- http://mail.python.org/mailman/listinfo/python-list
bad certificate error
Hi,
I am getting the following error when doing a post to REST API,
Enter PEM pass phrase:
Traceback (most recent call last):
File "./ices_catalog_feeder.py", line 193, in ?
main(sys.argv[1])
File "./ices_catalog_feeder.py", line 60, in main
post2Catalog(catalog_host, catalog_port, catalog_path, os.path.join
(input_dir, file), collection_name, key_file, cert_file)
File "./ices_catalog_feeder.py", line 125, in post2Catalog
connection.request('POST', path, parameters, head)
File "/usr/lib/python2.4/httplib.py", line 810, in request
self._send_request(method, url, body, headers)
File "/usr/lib/python2.4/httplib.py", line 833, in _send_request
self.endheaders()
File "/usr/lib/python2.4/httplib.py", line 804, in endheaders
self._send_output()
File "/usr/lib/python2.4/httplib.py", line 685, in _send_output
self.send(msg)
File "/usr/lib/python2.4/httplib.py", line 652, in send
self.connect()
File "/usr/lib/python2.4/httplib.py", line 1079, in connect
ssl = socket.ssl(sock, self.key_file, self.cert_file)
File "/usr/lib/python2.4/socket.py", line 74, in ssl
return _realssl(sock, keyfile, certfile)
socket.sslerror: (1, 'error:14094412:SSL
routines:SSL3_READ_BYTES:sslv3 alert bad certificate')
My code where this error occurs is:
head = {"Content-Type" : "application/x-www-form-urlencoded",
"Accept" : "text/plain"}
parameters = urlencode({"collection" : collection, "entryxml" : open
(file,'r').read()})
print "Sending the file to: " + host
try:
try:
# Default port is 443.
# key_file is the name of a PEM formatted file that contains
your
private key.
# cert_file is a PEM formatted certificate chain file.
connection = httplib.HTTPSConnection(host, int(port), key_file,
cert_file)
connection.request('POST', path, parameters, head)
response = connection.getresponse()
print response.status, response.reason
except httplib.error, (value,message):
print value + ':' + message
finally:
connection.close()
I was wondering if this is due to the server having a invalid server
cert? If I go to this server in my browser, I get a "This server
tried to identify itself with invalid information". Is there a way to
ignore this issue with Python? Can I setup a trust store and add this
server to the trust store?
--
http://mail.python.org/mailman/listinfo/python-list
Re: bad certificate error
On Jul 27, 2:23 pm, "Gabriel Genellina" wrote: > En Mon, 27 Jul 2009 12:57:40 -0300, jakecjacobson > escribió: > > > I was wondering if this is due to the server having a invalid server > > cert? If I go to this server in my browser, I get a "This server > > tried to identify itself with invalid information". Is there a way to > > ignore this issue with Python? Can I setup a trust store and add this > > server to the trust store? > > I don't see the point in trusting someone that you know is telling lies > about itself. > > -- > Gabriel Genellina It is a test box that the team I am on runs. That is why I would trust it. -- http://mail.python.org/mailman/listinfo/python-list
Re: bad certificate error
On Jul 28, 3:29 am, Nick Craig-Wood wrote:
> jakecjacobson wrote:
> > I am getting the following error when doing a post to REST API,
>
> > Enter PEM pass phrase:
> > Traceback (most recent call last):
> > File "./ices_catalog_feeder.py", line 193, in ?
> > main(sys.argv[1])
> > File "./ices_catalog_feeder.py", line 60, in main
> > post2Catalog(catalog_host, catalog_port, catalog_path, os.path.join
> > (input_dir, file), collection_name, key_file, cert_file)
> > File "./ices_catalog_feeder.py", line 125, in post2Catalog
> > connection.request('POST', path, parameters, head)
> > File "/usr/lib/python2.4/httplib.py", line 810, in request
> > self._send_request(method, url, body, headers)
> > File "/usr/lib/python2.4/httplib.py", line 833, in _send_request
> > self.endheaders()
> > File "/usr/lib/python2.4/httplib.py", line 804, in endheaders
> > self._send_output()
> > File "/usr/lib/python2.4/httplib.py", line 685, in _send_output
> > self.send(msg)
> > File "/usr/lib/python2.4/httplib.py", line 652, in send
> > self.connect()
> > File "/usr/lib/python2.4/httplib.py", line 1079, in connect
> > ssl = socket.ssl(sock, self.key_file, self.cert_file)
> > File "/usr/lib/python2.4/socket.py", line 74, in ssl
> > return _realssl(sock, keyfile, certfile)
> > socket.sslerror: (1, 'error:14094412:SSL
> > routines:SSL3_READ_BYTES:sslv3 alert bad certificate')
>
> > My code where this error occurs is:
>
> > head = {"Content-Type" : "application/x-www-form-urlencoded",
> > "Accept" : "text/plain"}
> > parameters = urlencode({"collection" : collection, "entryxml" : open
> > (file,'r').read()})
> > print "Sending the file to: " + host
>
> > try:
> > try:
> > # Default port is 443.
> > # key_file is the name of a PEM formatted file that contains your
> > private key.
> > # cert_file is a PEM formatted certificate chain file.
> > connection = httplib.HTTPSConnection(host, int(port), key_file,
> > cert_file)
> > connection.request('POST', path, parameters, head)
> > response = connection.getresponse()
> > print response.status, response.reason
> > except httplib.error, (value,message):
> > print value + ':' + message
> > finally:
> > connection.close()
>
> > I was wondering if this is due to the server having a invalid server
> > cert?
>
> I'd say judging from the traceback you messed up key_file or cert_file
> somehow.
>
> Try using the openssl binary on them (read the man page to see how!)
> to check them out.
>
> > If I go to this server in my browser, I get a "This server tried to
> > identify itself with invalid information". Is there a way to
> > ignore this issue with Python? Can I setup a trust store and add
> > this server to the trust store?
>
> Invalid how? Self signed certificate? Domain mismatch? Expired certificate?
>
> --
> Nick Craig-Wood --http://www.craig-wood.com/nick
Nick,
Thanks for the help on this. I will check my steps on openssl again
and see if I messed up. What I tried to do was:
1. Save my PKI cert to disk. It was saved as a P12 file
2. Use openssl to convert it to the needed .pem file type
3. Saved the CA that my cert was signed by as a .crt file
These are the 2 files that I was using for key_file and
* cert_file -> CA
* key_file -> my PKI cert converted to a .pem file
"Invalid how? Self signed certificate? Domain mismatch? Expired
certificate?" It is a server name mismatch.
For everyone that wants to discuss why we shouldn't do this, great but
I can't change the fact that I need to do this. I can't use http or
even get a correct cert at this time. This is a quick a dirty project
to demonstrate capability. I need something more than slide show
briefs.
--
http://mail.python.org/mailman/listinfo/python-list
Re: bad certificate error
On Jul 28, 9:48 am, Jean-Paul Calderone wrote: > On Tue, 28 Jul 2009 03:35:55 -0700 (PDT), jakecjacobson > wrote: > > [snip] > > >"Invalid how? Self signed certificate? Domain mismatch? Expired > >certificate?" It is a server name mismatch. > > Python 2.4 is not capable of allowing you to customize this verification > behavior. It is hard coded to let OpenSSL make the decision about whether > to accept the certificate or not. > > Either M2Crypto or pyOpenSSL will let you ignore verification errors. The > new ssl module in Python 2.6 may also as well. > > Jean-Paul Thanks, I will look into these suggestions. -- http://mail.python.org/mailman/listinfo/python-list
Re: bad certificate error
On Jul 29, 2:08 am, "Gabriel Genellina" wrote: > En Tue, 28 Jul 2009 09:02:40 -0300, Steven D'Aprano > escribió: > > > > > On Mon, 27 Jul 2009 23:16:39 -0300, Gabriel Genellina wrote: > > >> I don't see the point on "fixing" either the Python script or httplib to > >> accomodate for an invalid server certificate... If it's just for > >> internal testing, I'd use HTTP instead (at least until the certificate > >> is fixed). > > > In real life, sometimes you need to drive with bad brakes on your car, > > walk down dark alleys in the bad part of town, climb a tree without a > > safety line, and use a hammer without wearing goggles. We can do all > > these things. > > > The OP has said that, for whatever reason, he needs to ignore a bad > > server certificate when connecting to HTTPS. Python is a language where > > developers are allowed to shoot themselves in the foot, so long as they > > do so in full knowledge of what they're doing. > > > So, putting aside all the millions of reasons why the OP shouldn't accept > > an invalid certificate, how can he accept an invalid certificate? > > Yes, I understand the situation, but I'm afraid there is no way (that I > know of). At least not without patching _ssl.c; all the SSL negotiation is > handled by the OpenSSL library itself. > > I vaguely remember a pure Python SSL implementation somewhere that perhaps > could be hacked to bypass all controls. But making it work properly will > probably require a lot more effort than installing a self signed > certificate in the server... > > -- > Gabriel Genellina I have it working and I want to thank everyone for their efforts and very helpful hints. The error was with me and not understanding the documentation about the cert_file & key_file. After using openssl to divide up my p12 file into a cert file and a key file using the instructions http://security.ncsa.uiuc.edu/research/grid-howtos/usefulopenssl.php. I got everything working. Again, much thanks. Jake -- http://mail.python.org/mailman/listinfo/python-list
Help making this script better
Hi,
After much Google searching and trial & error, I was able to write a
Python script that posts XML files to a REST API using HTTPS and
passing PEM cert & key file. It seems to be working but would like
some pointers on how to handle errors. I am using Python 2.4, I don't
have the capability to upgrade even though I would like to. I am very
new to Python so help will be greatly appreciated and I hope others
can use this script.
#!/usr/bin/python
#
# catalog_feeder.py
#
# This sciript will process a directory of XML files and push them to
the Enterprise Catalog.
# You configure this script by using a configuration file that
describes the required variables.
# The path to this file is either passed into the script as a command
line argument or hard coded
# in the script. The script will terminate with an error if it can't
process the XML file.
#
# IMPORT STATEMENTS
import httplib
import mimetypes
import os
import sys
import shutil
import time
from urllib import *
from time import strftime
from xml.dom import minidom
def main(c):
start_time = time.time()
# Set configuration parameters
try:
# Process the XML conf file
xmldoc = minidom.parse(c)
catalog_host = readConfFile(xmldoc, 'catalog_host')
catalog_port = int(readConfFile(xmldoc, 'catalog_port'))
catalog_path = readConfFile(xmldoc, 'catalog_path')
collection_name = readConfFile(xmldoc, 'collection_name')
cert_file = readConfFile(xmldoc, 'cert_file')
key_file = readConfFile(xmldoc, 'key_file')
log_file = readConfFile(xmldoc, 'log_file')
input_dir = readConfFile(xmldoc, 'input_dir')
archive_dir = readConfFile(xmldoc, 'archive_dir')
hold_dir = readConfFile(xmldoc, 'hold_dir')
except Exception, inst:
# I had an error so report it and exit script
print "Unexpected error opening %s: %s" % (c, inst)
sys.exit(1)
# Log Starting
logOut = verifyLogging(log_file)
if logOut:
log(logOut, "Processing Started ...")
# Get list of XML files to process
if os.path.exists(input_dir):
files = getFiles2Post(input_dir)
else:
if logOut:
log(logOut, "WARNING!!! Couldn't find input directory:
" +
input_dir)
cleanup(logOut)
else:
print "Dir doen't exist: " + input_dir
sys.exit(1)
try:
# Process each file to the catalog
connection = httplib.HTTPSConnection(catalog_host, catalog_port,
key_file, cert_file)
for file in files:
log(logOut, "Processing " + file + " ...")
try:
response = post2Catalog(connection,
catalog_path, os.path.join
(input_dir, file), collection_name)
if response.status == 200:
msg = "Succesfully posted " + file + "
to cataloge ..."
print msg
log(logOut, msg)
# Move file to done directory
shutil.move(os.path.join(input_dir,
file), os.path.join
(archive_dir, file))
else:
msg = "Error posting " + file + " to
cataloge [" + response.read
() + "] ..."
print msg
log(logOut, response.read())
# Move file to error dir
shutil.move(os.path.join(input_dir,
file), os.path.join(hold_dir,
file))
except IOError, (errno):
print "%s" % (errno)
except httplib.HTTPException, (e):
print "Unexpected error %s " % (e)
run_time = time.time() - start_time
print 'Run time: %f seconds' % run_time
# Clean up
connection.close()
cleanup(logOut)
# Get an arry of files from the input_dir
def getFiles2Post(d):
return (os.listdir(d))
# Read out the conf file and set the needed global variable
def readConfFile(xmldoc, tag):
return (xmldoc.getElementsByTagName(tag)[0].firstChild.data)
# Write out the message to log file
def log(f, m):
f.write(strftime("%Y-%m-%d %H:%M:%S") + " : " + m + '\n')
# Clean up and exit
def cleanup(logOut):
if logOut:
log(logOut, "Proce
How to unencode a string
This seems like a real simple newbie question but how can a person
unencode a string? In Perl I use something like: "$part=~ s/\%([A-Fa-
f0-9]{2})/pack('C', hex($1))/seg;"
If I have a string like Word1%20Word2%20Word3 I want to get Word1
Word2 Word3. Would also like to handle special characters like '",(){}
[] etc/
--
http://mail.python.org/mailman/listinfo/python-list
Re: How to unencode a string
On Aug 27, 6:51 pm, Piet van Oostrum wrote:
> >>>>> jakecjacobson (j) wrote:
> >j> This seems like a real simple newbie question but how can a person
> >j> unencode a string? In Perl I use something like: "$part=~ s/\%([A-Fa-
> >j> f0-9]{2})/pack('C', hex($1))/seg;"
> >j> If I have a string like Word1%20Word2%20Word3 I want to get Word1
> >j> Word2 Word3.
>
> urllib.unquote(string)
>
> >j> Would also like to handle special characters like '",(){}
> >j> [] etc/
>
> What would you like to do with them? Or do you mean to replace %27 by ' etc?
> --
> Piet van Oostrum
> URL:http://pietvanoostrum.com[PGP 8DAE142BE17999C4]
> Private email: [email protected]
Yes, take '%27' and replace with ', etc.
--
http://mail.python.org/mailman/listinfo/python-list
How to Convert IO Stream to XML Document
I am trying to build a Python script that reads a Sitemap file and
push the URLs to a Google Search Appliance. I am able to fetch the
XML document and parse it with regular expressions but I want to move
to using native XML tools to do this. The problem I am getting is if
I use urllib.urlopen(url) I can convert the IO Stream to a XML
document but if I use urllib2.urlopen and then read the response, I
get the content but when I use minidom.parse() I get a "IOError:
[Errno 2] No such file or directory:" error
THIS WORKS but will have issues if the IO Stream is a compressed file
def GetPageGuts(net, url):
pageguts = urllib.urlopen(url)
xmldoc = minidom.parse(pageguts)
return xmldoc
# THIS DOESN'T WORK, but I don't understand why
def GetPageGuts(net, url):
request=getRequest_obj(net, url)
response = urllib2.urlopen(request)
response.headers.items()
pageguts = response.read()
# Test to see if the response is a gzip/compressed data stream
if isCompressedFile(response, url):
compressedstream = StringIO.StringIO(pageguts)
gzipper = gzip.GzipFile(fileobj = compressedstream)
pageguts = gzipper.read()
xmldoc = minidom.parse(pageguts)
response.close()
return xmldoc
# I am getting the following error
Starting SiteMap Manager ...
Traceback (most recent call last):
File "./tester.py", line 267, in ?
main()
File "./tester.py", line 49, in main
fetchSiteMap(ResourceDict, line)
File "./tester.py", line 65, in fetchSiteMap
pageguts = GetPageGuts(ResourceDict['NET'], url)
File "./tester.py", line 89, in GetPageGuts
xmldoc = minidom.parse(pageguts)
File "/usr/lib/python2.4/xml/dom/minidom.py", line 1915, in parse
return expatbuilder.parse(file)
File "/usr/lib/python2.4/xml/dom/expatbuilder.py", line 922, in
parse
fp = open(file, 'rb')
IOError: [Errno 2] No such file or directory: '\nhttp://www.sitemaps.org/
schemas/sitemap/0.9">\n\nhttp://www.myorg.org/janes/
sitemaps/binder_sitemap.xml\n2010-09-09\n\n\nhttp://www.myorg.org/janes/sitemaps/
dir_sitemap.xml\n2010-05-05\n
\n\nhttp://www.myorg.org/janes/sitemaps/
mags_sitemap.xml\n2010-09-09\n
\n\nhttp://www.myorg.org/janes/sitemaps/
news_sitemap.xml\n2010-09-09\n
\n\nhttp://www.myorg.org/janes/sitemaps/
sent_sitemap.xml\n2010-09-09\n
\n\nhttp://www.myorg.org/janes/sitemaps/
srep_sitemap.xml\n2001-05-04\n
\n\nhttp://www.myorg.org/janes/sitemaps/yb_sitemap.xml\n2010-09-09\n\n\n'
# A couple of supporting things
def getRequest_obj(net, url):
request = urllib2.Request(url)
request.add_header('User-Agent', 'ICES Sitemap Bot dni-ices-
[email protected]')
request.add_header('Accept-encoding', 'gzip')
return request
def isCompressedFile(r, u):
answer=False
if r.headers.has_key('Content-encoding'):
answer=True
else:
# Check to see if the URL ends in .gz
if u.endswith(".gz"):
answer=True
return answer
--
http://mail.python.org/mailman/listinfo/python-list
