Re: sys.path in python3.3
On Mon, Aug 27, 2012 at 12:18 AM, Ned Deily wrote: > In article > , > Nicholas Cole wrote: >> It certainly does exist. Distutils will happily put packages into it, >> but import won't find them. > > That's odd! It works for me on 10.8 and it worked for me yesterday on > 10.7 which I tested just after completing the python.org installer > builds. Perhaps there is some permission issue. Or the path name isn't > quite correct. Or you have some PYTHON* environment variable set, like > PYTHONNOUSERSITE? I'm also on 10.8. NPSC: nicholas$ set | grep PYTHON NPSC: nicholas$ The only user configuration I've done is to create the following configuration file: NPSC:~ nicholas$ cat .pydistutils.cfg [install] install_lib = ~/Library/Python/$py_version_short/site-packages install_scripts = ~/bin I should say, this has been a problem for all of the python3.3 alpha and beta releases, on previous releases of OS X. I can't understand why it works on your setup, though, because I haven't done anything at all (that I can think of) that ought of affect it. I wonder if the logic that adds the directory to sys.path is being too clever for everyone's good? Best wishes, N. -- http://mail.python.org/mailman/listinfo/python-list
Re: Calling External (Perl)Script in Python
Pervez Mulla writes: > I am trying to call perl script in my python view.py and store that > data in logfile To run external programs and connect to their standard streams, use the ‘subprocess’ module http://docs.python.org/library/subprocess.html> from the Python standard library. -- \ “If we don't believe in freedom of expression for people we | `\ despise, we don't believe in it at all.” —Noam Chomsky, | _o__) 1992-11-25 | Ben Finney -- http://mail.python.org/mailman/listinfo/python-list
Re: sys.path in python3.3
On 26/08/12 20:47:34, Nicholas Cole wrote: > Dear List, > > In all previous versions of python, I've been able to install packages > into the path: > > ~/Library/Python/$py_version_short/site-packages > > but in the rc builds of python 3.3 this is no longer part of sys.path. It has been changed to ~/Library/Python/$py_version_short/lib/python/site-packages You can find the path it's looking for in site.USER_SITE > Before I go hacking the install, is there a reason that this path was > removed? Is there a recommended way to get it back, or is this a > gentle way of pushing us all to use virtualenv rather than installing > user-specific packages? I don't know why it was changed. It would be nice if there were some magic code you could use for "install_lib" in your .pydistutils.cfg file that worked in both 3.2 and 3.3. -- HansM -- http://mail.python.org/mailman/listinfo/python-list
Re: sys.path in python3.3
In article , Nicholas Cole wrote: > The only user configuration I've done is to create the following > configuration file: > > NPSC:~ nicholas$ cat .pydistutils.cfg > [install] > install_lib = ~/Library/Python/$py_version_short/site-packages > install_scripts = ~/bin > > I should say, this has been a problem for all of the python3.3 alpha > and beta releases, on previous releases of OS X. > > I can't understand why it works on your setup, though, because I > haven't done anything at all (that I can think of) that ought of > affect it. I wonder if the logic that adds the directory to sys.path > is being too clever for everyone's good? Ah, now I know what the problem is. If you look carefully, you'll see that the path created in the Distribute example, effectively by using setup.py --install, is slightly different from the install_lib path in your .pydistutils.cfg. What happened is that Python 2.7 and Python 3.2 introduced a change in the user site path for OS X framework builds. The rationale was to make the ~/Library/Python user site directory look more like the existing framework structure. This change was somewhat controversial and has not yet been resolved to everyone's satisfaction. (http://bugs.python.org/issue8084) A contributing factor is that feature changes to Distutils have been on hold with the planned introduction of its replacement, packaging. However, late in the 3.3 release cycle, it was decided that packaging wasn't quite ready for release and we never got around to taking another look at other pending Distutils-related issues, like this one. OTOH, this has now been the behavior for the lifetime of 2.7 and 3.2 and now for 3.3 and it is documented if you know where to look: http://docs.python.org/py3k/library/site.html It is not likely that, at this point, it will be changed back to 2.6/3.1 behavior nor to the exact original PEP-370 proposal either. So, for the current Python releases (2.7, 3.2, 3.3), one solution is to change the install_lib definition in .pydistutils.cfg to: install_lib = ~/Library/Python/$py_version_short/lib/python/site-packages Or just user--user on setup.py install commands. I'm sorry that I don't have a better story for now. That said, I think that installation is likely to be a major focus for Python 3.4 and that there will be a push to rationalize things in this rather murky area of Python. -- Ned Deily, [email protected] -- http://mail.python.org/mailman/listinfo/python-list
Re: sys.path in python3.3
In article <[email protected]>, Hans Mulder wrote: > On 26/08/12 20:47:34, Nicholas Cole wrote: > It has been changed to > > ~/Library/Python/$py_version_short/lib/python/site-packages > > You can find the path it's looking for in site.USER_SITE That is correct. > It would be nice if there were some magic code you could use > for "install_lib" in your .pydistutils.cfg file that worked > in both 3.2 and 3.3. As I explained in my reply that overlapped with yours, I believe you will find that 3.2 and 3.3 behave the same. The difference is with 3.1 (which is no longer supported); likewise, the same change occurred in 2.7. So all of the current actively supported released behave the same way. That's not much help if you need to use 2.6 or 3.1. -- Ned Deily, [email protected] -- http://mail.python.org/mailman/listinfo/python-list
Re: sys.path in python3.3
On Mon, Aug 27, 2012 at 10:05 AM, Ned Deily wrote: > In article <[email protected]>, > Hans Mulder wrote: >> On 26/08/12 20:47:34, Nicholas Cole wrote: >> It has been changed to >> >> ~/Library/Python/$py_version_short/lib/python/site-packages >> >> You can find the path it's looking for in site.USER_SITE > > That is correct. > >> It would be nice if there were some magic code you could use >> for "install_lib" in your .pydistutils.cfg file that worked >> in both 3.2 and 3.3. > > As I explained in my reply that overlapped with yours, I believe you > will find that 3.2 and 3.3 behave the same. The difference is with 3.1 > (which is no longer supported); likewise, the same change occurred in > 2.7. So all of the current actively supported released behave the same > way. That's not much help if you need to use 2.6 or 3.1. Dear Hans and Ned, Thank you both for your answers, and Ned, thank you especially for answering at such length. I do love the things that are "documented if you know where to look." I'm by no means a stranger to python's documentation, but I am sure that I would never, ever have found it. Now that I come to think of it, I think I probably hit this when I first had a look at 2.7, put in a sym-link and forgot all about it. I suppose that now that 2.7 is the default on OS X, I suppose it is time to move to the correct directory properly. Very best wishes, Nicholas -- http://mail.python.org/mailman/listinfo/python-list
Extract Text Format Table Data
Hi, I am trying to extract some data from a log file that outputs tables in text format. I've been trying to accomplish this for some time but without any luck. Hence, appreciate if any of you could help out. Below is a just one block of table from the file. There will be many blocks like this in the log file. ROUTES TRAFFIC RESULTS, LSR TRG MP DATE TIME 37 17 120824 R TRAFF NBIDS CCONG NDV ANBLO MHTIME NBANSW AABBCCO 6.4 204 0.0 1151.0113.4 144 AABBCCI 3.0 293 1151.0 37.0 171 DDEEFFO 0.2 5 0.0590.0107.6 3 EEFFEEI 0.0 0590.0 0.0 0 HHGGFFO 0.0 0 0.0300.0 0.0 0 HHGGFFI 0.3 15300.0 62.2 4 END Thanks -- http://mail.python.org/mailman/listinfo/python-list
Extract Text Table From File
Hi, I am trying to extract some text table data from a log file. I am trying different methods, but I don't seem to get anything to work. I am kind of new to python as well. Hence, appreciate if someone could help me out. Below is just ONE block of the traffic i have in the log files. There will be more in them with different data. ROUTES TRAFFIC RESULTS, LSR TRG MP DATE TIME 37 17 120824 R TRAFF NBIDS CCONG NDV ANBLO MHTIME NBANSW AABBCCO 6.4 204 0.0 1151.0113.4 144 AABBCCI 3.0 293 1151.0 37.0 171 DDEEFFO 0.2 5 0.0590.0107.6 3 EEFFEEI 0.0 0590.0 0.0 0 HHGGFFO 0.0 0 0.0300.0 0.0 0 HHGGFFI 0.3 15300.0 62.2 4 END Thanks -- http://mail.python.org/mailman/listinfo/python-list
Re: Extract Text Table From File
On 2012-08-27 11:53, Huso wrote:
Hi,
I am trying to extract some text table data from a log file. I am trying
different methods, but I don't seem to get anything to work. I am kind of new
to python as well. Hence, appreciate if someone could help me out.
#
# Write test data to test.txt
#
data = """
ROUTES TRAFFIC RESULTS, LSR
TRG MP DATE TIME
37 17 120824
R TRAFF NBIDS CCONG NDV ANBLO MHTIME NBANSW
AABBCCO 6.4 204 0.0 1151.0113.4 144
AABBCCI 3.0 293 1151.0 37.0 171
DDEEFFO 0.2 5 0.0590.0107.6 3
EEFFEEI 0.0 0590.0 0.0 0
HHGGFFO 0.0 0 0.0300.0 0.0 0
HHGGFFI 0.3 15300.0 62.2 4
END
"""
fout = open("test.txt","wb+")
fout.write(data)
fout.close()
#
# This is how you iterate over a file and process its lines
#
fin = open("test.txt","r")
for line in fin:
# This is one possible way to extract values.
values = line.strip().split()
print values
This will print:
[]
['ROUTES', 'TRAFFIC', 'RESULTS,', 'LSR']
['TRG', 'MP', 'DATE', 'TIME']
['37', '17', '120824', '']
[]
['R', 'TRAFF', 'NBIDS', 'CCONG', 'NDV', 'ANBLO', 'MHTIME', 'NBANSW']
['AABBCCO', '6.4', '204', '0.0', '115', '1.0', '113.4', '144']
['AABBCCI', '3.0', '293', '115', '1.0', '37.0', '171']
['DDEEFFO', '0.2', '5', '0.0', '59', '0.0', '107.6', '3']
['EEFFEEI', '0.0', '0', '59', '0.0', '0.0', '0']
['HHGGFFO', '0.0', '0', '0.0', '30', '0.0', '0.0', '0']
['HHGGFFI', '0.3', '15', '30', '0.0', '62.2', '4']
['END']
The "values" list in the last line contains these values. This will work
only if you don't have spaces in your values. Otherwise you can use
regular expressions to parse a line. See here:
http://docs.python.org/library/re.html
Since you did not give any specification on your file format, it would
be hard to give a concrete program that parses your file(s)
Best,
Laszlo
--
http://mail.python.org/mailman/listinfo/python-list
Re: Extract Text Table From File
On Monday, August 27, 2012 3:12:14 PM UTC+5, Laszlo Nagy wrote:
> On 2012-08-27 11:53, Huso wrote:
>
> > Hi,
>
> >
>
> > I am trying to extract some text table data from a log file. I am trying
> > different methods, but I don't seem to get anything to work. I am kind of
> > new to python as well. Hence, appreciate if someone could help me out.
>
>
>
> #
>
> # Write test data to test.txt
>
> #
>
>
>
> data = """
>
> ROUTES TRAFFIC RESULTS, LSR
>
> TRG MP DATE TIME
>
> 37 17 120824
>
>
>
> R TRAFF NBIDS CCONG NDV ANBLO MHTIME NBANSW
>
> AABBCCO 6.4 204 0.0 1151.0113.4 144
>
> AABBCCI 3.0 293 1151.0 37.0 171
>
> DDEEFFO 0.2 5 0.0590.0107.6 3
>
> EEFFEEI 0.0 0590.0 0.0 0
>
> HHGGFFO 0.0 0 0.0300.0 0.0 0
>
> HHGGFFI 0.3 15300.0 62.2 4
>
> END
>
> """
>
> fout = open("test.txt","wb+")
>
> fout.write(data)
>
> fout.close()
>
>
>
> #
>
> # This is how you iterate over a file and process its lines
>
> #
>
> fin = open("test.txt","r")
>
> for line in fin:
>
> # This is one possible way to extract values.
>
> values = line.strip().split()
>
> print values
>
>
>
>
>
> This will print:
>
>
>
> []
>
> ['ROUTES', 'TRAFFIC', 'RESULTS,', 'LSR']
>
> ['TRG', 'MP', 'DATE', 'TIME']
>
> ['37', '17', '120824', '']
>
> []
>
> ['R', 'TRAFF', 'NBIDS', 'CCONG', 'NDV', 'ANBLO', 'MHTIME', 'NBANSW']
>
> ['AABBCCO', '6.4', '204', '0.0', '115', '1.0', '113.4', '144']
>
> ['AABBCCI', '3.0', '293', '115', '1.0', '37.0', '171']
>
> ['DDEEFFO', '0.2', '5', '0.0', '59', '0.0', '107.6', '3']
>
> ['EEFFEEI', '0.0', '0', '59', '0.0', '0.0', '0']
>
> ['HHGGFFO', '0.0', '0', '0.0', '30', '0.0', '0.0', '0']
>
> ['HHGGFFI', '0.3', '15', '30', '0.0', '62.2', '4']
>
> ['END']
>
>
>
>
>
> The "values" list in the last line contains these values. This will work
>
> only if you don't have spaces in your values. Otherwise you can use
>
> regular expressions to parse a line. See here:
>
>
>
> http://docs.python.org/library/re.html
>
>
>
> Since you did not give any specification on your file format, it would
>
> be hard to give a concrete program that parses your file(s)
>
>
>
> Best,
>
>
>
> Laszlo
Hi,
Thank you for the information.
The exact way I want to extract the data is like as below.
TRG, MP and DATE and TIME is common for that certain block of traffic.
So I am using those and dumping it with the rest of the data into sql.
Table will have all headers (TRG, MP, DATE, TIME, R, TRAFF, NBIDS, CCONG, NDV,
ANBLO, MHTIME, NBANSW).
So from this text, the first data will be 37, 17, 120824, , AABBCCO, 6.4,
204, 0.0, 115, 1.0, 113.4, 144.
Thanking,
Huso
--
http://mail.python.org/mailman/listinfo/python-list
What do I do to read html files on my pc?
Hallo, I have an html file on my pc and I want to read it to extract some text. Can you help on which libs I have to use and how can I do it? thank you so much. Michele -- http://mail.python.org/mailman/listinfo/python-list
Your favorite test tool and automation frameworks
Hello everybody, I would like to ask about your favorite python test frameworks. I never used it before (beginner in testing) and would like to start to learn Unit- and GUI-testing. I look now at PyUnit/unittest and dogtail. Maybe someone can recommend something better or just share experiences? Thank you, Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Extract Text Table From File
Hi,
Thank you for the information.
The exact way I want to extract the data is like as below.
TRG, MP and DATE and TIME is common for that certain block of traffic.
So I am using those and dumping it with the rest of the data into sql.
Table will have all headers (TRG, MP, DATE, TIME, R, TRAFF, NBIDS, CCONG, NDV,
ANBLO, MHTIME, NBANSW).
So from this text, the first data will be 37, 17, 120824, , AABBCCO, 6.4,
204, 0.0, 115, 1.0, 113.4, 144.
How many blocks do you have in a file? Do you want to create different
data sets for those blocks? How do you identify those blocks? (E.g. are
they all saved into the same database table the same way?)
Anyway here is something:
import re
# AABBCCO 6.4 204 0.0 1151.0113.4 144
pattern = re.compile(r"""([A-Z]{7})"""+7*r"""\s+([\d\.]+)""")
#
# This is how you iterate over a file and process its lines
#
fin = open("test.txt","r")
blocks = []
block = None
for line in fin:
# This is one possible way to extract values.
values = line.strip().split()
if values==['R', 'TRAFF', 'NBIDS', 'CCONG', 'NDV', 'ANBLO',
'MHTIME', 'NBANSW']:
if block is not None:
blocks.append(block)
block = []
elif block is not None:
res = pattern.match(line.strip())
if res:
values = list(res.groups())
values[1:] = map(float,values[1:])
block.append(values)
if block is not None:
blocks.append(block)
for idx,block in enumerate(blocks):
print "BLOCK",idx
for values in block:
print values
This prints:
BLOCK 0
['AABBCCO', 6.4, 204.0, 0.0, 115.0, 1.0, 113.4, 144.0]
['DDEEFFO', 0.2, 5.0, 0.0, 59.0, 0.0, 107.6, 3.0]
['HHGGFFO', 0.0, 0.0, 0.0, 30.0, 0.0, 0.0, 0.0]
--
http://mail.python.org/mailman/listinfo/python-list
Re: Extract Text Table From File
Hi, There can be any number of blocks in the log file. I distinguish the block by the start header 'ROUTES TRAFFIC RESULTS, LSR' and ending in 'END'. Each block will have a unique [date + time] value. I tried the code you mentioned, it works for the data part. But I need to get the TRG, MP, DATE and TIME for the block with those data as well. This is the part that i'm really tangled in. Thanking, Huso -- http://mail.python.org/mailman/listinfo/python-list
Re: Your favorite test tool and automation frameworks
On 27/08/2012 12:04, Alex Naumov wrote: Hello everybody, I would like to ask about your favorite python test frameworks. I never used it before (beginner in testing) and would like to start to learn Unit- and GUI-testing. I look now at PyUnit/unittest and dogtail. Maybe someone can recommend something better or just share experiences? Thank you, Alex I never test my own code as it's always perfect first time :) For those who lack my talents here's a good starting point http://wiki.python.org/moin/PythonTestingToolsTaxonomy -- Cheers. Mark Lawrence. -- http://mail.python.org/mailman/listinfo/python-list
Re: Extract Text Table From File
On 2012-08-27 13:23, Huso wrote: Hi, There can be any number of blocks in the log file. I distinguish the block by the start header 'ROUTES TRAFFIC RESULTS, LSR' and ending in 'END'. Each block will have a unique [date + time] value. I tried the code you mentioned, it works for the data part. But I need to get the TRG, MP, DATE and TIME for the block with those data as well. This is the part that i'm really tangled in. Thanking, Huso Well, I suggest that you try to understand my code and make changes in it. It is not too hard. First you start reading documentation of the "re" module. It is worth learning Python. Especially for mining data out of text files. :-) Best, Laszlo -- http://mail.python.org/mailman/listinfo/python-list
Re: What do I do to read html files on my pc?
On Mon, Aug 27, 2012 at 8:59 PM, mikcec82 wrote: > Hallo, > > I have an html file on my pc and I want to read it to extract some text. > Can you help on which libs I have to use and how can I do it? > > thank you so much. Try BeautifulSoup. You can find it at the opposite end of a web search. Not trying to be unhelpful, but without more description of the problem, there's not a lot more to say :) ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: What do I do to read html files on my pc?
On 27/08/2012 11:59, mikcec82 wrote: Hallo, I have an html file on my pc and I want to read it to extract some text. Can you help on which libs I have to use and how can I do it? thank you so much. Michele Type something like "python html parsing" into the box of your favourite search engine, hit return and follow the links it comes back with. Write some code. If you have problems give us the smallest code snippet that reproduces the issue together with the complete traceback and we'll help. -- Cheers. Mark Lawrence. -- http://mail.python.org/mailman/listinfo/python-list
Re: Extract Text Table From File
On 08/27/12 04:53, Huso wrote:
> Below is just ONE block of the traffic i have in the log files. There will be
> more in them with different data.
>
> ROUTES TRAFFIC RESULTS, LSR
> TRG MP DATE TIME
> 37 17 120824
>
> R TRAFF NBIDS CCONG NDV ANBLO MHTIME NBANSW
> AABBCCO 6.4 204 0.0 1151.0113.4 144
> AABBCCI 3.0 293 1151.0 37.0 171
> DDEEFFO 0.2 5 0.0590.0107.6 3
> HHGGFFI 0.3 15300.0 62.2 4
> END
In the past I've used something like the following to find columnar
data based on some found headers:
import re
token_re = re.compile(r'\b(\w+)\s*')
f = file(FILENAME)
headers = f.next() # in your case, you'd
# search forward until
# you got to a header line
# and use that TRAFF... line
header_map = dict(
# build a map of field-name to slice
(
matchobj.group(1).upper(),
slice(*matchobj.span())
)
for matchobj
in token_re.finditer(headers)
)
You can then access your values as you iterate through the rest of
the rows:
for row in f:
if row.startswith("END"): break
traff = float(row[header_map["TRAFF"]])
# ...
which makes the code pretty easy to read, effectively turning it
into a CSV file.
It has the advantage that, if for some reason data in the columns
have spaces in them, it won't throw off the row as a .split() would.
-tkc
--
http://mail.python.org/mailman/listinfo/python-list
Re: Python list archives double-gzipped?
On 27.08.2012 03:40, Tim Chase wrote:
So it looks like some python-list@ archiving process is double
gzip'ing the archives. Can anybody else confirm this and get the
info the right people?
In January, "random joe" noticed the same problem[1].
I think, Anssi Saari[2] was right in saying that there is something
wrong in the browser or server setup, because I notice the same
behaviour with Firefox, Chromium, wget and curl.
$ ll *July*
-rw-rw-r-- 1 andreas andreas 747850 Aug 27 13:48 chromium_2012-July.txt.gz
-rw-rw-r-- 1 andreas andreas 748041 Aug 27 13:41 curl_2012-July.txt.gz
-rw-rw-r-- 1 andreas andreas 747850 Aug 27 13:48 firefox_2012-July.txt.gz
-rw-rw-r-- 1 andreas andreas 748041 Aug 2 03:27 wget_2012-July.txt.gz
The browsers get a double gzipped file (size 747850) whereas the
download utilities get a normal gzipped file (size 748041).
After looking at the HTTP request and response headers I've noticed that
the browsers accept compressed data ("Accept-Encoding: gzip, deflate")
whereas wget/curl by default don't. After adding that header to
wget/curl they get the same double gzipped file as the browsers do:
$ ll *July*
-rw-rw-r-- 1 andreas andreas 747850 Aug 27 13:48 chromium_2012-July.txt.gz
-rw-rw-r-- 1 andreas andreas 748041 Aug 27 13:41 curl_2012-July.txt.gz
-rw-rw-r-- 1 andreas andreas 747850 Aug 27 13:40
curl_encoding_2012-July.txt.gz
-rw-rw-r-- 1 andreas andreas 747850 Aug 27 13:48 firefox_2012-July.txt.gz
-rw-rw-r-- 1 andreas andreas 748041 Aug 2 03:27 wget_2012-July.txt.gz
-rw-rw-r-- 1 andreas andreas 747850 Aug 2 03:27
wget_encoding_2012-July.txt.gz
I think the following is happening:
If you send the "Accept-Encoding: gzip, deflate"-header, the server will
gzip the file a second time (which is arguably unnecessary) and responds
with "Content-Encoding: gzip" and "Content-Type: application/x-gzip"
(which is IMHO correct according to RFC2616/14.11 and 14.17[3]).
But because many servers apparently don't set correct headers, the
default behaviour of most browsers nowadays is to ignore the
content-encoding for gzip files (application/x-gzip - see bug report for
firefox[4] and chromium[5]) and don't uncompress the outer layer,
leading to a double gzipped file in this case.
Bye, Andreas
[1] http://mail.python.org/pipermail/python-list/2012-January/617983.html
[2] http://mail.python.org/pipermail/python-list/2012-January/618211.html
[3] http://www.ietf.org/rfc/rfc2616
[4] https://bugzilla.mozilla.org/show_bug.cgi?id=610679#c5
[5] http://code.google.com/p/chromium/issues/detail?id=47951#c9
--
http://mail.python.org/mailman/listinfo/python-list
Re: What do I do to read html files on my pc?
Il giorno lunedì 27 agosto 2012 12:59:02 UTC+2, mikcec82 ha scritto: > Hallo, > > > > I have an html file on my pc and I want to read it to extract some text. > > Can you help on which libs I have to use and how can I do it? > > > > thank you so much. > > > > Michele Hi ChrisA, Hi Mark. Thanks a lot. I have this html data and I want to check if it is present a string "" or/and a string "NOT PASSED": . . . CODE CHECK : NOT PASSED Depending on this check I have to fill a cell in an excel file with answer: NOK (if Not passed or is present), or OK (if Not passed and are not present). Thanks again for your help (and sorry for my english) -- http://mail.python.org/mailman/listinfo/python-list
Re: What do I do to read html files on my pc?
On Mon, Aug 27, 2012 at 9:51 AM, mikcec82 wrote: > Il giorno lunedì 27 agosto 2012 12:59:02 UTC+2, mikcec82 ha scritto: >> Hallo, >> >> >> >> I have an html file on my pc and I want to read it to extract some text. >> >> Can you help on which libs I have to use and how can I do it? >> >> >> >> thank you so much. >> >> >> >> Michele > > Hi ChrisA, Hi Mark. > Thanks a lot. > > I have this html data and I want to check if it is present a string "" > or/and a string "NOT PASSED": > > > > > > > > > > > > > > > . > . > . > > > > > > > CODE CHECK > > > : NOT PASSED > > > > > > Depending on this check I have to fill a cell in an excel file with answer: > NOK (if Not passed or is present), or OK (if Not passed and are not > present). > > Thanks again for your help (and sorry for my english) > -- > http://mail.python.org/mailman/listinfo/python-list from your example it doesn't seem there is enough information to know where in the html your strings will be. If you just read the whole file into a string you can do this: >>> s = "this is a string" >>> if 'this' in s: ... print 'yes' ... yes >>> Of course you will be testing for '' or 'NOT PASSED' -- Joel Goldstick -- http://mail.python.org/mailman/listinfo/python-list
Re: What do I do to read html files on my pc?
On Mon, Aug 27, 2012 at 11:51 PM, mikcec82 wrote: > I have this html data and I want to check if it is present a string "" > or/and a string "NOT PASSED": Start by scribbling down some notes in your native language (that is, don't bother trying to write code yet), defining exactly what you're looking for. What constitutes a hit? What would be a false positive that you need to avoid? For instance: * The string must occur outside of any HTML tag. or: * The string must occur inside a but not inside . or: * The string must be in the first inside of a in the that immediately follows the text "abcdefg". Make sure it's clear enough that anybody could follow it, even without knowing everything you know about your files. Once you have that algorithmic description, it's simply a matter of translating it into a language the computer can handle; and that's fairly straight-forward. An hour or two with language/library documentation and you'll quite possibly have working code, or if you don't, you'll at least have something that you can show to the list and ask for help with. But until you have that, advice from this list is going to be fairly vague, and may turn out to be quite misleading. We can't solve your problem until we know what it is, and you can't tell us what the problem is until you know yourself. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 2.6 and Sqlite3 - Slow
Am 27.08.2012 03:23, schrieb [email protected]: My program uses Python 2.6 and Sqlite3 and connects to a network database 100 miles away. Wait, isn't SQLite completely file-based? In that case, SQLite accesses a file, which in turn is stored on a remote filesystem. This means that there are other components involved here, namely your OS, the network (bandwidth & latency), the network filesystem and the filesystem on the remote machine. It would help if you told us what you have there. My program reads approx 60 records (4000 bytes) from a Sqlite database in less than a second. Each time the user requests data, my program can continuously read 60 records in less than a second. However, if I access the network drive (e.g. DOS command DIR /S) while my program is running, my program takes 20 seconds to read the same 60 records. If I restart my program, my program once again takes less than a second to read 60 records. Questions here: 1. Is each record 4kB or are all 60 records together 4kB? 2. Does the time for reading double when you double the number of records? Typically you have B + C * N, but it would be interesting to know the bias B and the actual time (and size) of each record. 3. How does the timing change when running dir/s? 4. What if you run two instances of your program? 5. Is the duration is only reset by restarting the program or does it also decrease when the dir/s call has finished? What if you close and reopen the database without terminating the program? My guess is that the concurrent access by another program causes the accesses to become synchronized, while before most of the data is cached. That would cause a complete roundtrip between the two machines for every access, which can easily blow up the timing via the latency. In any case, I would try Python 2.7 in case this is a bug that was already fixed. Good luck! Uli -- http://mail.python.org/mailman/listinfo/python-list
Re: Python list archives double-gzipped?
On 08/27/12 08:52, Andreas Perstinger wrote: > On 27.08.2012 03:40, Tim Chase wrote: >> So it looks like some python-list@ archiving process is double >> gzip'ing the archives. Can anybody else confirm this and get the >> info the right people? > > If you send the "Accept-Encoding: gzip, deflate"-header, the server will > gzip the file a second time (which is arguably unnecessary) and responds > with "Content-Encoding: gzip" and "Content-Type: application/x-gzip" > (which is IMHO correct according to RFC2616/14.11 and 14.17[3]). > But because many servers apparently don't set correct headers, the > default behaviour of most browsers nowadays is to ignore the > content-encoding for gzip files (application/x-gzip - see bug report for > firefox[4] and chromium[5]) and don't uncompress the outer layer, > leading to a double gzipped file in this case. That corresponds with what I see in various testing. To whomever controls the python.org web-server, is it possible to tweak Apache so that it doesn't try to gzip *.gz files? It may ameliorate the problem, as well as reduce server load (since it's actually taking the time to make the file larger) -tkc -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 2.6 and Sqlite3 - Slow
On Monday, August 27, 2012 8:50:15 AM UTC-7, Ulrich Eckhardt wrote: > Am 27.08.2012 03:23, schrieb [email protected]: > > > My program uses Python 2.6 and Sqlite3 and connects to a network > > > database 100 miles away. > > > > Wait, isn't SQLite completely file-based? In that case, SQLite accesses > > a file, which in turn is stored on a remote filesystem. This means that > > there are other components involved here, namely your OS, the network > > (bandwidth & latency), the network filesystem and the filesystem on the > > remote machine. It would help if you told us what you have there. > > > > > > > My program reads approx 60 records (4000 bytes) from a Sqlite > > > database in less than a second. Each time the user requests data, my > > > program can continuously read 60 records in less than a second. > > > However, if I access the network drive (e.g. DOS command DIR /S) > > > while my program is running, my program takes 20 seconds to read the > > > same 60 records. If I restart my program, my program once again takes > > > less than a second to read 60 records. > > > > Questions here: > > 1. Is each record 4kB or are all 60 records together 4kB? > > 2. Does the time for reading double when you double the number of > > records? Typically you have B + C * N, but it would be interesting to > > know the bias B and the actual time (and size) of each record. > > 3. How does the timing change when running dir/s? > > 4. What if you run two instances of your program? > > 5. Is the duration is only reset by restarting the program or does it > > also decrease when the dir/s call has finished? What if you close and > > reopen the database without terminating the program? > > > > My guess is that the concurrent access by another program causes the > > accesses to become synchronized, while before most of the data is > > cached. That would cause a complete roundtrip between the two machines > > for every access, which can easily blow up the timing via the latency. > > > > In any case, I would try Python 2.7 in case this is a bug that was > > already fixed. > > > > Good luck! > > > > Uli -- http://mail.python.org/mailman/listinfo/python-list
Re: What do I do to read html files on my pc?
mikcec82 wrote: [snip] CODE CHECK : NOT PASSED Depending on this check I have to fill a cell in an excel file with answer: NOK (if Not passed or is present), or OK (if Not passed and are not present). Thanks again for your help (and sorry for my english) Html is not a format you wish to extract data from. Mainly because this is the endpoint of content AND display, meaning, that what is properly parsed today may not be parsed tomorrow because someone changed the background color. You should change your server so he can feed a client with data (xml for instance is quite close from the html syntax, it's based on tags and is suitable for data). JM -- http://mail.python.org/mailman/listinfo/python-list
Re: Python list archives double-gzipped?
In article <[email protected]>, Tim Chase wrote: > That corresponds with what I see in various testing. To whomever > controls the python.org web-server, is it possible to tweak Apache > so that it doesn't try to gzip *.gz files? It may ameliorate the > problem, as well as reduce server load (since it's actually taking > the time to make the file larger) http://mail.python.org/mailman/listinfo/pydotorg-www -- Ned Deily, [email protected] -- http://mail.python.org/mailman/listinfo/python-list
Re: set and dict iteration
On Thursday, August 23, 2012 1:11:14 PM UTC-5, Steven D'Aprano wrote: > On Thu, 23 Aug 2012 09:49:41 -0700, Aaron Brady wrote: > > > > [...] > > > The patch for the above is only 40-60 lines. However it introduces two > > > new concepts. > > > > > > The first is a "linked list", a classic dynamic data structure, first > > > developed in 1955, cf. http://en.wikipedia.org/wiki/Linked_list . > > > Linked lists are absent in Python > > > > They certainly are not. There's merely no named "linked list" class. > > > > Linked lists are used by collections.ChainMap, tracebacks, xml.dom, > > Abstract Syntax Trees, and probably many other places. (Well, technically > > some of these are trees rather than lists.) You can trivially create a > > linked list: > > > > x = [a, [b, [c, [d, [e, None] > > > > is equivalent to a singly-linked list with five nodes. Only less > > efficient. > > > > > > > The second is "uncounted references". The uncounted references are > > > references to "set iterators" exclusively, exist only internally to > > > "set" objects, and are invisible to the rest of the program. The reason > > > for the exception is that iterators are unique in the Python Data Model; > > > iterators consist of a single immutable reference, unlike both immutable > > > types such as strings and numbers, as well as container types. Counted > > > references could be used instead, but would be consistently wasted work > > > for the garbage collector, though the benefit to programmers' peace of > > > mind could be significant. > > > > The usual way to implement "uncounted references" is by using weakrefs. > > Why invent yet another form of weakref? > > > > > > > > -- > > Steven Hello S. D'Aprano. Thanks for your support as always. The semantics of the second collection are equivalent to a WeakSet. The space and time consumption of a WeakSet are higher in comparison to a linked list. However, so long as we iterated over it using the C API instead of creating an iterator, it would be consistent. If we dynamically create the WeakSet on demand and free it when empty, the space consumption would be lower. Typical use cases don't involve creating thousands of iterators, or rapidly creating and destroying them, so the performance impact might not be severe. Regarding the bare weakrefs, if the iterator's destructor hasn't been called, then the pointer is still valid. If it has been called, then it's not present in the list. Unlike Python classes, the destructors of C extension classes are guaranteed to be called. Therefore there are no points during exection at which a node needs to check whether a reference to its neighbor is valid. Are your concerns based on data integrity, future maintainability, or what? -- http://mail.python.org/mailman/listinfo/python-list
Re: Flexible string representation, unicode, typography, ...
Le dimanche 26 août 2012 22:45:09 UTC+2, Dan Sommers a écrit : > On 2012-08-26 at 20:13:21 +, > > Steven D'Aprano wrote: > > > > > I note that not all 32-bit ints are valid code points. I suppose I can > > > see sense in having rune be a 32-bit integer value limited to those > > > valid code points. (But, dammit, why not call it a code point?) But if > > > rune is merely an alias for int32, why not just call it int32? > > > > Having a "code point" type is a good idea. If nothing else, human code > > readers can tell that you're doing something with characters rather than > > something with integers. If your language provides any sort of type > > safety, then you get that, too. > > > > Calling your code points int32 is a bad idea for the same reason that it > > turned out to be a bad idea to call all my old ASCII characters int8. > > Or all my pointers int (or unsigned int), for n in 16, 20, 24, 32, > > 36, 48, or 64 (or I'm sure other values of n that I never had the pain > > or pleasure of using). > And this is precisely the concept of rune, a real int which is a name for Unicode code point. Go "has" the integers int32 and int64. A rune ensure the usage of int32. "Text libs" use runes. Go has only bytes and runes. If you do not like the word "perfection", this mechanism has at least an ideal simplicity (with probably a lot of positive consequences). rune -> int32 -> utf32 -> unicode code points. - Why int32 and not uint32? No idea, I tried to find an answer without asking. - I find the name "rune" elegant. "char" would have been too confusing. End. This is supposed to be a Python forum. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: set and dict iteration
On Thu, Aug 23, 2012 at 10:49 AM, Aaron Brady wrote: > The patch for the above is only 40-60 lines. However it introduces two new > concepts. Is there a link to the patch? > The first is a "linked list", a classic dynamic data structure, first > developed in 1955, cf. http://en.wikipedia.org/wiki/Linked_list . Linked > lists are absent in Python, including the standard library and CPython > implementation, beyond the weak reference mechanism and garbage collector. > The "collections.deque" structure shares some of the linked list interface > but uses arrays. > > The second is "uncounted references". The uncounted references are > references to "set iterators" exclusively, exist only internally to "set" > objects, and are invisible to the rest of the program. The reason for the > exception is that iterators are unique in the Python Data Model; iterators > consist of a single immutable reference, unlike both immutable types such as > strings and numbers, as well as container types. Counted references could be > used instead, but would be consistently wasted work for the garbage > collector, though the benefit to programmers' peace of mind could be > significant. > > Please share your opinion! Do you agree that the internal list resolves the > inconsistency? Do you agree with the strategy? Do you agree that uncounted > references are justified to introduce, or are counted references preferable? This feature is a hard sell as it is; I think that adding uncounted references into the mix is only going to make that worse. May I suggest an alternate approach? Internally tag each set or dict with a "version", which is just a C int. Every time the hash table is modified, increment the version. When an iterator is created, store the current version on the iterator. When the iterator is advanced, check that the iterator version matches the dict/set version. If they're not equal, raise an error. This should add less overhead than the linked list without any concerns about reference counting. It does introduce a small bug in that an error condition could be "missed", if the version is incremented a multiple of 2**32 or 2**64 times between iterations -- but how often is that really likely to occur? Bearing in mind that this error is meant for debugging and not production error handling, you could even make the version a single byte and I'd still be fine with that. Cheers, Ian -- http://mail.python.org/mailman/listinfo/python-list
Re: VPS For Python
On 26/08/12 09:41, coldfire wrote: I will really appreciate if someone type the address of any of the following for use with python If you can live just with PaaS (i.e., no shell account in the strict sense of the word, although you have ssh access) then my employer is introducing OpenShift (http://openshift.redhat.com) and I have a very great experience with playing with it. Use #openshift on Freenode for further support. Matěj -- http://mail.python.org/mailman/listinfo/python-list
Re: Flexible string representation, unicode, typography, ...
On Mon, Aug 27, 2012 at 1:16 PM, wrote: > - Why int32 and not uint32? No idea, I tried to find an > answer without asking. UCS-4 is technically only a 31-bit encoding. The sign bit is not used, so the choice of int32 vs. uint32 is inconsequential. (In fact, since they made the decision to limit Unicode to the range 0 - 0x0010, one might even point out that the *entire high-order byte* as well as 3 bits of the next byte are irrelevant. Truly, UTF-32 is not designed for memory efficiency.) -- http://mail.python.org/mailman/listinfo/python-list
Re: Flexible string representation, unicode, typography, ...
Le lundi 27 août 2012 22:14:07 UTC+2, Ian a écrit : > On Mon, Aug 27, 2012 at 1:16 PM, wrote: > > > - Why int32 and not uint32? No idea, I tried to find an > > > answer without asking. > > > > UCS-4 is technically only a 31-bit encoding. The sign bit is not used, > > so the choice of int32 vs. uint32 is inconsequential. > > > > (In fact, since they made the decision to limit Unicode to the range 0 > > - 0x0010, one might even point out that the *entire high-order > > byte* as well as 3 bits of the next byte are irrelevant. Truly, > > UTF-32 is not designed for memory efficiency.) I know all this. The question is more, why not a uint32 knowing there are only positive code points. It seems to me more "natural". -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 2.6 and Sqlite3 - Slow
Uli, Answers to your questions: 1) There are approx 65 records and each record is 68 bytes in length. 2) Not applicable because number of records is fixed. 3) Takes less than a second to read all 65 records when all is well. Takes 17 seconds to read all 65 records when all is NOT WELL 4) Performance is also sluggish, at least 12 seconds. 5) Most likely, I misspoken. Restarting my program does not always help with performance. When using the database on my C Drive, Sqlite performance is great! (<1S) When using the database on a network, Sqlite performance is terrible! (17S) I like your idea of trying Python 2.7 Finally, the way my program is written is: loop for all database records: read a database record process data display data (via wxPython) Perhaps, this is a better approach: read all database records loop for all records: process data display data (via wxPython) Thanks, Bruce On Monday, August 27, 2012 11:50:15 AM UTC-4, Ulrich Eckhardt wrote: > Am 27.08.2012 03:23, schrieb [email protected]: > > > My program uses Python 2.6 and Sqlite3 and connects to a network > > > database 100 miles away. > > > > Wait, isn't SQLite completely file-based? In that case, SQLite accesses > > a file, which in turn is stored on a remote filesystem. This means that > > there are other components involved here, namely your OS, the network > > (bandwidth & latency), the network filesystem and the filesystem on the > > remote machine. It would help if you told us what you have there. > > > > > > > My program reads approx 60 records (4000 bytes) from a Sqlite > > > database in less than a second. Each time the user requests data, my > > > program can continuously read 60 records in less than a second. > > > However, if I access the network drive (e.g. DOS command DIR /S) > > > while my program is running, my program takes 20 seconds to read the > > > same 60 records. If I restart my program, my program once again takes > > > less than a second to read 60 records. > > > > Questions here: > > 1. Is each record 4kB or are all 60 records together 4kB? > > 2. Does the time for reading double when you double the number of > > records? Typically you have B + C * N, but it would be interesting to > > know the bias B and the actual time (and size) of each record. > > 3. How does the timing change when running dir/s? > > 4. What if you run two instances of your program? > > 5. Is the duration is only reset by restarting the program or does it > > also decrease when the dir/s call has finished? What if you close and > > reopen the database without terminating the program? > > > > My guess is that the concurrent access by another program causes the > > accesses to become synchronized, while before most of the data is > > cached. That would cause a complete roundtrip between the two machines > > for every access, which can easily blow up the timing via the latency. > > > > In any case, I would try Python 2.7 in case this is a bug that was > > already fixed. > > > > Good luck! > > > > Uli -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 2.6 and Sqlite3 - Slow
Is there a reason that you're using SQLite in a network environment rather than a database server? -- http://mail.python.org/mailman/listinfo/python-list
Re: Python list archives double-gzipped?
On 08/27/12 12:21, Ned Deily wrote: > In article <[email protected]>, Tim Chase > wrote: >> To whomever controls the python.org web-server, is it possible >> to tweak Apache so that it doesn't try to gzip *.gz files? It >> may ameliorate the problem, as well as reduce server load >> (since it's actually taking the time to make the file larger) > > http://mail.python.org/mailman/listinfo/pydotorg-www At Ned's suggestion, I took it there and Ralf Hildebrandt kindly resolved the matter. Thanks to all involved. -tkc -- http://mail.python.org/mailman/listinfo/python-list
popen4 - get exit status
In bash I do the following:
linus:journal tim$ /home/AKMLS/cgi-bin/perl/processJournal-Photo.pl hiccup
-bash: /home/AKMLS/cgi-bin/perl/processJournal-Photo.pl: No such file or
directory
linus:journal tim$ echo $?
127
In python, use os.popen4 I do the following:
>>> fin,fout = os.popen4('/home/AKMLS/cgi-bin/perl/processJournal-Photo.pl
>>> hiccup;echo $?')
>>> results = fout.readlines()
>>> results
['/bin/sh: /home/AKMLS/cgi-bin/perl/processJournal-Photo.pl: No such file or
directory\n', '127\n']
Well, I got the exit code as the last item in the results, but I'm wondering if
there is a better way. From help(os) - I don't find any variables dedicated to
holding exit status.
Any ideas?
thanks
--
Tim
tim at tee jay forty nine dot com or akwebsoft dot com
http://www.akwebsoft.com
--
http://mail.python.org/mailman/listinfo/python-list
Re: popen4 - get exit status
On 08/27/2012 06:39 PM, Tim Johnson wrote:
> In bash I do the following:
> linus:journal tim$ /home/AKMLS/cgi-bin/perl/processJournal-Photo.pl hiccup
> -bash: /home/AKMLS/cgi-bin/perl/processJournal-Photo.pl: No such file or
> directory
> linus:journal tim$ echo $?
> 127
>
> In python, use os.popen4 I do the following:
fin,fout = os.popen4('/home/AKMLS/cgi-bin/perl/processJournal-Photo.pl
hiccup;echo $?')
results = fout.readlines()
results
> ['/bin/sh: /home/AKMLS/cgi-bin/perl/processJournal-Photo.pl: No such file or
> directory\n', '127\n']
>
> Well, I got the exit code as the last item in the results, but I'm wondering
> if
> there is a better way. From help(os) - I don't find any variables dedicated to
> holding exit status.
According to:
http://docs.python.org/library/popen2.html
" The only way to retrieve the return codes for the child processes is
by using the poll() or wait() methods on the Popen3
and Popen4
classes;
these are only available on Unix. This information is not available when
using the popen2()
, popen3()
, and popen4()
functions,"
However, unless you're using an old version of Python (2.5 or below),
you should be using the subprocess module.
--
DaveA
--
http://mail.python.org/mailman/listinfo/python-list
Re: popen4 - get exit status
On Aug 27, 2012 3:47 PM, "Tim Johnson" wrote:
>
> In bash I do the following:
> linus:journal tim$ /home/AKMLS/cgi-bin/perl/processJournal-Photo.pl hiccup
> -bash: /home/AKMLS/cgi-bin/perl/processJournal-Photo.pl: No such file or
directory
> linus:journal tim$ echo $?
> 127
>
> In python, use os.popen4 I do the following:
> >>> fin,fout =
os.popen4('/home/AKMLS/cgi-bin/perl/processJournal-Photo.pl hiccup;echo $?')
> >>> results = fout.readlines()
> >>> results
> ['/bin/sh: /home/AKMLS/cgi-bin/perl/processJournal-Photo.pl: No such file
or directory\n', '127\n']
>
> Well, I got the exit code as the last item in the results, but I'm
wondering if
> there is a better way. From help(os) - I don't find any variables
dedicated to
> holding exit status.
>
> Any ideas?
> thanks
> --
> Tim
> tim at tee jay forty nine dot com or akwebsoft dot com
> http://www.akwebsoft.com
The popen* functions are deprecated. You should use the subprocess module
instead.
--
http://mail.python.org/mailman/listinfo/python-list
Re: popen4 - get exit status
* Benjamin Kaplan [120827 15:20]: > The popen* functions are deprecated. You should use the subprocess module > instead. No, I'm stuck with py 2.4 on one of the servers I'm using and there will not be an upgrade for a few months. I'm really trying to set up something portable between linux->python 2.4 and darwin->python 2.7 thanks -- Tim tim at tee jay forty nine dot com or akwebsoft dot com http://www.akwebsoft.com -- http://mail.python.org/mailman/listinfo/python-list
Re: popen4 - get exit status
* Dave Angel [120827 15:20]:
> On 08/27/2012 06:39 PM, Tim Johnson wrote:
> > In bash I do the following:
> > linus:journal tim$ /home/AKMLS/cgi-bin/perl/processJournal-Photo.pl hiccup
> > -bash: /home/AKMLS/cgi-bin/perl/processJournal-Photo.pl: No such file or
> > directory
> > linus:journal tim$ echo $?
> > 127
> >
> > In python, use os.popen4 I do the following:
> fin,fout = os.popen4('/home/AKMLS/cgi-bin/perl/processJournal-Photo.pl
> hiccup;echo $?')
> results = fout.readlines()
> results
> > ['/bin/sh: /home/AKMLS/cgi-bin/perl/processJournal-Photo.pl: No such file
> > or directory\n', '127\n']
> >
> > Well, I got the exit code as the last item in the results, but I'm
> > wondering if
> > there is a better way. From help(os) - I don't find any variables dedicated
> > to
> > holding exit status.
>
> According to:
> http://docs.python.org/library/popen2.html
>
>
> " The only way to retrieve the return codes for the child processes is
> by using the poll() or wait() methods on the Popen3
> and Popen4
> classes;
> these are only available on Unix. This information is not available when
> using the popen2()
> , popen3()
> , and popen4()
> functions,"
>
> However, unless you're using an old version of Python (2.5 or below),
> you should be using the subprocess module.
Thanks DaveA and see my reply to Benjamin
that will do it.
--
Tim
tim at tee jay forty nine dot com or akwebsoft dot com
http://www.akwebsoft.com
--
http://mail.python.org/mailman/listinfo/python-list
Re: Flexible string representation, unicode, typography, ...
[email protected]: Go "has" the integers int32 and int64. A rune ensure the usage of int32. "Text libs" use runes. Go has only bytes and runes. Go's text libraries use UTF-8 encoded byte strings. Not arrays of runes. See, for example, http://golang.org/pkg/regexp/ Are you claiming that UTF-8 is the optimum string representation and therefore should be used by Python? Neil -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 2.6 and Sqlite3 - Slow
Demian, I am not a database expert! I selected sqlite for the following reasons: 1) Ships with Python. 2) Familiar with Python. 3) The Sqlite description at http://www.sqlite.org/whentouse.html appears to meet my requirements: Very low volume and concurrency, small datasets, simple to use. Bruce On Monday, August 27, 2012 4:54:07 PM UTC-4, Demian Brecht wrote: > Is there a reason that you're using SQLite in a network environment rather > than a database server? -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 2.6 and Sqlite3 - Slow
bruceg113 wrote: > I selected sqlite for the following reasons: > > 1) Ships with Python. > 2) Familiar with Python. > 3) The Sqlite description athttp://www.sqlite.org/whentouse.htmlappears to > meet my requirements: > Very low volume and concurrency, small datasets, simple to use. All good reasons, but a database file on a network drive is contraindication for SQLite. A Google site-specific search for "network" on www.sqlite.org, finds such warnings as: "We have received reports of implementations of both Windows network filesystems and NFS in which locking was subtly broken. We can not verify these reports, but as locking is difficult to get right on a network filesystem we have no reason to doubt them. You are advised to avoid using SQLite on a network filesystem in the first place, since performance will be slow." That said, I don't know where your 17 seconds is going. -Bryan -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 2.6 and Sqlite3 - Slow
On Monday, August 27, 2012 10:32:47 PM UTC-4, Bryan wrote: > bruceg113 wrote: > > > I selected sqlite for the following reasons: > > > > > > 1) Ships with Python. > > > 2) Familiar with Python. > > > 3) The Sqlite description athttp://www.sqlite.org/whentouse.htmlappears to > > meet my requirements: > > > Very low volume and concurrency, small datasets, simple to use. > > > > All good reasons, but a database file on a network drive is > > contraindication for SQLite. A Google site-specific search > > for "network" on www.sqlite.org, finds such warnings as: > > > > "We have received reports of implementations of both Windows network > > filesystems and NFS in which locking was subtly broken. We can not > > verify these reports, but as locking is difficult to get right on a > > network filesystem we have no reason to doubt them. You are advised to > > avoid using SQLite on a network filesystem in the first place, since > > performance will be slow." > > > > That said, I don't know where your 17 seconds is going. > > > > -Bryan Bryan, Thank you for your reply. Are you saying having a sqlite database file on a shared LOCAL network drive is problematic? Bruce -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 2.6 and Sqlite3 - Slow
>From the sqlite documentation he quoted, it appears that ANY network filesystem, local or otherwise, should be avoided. On Aug 27, 2012 8:13 PM, wrote: > On Monday, August 27, 2012 10:32:47 PM UTC-4, Bryan wrote: > > bruceg113 wrote: > > > > > I selected sqlite for the following reasons: > > > > > > > > > > 1) Ships with Python. > > > > > 2) Familiar with Python. > > > > > 3) The Sqlite description athttp:// > www.sqlite.org/whentouse.htmlappears to meet my requirements: > > > > > Very low volume and concurrency, small datasets, simple to use. > > > > > > > > All good reasons, but a database file on a network drive is > > > > contraindication for SQLite. A Google site-specific search > > > > for "network" on www.sqlite.org, finds such warnings as: > > > > > > > > "We have received reports of implementations of both Windows network > > > > filesystems and NFS in which locking was subtly broken. We can not > > > > verify these reports, but as locking is difficult to get right on a > > > > network filesystem we have no reason to doubt them. You are advised to > > > > avoid using SQLite on a network filesystem in the first place, since > > > > performance will be slow." > > > > > > > > That said, I don't know where your 17 seconds is going. > > > > > > > > -Bryan > > Bryan, > > Thank you for your reply. > Are you saying having a sqlite database file on a shared LOCAL network > drive is problematic? > > Bruce > -- > http://mail.python.org/mailman/listinfo/python-list > -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 2.6 and Sqlite3 - Slow
bruceg113 wrote: > Thank you for your reply. > Are you saying having a sqlite database file on a > shared LOCAL network drive is problematic? Yes, mostly, I think I am saying that. A "LOCAL network drive" is network drive, and is not a local drive, local as the network may be. We read and write such a drive over a network protocol, in this case a Microsoft protocol and implementation in the SMB/CIFS family. Where are your 17 seconds going? Hard to tell. Is your experience of astonishing filesystem slothfulness rare? Not so much. We could probably diagnose the problem in a few weeks. We'd use some open-source tools, WireShark among them, plus some Microsoft tools for which we might have to pay, plus the SQLite3 project's C library. With that investment I'd bet we could diagnose, but not cure. -Bryan -- http://mail.python.org/mailman/listinfo/python-list
