Re: Obtain the query interface url of BCS server.

2022-09-12 Thread DFS

On 9/12/2022 5:00 AM, [email protected] wrote:

I want to do the query from with in script based on the interface here [1]. For this 
purpose, the underlying posting URL must be obtained, say, the URL corresponding to 
"ITA Settings" button, so that I can make the corresponding query URL and issue 
the query from the script.

However, I did not find the conversion rules from these buttons to the 
corresponding URL. Any hints for achieving this aim?

[1] 
https://www.cryst.ehu.es/cgi-bin/cryst/programs/nph-getgen?list=new&what=gen&gnum=10

Regards,
Zhao



You didn't say what you want to query.  Are you trying to download 
entire sections of the Bilbao Crystallographic Server?  Maybe the admins 
will give you access to the data.



* this link: https://www.cryst.ehu.es/cgi-bin/cryst/programs/nph-getgen
  brings up the table of space group symbols.

* choose say #7: Pc

* now click ITA Settings, then choose the last entry "P c 1 1" and it
  loads:

https://www.cryst.ehu.es/cgi-bin/cryst/programs//nph-trgen?gnum=007&what=gp&trmat=b,-a-c,c&unconv=P%20c%201%201&from=ita

You might be able to fool around with that URL and substitute values and 
get back the data you want (in HTML) via Python.  Do you really want 
HTML results?



Hit Ctrl+U to see the source HTML of a webpage

Right-click or hit Ctrl + Shift + C to inspect the individual elements 
of the page



--
https://mail.python.org/mailman/listinfo/python-list


Re: Obtain the query interface url of BCS server.

2022-09-13 Thread DFS

On 9/13/2022 3:46 AM, [email protected] wrote:

On Tuesday, September 13, 2022 at 4:20:12 AM UTC+8, DFS wrote:

On 9/12/2022 5:00 AM, [email protected] wrote:

I want to do the query from with in script based on the interface here [1]. For this 
purpose, the underlying posting URL must be obtained, say, the URL corresponding to 
"ITA Settings" button, so that I can make the corresponding query URL and issue 
the query from the script.

However, I did not find the conversion rules from these buttons to the 
corresponding URL. Any hints for achieving this aim?

[1] 
https://www.cryst.ehu.es/cgi-bin/cryst/programs/nph-getgen?list=new&what=gen&gnum=10

Regards,
Zhao

You didn't say what you want to query. Are you trying to download
entire sections of the Bilbao Crystallographic Server?


I am engaged in some related research and need some specific data used by BCS 
server.


What specific data?

Is it available elsewhere?



Maybe the admins will give you access to the data.


I don't think they will provide such convenience to researchers who have no 
cooperative relationship with them.


You can try.  Tell the admins what data you want, and ask them for the 
easiest way to get it.




* this link: https://www.cryst.ehu.es/cgi-bin/cryst/programs/nph-getgen
brings up the table of space group symbols.

* choose say #7: Pc

* now click ITA Settings, then choose the last entry "P c 1 1" and it
loads:

https://www.cryst.ehu.es/cgi-bin/cryst/programs//nph-trgen?gnum=007&what=gp&trmat=b,-a-c,c&unconv=P%20c%201%201&from=ita


Not only that, but I want to obtain all such URLs programmatically!
  

You might be able to fool around with that URL and substitute values and
get back the data you want (in HTML) via Python. Do you really want
HTML results?

Hit Ctrl+U to see the source HTML of a webpage

Right-click or hit Ctrl + Shift + C to inspect the individual elements
of the page


For batch operations, all these manual methods are inefficient.


Yes, but I don't think you'll be able to retrieve the URLs 
programmatically.  The JavaScript code doesn't put them in the HTML 
result, except for that one I showed you, which seems like a mistake on 
their part.


So you'll have to figure out the search fields, and your python program 
will have to cycle through the search values:


Sample from above
gnum   = 007
what   = gp
trmat  = b,-a-c,c
unconv = P c 1 1
from   = ita

wBase   = "https://www.cryst.ehu.es/cgi-bin/cryst/programs//nph-trgen";
wGnum   = "?gnum="   + findgnum
wWhat   = "&what="   + findWhat
wTrmat  = "&trmat="  + findTrmat
wUnconv = "&unconv=" + findUnconv
wFrom   = "&from="   + findFrom
webpage  = wBase + wGnum + wWhat + wTrmat + wUnconv + wFrom

Then if that returns a hit, you'll have to parse the resulting HTML and 
extract the exact data you want.




I did something similar a while back using the requests and lxml libraries

#build url
wBase= "http://www.usdirectory.com";
wForm= "/ypr.aspx?fromform=qsearch"
wKeyw= "&qhqn=" + keyw
wCityZip = "&qc="   + cityzip
wState   = "&qs="   + state
wDist= "&rg="   + str(miles)
wSort= "&sb=a2z"  #sort alpha
wPage= "&ap="   #used with the results page number
webpage  = wBase + wForm + wKeyw + wCityZip + wState + wDist

#open URL
page = requests.get(webpage)
tree = html.fromstring(page.content)

#no matches
matches = tree.xpath('//strong/text()')
if passNbr == 1 and ("No results were found" in str(matches)):
print "No results found for that search"
exit(0)




2.x code file: https://file.io/VdptORSKh5CN




Best Regards,
Zhao


--
https://mail.python.org/mailman/listinfo/python-list


Re: Obtain the query interface url of BCS server.

2022-09-14 Thread DFS

On 9/13/2022 7:29 PM, [email protected] wrote:

On Tuesday, September 13, 2022 at 9:33:20 PM UTC+8, DFS wrote:

On 9/13/2022 3:46 AM, [email protected] wrote:

On Tuesday, September 13, 2022 at 4:20:12 AM UTC+8, DFS wrote:

On 9/12/2022 5:00 AM, [email protected] wrote:

I want to do the query from with in script based on the interface here [1]. For this 
purpose, the underlying posting URL must be obtained, say, the URL corresponding to 
"ITA Settings" button, so that I can make the corresponding query URL and issue 
the query from the script.

However, I did not find the conversion rules from these buttons to the 
corresponding URL. Any hints for achieving this aim?

[1] 
https://www.cryst.ehu.es/cgi-bin/cryst/programs/nph-getgen?list=new&what=gen&gnum=10

Regards,
Zhao

You didn't say what you want to query. Are you trying to download
entire sections of the Bilbao Crystallographic Server?


I am engaged in some related research and need some specific data used by BCS 
server.

What specific data?


All the data corresponding to the total catalog here:
https://www.cryst.ehu.es/cgi-bin/cryst/programs/nph-getgen
  

Is it available elsewhere?


This is an internationally recognized authoritative data source in this field. 
Data from other places, even if there are readily available electronic 
versions, are basically taken from here and are not comprehensive.


Maybe the admins will give you access to the data.


I don't think they will provide such convenience to researchers who have no 
cooperative relationship with them.

You can try. Tell the admins what data you want, and ask them for the
easiest way to get it.

* this link: https://www.cryst.ehu.es/cgi-bin/cryst/programs/nph-getgen
brings up the table of space group symbols.

* choose say #7: Pc

* now click ITA Settings, then choose the last entry "P c 1 1" and it
loads:

https://www.cryst.ehu.es/cgi-bin/cryst/programs//nph-trgen?gnum=007&what=gp&trmat=b,-a-c,c&unconv=P%20c%201%201&from=ita


Not only that, but I want to obtain all such URLs programmatically!


You might be able to fool around with that URL and substitute values and
get back the data you want (in HTML) via Python. Do you really want
HTML results?

Hit Ctrl+U to see the source HTML of a webpage

Right-click or hit Ctrl + Shift + C to inspect the individual elements
of the page


For batch operations, all these manual methods are inefficient.

Yes, but I don't think you'll be able to retrieve the URLs
programmatically. The JavaScript code doesn't put them in the HTML
result, except for that one I showed you, which seems like a mistake on
their part.

So you'll have to figure out the search fields, and your python program
will have to cycle through the search values:

Sample from above
gnum = 007
what = gp
trmat = b,-a-c,c
unconv = P c 1 1
from = ita


The problem is that I must first get all possible combinations of these 
variables.



Shouldn't be too hard, but I've never done some of these things and have 
no code for you:


space group number = gnum = 1 to 230

* use python to put each of those values, one at a time, into the group 
number field on the webpage


* use python to simulate a button click of the ITA Settings button

* it should load the HTML of the list of ITA settings for that space group

* use python to parse the HTML and extract each of the ITA settings. 
The line of HTML has 'ITA number' in it.  Find each of the 'href' values 
in the line(s).


Real HTML from ITA Settings for space group 10:

ITA number bgcolor="#bb">Settingbgcolor="#f0f0f0">10 href="/cgi-bin/cryst/programs//nph-getgen?gnum=010&what=gp">P 1 
2/m 1bgcolor="#f0f0f0">10 href="/cgi-bin/cryst/programs//nph-trgen?gnum=010&what=gp&trmat=c,a,b&unconv=P 
1 1 2/m&from=ita">P 1 1 2/malign="center" bgcolor="#f0f0f0">10 href="/cgi-bin/cryst/programs//nph-trgen?gnum=010&what=gp&trmat=b,c,a&unconv=P 
2/m 1 1&from=ita">P 2/m 1 1



If you parse it right you'll have these addresses:

"/cgi-bin/cryst/programs//nph-getgen?gnum=010&what=gp"

"/cgi-bin/cryst/programs//nph-trgen?gnum=010&what=gp&trmat=c,a,b&unconv=P 1 
1 2/m&from=ita"


"/cgi-bin/cryst/programs//nph-trgen?gnum=010&what=gp&trmat=b,c,a&unconv=P 2/m 
1 1&from=ita"



Then you can parse each of these addresses and build a master list of 
the valid combinations of:


gnum, what, trmat, unconv, from


Check into the lxml library, and the 'etree' class.  https://lxml.de



You can also search gen.lib.rus.ec for the crystallography volumes, and 
maybe cut and paste data from them.







wBase = "https://www.cryst.ehu.es/cgi-bin/cryst/programs//nph-trgen";
wGnum = "?gnum=" + findgnum
wWhat = "&wh

Re: Uninstall tool not working.

2022-09-14 Thread DFS

On 9/13/2022 3:54 PM, Salvatore Bruzzese wrote:

Hi,
I was trying to uninstall version 3.10.7 of python but I've
encountered problems with the uninstall tool.
I open the python setup program, click on the uninstall button but it
doesn't even start deleting python even though it says that the
process has finished.
Feel free to ask for more details in case I didn't explain it correctly.

Thanks in advance for your help.




https://stackoverflow.com/questions/3515673/how-to-completely-remove-python-from-a-windows-machine


--
https://mail.python.org/mailman/listinfo/python-list


Quick question about CPython interpreter

2022-10-17 Thread DFS

-
this does a str() conversion in the loop
-
for i in range(cells.count()):
  if text == str(ID):
break


-
this does one str() conversion before the loop
-
strID = str(ID)
for i in range(cells.count()):
  if text == strID:
break


But does CPython interpret the str() conversion away and essentially do 
it for me in the first example?



--
https://mail.python.org/mailman/listinfo/python-list


Any PyQt developers here?

2022-10-25 Thread DFS

Having problems with removeRow() on a QTableView object.

After calling removeRow(), the screen isn't updating.  It's as if the 
model is read-only, but it's a QSqlTableModel() model, which is not 
read-only.


The underlying SQL is straightforward (one table) and all columns are 
editable.


None of the editStrategies are working either.

I tried everything I can think of, including changes to the 
EditTriggers, but no luck.  HELP!


FWIW, the same removeRow() code works fine with a QTableWidget.

---
object creation and data loading all works fine
---
#open db connection
qdb = QSqlDatabase.addDatabase("QSQLITE")
qdb.setDatabaseName(dbname)
qdb.open()

#prepare query and execute to return data
query = QSqlQuery()
query.prepare(cSQL)
query.exec_()

#set model type and query
model = QSqlTableModel()
model.setQuery(query)

#assign model to QTableView object
view = frm.tblPostsView
view.setModel(model)

#get all data
while(model.canFetchMore()): model.fetchMore()
datarows = model.rowCount()



---
iterate selected rows also works fine
SelectionMode is Extended.
identical code works for a QTableWidget
---
selected = tbl.selectionModel().selectedRows()
#reverse sort the selected items to delete from bottom up
selected = sorted(selected,reverse=True)
for i,val in enumerate(selected):
tbl.model().removeRow(selected[i].row())

--
https://mail.python.org/mailman/listinfo/python-list


Re: Any PyQt developers here?

2022-10-26 Thread DFS

On 10/25/2022 1:45 PM, Thomas Passin wrote:

On 10/25/2022 1:03 PM, DFS wrote:

Having problems with removeRow() on a QTableView object.


removeRow() isn't listed as being a method of a QTableView, not even an 
inherited method, so how are you calling removeRow() on it? (See 
https://doc.qt.io/qt-6/qtableview-members.html)



* I thought I was calling it the same way it's called with
  QTableWidgets:  tbl.removeRow()

  But looking at my code again I was using tbl.model().removeRow()


* Plus I found several others online with similar removeRow() issues
  with QTableViews.


* Plus the code didn't throw an error:

selected = tbl.selectionModel().selectedRows()
#reverse sort the selected items to delete from bottom up
selected = sorted(selected,reverse=True)
for i,val in enumerate(selected):
 tbl.model().removeRow(selected[i].row())


But... as you say, when looking at the docs, removeRow() isn't even one 
of the slots for QTableViews.  So duh!


I see the QTableView.hideRow(row) method, which does exactly what I need.

Thanks man!





After calling removeRow(), the screen isn't updating.  It's as if the 
model is read-only, but it's a QSqlTableModel() model, which is not 
read-only.


The underlying SQL is straightforward (one table) and all columns are 
editable.


None of the editStrategies are working either.

I tried everything I can think of, including changes to the 
EditTriggers, but no luck.  HELP!


FWIW, the same removeRow() code works fine with a QTableWidget.

---
object creation and data loading all works fine
---
#open db connection
qdb = QSqlDatabase.addDatabase("QSQLITE")
qdb.setDatabaseName(dbname)
qdb.open()

#prepare query and execute to return data
query = QSqlQuery()
query.prepare(cSQL)
query.exec_()

#set model type and query
model = QSqlTableModel()
model.setQuery(query)

#assign model to QTableView object
view = frm.tblPostsView
view.setModel(model)

#get all data
while(model.canFetchMore()): model.fetchMore()
datarows = model.rowCount()



---
iterate selected rows also works fine
SelectionMode is Extended.
identical code works for a QTableWidget
---
selected = tbl.selectionModel().selectedRows()
#reverse sort the selected items to delete from bottom up
selected = sorted(selected,reverse=True)
for i,val in enumerate(selected):
 tbl.model().removeRow(selected[i].row())





--
https://mail.python.org/mailman/listinfo/python-list


Re: Any PyQt developers here?

2022-10-26 Thread DFS

On 10/25/2022 2:03 PM, Barry Scott wrote:

There is an active PyQt mailing list that has lots of helpful and knowledgeable 
people on it.

https://www.riverbankcomputing.com/mailman/listinfo/pyqt

Barry



Thanks.  I'll send some questions their way, I'm sure.
--
https://mail.python.org/mailman/listinfo/python-list


A little source file analyzer

2022-10-26 Thread DFS

Nothing special, but kind of fun to use

$python progname.py sourcefile.py

-
#count blank lines, comments, source code

import sys

#counters
imports, blanks,comments, source  = 0,0,0,0
functions, dbexec, total  = 0,0,0

#python builtins
builtins = 0
bins = 
['abs','aiter','all','any','anext','ascii','bin','bool','breakpoint','bytearray','bytes','callable','chr','classmethod','compile','complex','delattr','dict','dir','divmod','enumerate','eval','exec','filter','float','format','frozenset','getattr','globals','hasattr','hash','help','hex','id','input','int','isinstance','issubclass','iter','len','list','locals','map','max','memoryview','min','next','object','oct','open','ord','pow','property','range','repr','reversed','round','set','setattr','slice','sorted','staticmethod','str','sum','super','tuple','type','vars','zip']

bins2,bins3 = [],[]
for bi in bins: bins2.append(' ' + bi + '(')  #look for leading space 
then builtin then open paren


#todo use for source files other than .py
ccomments = 0
py_comment = ['#','~','@']
c_comment  = ['/*','//']

#read file
f = open(sys.argv[1], encoding='utf-8')
lines = f.read().splitlines()
f.close()

#print builtin usage count
#def binusage():

#iterate file
linenbr = 0
for line in lines:
line = line.strip()
linenbr += 1

if line == ''   : blanks += 1
if line != '':
if line[0:1]  == '#': comments += 1
if line[0:3]  == '"""'  : comments += 1
if line[-3:1] == '"""'   : comments += 1

if line[0:1] not in ['#','"']:
source += 1
if line[0:3] == 'def' and line[-2:] == '):'  : 
functions += 1
if '.execute' in line   : dbexec += 1
if 'commit()' in line   : dbexec += 1
if 'import' in line : imports += 1
if 'print(' in line : bins3.append('print')
for bi in bins2:#usage of a python builtin 
function
if bi in line:
bins3.append(bi[1:-1])  
total += 1  

#output 
print('imports   : ' + str(imports))
print('source: ' + str(source))
print('-functions: ' + str(functions))
print('-db exec  : ' + str(dbexec) + 'x')
ctxt = ''
x = [(i,bins3.count(i)) for i in sorted(set(bins3))]
for bi,cnt in x: ctxt += bi + '('+ str(cnt) + '), '
print('-builtins : ' + str(len(bins3)) + 'x [' + ctxt[:-2] + ']')
print('comments  : ' + str(comments))
print('blanks: ' + str(blanks))
print('Total : ' + str(total))
-
--
https://mail.python.org/mailman/listinfo/python-list


Re: Any PyQt developers here?

2022-10-27 Thread DFS

On 10/25/2022 1:45 PM, Thomas Passin wrote:

On 10/25/2022 1:03 PM, DFS wrote:

Having problems with removeRow() on a QTableView object.


removeRow() isn't listed as being a method of a QTableView, not even an 
inherited method, so how are you calling removeRow() on it? (See 
https://doc.qt.io/qt-6/qtableview-members.html)


Since you helped me on the last one, maybe you could try to answer a 
couple more [probably simple] roadblocks I'm hitting.



I just wanna set the font to bold/not-bold when clicking on a row in 
QTableView.




With a QTableWidget I do it like this:

font = QFont()
font.setBold(True) or False
QTableWidget.item(row,col).setFont(font)



But the QTableView has data/view 'models' attached to it and that syntax 
doesn't work:



Tried:
font = QFont()
font.setBold(True) or False
model = QTableView.model()
model.setFont(model.index(row,col), font)

Throws AttributeError: 'QSqlTableModel' object has no attribute 'setFont'


This doesn't throw an error, but doesn't show bold:
model.setData(model.index(tblRow, col), font, Qt.FontRole)


Any ideas?

Thanks
--
https://mail.python.org/mailman/listinfo/python-list


Re: Any PyQt developers here?

2022-10-28 Thread DFS

On 10/27/2022 3:47 PM, Thomas Passin wrote:

On 10/27/2022 11:15 AM, DFS wrote:

On 10/25/2022 1:45 PM, Thomas Passin wrote:

On 10/25/2022 1:03 PM, DFS wrote:

Having problems with removeRow() on a QTableView object.


removeRow() isn't listed as being a method of a QTableView, not even 
an inherited method, so how are you calling removeRow() on it? (See 
https://doc.qt.io/qt-6/qtableview-members.html)


Since you helped me on the last one, maybe you could try to answer a 
couple more [probably simple] roadblocks I'm hitting.



I just wanna set the font to bold/not-bold when clicking on a row in 
QTableView.




With a QTableWidget I do it like this:

font = QFont()
font.setBold(True) or False
QTableWidget.item(row,col).setFont(font)



But the QTableView has data/view 'models' attached to it and that 
syntax doesn't work:



Tried:
font = QFont()
font.setBold(True) or False
model = QTableView.model()
model.setFont(model.index(row,col), font)

Throws AttributeError: 'QSqlTableModel' object has no attribute 'setFont'


This doesn't throw an error, but doesn't show bold:
model.setData(model.index(tblRow, col), font, Qt.FontRole)


Any ideas?


You definitely need to be setting the font in an item.  I'm not sure but 
I think that your QFont() doesn't have any properties, so it doesn't do 
anything.  I found this bit in a page - it's in C++ instead of Python 
but that doesn't really make a difference except for the exact syntax to 
use -



https://forum.qt.io/topic/70016/qlistview-item-font-stylesheet-not-working/4 



   QVariant v = ModelBaseClass::data(index,role);
   if( condition && role == Qt::FontRole )
   {
    QFont font = v.value();
     font.setBold( true );
    v = QVariant::fromValue( font );
   }

IOW, you have to get the font from the item, then set it to bold, which 
you would do with setFont().  Then you set that new font on the item. Of 
course you would have to unset bold on it later. See


https://doc.qt.io/qt-6/qtablewidgetitem.html#font

Instead of "item", you might need to operate on "row".  I didn't look 
into that.  Since a row probably doesn't have just one font (since it 
can have more than one item), you'd still have to get the font from some 
item in the row.


You might also be able to make the item bold using CSS, but I'm not sure.


Thanks


Internet searches are your friend for questions like this.  


Before I posted I spent a couple hours looking online, reading the docs, 
and trying different ways.


I found one person that said they did it but their syntax didn't work. 
But it doesn't throw an error either.


model.setData(model.index(tblRow, col), font, Qt.FontRole)

When I'm done with my app (nearly 2K LOC) I'm going to put a summary out 
there somewhere with a bunch of examples of easy ways to do things.  For 
one thing I wrote zero classes.  Not one.




I've never 
worked with a QTableView, so I had to start with some knowledge about 
some other parts of QT.  I found the first page searching for "qt set 
qtableview row font", and the second searching for "qtablewidgetitem".



I used TableWidgets in 2 apps and no problems.  In this app there's more 
data and more sorting, and one of the TableWidgets took a while to load 
35K rows (7 items per row).  So I tried a TableView.  Incredibly fast - 
4x the speed - but it doesn't have the bolding in place yet.  That could 
slow it down.


As you know, a TableView is tied to the underlying datasource (in my 
case via a QSqlTableModel), but it's much faster to show data than a 
TableWidget, because with the widget you have populate each cell with 
setItem().


The Widget is slower but easier to work with.  So it's a tradeoff.


And I think I found some bugs in the TableViews.  The Views have 
editStrategies() that control how data is updated (if the model supports 
editing), but they don't work the way the docs say they do.


In my app, when I click on a row a flag field is changed from N to Y 
onscreen (well, it's hidden but it's in the row).


model.setData(model.index(row,7), 'Y')


OnFieldChange  : all changes to the model will be applied immediately to 
the database.

model.setEditStrategy(QSqlTableModel.OnFieldChange)

Doesn't work right.  The screen is updated the first row you click on, 
but the db isn't updated until you reload the view.



OnRowChange: changes to a row will be applied when the user selects 
a different row.

model.setEditStrategy(QSqlTableModel.OnRowChange)

Doesn't work right.  The screen is updated the first row you click on, 
but the db isn't updated until you reload the view.



OnManualSubmit : all changes will be cached in the model until either 
submitAll() or revertAll() is called.

model.setEdi

What's tkinter doing in \Lib\site-packages\future\moves ?

2022-11-07 Thread DFS

3.9.13

--
https://mail.python.org/mailman/listinfo/python-list


Re: What's tkinter doing in \Lib\site-packages\future\moves ?

2022-11-08 Thread DFS

On 11/7/2022 10:48 PM, DFS wrote:

3.9.13



Never mind.  User error - I didn't install it in the first place.





--
https://mail.python.org/mailman/listinfo/python-list


Re: Need max values in list of tuples, based on position

2022-11-11 Thread DFS

On 11/11/2022 12:49 PM, Dennis Lee Bieber wrote:

On Fri, 11 Nov 2022 02:22:34 -0500, DFS  declaimed the
following:



[(0,11), (1,1),  (2,1),
  (0,1) , (1,41), (2,2),
  (0,9) , (1,3),  (2,12)]

The set of values in elements[0] is {0,1,2}

I want the set of max values in elements[1]: {11,41,12} 


Do they have to be IN THAT ORDER?


Yes.


data = [(0,11), (1,1),  (2,1), (0,1) , (1,41), (2,2), (0,9) , (1,3),  (2,12)]
reshape = list(zip(*data))
result = sorted(reshape[1])[-3:]
result

[11, 12, 41]





--
https://mail.python.org/mailman/listinfo/python-list


Need max values in list of tuples, based on position

2022-11-11 Thread DFS



[(0,11), (1,1),  (2,1),
 (0,1) , (1,41), (2,2),
 (0,9) , (1,3),  (2,12)]

The set of values in elements[0] is {0,1,2}

I want the set of max values in elements[1]: {11,41,12} 



--
https://mail.python.org/mailman/listinfo/python-list


Re: Need max values in list of tuples, based on position

2022-11-11 Thread DFS

On 11/11/2022 7:50 AM, Stefan Ram wrote:

Pancho  writes:

def build_max_dict( tups):
 dict = {}
 for (a,b) in tups:
 if (a in dict):
 if (b>dict[a]):
 dict[a]=b
 else:
 dict[a]=b
 return(sorted(dict.values()))


   Or,

import itertools
import operator

def build_max_dict( tups ):
 key = operator.itemgetter( 0 )
 groups = itertools.groupby( sorted( tups, key=key ), key )
 return set( map( lambda x: max( x[ 1 ])[ 1 ], groups ))


FYI, neither of those solutions work:

Pancho: 11, 12, 41
You   : 41, 11, 12

The answer I'm looking for is 11,41,12


Maybe a tuple with the same info presented differently would be easier 
to tackle:


orig:
[(0, 11), (1, 1),  (2, 1),
 (0, 1),  (1, 41), (2, 2),
 (0, 9),  (1, 3),  (2, 12)]

new: [(11,1,1),
  (1,41,2),
  (9,3,12)]

I'm still looking for the max value in each position across all elements 
of the tuple, so the answer is still 11,41,12.



Edit: found a solution online:
-
x = [(11,1,1),(1,41,2),(9,3,12)]
maxvals = [0]*len(x[0])
for e in x:
maxvals = [max(w,int(c)) for w,c in zip(maxvals,e)]
print(maxvals)
[11,41,12]
-

So now the challenge is making it a one-liner!


Thanks
--
https://mail.python.org/mailman/listinfo/python-list


Re: Need max values in list of tuples, based on position

2022-11-11 Thread DFS

On 11/11/2022 2:22 PM, Pancho wrote:

On 11/11/2022 18:53, DFS wrote:

On 11/11/2022 12:49 PM, Dennis Lee Bieber wrote:

On Fri, 11 Nov 2022 02:22:34 -0500, DFS  declaimed the
following:



[(0,11), (1,1),  (2,1),
  (0,1) , (1,41), (2,2),
  (0,9) , (1,3),  (2,12)]

The set of values in elements[0] is {0,1,2}

I want the set of max values in elements[1]: {11,41,12}


Do they have to be IN THAT ORDER?


Yes.

Sets aren't ordered, which is why I gave my answer as a list. A wrongly 
ordered list, but I thought it rude to point out my own error, as no one 
else had. :-)


Assuming you want numeric order of element[0], rather than first 
occurrence order of the element[0] in the original tuple list. In this 
example, they are both the same.


Here is a corrected version

from collections import OrderedDict
def build_max_dict( tups):
     dict =  OrderedDict()
     for (a,b) in tups:
     if (a in dict):
     if (b>dict[a]):
     dict[a]=b
     else:
     dict[a]=b
     return(dict.values())

This solution giving the answer as type odict_values. I'm not quite sure 
what this type is, but it seems to be a sequence/iterable/enumerable 
type, whatever the word is in Python.


Caveat: I know very little about Python.



Thanks for looking at it.  I'm trying to determine the maximum length of 
each column result in a SQL query.  Normally you can use the 3rd value 
of the cursor.description object (see the DB-API spec), but apparently 
not with my dbms (SQLite).  The 'display_size' column is None with 
SQLite.  So I had to resort to another way.




--
https://mail.python.org/mailman/listinfo/python-list


Re: Need max values in list of tuples, based on position

2022-11-11 Thread DFS

On 11/11/2022 7:04 PM, Dennis Lee Bieber wrote:

On Fri, 11 Nov 2022 15:03:49 -0500, DFS  declaimed the
following:



Thanks for looking at it.  I'm trying to determine the maximum length of
each column result in a SQL query.  Normally you can use the 3rd value
of the cursor.description object (see the DB-API spec), but apparently
not with my dbms (SQLite).  The 'display_size' column is None with
SQLite.  So I had to resort to another way.


Not really a surprise. SQLite doesn't really have column widths --


As I understand it, the cursor.description doesn't look at the column 
type - it goes by the data in the cursor.




since any column can store data of any type; affinities just drive it into
what may be the optimal storage for the column... That is, if a column is
"INT", SQLite will attempt to convert whatever the data is into an integer
-- but if the data is not representable as an integer, it will be stored as
the next best form.


Yeah, I don't know why cursor.description doesn't work with SQLite; all 
their columns are basically varchars.




123 => stored as integer
"123" => converted and stored as integer
123.0   => probably converted to integer
123.5   => likely stored as numeric/double
"one two three"   => can't convert, store it as a string

	We've not seen the SQL query in question, 



The query is literally any SELECT, any type of data, including SELECT *. 
 The reason it works with SELECT * is the cursor.description against 
SQLite DOES give the column names:


select * from timezone;
print(cur.description)
(
('TIMEZONE', None, None, None, None, None, None),
('TIMEZONEDESC', None, None, None, None, None, None),
('UTC_OFFSET',   None, None, None, None, None, None)
)

(I lined up the data)


Anyway, I got it working nicely, with the help of the solution I found 
online and posted here earlier:


-
x = [(11,1,1),(1,41,2),(9,3,12)]
maxvals = [0]*len(x[0])
for e in x:

#clp example using only ints
maxvals = [max(w,int(c)) for w,c in zip(maxvals,e)]  #clp example

#real world - get the length of the data string, even if all numeric
maxvals = [max(w,len(str(c))) for w,c in zip(maxvals,e)]

print(maxvals)
[11,41,12]
-

Applied to real data, the iterations might look like this:

[4, 40, 9]
[4, 40, 9]
[4, 40, 9]
[4, 40, 18]
[4, 40, 18]
[4, 40, 18]
[5, 40, 18]
[5, 40, 18]
[5, 40, 18]
[5, 69, 18]
[5, 69, 18]
[5, 69, 18]

The last row contains the max width of the data in each column.

Then I compare those datawidths to the column name widths, and take the 
wider of the two, so [5,69,18] might change to [8,69,18] if the column 
label is wider than the widest bit of data in the column


convert those final widths into a print format string, and everything 
fits well: Each column is perfectly sized and it all looks pleasing to 
the eye (and no external libs like tabulate used either).


https://imgur.com/UzO3Yhp


The 'downside' is you have to fully iterate the data twice: once to get 
the widths, then again to print it.


If I get a wild hair I might create a PostgreSQL clone of my db and see 
if the cursor.description works with it.  It would also have to iterate 
the data to determine that 'display_size' value.


https://peps.python.org/pep-0249/#cursor-attributes




> but it might suffice to use a
> second (first?) SQL query with aggregate (untested)
>
>max(length(colname))
>
> for each column in the main SQL query.


Might be a pain to code dynamically.





"""
length(X)

 For a string value X, the length(X) function returns the number of
characters (not bytes) in X prior to the first NUL character. Since SQLite
strings do not normally contain NUL characters, the length(X) function will
usually return the total number of characters in the string X. For a blob
value X, length(X) returns the number of bytes in the blob. If X is NULL
then length(X) is NULL. If X is numeric then length(X) returns the length
of a string representation of X.
"""

Note the last sentence for numerics.




Thanks for looking at it.

--
https://mail.python.org/mailman/listinfo/python-list


Re: Need max values in list of tuples, based on position

2022-11-13 Thread DFS

On 11/13/2022 7:37 AM, Pancho wrote:

On 11/11/2022 19:56, DFS wrote:


Edit: found a solution online:
-
x = [(11,1,1),(1,41,2),(9,3,12)]
maxvals = [0]*len(x[0])
for e in x:
 maxvals = [max(w,int(c)) for w,c in zip(maxvals,e)]
print(maxvals)
[11,41,12]
-

So now the challenge is making it a one-liner!



  x = [(11,1,1),(1,41,2),(9,3,12)]
  print(functools.reduce( lambda a,b : [max(w,c) for w,c in zip(a,b)],
     x, [0]*len(x[0])))


noice!

--
https://mail.python.org/mailman/listinfo/python-list


In code, list.clear doesn't throw error - it's just ignored

2022-11-13 Thread DFS

In code, list.clear is just ignored.
At the terminal, list.clear shows



in code:
x = [1,2,3]
x.clear
print(len(x))
3

at terminal:
x = [1,2,3]
x.clear

print(len(x))
3


Caused me an hour of frustration before I noticed list.clear() was what 
I needed.


x = [1,2,3]
x.clear()
print(len(x))
0

--
https://mail.python.org/mailman/listinfo/python-list


Re: In code, list.clear doesn't throw error - it's just ignored

2022-11-13 Thread DFS

On 11/13/2022 5:20 PM, Jon Ribbens wrote:

On 2022-11-13, DFS  wrote:

In code, list.clear is just ignored.
At the terminal, list.clear shows



in code:
x = [1,2,3]
x.clear
print(len(x))
3

at terminal:
x = [1,2,3]
x.clear

print(len(x))
3


Caused me an hour of frustration before I noticed list.clear() was what
I needed.

x = [1,2,3]
x.clear()
print(len(x))
0


If you want to catch this sort of mistake automatically then you need
a linter such as pylint:

   $ cat test.py
   """Create an array and print its length"""

   array = [1, 2, 3]
   array.clear
   print(len(array))
   $ pylint -s n test.py
   * Module test
   test.py:4:0: W0104: Statement seems to have no effect (pointless-statement)



Thanks, I should use linters more often.

But why is it allowed in the first place?

I stared at list.clear and surrounding code a dozen times and said 
"Looks right!  Why isn't it clearing the list?!?!"


2 parens later and I'm golden!






--
https://mail.python.org/mailman/listinfo/python-list


Re: In code, list.clear doesn't throw error - it's just ignored

2022-11-13 Thread DFS

On 11/13/2022 9:11 PM, Chris Angelico wrote:

On Mon, 14 Nov 2022 at 11:53, DFS  wrote:


On 11/13/2022 5:20 PM, Jon Ribbens wrote:

On 2022-11-13, DFS  wrote:

In code, list.clear is just ignored.
At the terminal, list.clear shows



in code:
x = [1,2,3]
x.clear
print(len(x))
3

at terminal:
x = [1,2,3]
x.clear

print(len(x))
3


Caused me an hour of frustration before I noticed list.clear() was what
I needed.

x = [1,2,3]
x.clear()
print(len(x))
0


If you want to catch this sort of mistake automatically then you need
a linter such as pylint:

$ cat test.py
"""Create an array and print its length"""

array = [1, 2, 3]
array.clear
print(len(array))
$ pylint -s n test.py
* Module test
test.py:4:0: W0104: Statement seems to have no effect (pointless-statement)



Thanks, I should use linters more often.

But why is it allowed in the first place?

I stared at list.clear and surrounding code a dozen times and said
"Looks right!  Why isn't it clearing the list?!?!"

2 parens later and I'm golden!



No part of it is invalid, so nothing causes a problem. For instance,
you can write this:



If it wastes time like that it's invalid.

This is an easy check for the interpreter to make.

If I submit a suggestion to [email protected] will it just show up 
here?  Or do the actual Python devs intercept it?








1


And you can write this:


1 + 2


And you can write this:


print(1 + 2)


But only one of those is useful in a script. Should the other two be
errors? No. But linters WILL usually catch them, so if you have a good
linter (especially built into your editor), you can notice these
things.



ran pylint against it and got 0.0/10.


--disable=
invalid-name
multiple-statements
bad-indentation
line-too-long
trailing-whitespace
missing-module-docstring
missing-function-docstring
too-many-lines
fixme


and got 8.9/10.

--
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3.11.0 installation and Tkinter does not work

2022-11-21 Thread DFS

On 11/21/2022 12:59 PM, [email protected] wrote:






Dear list,

I want learn python for 4 weeks and have problems, installing Tkinter. If I 
installed 3.11.0 for my windows 8.1 from python.org and type

   >>> import _tkinter
   > Traceback (most recent call last):
   >    File "", line 1, in 
   > ImportError: DLL load failed while importing _tkinter: Das angegebene
   > Modul wurde nicht gefunden.

   > So I it is a tkinter Problem and I tried this:

  >  >>> import _tkinter
  > Traceback (most recent call last):
   >    File "", line 1, in 
   > ImportError: DLL load failed while importing _tkinter: Das angegebene
   > Modul wurde nicht gefunden.

How can I fix this and make it work?



When installing Python 3.11.0 did you check the box "tcl/tk and IDLE"? 
(it's an option on the Python Windows installer).



I made sure to do that, and then this worked:

import tkinter
from tkinter import filedialog as fd
from tkinter.filedialog import askopenfilename
filename = fd.askopenfilename()
print(filename)

foldername = fd.askdirectory()
print(foldername)
time.sleep(3)


--
https://mail.python.org/mailman/listinfo/python-list


Re: Vb6 type to python

2022-11-30 Thread DFS

On 11/30/2022 6:56 AM, [email protected] wrote:


Hello i have a byte file, that fill a vb6 type like:
Type prog_real
 codice As String * 12'hsg
 denom  As String * 24'oo
 codprof As String * 12   'ljio
 note As String * 100
 programmer As String * 11
 Out As Integer
 b_out As Byte'TRUE = Sec   FALSE= mm
 asse_w As Byte   '3.zo Asse --> 0=Z  1=W
 numpassi  As Integer 'put
 len As Long  'leng
 p(250) As passo_pg
 vd(9) As Byte'vel.
 qUscita(9) As Integer'quote
 l_arco As Long   'reserved
 AxDin As Byte'dime
End Type

How i can convert to python



You don't need to declare variable types in Python.

I don't do Python OO so someone else can answer better, but a simple 
port of your VB type would be a python class definition:


class prog_real:
codice, denom, codprof, note, programmer
AxDin, b_out, asse_w, vd, Out, numpassi, qUscita
len, l_arco, p

important: at some point you'll have trouble with a variable named 
'len', which is a Python built-in function.


For a visual aid you could label the variables by type and assign an 
initial value, if that helps you keep track in your mind.


class prog_real:
# strings
codice, denom, codprof, note, programmer = '', '', '', '', ''

# bytes
AxDin, b_out, asse_w, vd = 0, 0, 0, 0

# ints
Out, numpassi, qUscita = 0, 0, 0

# longs
len, l_arco = 0, 0

# misc
p = ''

But it's not necessary.

To restrict the range of values in the variables you would have to 
manually check them each time before or after they change, or otherwise 
force some kind of error/exception that occurs when the variable 
contains data you don't want.



# assign values
prog_real.codice = 'ABC'
print('codice: ' + prog_real.codice)
prog_real.codice = 'DEF'
print('codice: ' + prog_real.codice)
prog_real.codice = 123
print('codice: ' + str(prog_real.codice))


And as shown in the last 2 lines, a variable can accept any type of 
data, even after it's been initialized with a different type.


b = 1
print(type(b))

b = 'ABC'
print(type(b))



Python data types:
https://www.digitalocean.com/community/tutorials/python-data-types

A VB to python program:
https://vb2py.sourceforge.net

--
https://mail.python.org/mailman/listinfo/python-list


Re: Vb6 type to python

2022-11-30 Thread DFS

On 11/30/2022 1:07 PM, DFS wrote:

On 11/30/2022 6:56 AM, [email protected] wrote:




I don't do Python OO so someone else can answer better, but a simple 
port of your VB type would be a python class definition:


class prog_real:
     codice, denom, codprof, note, programmer
     AxDin, b_out, asse_w, vd, Out, numpassi, qUscita
     len, l_arco, p



Sorry for bad advice - that won't work.  The other class definition that 
initializes the variables does work:


class prog_real:
# strings
codice, denom, codprof, note, programmer = '', '', '', '', ''

# bytes
AxDin, b_out, asse_w, vd = 0, 0, 0, 0

# ints
Out, numpassi, qUscita = 0, 0, 0

# longs
len, l_arco = 0, 0

# misc
p = ''
--
https://mail.python.org/mailman/listinfo/python-list


Python is maybe the most widely used language, but clp gets 0 posts some days?

2022-12-02 Thread DFS

Usenet is dead.  Long live Usenet.
--
https://mail.python.org/mailman/listinfo/python-list


Re: New computer, new Python

2022-12-09 Thread DFS

On 12/9/2022 12:13 PM, [email protected] wrote:


  
Hello.  I've downloaded the new Python to my new Computer,  and the new Python mystifies me.
  
Instead of an editor, it looks like a Dos executable program.


python.exe is a Windows executable.




How can I write my own Python Functions and subroutines in the new Python?


Open a text editor and write your own functions and subs.  Save the file 
as prog.py.


From the command line (not from inside the Python shell), type:

$ python prog.py




It is version 3.11 (64 bit).


The latest and greatest.  Significantly sped up vs 3.10.

--
https://mail.python.org/mailman/listinfo/python-list


Re: Does one have to use curses to read single characters from keyboard?

2022-12-11 Thread DFS

On 12/11/2022 5:09 AM, Chris Green wrote:

Is the only way to read single characters from the keyboard to use
curses.cbreak() or curses.raw()?  If so how do I then read characters,
it's not at all obvious from the curses documentation as that seems to
think I'm using a GUI in some shape or form.

All I actually want to do is get 'Y' or 'N' answers to questions on
the command line.

Searching for ways to do this produces what seem to me rather clumsy
ways of doing it.



resp = 'x'
while resp.lower() not in 'yn':
resp = input("Did you say Y or did you say N?: ")




--
https://mail.python.org/mailman/listinfo/python-list


Re: How to get the needed version of a dependency

2022-12-14 Thread DFS

On 12/14/2022 3:55 AM, Cecil Westerhof wrote:

If I want to know the dependencies for requests I use:
 pip show requests

And one of the lines I get is:
 Requires: certifi, charset-normalizer, idna, urllib3

But I want (in this case) to know with version of charset-normalizer
requests needs.
How do I get that?


Check the METADATA file in the *dist-info package files usually found in 
Lib\site-packages.


ie  \Python\3.11.0\Lib\site-packages\pandas-1.5.2.dist-info

Look for config lines beginning with 'Requires':

Requires-Python: >=3.8
Requires-Dist: python-dateutil (>=2.8.1)

$ pip list will show you which version of the package you have 
installed, so you can search for the matching .dist-info file

--
https://mail.python.org/mailman/listinfo/python-list


Re: Fwd: Installation hell

2022-12-19 Thread DFS

On 12/18/2022 6:50 AM, Jim Lewis wrote:

I'm an occasional user of Python and have a degree in computer science.
Almost every freaking time I use Python, I go through PSH (Python Setup
Hell). Sometimes a wrong version is installed. Sometimes it's a path issue.
Or exe naming confusion: python, python3, phthon311, etc. Or library
compatibility issues - took an hour to find out that pygame does not work
with the current version of python. Then the kludgy PIP app and using a DOS
box under Windows with command prompts which is ridiculous. God only knows
how many novice users of the language (or even intermediate users) were
lost in the setup process. Why not clean the infrastructure up and make a
modern environment or IDE or something better than it is now. Or at least
good error messages that explain exactly what to do. Even getting this
email to the list took numerous steps.

-- A frustrated user



Issues installing python and sending an email?

Ask for a refund on your compsci degree.
--
https://mail.python.org/mailman/listinfo/python-list


Connecting python to DB2 database

2021-09-02 Thread DFS

Having a problem with the DB2 connector

test.py

import ibm_db_dbi
connectstring = 
'DATABASE=xxx;HOSTNAME=localhost;PORT=5;PROTOCOL=TCPIP;UID=xxx;PWD=xxx;'

conn = ibm_db_dbi.connect(connectstring,'','')

curr  = conn.cursor
print(curr)

cSQL = "SELECT * FROM TEST"
curr.execute(cSQL)
rows = curr.fetchall()
print(len(rows))


$python test.py

Traceback (most recent call last):
  File "temp.py", line 9, in 
curr.execute(cSQL)
AttributeError: 'function' object has no attribute 'execute'


The ibm_db_dbi library supposedly adheres to PEP 249 (DB-API Spec 2.0), 
but it ain't happening here.



Googling got me nowhere.  Any ideas?

python 3.8.2 on Windows 10
pip install ibm_db
--
https://mail.python.org/mailman/listinfo/python-list


Re: Connecting python to DB2 database

2021-09-03 Thread DFS

On 9/3/2021 1:47 AM, Chris Angelico wrote:

On Fri, Sep 3, 2021 at 3:42 PM DFS  wrote:


Having a problem with the DB2 connector

test.py

import ibm_db_dbi
connectstring =
'DATABASE=xxx;HOSTNAME=localhost;PORT=5;PROTOCOL=TCPIP;UID=xxx;PWD=xxx;'
conn = ibm_db_dbi.connect(connectstring,'','')

curr  = conn.cursor
print(curr)


According to PEP 249, what you want is conn.cursor() not conn.cursor.

I'm a bit surprised as to the repr of that function though, which
seems to be this line from your output:



I'd have expected it to say something like "method cursor of
Connection object", which would have been an immediate clue as to what
needs to be done. Not sure why the repr is so confusing, and that
might be something to report upstream.

ChrisA



Thanks.  I must've done it right, using conn.cursor(), 500x. 
Bleary-eyed from staring at code too long I guess.


Now can you get DB2 to accept ; as a SQL statement terminator like the 
rest of the world?   They call it "An unexpected token"...


--
https://mail.python.org/mailman/listinfo/python-list


Help me split a string into elements

2021-09-04 Thread DFS

Typical cases:
 lines = [('one\ntwo\nthree\n')]
 print(str(lines[0]).splitlines())
 ['one', 'two', 'three']

 lines = [('one two three\n')]
 print(str(lines[0]).split())
 ['one', 'two', 'three']


That's the result I'm wanting, but I get data in a slightly different 
format:


lines = [('one\ntwo\nthree\n',)]

Note the comma after the string data, but inside the paren. 
splitlines() doesn't work on it:


print(str(lines[0]).splitlines())
["('one\\ntwo\\nthree\\n',)"]


I've banged my head enough - can someone spot an easy fix?

Thanks
--
https://mail.python.org/mailman/listinfo/python-list


Re: Help me split a string into elements

2021-09-04 Thread DFS

On 9/4/2021 5:55 PM, DFS wrote:

Typical cases:
  lines = [('one\ntwo\nthree\n')]
  print(str(lines[0]).splitlines())
  ['one', 'two', 'three']

  lines = [('one two three\n')]
  print(str(lines[0]).split())
  ['one', 'two', 'three']


That's the result I'm wanting, but I get data in a slightly different 
format:


lines = [('one\ntwo\nthree\n',)]

Note the comma after the string data, but inside the paren. splitlines() 
doesn't work on it:


print(str(lines[0]).splitlines())
["('one\\ntwo\\nthree\\n',)"]


I've banged my head enough - can someone spot an easy fix?

Thanks



I got it:

lines = [('one\ntwo\nthree\n',)]
print(str(lines[0][0]).splitlines())
['one', 'two', 'three']

--
https://mail.python.org/mailman/listinfo/python-list


Re: Connecting python to DB2 database

2021-09-04 Thread DFS

On 9/3/2021 9:50 AM, Chris Angelico wrote:

On Fri, Sep 3, 2021 at 11:37 PM DFS  wrote:


On 9/3/2021 1:47 AM, Chris Angelico wrote:

On Fri, Sep 3, 2021 at 3:42 PM DFS  wrote:


Having a problem with the DB2 connector

test.py

import ibm_db_dbi
connectstring =
'DATABASE=xxx;HOSTNAME=localhost;PORT=5;PROTOCOL=TCPIP;UID=xxx;PWD=xxx;'
conn = ibm_db_dbi.connect(connectstring,'','')

curr  = conn.cursor
print(curr)


According to PEP 249, what you want is conn.cursor() not conn.cursor.

I'm a bit surprised as to the repr of that function though, which
seems to be this line from your output:



I'd have expected it to say something like "method cursor of
Connection object", which would have been an immediate clue as to what
needs to be done. Not sure why the repr is so confusing, and that
might be something to report upstream.

ChrisA



Thanks.  I must've done it right, using conn.cursor(), 500x.
Bleary-eyed from staring at code too long I guess.


Cool cool! Glad that's working.


Now can you get DB2 to accept ; as a SQL statement terminator like the
rest of the world?   They call it "An unexpected token"...



Hmm, I don't know that the execute() method guarantees to allow
semicolons. Some implementations will strip a trailing semi, but they
usually won't allow interior ones, because that's a good way to worsen
SQL injection vulnerabilities. It's entirely possible - and within the
PEP 249 spec, I believe - for semicolons to be simply rejected.



The default in the DB2 'Command Line Plus' tool is semicolons aren't 
"allowed".



db2 => connect to SAMPLE

db2 => SELECT COUNT(*) FROM STAFF;
SQL0104N  An unexpected token ";" was found following "COUNT(*) FROM STAFF".
Expected tokens may include:  "END-OF-STATEMENT".  SQLSTATE=42601

db2 => SELECT COUNT(*) FROM STAFF
1
---
 35
  1 record(s) selected.



But I should've known you can set the terminator value:

https://www.ibm.com/docs/en/db2/11.1?topic=clp-options

Option :  -t
Description:  This option tells the command line processor to use a
  semicolon (;) as the statement termination character. 
Default:  OFF


$ db2 -t

turns it on in CommandLinePlus - and the setting applies to the DB-API 
code too.

--
https://mail.python.org/mailman/listinfo/python-list


Re: ANN: Dogelog Runtime, Prolog to the Moon (2021)

2021-09-15 Thread DFS

On 9/15/2021 12:23 PM, Mostowski Collapse wrote:

I really wonder why my Python implementation
is a factor 40 slower than my JavaScript implementation.
Structurally its the same code.

You can check yourself:

Python Version:
https://github.com/jburse/dogelog-moon/blob/main/devel/runtimepy/machine.py

JavaScript Version:
https://github.com/jburse/dogelog-moon/blob/main/devel/runtime/machine.js

Its the same while, if-then-else, etc.. its the same
classes Variable, Compound etc.. Maybe I could speed
it up by some details. For example to create an array
of length n, I use in Python:

   temp = [NotImplemented] * code[pos]
   pos += 1

Whereas in JavaScript I use, also
in exec_build2():

   temp = new Array(code[pos++]);

So I hear Guido doesn't like ++. So in Python I use +=
and a separate statement as a workaround. But otherwise,
what about the creation of an array,

is the the idiom [_] * _ slow? I am assuming its
compiled away. Or does it really first create an
array of size 1 and then enlarge it?




I'm sure you know you can put in timing statements to find bottlenecks.

import time
startTime = time.perf_counter()
[code block]
print("%.2f" % (time.perf_counter() - startTime))



--
https://mail.python.org/mailman/listinfo/python-list


Re: ANN: Dogelog Runtime, Prolog to the Moon (2021)

2021-09-15 Thread DFS

On 9/15/2021 5:10 PM, Mostowski Collapse wrote:

And how do you only iterate over n-1 elements?
I don't need a loop over all elements.

With array slicing?

Someting like:

for item in items[0:len(items)-2]:
___print(item)

Or with negative slicing indexes? Problem
is my length can be equal to one.

And when I have length equal to one, the
slice might not do the right thing?

LoL



From the python command prompt:

items = [1,2,3,4]

for itm in items:
print(itm)
1
2
3
4

for itm in items[:-2]:
print(itm)
1
2


for itm in items[:-3]:
print(itm)
1


for itm in items[:-4]:
print(itm)
(no result, no error thrown)


for itm in items[:-5]:
print(itm)
(no result, no error thrown)
--
https://mail.python.org/mailman/listinfo/python-list


Re: Question again

2021-09-16 Thread DFS

On 9/16/2021 1:50 AM, af kh wrote:

Hello,
I was doing some coding on a website called replit then I extracted the file, 
and opened it in Python. For some reason, after answering 'no' or 'yes' after 
the last sentence I wrote, the Python window shut off, in replit I added one 
more sentence, but it isn't shown on Python, it just shuts off. Why is that? 
please reply to me soon since I need to submit it as an assignment for my class.

Code on replit:
#Title: Week 2: Chatbot with personality
#Author: Afnan Khan
#Date:9/15/21
#Description: Ask at least 3 questions to the user
#and create a creative topic

#Gives greetings to the user
import random

greetings = ["Hello, I'm Mr. ChatBot!", "Hi, I'm Mr. ChatBot!", "Hey~, I'm Mr. 
ChatBot!"]
comment = random.choice(greetings)
print(comment)
#Learn about the user's Name: First question
name = input("What is your name? ")

#Greet the User
print("Nice to meet you, " + name)

#Ask the user about their day: Second question
print("How is your day going? ")

#The user replies
reply = input()

#If user says 'amazing', reply with 'I am glad!'
if reply == "amazing" :
   print("I am glad!")

#If user says 'Alright', reply with 'that's good'
elif reply == "alright" :
   print("that's good")

#If user says 'bad', reply with 'Do not worry things will get better'
elif reply == "bad" :
   print("Do not worry things will get better")

#Else than that type 'I see'
else :
   print("I see!")

#Ask to pick between numbers 1~10 to see if you will get lucky today: Third 
question
number = input("Please pick between numbers 1~10 to see your luck for today: ")

#From number 1~3 and an answer
if number == "1" or number == "2" or number == "3" :
   print("You're in grat luck today!")

#From number 4~7 and an answer
elif number == "4" or number == "5" or number == "6" :
   print("damn, bad luck is coming your way")

#From number 8~10 and an answer
elif number == "7" or number == "8" or number == "9" or number == "10" :
   print("I cannot sense any luck today, try again next time")

#Add a statement and question: Fourth question
print("That will be all for today's chitchat, woohooo! would you like to exit the 
chat?")

#User says 'yes'
reply = input()

#If user says 'yes' reply 'wait hold on! are you really leaving??': Fifth 
question
if reply == "yes" :
   print("Wait hold on! are you really leaving??")

#User answers
answer = input()

#If user says 'yes' again, reply 'fine! bye then!'
if answer == "yes" :
   print("Fine! bye then!")

#Other than that if user says 'no', reply 'just kidding we're done here haha'
elif answer == "no" :
   print("just kidding we're done here haha")


Regards,
Aya



I don't understand your issue, but this code runs fine for me.

--
https://mail.python.org/mailman/listinfo/python-list


Re: Free OCR package in Python and selecting appropriate widget for the GUI

2021-09-21 Thread DFS

On 9/21/2021 4:36 AM, Mohsen Owzar wrote:

Hi Guys
Long time ago I've written a program in Malab a GUI for solving Sudoku puzzles, 
which worked not so bad.
Now I try to write this GUI with Python with PyQt5 or TKinter.
First question is:
Is there any free OCR software, packages or code in Python, which I can use to 
recognize the given digits and their positions in the puzzle square.
Second:
Because, I can not attach a picture to this post, I try to describe my picture 
of my GUI.


Draw your GUI in PyQt designer or other graphics tool, then upload a 
screenshot of it to imgur, then post the link to the picture.


--
https://mail.python.org/mailman/listinfo/python-list


Re: Free OCR package in Python and selecting appropriate widget for the GUI

2021-09-22 Thread DFS

On 9/21/2021 10:38 PM, Mohsen Owzar wrote:

DFS schrieb am Dienstag, 21. September 2021 um 15:45:38 UTC+2:

On 9/21/2021 4:36 AM, Mohsen Owzar wrote:

Hi Guys
Long time ago I've written a program in Malab a GUI for solving Sudoku puzzles, 
which worked not so bad.
Now I try to write this GUI with Python with PyQt5 or TKinter.
First question is:
Is there any free OCR software, packages or code in Python, which I can use to 
recognize the given digits and their positions in the puzzle square.
Second:
Because, I can not attach a picture to this post, I try to describe my picture 
of my GUI.

Draw your GUI in PyQt designer or other graphics tool, then upload a
screenshot of it to imgur, then post the link to the picture.

Thanks, for your answer.
But, what is "imgur"?
I'm not so familiar with handling of pictures in this group.
How can I call "imgur" or how can I get there?

Regards
Mohsen



www.imgur.com

It's a website you can upload image files or screenshots to.  Then you 
can copy a link to your picture and post the link here.



--
https://mail.python.org/mailman/listinfo/python-list


Re: Free OCR package in Python and selecting appropriate widget for the GUI

2021-09-22 Thread DFS

On 9/22/2021 1:54 AM, Mohsen Owzar wrote:

DFS schrieb am Mittwoch, 22. September 2021 um 05:10:30 UTC+2:

On 9/21/2021 10:38 PM, Mohsen Owzar wrote:

DFS schrieb am Dienstag, 21. September 2021 um 15:45:38 UTC+2:

On 9/21/2021 4:36 AM, Mohsen Owzar wrote:

Hi Guys
Long time ago I've written a program in Malab a GUI for solving Sudoku puzzles, 
which worked not so bad.
Now I try to write this GUI with Python with PyQt5 or TKinter.
First question is:
Is there any free OCR software, packages or code in Python, which I can use to 
recognize the given digits and their positions in the puzzle square.
Second:
Because, I can not attach a picture to this post, I try to describe my picture 
of my GUI.

Draw your GUI in PyQt designer or other graphics tool, then upload a
screenshot of it to imgur, then post the link to the picture.

Thanks, for your answer.
But, what is "imgur"?
I'm not so familiar with handling of pictures in this group.
How can I call "imgur" or how can I get there?

Regards
Mohsen

www.imgur.com

It's a website you can upload image files or screenshots to. Then you
can copy a link to your picture and post the link here.

I have already posted the link, but I can not see it anywhere.
Now, I post it again:
https://imgur.com/a/Vh8P2TE
I hope that you can see my two images.
Regards
Mohsen



Got it.

I haven't used tkinter.  In PyQt5 designer I think you should use one 
QTextEdit control for each square.



Each square with the small black font can be initially populated with

1  2  3
4  5  6
7  8  9



https://imgur.com/lTcEiML



some starter python code  (maybe save as sudoku.py)

=
from PyQt5 import Qt, QtCore, QtGui, QtWidgets, uic
from PyQt5.Qt import *
from PyQt5.QtCore import *
from PyQt5.QtGui import *
from PyQt5.QtWidgets import *

#objects
app = QtWidgets.QApplication([])
frm  = uic.loadUi("sudoku.ui")


#grid = a collection of squares
grids = 1

#squares = number of squares per grid
squares = 9

#fill the squares with 1-9
def populateSquares():
for i in range(grids,grids+1):
for j in range(1,squares+1):
widget = frm.findChild(QtWidgets.QTextEdit, 
"txt{}_{}".format(i,j))
widget.setText("1  2  3  4  5  6  7  8  9")

#read data from squares
def readSquares():
for i in range(grids,grids+1):
for j in range(1,squares+1):
			print("txt%d_%d contains: %s" % 
(i,j,frm.findChild(QtWidgets.QTextEdit, 
"txt{}_{}".format(i,j)).toPlainText()))



#connect pushbuttons to code
frm.btnPopulate.clicked.connect(populateSquares)
frm.btnReadContents.clicked.connect(readSquares)

#show main form
frm.show()

#initiate application
app.exec()
=





.ui file (ie save as sudoku.ui)
=



 MainWindow
 
  
   
0
0
325
288
   
  
  
   Sudoku
  
  
   

 
  32
  22
  83
  65
 


 
  Courier
  12
  50
  false
 


 false


 color: rgb(0, 0, 127);
background-color: rgb(255, 255, 127);


 QFrame::StyledPanel


 QFrame::Sunken


 Qt::ScrollBarAlwaysOff


 true

   
   

 
  114
  22
  83
  65
 


 
  Courier
  12
  50
  false
 


 false


 color: rgb(0, 0, 127);
background-color: rgb(255, 255, 127);


 QFrame::StyledPanel


 QFrame::Sunken


 Qt::ScrollBarAlwaysOff


 true

   
   

 
  196
  22
  83
  65
 


 
  Courier
  12
  50
  false
 


 false


 color: rgb(0, 0, 127);
background-color: rgb(255, 255, 127);


 QFrame::StyledPanel


 QFrame::Sunken


 Qt::ScrollBarAlwaysOff


 true

   
   

 
  32
  86
  83
  65
 


 
  Courier
  12
  50
  false
 


 false


 color: rgb(0, 0, 127);
background-color: rgb(255, 255, 127);


 QFrame::StyledPanel


 QFrame::Sunken


 Qt::ScrollBarAlwaysOff


 true

   
   

 
  114
  86
  83
  65
 


 
  Courier
  12
  50
  false
 


 false


 color: rgb(0, 0, 127);
background-color: rgb(255, 255, 127);


 QFrame::StyledPanel


 QFrame::Sunken


 Qt::ScrollBarAlwaysOff


 true

   
   

 
  196
  86
  83
  65
 


 
  Courier
  12
  50
  false
 

Re: Flush / update GUIs in PyQt5 during debugging in PyCharm

2021-09-24 Thread DFS

On 9/24/2021 12:46 AM, Mohsen Owzar wrote:

Hi Guys
I've written a GUI using PyQt5 and in there I use StyleSheets (css) for the 
buttons and labels to change their background- and foreground-colors and their 
states as well.
Because my program doesn't function correctly, I try to debug it in my IDE 
(PyCharm).
The problem is that during debugging, when I change some attributes of a button 
or label, let say its background-color, I can not see this modification of the 
color until the whole method or function is completed.
I believe that I have seen somewhere during my searches and googling that one 
can flush or update the GUI after each step/action is done.
But until now I couldn't manage it and I don't know where I have to invoke 
flush/update command in PyCharm.
If anyone has done this before and knows about it, I would very appreciate 
seeing his solution.

Regards
Mohsen



screen:
form.repaint()

individual widgets:
form.widget.repaint()


--
https://mail.python.org/mailman/listinfo/python-list


Use pyodbc to count and list tables, columns, indexes, etc

2016-03-31 Thread DFS


import pyodbc

dbName = "D:\test_data.mdb"
conn = pyodbc.connect('DRIVER={Microsoft Access Driver 
(*.mdb)};DBQ='+dbName)

cursor = conn.cursor()

#COUNT TABLES, LIST COLUMNS
tblCount = 0
for rows in cursor.tables():
if rows.table_type == "TABLE":  #LOCAL TABLES ONLY
tblCount += 1
print rows.table_name
for fld in cursor.columns(rows.table_name):
print(fld.table_name, fld.column_name)

print tblCount,"tables"


Problem is, the 'for rows' loop executes only once if the 'for fld' loop 
is in place.  So even if I have 50 tables, the output is like:


DATA_TYPES
(u'DATA_TYPES', u'FLD_TEXT', -9, u'VARCHAR')
(u'DATA_TYPES', u'FLD_MEMO', -10, u'LONGCHAR')
(u'DATA_TYPES', u'FLD_NBR_BYTE', -6, u'BYTE')
1 tables

And no errors are thrown.

If I comment out the 2 'for fld' lines, it counts and lists all 50 
tables correctly.


Any ideas?

Thanks!

--
https://mail.python.org/mailman/listinfo/python-list


Re: Use pyodbc to count and list tables, columns, indexes, etc

2016-03-31 Thread DFS

On 3/31/2016 11:44 PM, DFS wrote:


import pyodbc

dbName = "D:\test_data.mdb"
conn = pyodbc.connect('DRIVER={Microsoft Access Driver
(*.mdb)};DBQ='+dbName)
cursor = conn.cursor()

#COUNT TABLES, LIST COLUMNS
tblCount = 0
for rows in cursor.tables():
 if rows.table_type == "TABLE":  #LOCAL TABLES ONLY
 tblCount += 1
 print rows.table_name
 for fld in cursor.columns(rows.table_name):
 print(fld.table_name, fld.column_name)

print tblCount,"tables"


Problem is, the 'for rows' loop executes only once if the 'for fld' loop
is in place.  So even if I have 50 tables, the output is like:

DATA_TYPES
(u'DATA_TYPES', u'FLD_TEXT', -9, u'VARCHAR')
(u'DATA_TYPES', u'FLD_MEMO', -10, u'LONGCHAR')
(u'DATA_TYPES', u'FLD_NBR_BYTE', -6, u'BYTE')
1 tables

And no errors are thrown.

If I comment out the 2 'for fld' lines, it counts and lists all 50
tables correctly.

Any ideas?

Thanks!



Never mind!  I discovered I just needed a 2nd cursor object for the columns.

---
 cursor1 = conn.cursor()
 cursor2 = conn.cursor()

 tblCount = 0
 for rows in cursor1.tables():
  if rows.table_type == "TABLE":
  tblCount += 1
  print rows.table_name
  for fld in cursor2.columns(rows.table_name):
  print(fld.table_name, fld.column_name)
---

Works splendiferously.


--
https://mail.python.org/mailman/listinfo/python-list


Re: extract rar

2016-04-02 Thread DFS

On 4/1/2016 5:01 PM, Jianling Fan wrote:

Thanks, but the problem is that I am not allowed to install any
software in my office PC, even free software.
Normally, I use zip files but this time I need to extract a rar file.
I don't like to go to IT guys because it takes time.
That's why I am looking for an alternative way without installing
other software.

  Thanks,

On 1 April 2016 at 13:37, Albert-Jan Roskam  wrote:





Date: Fri, 1 Apr 2016 13:22:12 -0600
Subject: extract rar
From: [email protected]
To: [email protected]

Hello everyone,

I am wondering is there any way to extract rar files by python without
WinRAR software?

I tried Archive() and patool, but seems they required the WinRAR software.


Perhaps 7-zip in a Python subprocess:
http://superuser.com/questions/458643/unzip-rar-from-command-line-with-7-zip/464128



I'm not experienced with Python, but I found this:

"pip install patool
import patoolib
patoolib.extract_archive("foo_bar.rar", outdir=".")
Works on Windows and linux without any other libraries needed."

http://stackoverflow.com/questions/17614467/how-can-unrar-a-file-with-python


--
https://mail.python.org/mailman/listinfo/python-list


Re: Sorting a list

2016-04-03 Thread DFS

On 4/3/2016 2:30 PM, DFS wrote:

cntText = 60
cntBool = 20
cntNbrs = 30
cntDate = 20
cntBins = 20

strText = "  text: "
strBool = "  boolean:  "
strNbrs = "  numeric:  "
strDate = "  date-time:"
strBins = "  binary:   "

colCounts = [(cntText,strText) , (cntBool,strBool), (cntNbrs,strNbrs) ,
(cntDate,strDate) , (cntBins,strBins)]

# sort by alpha, then by column type count descending
colCounts.sort(key=lambda x: x[1])
colCounts.sort(key=lambda x: x[0], reverse=True)
for key in colCounts: print key[1], key[0]]

-

Output (which is exactly what I want):

   text:  60
   numeric:   30
   binary:20
   boolean:   20
   date-time: 20

-


But, is there a 1-line way to sort and print?


Meant to include this example:

print {i:os.strerror(i) for i in sorted(errno.errorcode)}




Thanks!





--
https://mail.python.org/mailman/listinfo/python-list


Sorting a list

2016-04-03 Thread DFS

cntText = 60
cntBool = 20
cntNbrs = 30
cntDate = 20
cntBins = 20

strText = "  text: "
strBool = "  boolean:  "
strNbrs = "  numeric:  "
strDate = "  date-time:"
strBins = "  binary:   "

colCounts = [(cntText,strText) , (cntBool,strBool), (cntNbrs,strNbrs) , 
(cntDate,strDate) , (cntBins,strBins)]


# sort by alpha, then by column type count descending
colCounts.sort(key=lambda x: x[1])
colCounts.sort(key=lambda x: x[0], reverse=True)
for key in colCounts: print key[1], key[0]]

-

Output (which is exactly what I want):

  text:  60
  numeric:   30
  binary:20
  boolean:   20
  date-time: 20

-


But, is there a 1-line way to sort and print?


Thanks!



--
https://mail.python.org/mailman/listinfo/python-list


Re: Sorting a list

2016-04-03 Thread DFS

On 4/3/2016 3:31 PM, Peter Otten wrote:

DFS wrote:


cntText = 60
cntBool = 20
cntNbrs = 30
cntDate = 20
cntBins = 20

strText = "  text: "
strBool = "  boolean:  "
strNbrs = "  numeric:  "
strDate = "  date-time:"
strBins = "  binary:   "

colCounts = [(cntText,strText) , (cntBool,strBool), (cntNbrs,strNbrs) ,
(cntDate,strDate) , (cntBins,strBins)]

# sort by alpha, then by column type count descending
colCounts.sort(key=lambda x: x[1])
colCounts.sort(key=lambda x: x[0], reverse=True)
for key in colCounts: print key[1], key[0]]

-

Output (which is exactly what I want):

text:  60
numeric:   30
binary:20
boolean:   20
date-time: 20

-


But, is there a 1-line way to sort and print?


Yes, but I would not recommend it. You can replace the sort() method
invocations with nested calls of sorted() and instead of

for item in items:
 print convert_to_str(item)

use

print "\n".join(convert_to_str(item) for item in items)

Putting it together:


from operator import itemgetter as get
print "\n".join("{1} {0}".format(*p) for p in sorted(

... sorted(colCounts, key=get(1)), key=get(0), reverse=True))


Kind of clunky looking.  Is that why don't you recommend it?





   text:  60
   numeric:   30
   binary:20
   boolean:   20
   date-time: 20

You could also cheat and use

lambda v: (-v[0], v[1])

and a single sorted().


That works well.  Why is it 'cheating'?


Thanks for the reply.


--
https://mail.python.org/mailman/listinfo/python-list


OT: Anyone here use the ConEmu console app?

2016-04-11 Thread DFS
I turned on the Quake-style option (and auto-hide when it loses focus) 
and it disappeared and I can't figure out how to get it back onscreen. I 
think there's a keystroke combo (like Win+key) but I don't know what it is.


It shows in the Task Manager Processses, but not in the Alt+Tab list.

Uninstalled and reinstalled and now it launches Quake-style and hidden. 
 Looked everywhere (\Users\AppData\Local, Registry) for leftover 
settings file but couldn't find it.


Here's the screen where you make the Quake-style setting.
https://conemu.github.io/en/SettingsAppearance.html



Thanks
--
https://mail.python.org/mailman/listinfo/python-list


Re: OT: Anyone here use the ConEmu console app?

2016-04-11 Thread DFS

On 4/11/2016 6:04 PM, 20/20 Lab wrote:

win+alt+space does not work?  ctrl+alt+win+space?

http://conemu.github.io/en/KeyboardShortcuts.html

Says those are not configurable, so they should work.



Neither of those worked, but Ctrl+~ did.

Thankyouthankyouthankyou




On 04/11/2016 02:49 PM, DFS wrote:

I turned on the Quake-style option (and auto-hide when it loses focus)
and it disappeared and I can't figure out how to get it back onscreen.
I think there's a keystroke combo (like Win+key) but I don't know what
it is.

It shows in the Task Manager Processses, but not in the Alt+Tab list.

Uninstalled and reinstalled and now it launches Quake-style and
hidden.  Looked everywhere (\Users\AppData\Local, Registry) for
leftover settings file but couldn't find it.

Here's the screen where you make the Quake-style setting.
https://conemu.github.io/en/SettingsAppearance.html



Thanks




--
https://mail.python.org/mailman/listinfo/python-list


You gotta love a 2-line python solution

2016-05-01 Thread DFS

To save a webpage to a file:
-
1. import urllib
2. urllib.urlretrieve("http://econpy.pythonanywhere.com
/ex/001.html","D:\file.html")
-

That's it!

Coming from VB/A background, some of the stuff you can do with python - 
with ease - is amazing.



VBScript version
--
1. Option Explicit
2. Dim xmlHTTP, fso, fOut
3. Set xmlHTTP = CreateObject("MSXML2.serverXMLHTTP")
4. xmlHTTP.Open "GET", "http://econpy.pythonanywhere.com/ex/001.html";
5. xmlHTTP.Send
6. Set fso = CreateObject("Scripting.FileSystemObject")
7. Set fOut = fso.CreateTextFile("D:\file.html", True)
8.  fOut.WriteLine xmlHTTP.ResponseText
9. fOut.Close
10. Set fOut = Nothing
11. Set fso  = Nothing
12. Set xmlHTTP = Nothing
--

Technically, that VBS will run with just lines 3-9, but that's still 6 
lines of code vs 2 for python.




--
https://mail.python.org/mailman/listinfo/python-list


Fastest way to retrieve and write html contents to file

2016-05-01 Thread DFS

I posted a little while ago about how short the python code was:

-
1. import urllib
2. urllib.urlretrieve(webpage, filename)
-

Which is very sweet compared to the VBScript version:

--
1. Option Explicit
2. Dim xmlHTTP, fso, fOut
3. Set xmlHTTP = CreateObject("MSXML2.serverXMLHTTP")
4. xmlHTTP.Open "GET", webpage
5. xmlHTTP.Send
6. Set fso = CreateObject("Scripting.FileSystemObject")
7. Set fOut = fso.CreateTextFile(filename, True)
8.  fOut.WriteLine xmlHTTP.ResponseText
9. fOut.Close
10. Set fOut = Nothing
11. Set fso  = Nothing
12. Set xmlHTTP = Nothing
--

Then I tested them in loops - the VBScript is MUCH faster: 0.44 for 10 
iterations, vs 0.88 for python.


webpage = 'http://econpy.pythonanywhere.com/ex/001.html'


So I tried:
---
import urllib2
r = urllib2.urlopen(webpage)
f = open(filename,"w")
f.write(r.read())
f.close
---
and
---
import requests
r = requests.get(webpage)
f = open(filename,"w")
f.write(r.text)
f.close
---
and
-
import pycurl
with open(filename, 'wb') as f:
c = pycurl.Curl()
c.setopt(c.URL, webpage)
c.setopt(c.WRITEDATA, f)
c.perform()
c.close()
-

urllib2 and requests were about the same speed as urllib.urlretrieve, 
while pycurl was significantly slower (1.2 seconds).


I'm running Win 8.1.  python 2.7.11 32-bit.

I know it's asking a lot, but is there a really fast AND really short 
python solution for this simple thing?



Thanks!


--
https://mail.python.org/mailman/listinfo/python-list


Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread DFS

On 5/2/2016 12:40 AM, Chris Angelico wrote:

On Mon, May 2, 2016 at 2:34 PM, Stephen Hansen  wrote:

On Sun, May 1, 2016, at 09:06 PM, DFS wrote:

Then I tested them in loops - the VBScript is MUCH faster: 0.44 for 10
iterations, vs 0.88 for python.

...

I know it's asking a lot, but is there a really fast AND really short
python solution for this simple thing?


0.88 is not fast enough for you? That's less then a second.


Also, this is timings of network and disk operations. Unless something
pathological is happening, the language used won't make any
difference.

ChrisA



Unfortunately, the VBScript is twice as fast as any python method.




--
https://mail.python.org/mailman/listinfo/python-list


Re: You gotta love a 2-line python solution

2016-05-01 Thread DFS

On 5/2/2016 12:31 AM, Stephen Hansen wrote:

On Sun, May 1, 2016, at 08:39 PM, DFS wrote:

To save a webpage to a file:
-
1. import urllib
2. urllib.urlretrieve("http://econpy.pythonanywhere.com
 /ex/001.html","D:\file.html")
-


Note, for paths on windows you really want to use a rawstring. Ie,
r"D:\file.html".



Thanks.

I actually use "D:\\file.html" in my code.


--
https://mail.python.org/mailman/listinfo/python-list


Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread DFS

On 5/2/2016 12:49 AM, Ben Finney wrote:

DFS  writes:


Then I tested them in loops - the VBScript is MUCH faster: 0.44 for 10
iterations, vs 0.88 for python.

[…]

urllib2 and requests were about the same speed as urllib.urlretrieve,
while pycurl was significantly slower (1.2 seconds).


Network access is notoriously erratic in its timing. The program, and
the machine on which it runs, is subject to a great many external
effects once the request is sent — effects which will significantly
alter the delay before a response is completed.

How have you controlled for the wide variability in the duration, for
even a given request by the *same code on the same machine*, at
different points in time?

One simple way to do that: Run the exact same test many times (say,
10 000 or so) on the same machine, and then compute the average of all
the durations.

Do the same for each different program, and then you may have more
meaningfully comparable measurements.



I tried the 10-loop test several times with all versions.

The results were 100% consistent: VBSCript xmlHTTP was always 2x faster 
than any python method.




--
https://mail.python.org/mailman/listinfo/python-list


Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread DFS

On 5/2/2016 1:00 AM, Stephen Hansen wrote:

On Sun, May 1, 2016, at 09:50 PM, DFS wrote:

On 5/2/2016 12:40 AM, Chris Angelico wrote:

On Mon, May 2, 2016 at 2:34 PM, Stephen Hansen  wrote:

On Sun, May 1, 2016, at 09:06 PM, DFS wrote:

Then I tested them in loops - the VBScript is MUCH faster: 0.44 for 10
iterations, vs 0.88 for python.

...

I know it's asking a lot, but is there a really fast AND really short
python solution for this simple thing?


0.88 is not fast enough for you? That's less then a second.


Also, this is timings of network and disk operations. Unless something
pathological is happening, the language used won't make any
difference.

ChrisA



Unfortunately, the VBScript is twice as fast as any python method.


And 0.2 is twice as fast as 0.1. When you have two small numbers, 'twice
as fast' isn't particularly meaningful as a metric.


0.2 is half as fast as 0.1, here.

And two small numbers turn into bigger numbers when the webpage is big, 
and soon the download time differences are measured in minutes, not half 
a second.


So, any ideas?
--
https://mail.python.org/mailman/listinfo/python-list


Re: You gotta love a 2-line python solution

2016-05-01 Thread DFS

On 5/2/2016 1:02 AM, Stephen Hansen wrote:

On Sun, May 1, 2016, at 09:51 PM, DFS wrote:

On 5/2/2016 12:31 AM, Stephen Hansen wrote:

On Sun, May 1, 2016, at 08:39 PM, DFS wrote:

To save a webpage to a file:
-
1. import urllib
2. urllib.urlretrieve("http://econpy.pythonanywhere.com
 /ex/001.html","D:\file.html")
-


Note, for paths on windows you really want to use a rawstring. Ie,
r"D:\file.html".


Thanks.

I actually use "D:\\file.html" in my code.


Or you can do that. But the whole point of raw strings is not having to
escape slashes :)



Nice.  Where/how else is 'r' used?


I'm new to python, but I learned that one the hard way.

I was using "D\testfile.txt" for something, and my code kept failing. 
Took me a while to figure it out.  I tried various letters after the 
slash.  I finally stumbled across the escape slashes in the docs somewhere.



--
https://mail.python.org/mailman/listinfo/python-list


Re: You gotta love a 2-line python solution

2016-05-01 Thread DFS

On 5/2/2016 1:02 AM, Stephen Hansen wrote:

On Sun, May 1, 2016, at 09:51 PM, DFS wrote:

On 5/2/2016 12:31 AM, Stephen Hansen wrote:

On Sun, May 1, 2016, at 08:39 PM, DFS wrote:

To save a webpage to a file:
-
1. import urllib
2. urllib.urlretrieve("http://econpy.pythonanywhere.com
 /ex/001.html","D:\file.html")
-


Note, for paths on windows you really want to use a rawstring. Ie,
r"D:\file.html".


Thanks.

I actually use "D:\\file.html" in my code.


Or you can do that. But the whole point of raw strings is not having to
escape slashes :)



Trying the rawstring thing (say it fast 3x):

webpage = "http://econpy.pythonanywhere.com/ex/001.html";


webfile = "D:\\econpy001.html"
urllib.urlretrieve(webpage,webfile) WORKS

webfile = "rD:\econpy001.html"
urllib.urlretrieve(webpage,webfile) FAILS

webfile = "D:\econpy001.html"
urllib.urlretrieve(webpage,"r" + webfile) FAILS

webfile  = "D:\econpy001.html"
urllib.urlretrieve(webpage,"r" + "" + webfile + "")  FAILS


The FAILs throw:

Traceback (most recent call last):
  File "webscraper.py", line 54, in 
urllib.urlretrieve(webpage,webfile)
  File "D:\development\python\python_2.7.11\lib\urllib.py", line 98, in 
urlretrieve

return opener.retrieve(url, filename, reporthook, data)
  File "D:\development\python\python_2.7.11\lib\urllib.py", line 249, 
in retrieve

tfp = open(filename, 'wb')
IOError: [Errno 22] invalid mode ('wb') or filename: 'rD:\\econpy001.html'


What am I doing wrong?
--
https://mail.python.org/mailman/listinfo/python-list


Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread DFS

On 5/2/2016 1:15 AM, Stephen Hansen wrote:

On Sun, May 1, 2016, at 10:00 PM, DFS wrote:

I tried the 10-loop test several times with all versions.


Also how, _exactly_, are you testing this?

C:\Python27>python -m timeit "filename='C:\\test.txt';
webpage='http://econpy.pythonanywhere.com/ex/001.html'; import urllib2;
r = urllib2.urlopen(webpage); f = open(filename, 'w');
f.write(r.read()); f.close();"
10 loops, best of 3: 175 msec per loop

That's a whole lot less the 0.88secs.


Indeed.


-
import requests, urllib, urllib2, pycurl
import time

webpage = "http://econpy.pythonanywhere.com/ex/001.html";
webfile = "D:\\econpy001.html"
loops   = 10

startTime = time.clock()
for i in range(loops):
urllib.urlretrieve(webpage,webfile)
endTime = time.clock()  
print "Finished urllib in %.2g seconds" %(endTime-startTime)

startTime = time.clock()
for i in range(loops):
r = urllib2.urlopen(webpage)
f = open(webfile,"w")
f.write(r.read())
f.close
endTime = time.clock()  
print "Finished urllib2 in %.2g seconds" %(endTime-startTime)

startTime = time.clock()
for i in range(loops):
r = requests.get(webpage)
f = open(webfile,"w")
f.write(r.text)
f.close
endTime = time.clock()  
print "Finished requests in %.2g seconds" %(endTime-startTime)

startTime = time.clock()
for i in range(loops):
with open(webfile + str(i) + ".txt", 'wb') as f:
c = pycurl.Curl()
c.setopt(c.URL, webpage)
c.setopt(c.WRITEDATA, f)
c.perform()
c.close()
endTime = time.clock()  
print "Finished pycurl in %.2g seconds" %(endTime-startTime)
-

$ python getHTML.py
Finished urllib in 0.88 seconds
Finished urllib2 in 0.83 seconds
Finished requests in 0.89 seconds
Finished pycurl in 1.1 seconds

Those results are consistent.  They go up or down a little, but never 
below 0.82 seconds (for urllib2), or above 1.2 seconds (for pycurl)


VBScript is consistently 0.44 to 0.48

--
https://mail.python.org/mailman/listinfo/python-list


Re: You gotta love a 2-line python solution

2016-05-01 Thread DFS

On 5/2/2016 1:37 AM, Stephen Hansen wrote:

On Sun, May 1, 2016, at 10:23 PM, DFS wrote:

Trying the rawstring thing (say it fast 3x):

webpage = "http://econpy.pythonanywhere.com/ex/001.html";


webfile = "D:\\econpy001.html"
urllib.urlretrieve(webpage,webfile) WORKS

webfile = "rD:\econpy001.html"


The r is *outside* the string.

Its: r"D:\econpy001.html"



Got it.  Thanks.




--
https://mail.python.org/mailman/listinfo/python-list


Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread DFS

On 5/2/2016 2:05 AM, Steven D'Aprano wrote:

On Monday 02 May 2016 15:00, DFS wrote:


I tried the 10-loop test several times with all versions.

The results were 100% consistent: VBSCript xmlHTTP was always 2x faster
than any python method.



Are you absolutely sure you're comparing the same job in two languages?


As near as I can tell.  In VBScript I'm actually dereferencing various 
objects (that adds to the time), but I don't do that in python.  I don't 
know enough to even know if it's necessary, or good practice, or what.






Is VB using a local web cache, and Python not?


I'm not specifying a local web cache with either (wouldn't know how or 
where to look).  If you have Windows, you can try it.

---
Option Explicit
Dim xmlHTTP, fso, fOut, startTime, endTime, webpage, webfile,i
webpage = "http://econpy.pythonanywhere.com/ex/001.html";
webfile  = "D:\econpy001.html"
startTime = Timer
For i = 1 to 10
 Set xmlHTTP = CreateObject("MSXML2.serverXMLHTTP")
 xmlHTTP.Open "GET", webpage
 xmlHTTP.Send
 Set fso = CreateObject("Scripting.FileSystemObject")
 Set fOut = fso.CreateTextFile(webfile, True)
  fOut.WriteLine xmlHTTP.ResponseText
 fOut.Close
 Set fOut= Nothing
 Set fso = Nothing
 Set xmlHTTP = Nothing
Next
endTime = Timer
wscript.echo "Finished VBScript in " & FormatNumber(endTime - 
startTime,3) & " seconds"

---
save it to a .vbs file and run it like this:
$cscript /nologo filename.vbs



Are you saving files with both
tests? To the same local drive? (To ensure you aren't measuring the
difference between "write this file to a slow IDE hard disk, write that file
to a fast SSD".)


Identical functionality (retrieve webpage, write html to file).  Same 
webpage, written to the same folder on the same hard drive (not SSD).


The 10 file writes (open/write/close) don't make a meaningful difference 
at all:

VBScript 0.0156 seconds
urllib2  0.0034 seconds

This file is 3.55K.



Once you are sure that you are comparing the same task in two languages,
then make sure the measurement is meaningful. If you change from a (let's
say) 1 KB file to a 100 KB file, do you see the same 2 x difference? What if
you increase it to a 1 KB file?


Do you know a webpage I can hit 10x repeatedly to download a good size 
file?  I'm always paranoid they'll block me thinking I'm a 
"professional" web scraper or something.


Thanks


--
https://mail.python.org/mailman/listinfo/python-list


Re: Fastest way to retrieve and write html contents to file

2016-05-02 Thread DFS

On 5/2/2016 2:27 AM, Stephen Hansen wrote:

On Sun, May 1, 2016, at 10:59 PM, DFS wrote:

startTime = time.clock()
for i in range(loops):
r = urllib2.urlopen(webpage)
f = open(webfile,"w")
f.write(r.read())
f.close
endTime = time.clock()
print "Finished urllib2 in %.2g seconds" %(endTime-startTime)


Yeah on my system I get 1.8 out of this, amounting to 0.18s.


You get 1.8 seconds total for the 10 loops?  That's less than half as 
fast as my results.  Surprising.




I'm again going back to the point of: its fast enough. When comparing
two small numbers, "twice as slow" is meaningless.


Speed is always meaningful.

I know python is relatively slow, but it's a cool, concise, powerful 
language.  I'm extremely impressed by how tight the code can get.




You have an assumption you haven't answered, that downloading a 10 meg
file will be twice as slow as downloading this tiny file. You haven't
proven that at all.


True.  And it has been my assumption - tho not with 10MB file.



I suspect you have a constant overhead of X, and in this toy example,
that makes it seem twice as slow. But when downloading a file of size,
you'll have the same constant factor, at which point the difference is
irrelevant.


Good point.  Test below.



If you believe otherwise, demonstrate it.


http://www.usdirectory.com/ypr.aspx?fromform=qsearch&qs=ga&wqhqn=2&qc=Atlanta&rg=30&qhqn=restaurant&sb=zipdisc&ap=2

It's a 58854 byte file when saved to disk (smaller file was 3546 bytes), 
so this is 16.6x larger.  So I would expect python to linearly run in 
16.6 * 0.88 = 14.6 seconds.


10 loops per run

1st run
$ python timeGetHTML.py
Finished urllib in 8.5 seconds
Finished urllib2 in 5.6 seconds
Finished requests in 7.8 seconds
Finished pycurl in 6.5 seconds

wait a couple minutes, then 2nd run
$ python timeGetHTML.py
Finished urllib in 5.6 seconds
Finished urllib2 in 5.7 seconds
Finished requests in 5.2 seconds
Finished pycurl in 6.4 seconds

It's a little more than 1/3 of my estimate - so good news.

(when I was doing these tests, some of the python results were 0.75 
seconds - way too fast, so I checked and no data was written to file, 
and I couldn't even open the webpage with a browser.  Looks like I had 
been temporarily blocked from the site.  After a couple minutes, I was 
able to access it again).


I noticed urllib and curl returned the html as is, but urllib2 and 
requests added enhancements that should make the data easier to parse. 
Based on speed and functionality and documentation, I believe I'll be 
using the requests HTTP library (I will actually be doing a small amount 
of web scraping).



VBScript
1st run: 7.70 seconds
2nd run: 5.38
3rd run: 7.71

So python matches or beats VBScript at this much larger file.  Kewl.


--
https://mail.python.org/mailman/listinfo/python-list


Re: You gotta love a 2-line python solution

2016-05-02 Thread DFS

On 5/2/2016 5:26 AM, BartC wrote:

On 02/05/2016 04:39, DFS wrote:

To save a webpage to a file:
-
1. import urllib
2. urllib.urlretrieve("http://econpy.pythonanywhere.com
/ex/001.html","D:\file.html")
-

That's it!

Coming from VB/A background, some of the stuff you can do with python -
with ease - is amazing.


VBScript version
--
1. Option Explicit
2. Dim xmlHTTP, fso, fOut
3. Set xmlHTTP = CreateObject("MSXML2.serverXMLHTTP")
4. xmlHTTP.Open "GET", "http://econpy.pythonanywhere.com/ex/001.html";
5. xmlHTTP.Send
6. Set fso = CreateObject("Scripting.FileSystemObject")
7. Set fOut = fso.CreateTextFile("D:\file.html", True)
8.  fOut.WriteLine xmlHTTP.ResponseText
9. fOut.Close
10. Set fOut = Nothing
11. Set fso  = Nothing
12. Set xmlHTTP = Nothing
--

Technically, that VBS will run with just lines 3-9, but that's still 6
lines of code vs 2 for python.


It seems Python provides a higher level solution compared with VBS.
Python presumably also has to do those Opens and Sends, but they are
hidden away inside urllib.urlretrieve.

You can do the same with VB just by wrapping up these lines in a
subroutine. As you would if this had to be executed in a dozen different
places for example. Then you could just write:

getfile("http://econpy.pythonanywhere.com/ex/001.html";, "D:/file.html")

in VBS too. (The forward slash in the file name ought to work.)



Of course.  Taken to its extreme, I could eventually replace you with 
one line of code :)


But python does it for me.  That would save me 8 lines...




(I don't know VBS; I assume it does /have/ subroutines? What I haven't
factored in here is error handling which might yet require more coding
in VBS compared with Python)


Yeah, VBS has subs and functions.  And strange, limited error handling. 
And a single data type, called Variant.  But it's installed with Windows 
so it's easy to get going with.



--
https://mail.python.org/mailman/listinfo/python-list


Best way to clean up list items?

2016-05-02 Thread DFS

Have: list1 = ['\r\n   Item 1  ','  Item 2  ','\r\n  ']
Want: list1 = ['Item 1','Item 2']


I wrote this, which works fine, but maybe it can be tidier?

1. list2 = [t.replace("\r\n", "") for t in list1]   #remove \r\n
2. list3 = [t.strip(' ') for t in list2]#trim whitespace
3. list1  = filter(None, list3) #remove empty items


After each step:

1. list2 = ['   Item 1  ','  Item 2  ','  ']   #remove \r\n
2. list3 = ['Item 1','Item 2','']  #trim whitespace
3. list1 = ['Item 1','Item 2'] #remove empty items


Thanks!
--
https://mail.python.org/mailman/listinfo/python-list


Re: Python3 html scraper that supports javascript

2016-05-02 Thread DFS

On 5/2/2016 11:33 AM, [email protected] wrote:



I tried to use the following code:

from bs4 import BeautifulSoup
from selenium import webdriver

PHANTOMJS_PATH = 
'C:\\Users\\Zoran\\Downloads\\Obrisi\\phantomjs-2.1.1-windows\\bin\\phantomjs.exe'

url = 
'https://hrti.hrt.hr/#/video/show/2203605/trebizat-prica-o-jednoj-vodi-i-jednom-narodu-dokumentarni-film'

browser = webdriver.PhantomJS(PHANTOMJS_PATH)
browser.get(url)

soup = BeautifulSoup(browser.page_source, "html.parser")

x = soup.prettify()

print(x)


When I print x variable, I would expect to see something like this:
https://hrti.hrt.hr/2e9e9c45-aa23-4d08-9055-cd2d7f2c4d58"; id="vjs_video_3_html5_api" 
class="vjs-tech" preload="none">https://prd-hrt.spectar.tv/player/get_smil/id/2203605/video_id/2203605/token/Cny6ga5VEQSJ2uZaD2G8pg/token_expiration/1462043309/asset_type/Movie/playlist_template/nginx/channel_name/trebiat__pria_o_jednoj_vodi_i_jednom_narodu_dokumentarni_film/playlist.m3u8?foo=bar";>


but I can't come to that point.

Regards.



I was doing something similar recently.  Try this:

f = open(somefilename)
soup = BeautifulSoup.BeautifulSoup(f)
f.close()
print soup.prettify()


--
https://mail.python.org/mailman/listinfo/python-list


Re: Best way to clean up list items?

2016-05-02 Thread DFS

On 5/2/2016 1:25 PM, Stephen Hansen wrote:

On Mon, May 2, 2016, at 09:33 AM, DFS wrote:

Have: list1 = ['\r\n   Item 1  ','  Item 2  ','\r\n  ']


I'm curious how you got to this point, it seems like you can solve the
problem in how this is generated.



from lxml import html
import requests

webpage = 
"http://www.usdirectory.com/ypr.aspx?fromform=qsearch&qs=TN&wqhqn=2&qc=Nashville&rg=30&qhqn=restaurant&sb=zipdisc&ap=2";


page  = requests.get(webpage)
tree  = html.fromstring(page.content)
addr1 = tree.xpath('//span[@class="text3"]/text()')
print 'Addresses: ', addr1


I'd prefer to get clean data in the first place, but I don't know a 
better way to extract it from the HTML.




--
https://mail.python.org/mailman/listinfo/python-list


Re: Best way to clean up list items?

2016-05-02 Thread DFS

On 5/2/2016 12:57 PM, Jussi Piitulainen wrote:

DFS writes:


Have: list1 = ['\r\n   Item 1  ','  Item 2  ','\r\n  ']
Want: list1 = ['Item 1','Item 2']


I wrote this, which works fine, but maybe it can be tidier?

1. list2 = [t.replace("\r\n", "") for t in list1]   #remove \r\n
2. list3 = [t.strip(' ') for t in list2]#trim whitespace
3. list1  = filter(None, list3) #remove empty items

After each step:

1. list2 = ['   Item 1  ','  Item 2  ','  ']   #remove \r\n
2. list3 = ['Item 1','Item 2','']  #trim whitespace
3. list1 = ['Item 1','Item 2'] #remove empty items


Try filter(None, (t.strip() for t in list1)). The default.


Works and drops a line of code.  Thx.




Funny-looking data you have.


I know - sadly, it's actual data:


from lxml import html
import requests

webpage = 
"http://www.usdirectory.com/ypr.aspx?fromform=qsearch&qs=TN&wqhqn=2&qc=Nashville&rg=30&qhqn=restaurant&sb=zipdisc&ap=2";


page  = requests.get(webpage)
tree  = html.fromstring(page.content)
addr1 = tree.xpath('//span[@class="text3"]/text()')
print 'Addresses: ', addr1


I couldn't figure out a better way to extract it from the HTML (maybe 
XML and DOM?)

--
https://mail.python.org/mailman/listinfo/python-list


Re: Best way to clean up list items?

2016-05-02 Thread DFS

On 5/2/2016 2:27 PM, Jussi Piitulainen wrote:

DFS writes:


On 5/2/2016 12:57 PM, Jussi Piitulainen wrote:

DFS writes:


Have: list1 = ['\r\n   Item 1  ','  Item 2  ','\r\n  ']
Want: list1 = ['Item 1','Item 2']


. .


Funny-looking data you have.


I know - sadly, it's actual data:


from lxml import html
import requests

webpage =
"http://www.usdirectory.com/ypr.aspx?fromform=qsearch&qs=TN&wqhqn=2&qc=Nashville&rg=30&qhqn=restaurant&sb=zipdisc&ap=2";

page  = requests.get(webpage)
tree  = html.fromstring(page.content)
addr1 = tree.xpath('//span[@class="text3"]/text()')
print 'Addresses: ', addr1


I couldn't figure out a better way to extract it from the HTML (maybe
XML and DOM?)


I should have guessed :) But now I'm a bit worried about those spaces
inside your items. Can it happen that item text is split into strings in
the middle?


Meaning split by me, or comes 'malformed' from the data source?



Then the above sanitation does the wrong thing.

If someone has the right solution, I'm watching, too.



Here's the raw data as stored in the tree:

---
1st page

['\r\n', '\r\n1918 W End 
Ave, Nashville, TN 37203', '\r\n
  ', '\r\n1806 Hayes St, Nashville, 
TN 37203', '\r\n', '\r\n 
1701 Broadway, Nashville, TN 37203', '\r\n', '\r\n
209 10th Ave S, Nashville, TN 37203', '\r\n 
   ', '\r\n907 20th Ave S, Nashville, TN 
37212', '\r\n', '\r\n911 
20th Ave S, Nashville, TN 37212', '\r\n', '\r\n 
  1722 W End Ave, Nashville, TN 37203', '\r\n 
 ', '\r\n1905 Hayes St, 
Nashville, TN 37203', '\r\n
  ', '\r\n2000 W End Ave, 
Nashville, TN 37203']


---

Next page

['\r\n', '\r\n120 19th 
Ave N, Nashville, TN 37203', '\r\n
  ', '\r\n1719 W End Ave Ste 101, 
Nashville, TN 37203', '\r\n
  ', '\r\n1922 W End Ave, Nashville, TN 
37203', '\r\n', '\r\n
  909 20th Ave S, Nashville, TN 37212', '\r\n 
 ', '\r\n
  1807 Church St, Nashville, TN 37203', '\r\n 
 ', '\r\n1721 Church St, Nashville, TN 37203', 
'\r\n', '\r\n718 
Division St, Nashville, TN 37203', '\r\n', '\r\n 
   907 12th Ave S, Nashville, TN 37203', '\r\n 
  ', '\r\n204 21st Ave S, 
Nashville, TN 37203', '\r\n
  ', '\r\n1811 Division St, Nashville, 
TN 37203', '\r\n', '\r\n 
903 Gleaves St, Nashville, TN 37203', '\r\n', '\r\n
1720 W End Ave Ste 530, Nashville, TN 37203', '\r\n 
   ', '\r\n
1200 Division St Ste 100-A, Nashville, TN 37203', '\r\n 
   ', '\r\n
422 7th Ave S, Nashville, TN 37203', '\r\n', 
'\r\n605 8th Ave S, Nashville, TN 37203']


and so on
---

I've checked a couple hundred addresses visually, and so far I've only 
seen 2 formats:


1. '\r\n'
2. '\r\n   address  '


--
https://mail.python.org/mailman/listinfo/python-list


Re: You gotta love a 2-line python solution

2016-05-02 Thread DFS

On 5/2/2016 8:45 PM, [email protected] wrote:

DFS at 2016/5/2 UTC+8 11:39:33AM wrote:

To save a webpage to a file:
-
1. import urllib
2. urllib.urlretrieve("http://econpy.pythonanywhere.com
 /ex/001.html","D:\file.html")
-

That's it!


Why my system can't do it?

Python 3.4.4 (v3.4.4:737efcadf5a6, Dec 20 2015, 19:28:18) [MSC v.1600 32 bit (In
tel)] on win32
Type "help", "copyright", "credits" or "license" for more information.

from urllib import urlretrieve

Traceback (most recent call last):
  File "", line 1, in 
ImportError: cannot import name 'urlretrieve'



try

from urllib.request import urlretrieve

http://stackoverflow.com/questions/21171718/urllib-urlretrieve-file-python-3-3


I'm running python 2.7.11 (32-bit)
--
https://mail.python.org/mailman/listinfo/python-list


Re: Fastest way to retrieve and write html contents to file

2016-05-02 Thread DFS

On 5/2/2016 4:42 AM, Peter Otten wrote:

DFS wrote:


Is VB using a local web cache, and Python not?


I'm not specifying a local web cache with either (wouldn't know how or
where to look).  If you have Windows, you can try it.


I don't have Windows, but if I'm to believe

http://stackoverflow.com/questions/5235464/how-to-make-microsoft-xmlhttprequest-honor-cache-control-directive

the page is indeed cached and you can disable caching with


Option Explicit
Dim xmlHTTP, fso, fOut, startTime, endTime, webpage, webfile,i
webpage = "http://econpy.pythonanywhere.com/ex/001.html";
webfile  = "D:\econpy001.html"
startTime = Timer
For i = 1 to 10
Set xmlHTTP = CreateObject("MSXML2.serverXMLHTTP")
xmlHTTP.Open "GET", webpage


  xmlHTTP.setRequestHeader "Cache-Control", "max-age=0"



Tried that, and from later on that stackoverflow page:

xmlHTTP.setRequestHeader "Cache-Control", "private"

Neither made a difference.  In fact, I saw faster times than ever - as 
low as 0.41 for 10 loops.

--
https://mail.python.org/mailman/listinfo/python-list


Re: Fastest way to retrieve and write html contents to file

2016-05-02 Thread DFS

On 5/2/2016 3:19 AM, Chris Angelico wrote:


There's an easier way to test if there's caching happening. Just crank
the iterations up from 10 to 100 and see what happens to the times. If
your numbers are perfectly fair, they should be perfectly linear in
the iteration count; eg a 1.8 second ten-iteration loop should become
an 18 second hundred-iteration loop. Obviously they won't be exactly
that, but I would expect them to be reasonably close (eg 17-19
seconds, but not 2 seconds).


100 loops
Finished VBScript in 3.953 seconds
Finished VBScript in 3.608 seconds
Finished VBScript in 3.610 seconds

Bit of a per-loop speedup going from 10 to 100.



Then the next thing to test would be to create a deliberately-slow web
server, and connect to that. Put a two-second delay into it, to
simulate a distant or overloaded server, and see if your logs show the
correct result. Something like this:



import time
try:
import http.server as BaseHTTPServer # Python 3
except ImportError:
import BaseHTTPServer # Python 2

class SlowHTTP(BaseHTTPServer.BaseHTTPRequestHandler):
def do_GET(self):
self.send_response(200)
self.send_header("Content-type","text/html")
self.end_headers()
self.wfile.write(b"Hello, ")
time.sleep(2)
self.wfile.write(b"world!")

server = BaseHTTPServer.HTTPServer(("", 1234), SlowHTTP)
server.serve_forever()

---

Test that with a web browser or command-line downloader (go to
http://127.0.0.1:1234/), and make sure that (a) it produces "Hello,
world!", and (b) it takes two seconds. Then set your test scripts to
downloading that URL. (Be sure to set them back to low iteration
counts first!) If the times are true and fair, they should all come
out pretty much the same - ten iterations, twenty seconds. And since
all that's changed is the server, this will be an accurate
demonstration of what happens in the real world: network requests
aren't always fast. Incidentally, you can also watch the server's log
to see if it's getting the appropriate number of requests.

It may turn out that changing the web server actually materially
changes your numbers. Comment out the sleep call and try it again -
you might find that your numbers come closer together, because this
naive server doesn't send back 204 NOT MODIFIED responses or anything.
Again, though, this would prove that you're not actually measuring
language performance, because the tests are more dependent on the
server than the client.

Even if the files themselves aren't being cached, you might find that
DNS is. So if you truly want to eliminate variables, replace the name
in your URL with an IP address. It's another thing that might mess
with your timings, without actually being a language feature.

Networking has about four billion variables in it. You're messing with
one of the least significant: the programming language :)

ChrisA



Thanks for the good feedback.


--
https://mail.python.org/mailman/listinfo/python-list


Re: Fastest way to retrieve and write html contents to file

2016-05-02 Thread DFS

On 5/2/2016 10:00 PM, Chris Angelico wrote:

On Tue, May 3, 2016 at 11:51 AM, DFS  wrote:

On 5/2/2016 3:19 AM, Chris Angelico wrote:


There's an easier way to test if there's caching happening. Just crank
the iterations up from 10 to 100 and see what happens to the times. If
your numbers are perfectly fair, they should be perfectly linear in
the iteration count; eg a 1.8 second ten-iteration loop should become
an 18 second hundred-iteration loop. Obviously they won't be exactly
that, but I would expect them to be reasonably close (eg 17-19
seconds, but not 2 seconds).



100 loops
Finished VBScript in 3.953 seconds
Finished VBScript in 3.608 seconds
Finished VBScript in 3.610 seconds

Bit of a per-loop speedup going from 10 to 100.


How many seconds was it for 10 loops?

ChrisA


~0.44


--
https://mail.python.org/mailman/listinfo/python-list


Re: You gotta love a 2-line python solution

2016-05-02 Thread DFS

On 5/2/2016 11:27 PM, [email protected] wrote:

DFS at 2016/5/3 9:12:24AM wrote:

try

from urllib.request import urlretrieve

http://stackoverflow.com/questions/21171718/urllib-urlretrieve-file-python-3-3


I'm running python 2.7.11 (32-bit)


Alright, it works...someway.

I try to get a zip file. It works, the file can be unzipped correctly.


from urllib.request import urlretrieve
urlretrieve("http://www.caprilion.com.tw/fed.zip";, "d:\\temp\\temp.zip")

('d:\\temp\\temp.zip', )




But when I try to get this forum page, it does get a html file but can't be 
viewed normally.


urlretrieve("https://groups.google.com/forum/#!topic/comp.lang.python/jFl3GJ

bmR7A", "d:\\temp\\temp.html")
('d:\\temp\\temp.html', )




I suppose the html is a much complex situation where more processes need to be 
done before it can be opened by a web browser:-)



Who knows what Google has done... it won't open in Opera.  The tab title 
shows up, but after 20-30 seconds the screen just stays blank and the 
cursor quits loading.


It's a mess - try running it thru BeautifulSoup.prettify() and it looks 
better.



import BeautifulSoup
from urllib.request import urlretrieve
webfile = "D:\\afile.html"
urllib.urlretrieve("https://groups.google.com/forum/#!topic/comp.lang.python/jFl3GJbmR7A",webfile)
f = open(webfile)
soup = BeautifulSoup.BeautifulSoup(f)
f.close()
print soup.prettify()




--
https://mail.python.org/mailman/listinfo/python-list


Re: Fastest way to retrieve and write html contents to file

2016-05-02 Thread DFS

On 5/3/2016 12:06 AM, Michael Torrie wrote:


Now if you want to talk about processing the data once you have it,
there we can talk about speeds and optimization.


Be glad to.  Helps me learn python, so bring whatever challenge you want 
and I'll try to keep up.


One small comparison I was able to make was VBA vs python/pyodbc to 
summarize an Access database.  Not quite a fair test, but interesting 
nonetheless.


---

Access 2003 file
Access 2003 VBA code

2,099,101 rows
114 tables  (max row = 600288)
971 columns
  text:  503
  boolean:   4
  numeric:   351
  date-time: 108
  binary:5
309 indexes (25 foreign keys)
333,549,568 bytes on disk
Time: 0.18 seconds

---

same Access 2003 file
32-bit python 2.7.11 + 32-bit pyodbc 3.0.6

2,099,101 rows
114 tables (max row = 600288)
971  columns
  text:  503
  numeric:   351
  date-time: 108
  binary:5
  boolean:   4
309 indexes (foreign keys na via ODBC*)
333,549,568 bytes on disk
Time: 0.49 seconds

* the Access ODBC driver doesn't support
  the SQLForeignKeys function

---

--
https://mail.python.org/mailman/listinfo/python-list


Re: Not x.islower() has different output than x.isupper() in list output...

2016-05-03 Thread DFS

On 5/3/2016 8:00 AM, Chris Angelico wrote:

On Tue, May 3, 2016 at 9:25 PM, Jussi Piitulainen
 wrote:

Chris Angelico writes:


This assumes, of course, that there is a function swapcase which can
return a string with case inverted. I'm not sure such a function
exists.


   str.swapcase("foO")
   'FOo'


I suppose for this discussion it doesn't matter if it's imperfect.



What was imperfect?



--
https://mail.python.org/mailman/listinfo/python-list


Re: Not x.islower() has different output than x.isupper() in list output...

2016-05-03 Thread DFS

On 5/3/2016 9:13 AM, Chris Angelico wrote:

On Tue, May 3, 2016 at 11:01 PM, DFS  wrote:

On 5/3/2016 8:00 AM, Chris Angelico wrote:


On Tue, May 3, 2016 at 9:25 PM, Jussi Piitulainen
 wrote:


Chris Angelico writes:


This assumes, of course, that there is a function swapcase which can
return a string with case inverted. I'm not sure such a function
exists.



   str.swapcase("foO")
   'FOo'



I suppose for this discussion it doesn't matter if it's imperfect.




What was imperfect?


It doesn't invert, the way numeric negation does.



What do you mean by 'case inverted'?

It looks like it swaps the case correctly between upper and lower.





And if you try to
define exactly what it does, you'll come right back to
isupper()/islower(), so it's not much help in defining those.

ChrisA






--
https://mail.python.org/mailman/listinfo/python-list


Re: Saving Consol outputs in a python script

2016-05-03 Thread DFS

On 5/3/2016 8:14 AM, [email protected] wrote:

Hello, I'm new to python and have a Question.

I'm running a c++ file with a python script like:

import os
import subprocess

subprocess.call(["~/caffe/build/examples/cpp_classification/classification", "deploy.prototxt", 
"this.caffemodel", "mean.binaryproto", "labels.txt", "Bild2.jpg"])

and it runes fine. On the console it gives me the output:

~/Desktop/Downloader/Sym+$ python Run_C.py
-- Prediction for Bild2.jpg --
0.9753 - "Class 1"
0.0247 - "Class 2"


What I need are the 2 values for the 2 classes saved in a variable in the .py 
script, so that I can write them into a text file.

Would be super nice if someone could help me!



This looks like the ticket:

http://eli.thegreenplace.net/2015/redirecting-all-kinds-of-stdout-in-python/




have a nice day!

Steffen



--
https://mail.python.org/mailman/listinfo/python-list


Re: Not x.islower() has different output than x.isupper() in list output...

2016-05-03 Thread DFS

On 5/3/2016 10:49 AM, Jussi Piitulainen wrote:

DFS writes:


On 5/3/2016 9:13 AM, Chris Angelico wrote:



It doesn't invert, the way numeric negation does.


What do you mean by 'case inverted'?

It looks like it swaps the case correctly between upper and lower.


There's letters that do not come in exact pairs of upper and lower case,
so _some_ swaps are not invertible: you swap twice and end up somewhere
else than your starting point.

The "\N{ANSGTROM SIGN}" looks like the Swedish upper-case
a-with-ring-above but isn't the same character, yet Python swaps its
case to the actual lower-case a-with-ring above. It can't go back to
_both_ the Angstrom sign and the actual upper case letter.

(Not sure why the sign is considered a cased letter at all.)



Thanks for the explanation.

Does that mean:

lower(Å) != å ?

and

upper(å) != Å ?


--
https://mail.python.org/mailman/listinfo/python-list


Re: Fastest way to retrieve and write html contents to file

2016-05-03 Thread DFS

On 5/3/2016 11:28 AM, Tim Chase wrote:

On 2016-05-03 00:24, DFS wrote:

One small comparison I was able to make was VBA vs python/pyodbc to
summarize an Access database.  Not quite a fair test, but
interesting nonetheless.

Access 2003 file
Access 2003 VBA code
Time: 0.18 seconds

same Access 2003 file
32-bit python 2.7.11 + 32-bit pyodbc 3.0.6
Time: 0.49 seconds


Curious whether you're forcing Access VBA to talk over ODBC or
whether Access is using native access/file-handling (and thus
bypassing the ODBC overhead)?



The latter, which is why I said "not quite a fair test".


--
https://mail.python.org/mailman/listinfo/python-list


Re: How to become more motivated to learn Python

2016-05-03 Thread DFS

On 5/3/2016 10:12 PM, Christopher Reimer wrote:



When I realized that I wasn't learning enough about the Python language
from translating BASIC games, I started coding a chess engine. If you
ever look at the academic literature for chess programming from the last
50+ years, you can spend a lifetime solving the programming challenges
from implementing the game of kings.



We can have a good thread on python chess engines some time.  I'm also 
going to write a chess engine in python - follow the UCI protocol and 
all.  You're way ahead of me, I'm sure, but I did already look into 
algebraic notation, game recording, FEN and all that.


pyChess is a nice little game: www.pychess.org

The one thing I'm not going to do is review anyone else's code until I 
put out v1.0 of my own.  My goal with v1.0 is for the pieces to make 
valid moves.  That's it.  Following that, I'll work in getting the game 
recording right.  No 'strategy' at first.  Maybe later I can load a 
library of well-known openings and try to utilize them.


How far along are you in your engine development?

Getting the code for en passant and castling right looks to be a bit of 
an obstacle.


What's nice is the strongest engine (Stockfish) is totally open source.


--
https://mail.python.org/mailman/listinfo/python-list


python chess engines

2016-05-03 Thread DFS

On 5/3/2016 8:00 PM, DFS wrote:

How far along are you in your engine development?



I can display a text-based chess board on the console (looks better
with a mono font).

8   BR BN BB BQ BK BB BN BR

7   BP BP BP BP BP BP BP BP

6   __ __ __ __ __ __ __ __

5   __ __ __ __ __ __ __ __

4   __ __ __ __ __ __ __ __

3   __ __ __ __ __ __ __ __

2   WP WP WP WP WP WP WP WP

1   WR WN WB WQ WK WB WN WR

A  B  C  D  E  F  G  H


With feedback from this list, I had to break a lot of bad Java habits
to make the code more Pythonic. Right now I'm going back and forth
between writing documentation and unit tests. Once I finalized the
code in its current state, I'll post it up on GitHub under the MIT
license. Future updates will have a fuller console interface and
moves for individual pieces implemented.




Thank you,

Chris R.



Wanted to start a new thread, rather than use the 'motivated' thread.

Can you play your game at the console?

The way I think about a chess engine is it doesn't even display a board. 
 It accepts a move as input, records the move, analyzes the positions 
after the move, and returns the next move.


Here's the UCI protocol.
http://download.shredderchess.com/div/uci.zip


--
https://mail.python.org/mailman/listinfo/python-list


Re: Fastest way to retrieve and write html contents to file

2016-05-03 Thread DFS

On 5/3/2016 2:41 PM, Tim Chase wrote:

On 2016-05-03 13:00, DFS wrote:

On 5/3/2016 11:28 AM, Tim Chase wrote:

On 2016-05-03 00:24, DFS wrote:

One small comparison I was able to make was VBA vs python/pyodbc
to summarize an Access database.  Not quite a fair test, but
interesting nonetheless.

Access 2003 file
Access 2003 VBA code
Time: 0.18 seconds

same Access 2003 file
32-bit python 2.7.11 + 32-bit pyodbc 3.0.6
Time: 0.49 seconds


Curious whether you're forcing Access VBA to talk over ODBC or
whether Access is using native access/file-handling (and thus
bypassing the ODBC overhead)?


The latter, which is why I said "not quite a fair test".


Can you try the same tests, getting Access/VBA to use ODBC instead to
see how much overhead ODBC entails?

-tkc



Done.

I dropped a few extraneous tables from the database (was 114 tables):

Access 2003 .mdb file
2,009,164 rows
97 tables  (max row = 600288)
725 columns
  text:  389
  boolean:   4
  numeric:   261
  date-time: 69
  binary:2
264 indexes (25 foreign keys)*
299,167,744 bytes on disk


1. DAO
   Time: 0.15 seconds

2. ADODB, Access ODBC driver, OpenSchema method**
   Time: 0.26 seconds

3. python, pyodbc, Access ODBC driver
   Time: 0.42 seconds




* despite being written by Microsoft, the Access ODBC driver doesn't
  support the ODBC SQLForeignKeys function, so the python code doesn't
  show a count of foreign keys

** the Access ODBC driver doesn't support the adSchemaIndexes or
   adSchemaForeignKeys query types, so I used DAO code to count
   indexes and foreign keys.






--
https://mail.python.org/mailman/listinfo/python-list


Re: Not x.islower() has different output than x.isupper() in list output...

2016-05-04 Thread DFS

On 5/3/2016 11:28 PM, Steven D'Aprano wrote:

On Wed, 4 May 2016 12:49 am, Jussi Piitulainen wrote:


DFS writes:


On 5/3/2016 9:13 AM, Chris Angelico wrote:



It doesn't invert, the way numeric negation does.


What do you mean by 'case inverted'?

It looks like it swaps the case correctly between upper and lower.


There's letters that do not come in exact pairs of upper and lower case,


Languages with two distinct lettercases, like English, are called bicameral.
The two cases are technically called majuscule and minuscule, but
colloquially known as uppercase and lowercase since movable type printers
traditionally used to keep the majuscule letters in a drawer above the
minuscule letters.

Many alphabets are unicameral, that is, they only have a single lettercase.
Examples include Hebrew, Arabic, Hangul, and many others. Georgian is an
interesting example, as it is the only known written alphabet that started
as a bicameral script and then became unicameral.

Consequently, many letters are neither upper nor lower case, and have
Unicode category "Letter other":

py> c = u'\N{ARABIC LETTER FEH}'
py> unicodedata.category(c)
'Lo'
py> c.isalpha()
True
py> c.isupper()
False
py> c.islower()
False


Even among bicameral alphabets, there are a few anomalies. The three most
obvious ones are Greek sigma, German Eszett (or "sharp S") and Turkish I.

(1) The Greek sigma is usually written as Σ or σ in uppercase and lowercase
respectively, but at the end of a word, lowercase sigma is written as ς.

(This final sigma is sometimes called "stigma", but should not be confused
with the archaic Greek letter stigma, which has two cases Ϛ ϛ, at least
when it is not being written as digamma Ϝϝ -- and if you're confused, so
are the Greeks :-)

Python 3.3 correctly handles the sigma/final sigma when upper- and
lowercasing:

py> 'ΘΠΣΤΣ'.lower()
'θπστς'

py> 'ΘΠΣΤΣ'.lower().upper()
'ΘΠΣΤΣ'



(2) The German Eszett ß traditionally existed in only lowercase forms, but
despite the existence of an uppercase form since at least the 19th century,
when the Germans moved away from blackletter to Roman-style letters, the
uppercase form was left out. In recent years, printers in Germany have
started to reintroduce an uppercase version, and the German government have
standardized on its use for placenames, but not other words.

(Aside: in Germany, ß is not considered a distinct letter of the alphabet,
but a ligature of ss; historically it derived from a ligature of ſs, ſz or
ſʒ. The funny characters you may or may not be able to see are the long-S
and round-Z.)

Python follows common, but not universal, German practice for eszett:

py> 'ẞ'.lower()
'ß'
py> 'ß'.upper()
'SS'

Note that this is lossy: given a name like "STRASSER", it is impossible to
tell whether it should be title-cased to "Strasser" or "Straßer". It also
means that uppercasing a string can make it longer.


For more on the uppercase eszett, see:

https://typography.guru/journal/germanys-new-character/
https://typography.guru/journal/how-to-draw-a-capital-sharp-s-r18/


(3) In most Latin alphabets, the lowercase i and j have a "tittle" diacritic
on them, but not the uppercase forms I and J. Turkish and a few other
languages have both I-with-tittle and I-without-tittle.

(As far as I know, there is no language with a dotless J.)

So in Turkish, the correct uppercase to lowercase and back again should go:

Dotless I: I -> ı -> I

Dotted I: İ -> i -> İ

Python does not quite manage to handle this correctly for Turkish
applications, since it loses the dotted/dotless distinction:

py> 'ı'.upper()
'I'
py> 'İ'.lower()
'i'

and further case conversions follow the non-Turkish rules.

Note that sometimes getting this wrong can have serious consequences:

http://gizmodo.com/382026/a-cellphones-missing-dot-kills-two-people-puts-three-more-in-jail



Linguist much?


--
https://mail.python.org/mailman/listinfo/python-list


Re: Not x.islower() has different output than x.isupper() in list output...

2016-05-04 Thread DFS

On 5/4/2016 11:37 AM, Steven D'Aprano wrote:

On Thu, 5 May 2016 12:09 am, DFS wrote:


On 5/3/2016 11:28 PM, Steven D'Aprano wrote:



Languages with two distinct lettercases, like English, are called
bicameral.

[...]


Linguist much?



Possibly even a cunning one.



I see you as more of a Colonel Angus.




--
https://mail.python.org/mailman/listinfo/python-list


No SQLite newsgroup, so I'll ask here about SQLite, python and MS Access

2016-05-04 Thread DFS
Both of the following python commands successfully create a SQLite3 
datafile which crashes Access 2003 immediately upon trying to open it 
(via an ODBC linked table).


import sqlite3
conn = sqlite3.connect("dfile.db")

import pyodbc   
conn = pyodbc.connect('Driver={SQLite3 ODBC Driver};Database=dfile.db')


The file is created, a table is added, I add rows to the table in code, 
etc., and it can be read by 'DB Browser for SQLite' so it's a valid 
SQLite3 database, but Access won't read it.  I can create and store a 
link to the table - using that ODBC driver - but as soon as I try to 
open it: "Microsoft Access has stopped working"



On the other hand, a SQLite3 file created in VBScript, using the same 
ODBC driver, /is/ readable with Access 2003:


Set conn = CreateObject("ADODB.Connection")
conn.Open "Driver={SQLite3 ODBC Driver};Database=dfile.db;"


python 2.7.11, pyodbc 3.0.6, ODBC driver, and Access 2003: all 32-bit
OS is Win8.1Pro 64-bit.


I can't find anything on the web.

Any ideas?

Thanks

--
https://mail.python.org/mailman/listinfo/python-list


Whittle it on down

2016-05-04 Thread DFS

Want to whittle a list like this:

[u'Espa\xf1ol', 'Health & Fitness Clubs (36)', 'Health Clubs & 
Gymnasiums (42)', 'Health Fitness Clubs', 'Name', 'Atlanta city guide', 
'edit address', 'Tweet', 'PHYSICAL FITNESS CONSULTANTS & TRAINERS', 
'HEALTH CLUBS & GYMNASIUMS', 'HEALTH CLUBS & GYMNASIUMS', 
'www.custombuiltpt.com/', 'RACQUETBALL COURTS PRIVATE', 
'www.lafitness.com', 'GYMNASIUMS', 'HEALTH & FITNESS CLUBS', 
'www.lafitness.com', 'HEALTH & FITNESS CLUBS', 'www.lafitness.com', 
'PERSONAL FITNESS TRAINERS', 'HEALTH CLUBS & GYMNASIUMS', 'EXERCISE & 
PHYSICAL FITNESS PROGRAMS', 'FITNESS CENTERS', 'HEALTH CLUBS & 
GYMNASIUMS', 'HEALTH CLUBS & GYMNASIUMS', 'PERSONAL FITNESS TRAINERS', 
'5', '4', '3', '2', '1', 'Yellow Pages', 'About Us', 'Contact Us', 
'Support', 'Terms of Use', 'Privacy Policy', 'Advertise With Us', 
'Add/Update Listing', 'Business Profile Login', 'F.A.Q.']


down to

['PHYSICAL FITNESS CONSULTANTS & TRAINERS', 'HEALTH CLUBS & GYMNASIUMS', 
'HEALTH CLUBS & GYMNASIUMS', 'RACQUETBALL COURTS PRIVATE', 'GYMNASIUMS', 
'HEALTH & FITNESS CLUBS', 'HEALTH & FITNESS CLUBS',  'PERSONAL FITNESS 
TRAINERS', 'HEALTH CLUBS & GYMNASIUMS', 'EXERCISE & PHYSICAL FITNESS 
PROGRAMS', 'FITNESS CENTERS', 'HEALTH CLUBS & GYMNASIUMS', 'HEALTH CLUBS 
& GYMNASIUMS', 'PERSONAL FITNESS TRAINERS']




Want to keep all elements containing only upper case letters or upper 
case letters and ampersand (where ampersand is surrounded by spaces)


Is it easier to extract elements meeting those conditions, or remove 
elements meeting the following conditions:


* elements with a lower-case letter in them
* elements with a number in them
* elements with a period in them

?


So far all I figured out is remove items with a period:
newlist = [ x for x in oldlist if "." not in x ]


Thanks for help, python gurus.
--
https://mail.python.org/mailman/listinfo/python-list


Re: No SQLite newsgroup, so I'll ask here about SQLite, python and MS Access

2016-05-04 Thread DFS

On 5/4/2016 10:02 PM, Stephen Hansen wrote:

On Wed, May 4, 2016, at 03:46 PM, DFS wrote:

I can't find anything on the web.


Have you tried:
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
If you really must access it over a newsgroup, you can use the Gmane
mirror:
http://gmane.org/info.php?group=gmane.comp.db.sqlite.general



Thanks



Any ideas?


Sorry, I don't use Access.



--
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread DFS

On 5/5/2016 2:04 AM, Steven D'Aprano wrote:

On Thursday 05 May 2016 14:58, DFS wrote:


Want to whittle a list like this:

[...]

Want to keep all elements containing only upper case letters or upper
case letters and ampersand (where ampersand is surrounded by spaces)



Start by writing a function or a regex that will distinguish strings that
match your conditions from those that don't. A regex might be faster, but
here's a function version.

def isupperalpha(string):
return string.isalpha() and string.isupper()

def check(string):
if isupperalpha(string):
return True
parts = string.split("&")
if len(parts) < 2:
return False
# Don't strip leading spaces from the start of the string.
parts[0] = parts[0].rstrip(" ")
# Or trailing spaces from the end of the string.
parts[-1] = parts[-1].lstrip(" ")
# But strip leading and trailing spaces from the middle parts
# (if any).
for i in range(1, len(parts)-1):
parts[i] = parts[i].strip(" ")
 return all(isupperalpha(part) for part in parts)


Now you have two ways of filtering this. The obvious way is to extract
elements which meet the condition. Here are two ways:

# List comprehension.
newlist = [item for item in oldlist if check(item)]

# Filter, Python 2 version
newlist = filter(check, oldlist)

# Filter, Python 3 version
newlist = list(filter(check, oldlist))


In practice, this is the best (fastest, simplest) way. But if you fear that
you will run out of memory dealing with absolutely humongous lists with
hundreds of millions or billions of strings, you can remove items in place:


def remove(func, alist):
for i in range(len(alist)-1, -1, -1):
if not func(alist[i]):
del alist[i]


Note the magic incantation to iterate from the end of the list towards the
front. If you do it the other way, Bad Things happen. Note that this will
use less memory than extracting the items, but it will be much slower.

You can combine the best of both words. Here is a version that uses a
temporary list to modify the original in place:

# works in both Python 2 and 3
def remove(func, alist):
# Modify list in place, the fast way.
alist[:] = filter(check, alist)



You are out of your mind.





--
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread DFS

On 5/5/2016 1:39 AM, Stephen Hansen wrote:




pattern = re.compile(r"^[A-Z\s&]+$")



output = [x for x in list if pattern.match(x)]




Holy Shr"^[A-Z\s&]+$"  One line of parsing!

I was figuring a few list comprehensions would do it - this is better.

(note: the reason I specified 'spaces around ampersand' is so it would
remove 'Q&A' if that ever came up - but some people write 'Q & A', so
I'll live with that exception, or try to tweak it myself.

You're the man, man.

Thank you!




--
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread DFS

On 5/5/2016 1:53 AM, Jussi Piitulainen wrote:



Either way is easy to approximate with a regex:

import re
upper = re.compile(r'[A-Z &]+')
lower = re.compile(r'[^A-Z &]')
print([datum for datum in data if upper.fullmatch(datum)])
print([datum for datum in data if not lower.search(datum)])


This is similar to Hansen's solution.




I've skipped testing that the ampersand is between spaces, and I've
skipped the period. Adjust.


Will do.



This considers only ASCII upper case letters. You can add individual
letters that matter to you, or you can reach for the documentation to
find if there is some generic notation for all upper case letters.

The newer regex package on PyPI supports POSIX character classes like
[:upper:], I think, and there may or may not be notation for Unicode
character categories in re or regex - LU would be Letter, Uppercase.


Thanks.

--
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread DFS

On 5/5/2016 9:32 AM, Stephen Hansen wrote:

On Thu, May 5, 2016, at 12:36 AM, Steven D'Aprano wrote:

Oh, a further thought...

On Thursday 05 May 2016 16:46, Stephen Hansen wrote:

I don't even care about faster: Its overly complicated. Sometimes a
regular expression really is the clearest way to solve a problem.


Putting non-ASCII letters aside for the moment, how would you match these
specs as a regular expression?


I don't know, but mostly because I wouldn't even try. The requirements
are over-specified. If you look at the OP's data (and based on previous
conversation), he's doing web scraping and trying to pull out good data.
There's no absolutely perfect way to do that because the system he's
scraping isn't meant for data processing. The data isn't cleanly
articulated.

Instead, he wants a heuristic to pull out what look like section titles.



Assigned by a company named localeze, apparently.

http://www.usdirectory.com/cat/g0

https://www.neustarlocaleze.biz/welcome/




The OP looked at the data and came up with a simple set of rules that
identify these section titles:


Want to keep all elements containing only upper case letters or upper

case letters and ampersand (where ampersand is surrounded by spaces)

This translates naturally into a simple regular expression: an uppercase
string with spaces and &'s. Now, that expression doesn't 100% encode
every detail of that rule-- it allows both Q&A and Q & A-- but on my own
looking at the data, I suspect its good enough. The titles are clearly
separate from the other data scraped by their being upper cased. We just
need to expand our allowed character range into spaces and &'s.

Nothing in the OP's request demands the kind of rigorous matching that
your scenario does. Its a practical problem with a simple, practical
answer.



Yes.  And simplicity + practicality = successfulality.

And I do a sanity check before using the data anyway: after parse and 
cleanup and regex matching, I make sure all lists have the same number 
of elements:


lenData = 
[len(title),len(names),len(addr),len(street),len(city),len(state),len(zip)]


if len(set(lenData)) != 1:  alert the media


--
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread DFS

On 5/5/2016 1:54 PM, Steven D'Aprano wrote:

On Thu, 5 May 2016 10:31 pm, DFS wrote:


You are out of your mind.


That's twice you've tried to put me down, first by dismissing my comments
about text processing with "Linguist much", and now an outright insult. The
first time I laughed it off and made a joke about it. I won't do that
again.

>

You asked whether it was better to extract the matching strings into a new
list, or remove them in place in the existing list. I not only showed you
how to do both, but I tried to give you the mental tools to understand when
you should pick one answer over the other. And your response is to insult
me and question my sanity.

Well, DFS, I might be crazy, but I'm not stupid. If that's really how you
feel about my answers, I won't make the mistake of wasting my time
answering your questions in the future.

Over to you now.



heh!  Relax, pal.

I was just trying to be funny - no insult intended either time, of 
course.  Look for similar responses from me in the future.  Usenet 
brings out the smart-aleck in me.


Actually, you should've accepted the 'Linguist much?' as a compliment, 
because I seriously thought you were.


But you ARE out of your mind if you prefer that convoluted "function" 
method over a simple 1-line regex method (as per S. Hansen).


def isupperalpha(string):
return string.isalpha() and string.isupper()

def check(string):
if isupperalpha(string):
return True
parts = string.split("&")
if len(parts) < 2:
return False
parts[0] = parts[0].rstrip(" ")
parts[-1] = parts[-1].lstrip(" ")
for i in range(1, len(parts)-1):
parts[i] = parts[i].strip(" ")
 return all(isupperalpha(part) for part in parts)


I'm sure it does the job well, but that style brings back [bad] memories 
of the VBA I used to write.  I expected something very concise and 
'pythonic' (which I'm learning is everyone's favorite mantra here in 
python-land).


Anyway, I appreciate ALL replies to my queries.  So thank you for taking 
the time.


Whenever I'm able, I'll try to contribute to clp as well.




--
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread DFS

On 5/5/2016 2:56 PM, Stephen Hansen wrote:

On Thu, May 5, 2016, at 05:31 AM, DFS wrote:

You are out of your mind.


Whoa, now. I might disagree with Steven D'Aprano about how to approach
this problem, but there's no need to be rude.


Seriously not trying to be rude - more smart-alecky than anything.

Hope D'Aprano doesn't stay butthurt...




Everyone's trying to help you, after all.


Yes, and I do appreciate it.

I've only been working with python for about a month, but I feel like 
I'm making good progress.  clp is a great resource, and I'll be hanging 
around for a long time, and will contribute when possible.


Thanks for your help.
--
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread DFS

On 5/5/2016 1:39 AM, Stephen Hansen wrote:


Given:


input = [u'Espa\xf1ol', 'Health & Fitness Clubs (36)', 'Health Clubs & Gymnasiums (42)', 'Health Fitness Clubs', 
'Name', 'Atlanta city guide', 'edit address', 'Tweet', 'PHYSICAL FITNESS CONSULTANTS & TRAINERS', 'HEALTH CLUBS & 
GYMNASIUMS', 'HEALTH CLUBS & GYMNASIUMS', 'www.custombuiltpt.com/', 'RACQUETBALL COURTS PRIVATE', 'www.lafitness.com', 
'GYMNASIUMS', 'HEALTH & FITNESS CLUBS', 'www.lafitness.com', 'HEALTH & FITNESS CLUBS', 'www.lafitness.com', 
'PERSONAL FITNESS TRAINERS', 'HEALTH CLUBS & GYMNASIUMS', 'EXERCISE & PHYSICAL FITNESS PROGRAMS', 'FITNESS 
CENTERS', 'HEALTH CLUBS & GYMNASIUMS', 'HEALTH CLUBS & GYMNASIUMS', 'PERSONAL FITNESS TRAINERS', '5', '4', '3', 
'2', '1', 'Yellow Pages', 'About Us', 'Contact Us', 'Support', 'Terms of Use', 'Privacy Policy', 'Advertise With Us', 
'Add/Update Listing', 'Business Profile Login', 'F.A.Q.']


Then:


pattern = re.compile(r"^[A-Z\s&]+$")
output = [x for x in list if pattern.match(x)]
output



['PHYSICAL FITNESS CONSULTANTS & TRAINERS', 'HEALTH CLUBS & GYMNASIUMS',
'HEALTH CLUBS & GYMNASIUMS', 'RACQUETBALL COURTS PRIVATE', 'GYMNASIUMS',
'HEALTH & FITNESS CLUBS', 'HEALTH & FITNESS CLUBS', 'PERSONAL FITNESS
TRAINERS', 'HEALTH CLUBS & GYMNASIUMS', 'EXERCISE & PHYSICAL FITNESS
PROGRAMS', 'FITNESS CENTERS', 'HEALTH CLUBS & GYMNASIUMS', 'HEALTH CLUBS
& GYMNASIUMS', 'PERSONAL FITNESS TRAINERS']



Should've looked earlier.  Their master list of categories 
http://www.usdirectory.com/cat/g0 shows a few commas, a bunch of dashes, 
and the ampersands we talked about.


"OFFICE SERVICES, SUPPLIES & EQUIPMENT" gets removed because of the comma.

"AUTOMOBILE - DEALERS" gets removed because of the dash.

I updated your regex and it seems to have fixed it.

orig: (r"^[A-Z\s&]+$")
new : (r"^[A-Z\s&,-]+$")


Thanks again.


--
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-06 Thread DFS

On 5/6/2016 3:45 AM, Peter Otten wrote:

DFS wrote:



Should've looked earlier.  Their master list of categories
http://www.usdirectory.com/cat/g0 shows a few commas, a bunch of dashes,
and the ampersands we talked about.

"OFFICE SERVICES, SUPPLIES & EQUIPMENT" gets removed because of the comma.

"AUTOMOBILE - DEALERS" gets removed because of the dash.

I updated your regex and it seems to have fixed it.

orig: (r"^[A-Z\s&]+$")
new : (r"^[A-Z\s&,-]+$")


Thanks again.


If there is a "master list" compare your candidates against it instead of
using a heuristic, i. e.

categories = set(master_list)
output = [category for category in input if category in categories]

You can find the categories with


import urllib.request
import bs4
soup =

bs4.BeautifulSoup(urllib.request.urlopen("http://www.usdirectory.com/cat/g0";).read())

categories = set()
for li in soup.find_all("li"):

... assert li.parent.parent["class"][0].startswith("category_items")
... categories.add(li.text)
...

print("\n".join(sorted(categories)[:10]))




"import urllib.request
ImportError: No module named request"


I'm on python 2.7.11






Accounting & Bookkeeping Services
Adoption Services
Adult Entertainment
Advertising
Agricultural Equipment & Supplies
Agricultural Production
Agricultural Services
Aids Resources
Aircraft Charters & Rentals
Aircraft Dealers & Services





Yeah, I actually did something like that last night.  Was trying to get
their full tree structure, which goes 4 levels deep: ie

Arts & Entertainment
  Newpapers
   News Dealers
Prepess Services


What I referred to as their 'master list' is actually just 2 levels 
deep.  My bad.


So far I haven't come across one that had anything in it but letters, 
dashes, commas or ampersands.


Thanks
--
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-06 Thread DFS

On 5/6/2016 9:58 AM, DFS wrote:

On 5/6/2016 3:45 AM, Peter Otten wrote:

DFS wrote:



Should've looked earlier.  Their master list of categories
http://www.usdirectory.com/cat/g0 shows a few commas, a bunch of dashes,
and the ampersands we talked about.

"OFFICE SERVICES, SUPPLIES & EQUIPMENT" gets removed because of the
comma.

"AUTOMOBILE - DEALERS" gets removed because of the dash.

I updated your regex and it seems to have fixed it.

orig: (r"^[A-Z\s&]+$")
new : (r"^[A-Z\s&,-]+$")


Thanks again.


If there is a "master list" compare your candidates against it instead of
using a heuristic, i. e.

categories = set(master_list)
output = [category for category in input if category in categories]

You can find the categories with


import urllib.request
import bs4
soup =

bs4.BeautifulSoup(urllib.request.urlopen("http://www.usdirectory.com/cat/g0";).read())


categories = set()
for li in soup.find_all("li"):

... assert li.parent.parent["class"][0].startswith("category_items")
... categories.add(li.text)
...

print("\n".join(sorted(categories)[:10]))




"import urllib.request
ImportError: No module named request"



Figured it out using urllib2.  Your code returns 411 categories from 
that first page.


There are up to 4 levels of categorization:


Level 1: Arts & Entertainment
Level 2:   Newspapers

Level 3: Newspaper Brokers
Level 3: Newspaper Dealers Back Number
Level 3: Newspaper Delivery
Level 3: Newspaper Distributors
Level 3: Newsracks
Level 3: Printers Newspapers
Level 3: Newspaper Dealers

Level 3: News Dealers
Level 4:   News Dealers Wholesale
Level 4:   Shoppers News Publications

Level 3: News Service
Level 4:   Newspaper Feature Syndicates
Level 4:   Prepress Services




http://www.usdirectory.com/cat/g0 shows 21 Level 1 categories, and 390 
Level 2.  To get the Level 3 and 4 you have to drill-down using the 
hyperlinks.


How to do it in python code is beyond my skills at this point.  Get the 
hrefs and load them and parse, then get the next level and load them and 
parse, etc.?





--
https://mail.python.org/mailman/listinfo/python-list


A fun python CLI program for all to enjoy!

2016-05-06 Thread DFS

getAddresses.py

Scrapes addresses from www.usdirectory.com and stores them in a SQLite 
database, or writes them to text files for mailing labels, etc


Now, just by typing 'fast food Taco Bell  10 db all' you can find 
out how many Taco Bells are within 10 miles of you, and store all the 
addresses in your own address database.


No more convoluted Googling, or hitting the 'Next Page' button, or 
fumbling with the Yellow Pages...


Note: the db structure is flat on purpose, and the .csv files aren't 
quote delimited.


Put the program in its own directory.  It creates the SQLite database 
there, and writes files there, too.


Reviews of code, bug reports, criticisms, suggestions for improvement, 
etc are all welcome.


Enjoy!




#getAddresses.py

import os, sys, requests, time, datetime
from lxml import html
import pyodbc, sqlite3, re


#show values of variables, HTML content, etc
#set it to False for short/concise program output
verbose = False
if verbose == True:
print "The verbose setting is turned On."
print ""


#check if address is unique
addrCheck = []
def addrUnique(addr):
if addr not in addrCheck:
x = True
addrCheck.append(addr)
else: x = False 
return x


#validate and parse command line
def showHelp():
print ""
	print " Enter search word(s), city or zip, state, miles to search, txt 
or csv or db, # addresses to save (no commas)"

print ""
print " eg: restaurant Knoxville TN 10 txt 50"
	print " search for restaurants within 10 miles of Knoxville TN, and 
write"

print " the first 50 address to a txt file"
print ""
print " eg: furniture 30303 GA 20 csv all"
print " search for furniture within 20 miles of zip 30303 GA,"
print " and write all results to a csv file"
print ""
print " eg: boxing gyms Detroit MI 10 db 5"
	print " search for boxing gyms within 10 miles of Detroit MI, and 
store"

print " the first 5 results in a database"
print ""
print " All entries are case-insensitive (ie TX or tx are acceptable)"
exit(0)

argCnt = len(sys.argv)
if argCnt < 7: showHelp()
if verbose == True:
print ""
print str(argCnt) + " arguments"

keyw = "" #eg restaurant, 
boxing gym
if argCnt == 7: keyw = sys.argv[1]  #one search word
if argCnt >  7:  #multiple search words
for i in range(1,argCnt-5):
keyw = keyw + sys.argv[i] + "+"
keyw = keyw[:-1]#drop trailing + sign
cityzip  = sys.argv[argCnt-5]   #eg Atlanta or 30339
state= sys.argv[argCnt-4]   #eg GA
miles= sys.argv[argCnt-3]   #eg 5,10,20,30,50 (website allows max 30)
store= sys.argv[argCnt-2]   #write address to file or database
addrWant = sys.argv[argCnt-1]   #eg save All or number >0

if addrWant.lower() != "all": #how many addresses to save
if addrWant.isdigit() == False: showHelp()
if addrWant == "0": showHelp()
addrWant = int(addrWant)
elif addrWant.lower() == "all": addrWant = addrWant.lower()
else: addrWant = int(addrWant)

if store != "csv" and store != "txt" and store != "db": showHelp()


#begin timing the code
startTime = time.clock()


#website, SQLite db, search string, current date/time for use with db
datasrc = "www.usdirectory.com"
dbName  = "addresses.sqlite"
search  = keyw + " " + str(cityzip) + " " + state + " " + str(miles) + " 
" + str(addrWant)

loaddt = datetime.datetime.now()


#write addresses to file
#each time the same search is done, the file is deleted and recreated
if store == "csv" or store == "txt":
#csv will write in .csv format - header and 1 line per address
#txt will write out 3 lines per address, then blank before next address
webfile  = "usdirectory.com_"+keyw+"_"+cityzip+"_"+state+"."+store
f = open(webfile,"w")
if store == "csv": f.write("Name,Address,CityStateZip\n")
f.close


#store addresses in database
cSQL = ""
if store == "db": 
#creates a SQLite database that Access 2003 can't read
#conn = sqlite3.connect(dbName)

#also creates a SQLite database that Access 2003 can't read
conn = pyodbc.connect('Driver={SQLite3 ODBC Driver};Database=' + dbName)
db   = conn.cursor()

cSQL =  "CREATE TABLE If Not Exists ADDRESSES "
	cSQL += "(datasrc, search, category, name, street, city, state, zip, 
loaddt, "

cSQL += "PRIMARY KEY (datasrc, search, name, street));"
db.execute(cSQL)

# cSQL =  "CREATE TABLE If Not Exists CATEGORIES "
# cSQL += "(catID INTEGER PRIMARY KEY, catDesc);"
# db.execute(cSQL)
	# db.execute("CREATE UNIQUE INDEX If Not Exists UIDX_CATDESC ON 
CA

Re: Whittle it on down

2016-05-06 Thread DFS

On 5/6/2016 11:44 AM, Peter Otten wrote:

DFS wrote:


There are up to 4 levels of categorization:



http://www.usdirectory.com/cat/g0 shows 21 Level 1 categories, and 390
Level 2.  To get the Level 3 and 4 you have to drill-down using the
hyperlinks.

How to do it in python code is beyond my skills at this point.  Get the
hrefs and load them and parse, then get the next level and load them and
parse, etc.?


Yes, that should work ;)



How about you do it, and I'll tell you if you did it right?

ha!




--
https://mail.python.org/mailman/listinfo/python-list


Re: A fun python CLI program for all to enjoy!

2016-05-06 Thread DFS

On 5/6/2016 4:30 PM, MRAB wrote:

On 2016-05-06 20:10, DFS wrote:

getAddresses.py

Scrapes addresses from www.usdirectory.com and stores them in a SQLite
database, or writes them to text files for mailing labels, etc

Now, just by typing 'fast food Taco Bell  10 db all' you can find
out how many Taco Bells are within 10 miles of you, and store all the
addresses in your own address database.

No more convoluted Googling, or hitting the 'Next Page' button, or
fumbling with the Yellow Pages...

Note: the db structure is flat on purpose, and the .csv files aren't
quote delimited.

Put the program in its own directory.  It creates the SQLite database
there, and writes files there, too.

Reviews of code, bug reports, criticisms, suggestions for improvement,
etc are all welcome.


OK, you asked for it... :-)

1. It's shorter and clearer not to compare with True or False:

   if verbose:

   and:

   if not dupeRow:



Done.  It will take some getting used to, though.  I like that it's 
shorter, but I could do the same in VBA and almost always chose not to.





2. You can print a blank line with an empty print statement:

   print


Done.  I actually like the way print  looks better than print ""




3. When looking for unique items, a set is a better choice than a list:

   addrCheck = set()

   def addrUnique(addr):
   if addr not in addrCheck:
   x = True
   addrCheck.add(addr)
   else:
   x = False
   return x


Done.

I researched this just now on StackOverflow:

"Sets are significantly faster when it comes to determining if an object 
is present in the set"

and
"lists are very nice to sort and have order while sets are nice to use 
when you don't want duplicates and don't care about order."


The speed difference won't matter here in my little app, but it's better 
to use the right construct for the job.





4. Try string formatting instead multiple concatenation:

   print "%s arguments" % argCnt



You're referring to this line:
print str(argCnt) + " arguments"

Is there a real benefit of using string formatting here?  (other than 
the required str() conversion)





5. Strings have a .join method, and when you combine it with string
slicing:

   keyw = "+".join(sys.argv[1 : argCnt - 5])



Slick.  Works a treat, and saved 2 lines of code.  String handling is 
another area in which python shines compared to VB.




6. Another example of string formatting:

   search = "%s %s %s %s %s" % (keyw, cityzip, state, miles, addrWant)


Done.  It's shorter, and doesn't require the str() conversion I had to 
do on several of the items.


If I can remember to use it, it should eliminate these:
"TypeError: cannot concatenate 'str' and 'int' objects"




7. It's recommended to use the 'with' statement when handling files:

   with open(webfile, "w") as f:
   if store == "csv":
   f.write("Name,Address,CityStateZip\n")



Done.  I read that using 'with' means Python closes the file even if an 
exception occurs.  So a definite benefit.





   If you don't want to use the 'with' statement, note that closing the
file is:

   f.close()

   It needs the "()"!


I used close() in 1 place, but close without parens in 2 other places. 
So it works either way.  Good catch.


(it's moot now: all 'f.open()/f.close()' replaced by 'with open()')




8. When using SQL, you shouldn't try to insert the values yourself; you
should use parametrised queries:

   cSQL = "DELETE FROM addresses WHERE datasrc = ? AND search = ?;"
   if verbose:
   print cSQL
   db.execute(cSQL, (datasrc, search))
   conn.commit()

It'll insert the values where the "?" are and will do any necessary
quoting itself. (Actually, some drivers use "?", others use "%s", so if
it doesn't work with one, try the other.)

The way you wrote it, it would fail if a value contained a "'". It's
that kind of thing that leads to SQL injection attacks.


Fixed.

You'll notice later on in the code I used the parameterized method for 
INSERTS.  I hate the look of that method, but it does make dealing with 
apostrophes easier, and makes it safer as you say.





Thanks for the code review, RMAB.  Good improvements.


--
https://mail.python.org/mailman/listinfo/python-list


Re: A fun python CLI program for all to enjoy!

2016-05-06 Thread DFS

On 5/6/2016 7:29 PM, Ethan Furman wrote:

On 05/06/2016 04:12 PM, DFS wrote:

On 5/6/2016 4:30 PM, MRAB wrote:



   If you don't want to use the 'with' statement, note that closing the
file is:

   f.close()

   It needs the "()"!


I used close() in 1 place, but close without parens in 2 other places.
So it works either way.  Good catch.


No, it doesn't.  `f.close` simple returns the close function, it doesn't
call it.  The "it works" was simply because Python closed the files for
you later.

Not a big deal in a small program like this, but still a mistake.



Yes.

Check out the answer by 'unutbu' here:

http://stackoverflow.com/questions/1832528/is-close-necessary-when-using-iterator-on-a-python-file-object

He says "I...checked /proc/PID/fd for when the file descriptor was 
closed. It appears that when you break out of the for loop, the file is 
closed for you."


Improper f.close didn't seem to affect any of the files my program wrote 
- and I checked a lot of them when I was writing the code.


Maybe it worked because the last time the file was written to was in a 
for loop, so I got lucky and the files weren't truncated?  Don't know.


Did you notice any other gotchas in the program?


--
https://mail.python.org/mailman/listinfo/python-list


  1   2   >