[Tutor] Finding unique strings.
All, I have a list of strings which has been downloaded from my bank. I am trying to build a program to find the unique string patterns which I want to use with a dictionary. So I can group the different transactions together. Below are example unique strings which I have manually extracted from the data. Everything after the example text is different. I cannot show the full data due to privacy. WITHDRAWAL AT HANDYBANK PAYMENT BY AUTHORITY WITHDRAWAL BY EFTPOS WITHDRAWAL MOBILE DEPOSIT ACCESSPAY Note: Some of the entries, have an store name contained in the string towards the end. For example: WITHDRAWAL BY EFTPOS 0304479 KMART 1075 CASTLE HILL 24/09 Thus I want to extract the KMART as part of the unique key. As the shown example transaction always has a number. I was going to use a test condition for the above to test for the number. Then the next word would be added to the string for the key. I tried to use dictionaries and managed to get unique first words. But got stuck at this point and could not work out how to build a unique keyword with multiple words. I hope someone can help. Sean ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] pip issue
On Fri, 2019-05-03 at 10:14 +1000, Cameron Simpson wrote: > On 02May2019 17:24, Anil Duggirala wrote: > > I executed the pip3 install --user -r > > contrib/requirements/requirements.txt (I actually did sudo before > > that). > > Please don't use sudo for this. The notion "install" does not imply > being root. That is actually why I interrupted the process, because I remembered that you're not supposed to do pip as sudo or 'install' anything python as sudo. > Try this (as _yourself_, not as root): > > pip3 install --verbose --user 'aiorpcX<0.18,>=0.17.0' I tried this and got a lot of messages like: The package https://files.pythonhosted.org/packages/60/1c/dd77ef44387e9 c51d845140cc46a27049effc04895e02f53a1006754d510/aiorpcX-0.1-py3-none- any.whl#sha256=c6fcb4bce3eb82b9bba2d80b1c57cf3e2498462b2bc8c646a1b94263 9a0d86eb (from https://pypi.org/simple/aiorpcx/) (requires- python:>=3.6) is incompatible with the pythonversion in use. Acceptable python versions are:>=3.6 And then: The package https://files.pythonhosted.org/packages/fd/2e/7d9f0dd1a8c30 8bdc7cbda32859e9b1171768b8f68c124527da83cd4f978/aiorpcX- 0.17.0.tar.gz#sha256=13ccc8361bc3049d649094b69aead6118f6deb5f1b88ad7721 1be85c4e2ed792 (from https://pypi.org/simple/aiorpcx/) (requires- python:>=3.6) is incompatible with the pythonversion in use. Acceptable python versions are:>=3.6 Could not find a version that satisfies the requirement aiorpcX<0.18,>=0.17.0 (from versions: ) Cleaning up... No matching distribution found for aiorpcX<0.18,>=0.17.0 Exception information: Traceback (most recent call last): File "/usr/lib/python3/dist-packages/pip/basecommand.py", line 215, in main status = self.run(options, args) File "/usr/lib/python3/dist-packages/pip/commands/install.py", line 353, in run wb.build(autobuilding=True) File "/usr/lib/python3/dist-packages/pip/wheel.py", line 749, in build self.requirement_set.prepare_files(self.finder) File "/usr/lib/python3/dist-packages/pip/req/req_set.py", line 380, in prepare_files ignore_dependencies=self.ignore_dependencies)) File "/usr/lib/python3/dist-packages/pip/req/req_set.py", line 554, in _prepare_file require_hashes File "/usr/lib/python3/dist-packages/pip/req/req_install.py", line 278, in populate_link self.link = finder.find_requirement(self, upgrade) File "/usr/lib/python3/dist-packages/pip/index.py", line 514, in find_requirement 'No matching distribution found for %s' % req pip.exceptions.DistributionNotFound: No matching distribution found for aiorpcX<0.18,>=0.17.0 > Also, try the --ignore-installed option and/or the --force- > reinstall, > which may cause pip3 to ignore any partial/damaged install and just > do > it all from scratch. I also tried these, and got the same (or very similar) outputs, with no avail. Please tell me where I screwed up. I think I could learn to program in Python, but learning about the packaging and modules and using them, requires a lot more time, I think. thank you Cameron, ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Finding unique strings.
It would probably make things easier if you specified your operating system, Python version, data file type and location. Typically, the bank info would be downloadable as a CSV (comma separated value) file. Assuming that to be the case, and assuming you are using Windows, and assuming Python 3, and assuming you are downing loading your data to a file stored on your computer and running your reader program on the data file, you might start with something like: #! python 3 import csv deposits, atm, temp = [], [], [] # initializes column headings to empty lists. filename = input('Enter file name: ') with open("C:\\PyScripts\\" + filename, 'r') as f: contents = csv.reader(f) for row in contents: ... You would substitute your own column names in place of 'deposits, atm' and substitute your file path information in place of "C:\\PyScripts\\". The first row of the csv file will contain your column headers. Therefore, as you read through the first row in the file, row[0] would be the first header, row[1] would be the second header and so forth. As you read the second row, row[0] would be the first value (data element) under the first column header, row[1] would be the first value under the second column header and so forth. For example, if you were searching for deposits, assuming 'Deposits' is the 3rd column header, you could use code such as deposits += row[2] to add deposit values to your list of deposits. Such code would be indented under 'for row in contents' to make it part of the 'for' loop. Hopefully this will give you some ideas to get you started. to On Fri, May 3, 2019 at 5:40 AM wrote: > > All, > > > > I have a list of strings which has been downloaded from my bank. I am trying > to build a program to find the unique string patterns which I want to use > with a dictionary. So I can group the different transactions together. Below > are example unique strings which I have manually extracted from the data. > Everything after the example text is different. I cannot show the full data > due to privacy. > > > > WITHDRAWAL AT HANDYBANK > > PAYMENT BY AUTHORITY > > WITHDRAWAL BY EFTPOS > > WITHDRAWAL MOBILE > > DEPOSIT ACCESSPAY > > > > Note: Some of the entries, have an store name contained in the string > towards the end. For example: > > > > WITHDRAWAL BY EFTPOS 0304479 KMART 1075 CASTLE HILL 24/09 > > > > Thus I want to extract the KMART as part of the unique key. As the shown > example transaction always has a number. I was going to use a test condition > for the above to test for the number. Then the next word would be added to > the string for the key. > > > > I tried to use dictionaries and managed to get unique first words. But got > stuck at this point and could not work out how to build a unique keyword > with multiple words. I hope someone can help. > > > > > > Sean > > > > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Finding unique strings.
On 03/05/2019 13:07, mhysnm1...@gmail.com wrote: All, I have a list of strings which has been downloaded from my bank. I am trying to build a program to find the unique string patterns which I want to use with a dictionary. So I can group the different transactions together. Below are example unique strings which I have manually extracted from the data. Everything after the example text is different. I cannot show the full data due to privacy. WITHDRAWAL AT HANDYBANK PAYMENT BY AUTHORITY WITHDRAWAL BY EFTPOS WITHDRAWAL MOBILE DEPOSIT ACCESSPAY Note: Some of the entries, have an store name contained in the string towards the end. For example: WITHDRAWAL BY EFTPOS 0304479 KMART 1075 CASTLE HILL 24/09 Thus I want to extract the KMART as part of the unique key. As the shown example transaction always has a number. I was going to use a test condition for the above to test for the number. Then the next word would be added to the string for the key. I tried to use dictionaries and managed to get unique first words. But got stuck at this point and could not work out how to build a unique keyword with multiple words. I hope someone can help. Sean Please check out https://docs.python.org/3/library/collections.html#collections.defaultdict as I think it's right up your street. Examples are given at the link :) -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] pip issue
On 03/05/2019 17:11, Anil Duggirala wrote: >> Try this (as _yourself_, not as root): >> >> pip3 install --verbose --user 'aiorpcX<0.18,>=0.17.0' > > I tried this and got a lot of messages like: ... > 9a0d86eb (from https://pypi.org/simple/aiorpcx/) (requires- > python:>=3.6) is incompatible with the pythonversion in use. Acceptable > python versions are:>=3.6 Ok, so it clearly says you need a Python version greater than or equal to 3.6. Which version of Python are you using? > Please tell me where I screwed up. I think I could learn to program in > Python, but learning about the packaging and modules and using them, > requires a lot more time, I think. Can you clarify your current status since that will help us provide suitable solutions. Can you already program in any other language? (If so which?) Or are you a complete beginner programmer? Normally when learning a language it's best to start with the basics which don't require installing third party libraries. Is there some specific task you need this library for? Is there not an existing python distribution that includes the library already - like Enthought or Anaconda say? Before trying to solve the problem the hard way it might be worth first checking if a simpler alternative exists? -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Finding unique strings.
On 03May2019 22:07, Sean Murphy wrote: I have a list of strings which has been downloaded from my bank. I am trying to build a program to find the unique string patterns which I want to use with a dictionary. So I can group the different transactions together. Below are example unique strings which I have manually extracted from the data. Everything after the example text is different. I cannot show the full data due to privacy. WITHDRAWAL AT HANDYBANK PAYMENT BY AUTHORITY WITHDRAWAL BY EFTPOS WITHDRAWAL MOBILE DEPOSIT ACCESSPAY Note: Some of the entries, have an store name contained in the string towards the end. For example: WITHDRAWAL BY EFTPOS 0304479 KMART 1075 CASTLE HILL 24/09 Thus I want to extract the KMART as part of the unique key. As the shown example transaction always has a number. I was going to use a test condition for the above to test for the number. Then the next word would be added to the string for the key. [...] I'm assuming you're handed the text as one string, for example this: WITHDRAWAL BY EFTPOS 0304479 KMART 1075 CASTLE HILL 24/09 I'm assuming is a single column from a CSV of transactions. I've got 2 observations: 1: For your unique key, if it is a string (it needn't be), you just need to put the relevant parts into your key. FOr the above, perhaps that might be: WITHDRAWAL 0304479 KMART or: WITHDRAWAL KMART 1075 etc depending on what the relevant parts are. 2: To pull out the relevant words from the description I would be inclined to do a more structured parse. Consider something like the following (untested): # example data desc = 'WITHDRAWAL BY EFTPOS 0304479 KMART 1075 CASTLE HILL 24/09' # various things which may be recognised method = None terminal = None vendor = None vendor_site = None # inspect the description words = desc.split() flavour = desc.pop(0) # "WITHDRAWAL" etc word0 = desc.pop(0) if word0 in ('BY', 'AT'): method = word0 + ' ' + desc.pop(0)# "BY EFTPOS" elif word0 in ('MOBILE', 'ACCESSPAY'): method = word0 word0 = words.pop(0) if word0.isdigit(): # probably really part of the "BY EFTPOS" description terminal = word0 word0 = words.pop(0) vendor = word0 word0 = words.pop(0) if word0.isdigit(): vendor_site = word0 word0 = words.pop(0) # ... more dissection ... # assemble the key - include only the items that matter # eg maybe leave out terminal and vendor_site, etc key = (flavour, method, terminal, vendor, vendor_site) This is all rather open ended, and totally dependent on your bank's reporting habits. Also, it needs some length checks: words.pop(0) will raise an exception when "words" is empty, as it will be for the shorter descriptions at some point. The important point is to get a structured key containing just the relevant field values: being assembled as a tuple from strings (immutable hashable Python values) it is usable as a dictionary key. For more ease of use you can make the key a namedtuple: from collections import defaultdict, namedtuple KeyType = namedtuple('KeyType', 'flavour method vendor') transactions = defaultdict(list) loop over the CSV data ... key = KeyType(flavour, method, vendor) transactions[key].append(transcaction info here...) which gets you a dictionary "transactions" containing lists of transaction record (in whatever form you make them, when might be simply the row from the CSV data as a first cut). The nice thing about a namedtuple is that the values are available as attributes: you can use "key.flavour" etc to inspect the tuple. Cheers, Cameron Simpson ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] don't steel my code Mister user
Hi there, Hope you like the subject, I was feeling inventive. my question today concerns compilation. I hear this a lot from the communities I hang around in, and it is something I wonder about often. There are tools like py2exe and pyinstaller that are able to compile your python code into .exe format. but why bother? Lets use a metaphorical example. Lets say I create a program called awesomesauce. In awesomesauce, you boot it up, it checks your user key and sends that up to a server which says if you've paid for the product. Now then, mister user comes along. He decompiles it with pyinstaller ("Which I'm told is easy"), removes the check, and has himself a free product. So how are people doing it? I'm told that many commercial, business, and game-world programs are made in python, and are pay to use, or are yet to be source code stolen, but how? I'd appreciate any tips you may have. Nate ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor