How to compare words from .txt file against words in .xlsx file via Python? I will then extract these words by writing it to a new .xls file

2019-08-04 Thread aishan0403
I want to compare the common words from multiple .txt files based on the words 
in multiple .xlsx files. 

Could anyone kindly help with my code? I have been stuck for weeks and really 
need help..

Please refer to this link: 
https://stackoverflow.com/questions/57319707/how-to-compare-words-from-txt-file-against-words-in-xlsx-file-via-python-i-wi
 

Any help is greatly appreciated really!! 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How to compare words from .txt file against words in .xlsx file via Python? I will then extract these words by writing it to a new .xls file

2019-08-04 Thread aishan0403
On Monday, 5 August 2019 07:21:52 UTC+8, MRAB  wrote:
> On 2019-08-05 00:10, A S wrote:
> > Oh... By set did you mean by using python function set(variable) as 
> > something?
> >
> > So sorry for bothering you..
> >
> Make it a set (outside the loop):
> 
>      dictionary = set()
> 
> and then add the words to it (inside the loop):
> 
>      dictionary.add(cell_range.value)
> 
> (Maybe also rename the variable to, say, "words_wanted", because calling 
> it "dictionary" when it's not a dictionary (dict) could be confusing...)
> 
> > On Mon, 5 Aug 2019, 6:52 am A S,  > > wrote:
> >
> > Previously I had tried many methods and using set was one of them
> > but it didn't work out either.. I even tried to append it to a
> > list but it's not working out..
> >
> > On Mon, 5 Aug 2019, 2:29 am MRAB,  > > wrote:
> >
> > On 2019-08-04 18:53, A S wrote:
> > > Hi Mrab,
> > >
> > > Thank you so much for your detailed response, I really really
> > > appreciate it as I have been constantly trying to seek help
> > regarding
> > > this issue.
> > >
> > > Yes, I figured that the dictionary is only capturing the
> > last value :(
> > > I've been trying to get it to capture and store all the
> > values to
> > > memory in python but it's not working..
> > >
> > > Are there any improvements that I could make to allow my
> > code to work?
> > >
> > > I would be truly grateful if you could provide further
> > insights on this..
> > >
> > > Thank you so much.
> > >
> > Make it a set and then add the words to it.
> >
> > >
> > > On Mon, 5 Aug 2019, 1:45 am MRAB,
> > mailto:[email protected]>
> > >  > >> wrote:
> > >
> > >     On 2019-08-04 09:29, [email protected]
> > 
> > >      > > wrote:
> > >     > I want to compare the common words from multiple .txt
> > files
> > >     based on the words in multiple .xlsx files.
> > >     >
> > >     > Could anyone kindly help with my code? I have been
> > stuck for
> > >     weeks and really need help..
> > >     >
> > >     > Please refer to this link:
> > >     >
> > >
> > 
> > https://stackoverflow.com/questions/57319707/how-to-compare-words-from-txt-file-against-words-in-xlsx-file-via-python-i-wi
> > >     >
> > >     > Any help is greatly appreciated really!!
> > >     >
> > >     First of all, in this line:
> > >
> > >          folder_path1 =
> > os.chdir("C:/Users/xxx/Documents//Test
> > >     python dict")
> > >
> > >     it changes the current working directory (not a
> > problem), but 'chdir'
> > >     returns None, so from that point 'folder_path1' has the
> > value None.
> > >
> > >     Then in this line:
> > >
> > >          for file in os.listdir(folder_path1):
> > >
> > >     it's actually doing:
> > >
> > >          for file in os.listdir(None):
> > >
> > >     which happens to work because passing it None means to
> > return the
> > >     names
> > >     in the current directory.
> > >
> > >     Now to your problem.
> > >
> > >     This line:
> > >
> > >          dictionary = cell_range.value
> > >
> > >     sets 'dictionary' to the value in the spreadsheet cell,
> > and you're
> > >     doing
> > >     it each time around the loop. At the end of the loop,
> > 'dictionary'
> > >     will
> > >     be set to the _last_ such value. You're not collecting
> > the value, but
> > >     merely remembering the last value.
> > >
> > >     Looking further on, there's this line:
> > >
> > >          if txtwords in dictionary:
> > >
> > >     Remember, 'dictionary' is the last value (a string), so
> > that'll be
> > >     True
> > >     only if 'txtwords' is a substring of the string in
> > 'dictionary'.
> > >
> > >     That's why you're seeing only one match.
> > >
> >

My latest reply to Mrab in case anybody needs it (and p.s. I'm so sorry for 
spamming you Mrab):

Mrab! Thank you so much for your constant replies ! I'm able to print out the 
words now!! Using these codes:

import os, sys
import xlrd
from xlrd import open_workbook
import openpyxl

Re: How to compare words from .txt file against words in .xlsx file via Python? I will then extract these words by writing it to a new .xls file

2019-08-04 Thread aishan0403
On Monday, 5 August 2019 07:21:52 UTC+8, MRAB  wrote:
> On 2019-08-05 00:10, A S wrote:
> > Oh... By set did you mean by using python function set(variable) as 
> > something?
> >
> > So sorry for bothering you..
> >
> Make it a set (outside the loop):
> 
>      dictionary = set()
> 
> and then add the words to it (inside the loop):
> 
>      dictionary.add(cell_range.value)
> 
> (Maybe also rename the variable to, say, "words_wanted", because calling 
> it "dictionary" when it's not a dictionary (dict) could be confusing...)
> 
> > On Mon, 5 Aug 2019, 6:52 am A S,  > > wrote:
> >
> > Previously I had tried many methods and using set was one of them
> > but it didn't work out either.. I even tried to append it to a
> > list but it's not working out..
> >
> > On Mon, 5 Aug 2019, 2:29 am MRAB,  > > wrote:
> >
> > On 2019-08-04 18:53, A S wrote:
> > > Hi Mrab,
> > >
> > > Thank you so much for your detailed response, I really really
> > > appreciate it as I have been constantly trying to seek help
> > regarding
> > > this issue.
> > >
> > > Yes, I figured that the dictionary is only capturing the
> > last value :(
> > > I've been trying to get it to capture and store all the
> > values to
> > > memory in python but it's not working..
> > >
> > > Are there any improvements that I could make to allow my
> > code to work?
> > >
> > > I would be truly grateful if you could provide further
> > insights on this..
> > >
> > > Thank you so much.
> > >
> > Make it a set and then add the words to it.
> >
> > >
> > > On Mon, 5 Aug 2019, 1:45 am MRAB,
> > mailto:[email protected]>
> > >  > >> wrote:
> > >
> > >     On 2019-08-04 09:29, [email protected]
> > 
> > >      > > wrote:
> > >     > I want to compare the common words from multiple .txt
> > files
> > >     based on the words in multiple .xlsx files.
> > >     >
> > >     > Could anyone kindly help with my code? I have been
> > stuck for
> > >     weeks and really need help..
> > >     >
> > >     > Please refer to this link:
> > >     >
> > >
> > 
> > https://stackoverflow.com/questions/57319707/how-to-compare-words-from-txt-file-against-words-in-xlsx-file-via-python-i-wi
> > >     >
> > >     > Any help is greatly appreciated really!!
> > >     >
> > >     First of all, in this line:
> > >
> > >          folder_path1 =
> > os.chdir("C:/Users/xxx/Documents//Test
> > >     python dict")
> > >
> > >     it changes the current working directory (not a
> > problem), but 'chdir'
> > >     returns None, so from that point 'folder_path1' has the
> > value None.
> > >
> > >     Then in this line:
> > >
> > >          for file in os.listdir(folder_path1):
> > >
> > >     it's actually doing:
> > >
> > >          for file in os.listdir(None):
> > >
> > >     which happens to work because passing it None means to
> > return the
> > >     names
> > >     in the current directory.
> > >
> > >     Now to your problem.
> > >
> > >     This line:
> > >
> > >          dictionary = cell_range.value
> > >
> > >     sets 'dictionary' to the value in the spreadsheet cell,
> > and you're
> > >     doing
> > >     it each time around the loop. At the end of the loop,
> > 'dictionary'
> > >     will
> > >     be set to the _last_ such value. You're not collecting
> > the value, but
> > >     merely remembering the last value.
> > >
> > >     Looking further on, there's this line:
> > >
> > >          if txtwords in dictionary:
> > >
> > >     Remember, 'dictionary' is the last value (a string), so
> > that'll be
> > >     True
> > >     only if 'txtwords' is a substring of the string in
> > 'dictionary'.
> > >
> > >     That's why you're seeing only one match.
> > >
> >

My latest reply to Mrab in case anybody needs it (and p.s. I'm so sorry for 
spamming you Mrab): 

Mrab! Thank you so much for your constant replies ! I'm able to print out the 
words now!! Using these codes: 

import os, sys 
import xlrd 
from xlrd import open_workbook 
import ope