What do I do to read html files on my pc?
Hallo, I have an html file on my pc and I want to read it to extract some text. Can you help on which libs I have to use and how can I do it? thank you so much. Michele -- http://mail.python.org/mailman/listinfo/python-list
Re: What do I do to read html files on my pc?
Il giorno lunedì 27 agosto 2012 12:59:02 UTC+2, mikcec82 ha scritto: > Hallo, > > > > I have an html file on my pc and I want to read it to extract some text. > > Can you help on which libs I have to use and how can I do it? > > > > thank you so much. > > > > Michele Hi ChrisA, Hi Mark. Thanks a lot. I have this html data and I want to check if it is present a string "" or/and a string "NOT PASSED": . . . CODE CHECK : NOT PASSED Depending on this check I have to fill a cell in an excel file with answer: NOK (if Not passed or is present), or OK (if Not passed and are not present). Thanks again for your help (and sorry for my english) -- http://mail.python.org/mailman/listinfo/python-list
Re: What do I do to read html files on my pc?
Il giorno lunedì 27 agosto 2012 12:59:02 UTC+2, mikcec82 ha scritto:
> Hallo,
>
>
>
> I have an html file on my pc and I want to read it to extract some text.
>
> Can you help on which libs I have to use and how can I do it?
>
>
>
> thank you so much.
>
>
>
> Michele
Thank you to all.
Hi Chris, thank you for your hint. I'll try to do as you said and to be clear:
I have to work on an HTML File. This file is not a website-file, neither it
comes from internet.
It is a file created by a local software (where "local" means "on my pc").
On this file, I need to do this operation:
1) Open the file
2) Check the occurences of the strings:
2a) , in this case I have this code:
DTC CODE Read:
2b) NOT PASSED, in this case I have this code:
CODE CHECK
: NOT PASSED
Note: color in ""
can be "red" or "orange"
2c) OK or PASSED
3) Then, I need to fill an excel file following this rules:
3a) If 2a or 2b occurs on htmlfile, I'll write NOK in excel file
3b) If 2c occurs on htmlfile, I'll write OK in excel file
Note:
1) In this example, in 2b case, I have "CODE CHECK" in the code, but I could
also have "TEXT CHECK" or "CHAR CHECK".
2) The research of occurences can be done either by tag ("") or via (NOT PASSED, PASSED). But I would to use the first
method.
==
In my script I have used the second way to looking for, i.e.:
**
fileorig = "C:\Users\Mike\Desktop\\2012_05_16_1___p0201_13.html"
f = open(fileorig, 'r')
nomefile = f.read()
for x in nomefile:
if '' in nomefile:
print 'NOK'
else :
print 'OK'
**
But this one works on charachters and not on strings (i.e.: in this way I have
searched NOT string by string, but charachters-by-charachters).
===
I hope I was clear.
Thank for your help
Michele
--
http://mail.python.org/mailman/listinfo/python-list
Re: What do I do to read html files on my pc?
Il giorno lunedì 27 agosto 2012 12:59:02 UTC+2, mikcec82 ha scritto:
> Hallo,
>
>
>
> I have an html file on my pc and I want to read it to extract some text.
>
> Can you help on which libs I have to use and how can I do it?
>
>
>
> thank you so much.
>
>
>
> Michele
Hi Oscar,
I tried as you said and I've developed the code as you will see.
But, when I have a such situation in an html file, in wich there is a
repetition of a string (XX in this case):
CODE Target:0201
CODE Read:
CODE CHECK : NOT PASSED
TEXT Target: 13
TEXT Read:XX
TEXT CHECK : NOT PASSED
CHAR Target: AA
CHAR Read:XX
CHAR CHECK : NOT PASSED
With this code (created starting from yours)
index = nomefile.find('')
print '_ found at location', index
index2 = nomefile.find('XX')
print 'XX_ found at location', index2
found = nomefile.find('XX')
while found > -1:
print "XX found at location", found
found = nomefile.find('XX', found+1)
I have an answer like this:
_ found at location 51315
XX_ found at location 51315
XX found at location 51315
XX found at location 51316
XX found at location 51317
XX found at location 52321
XX found at location 53328
I have done it to find all occurences of '' and 'XX' strings. But, as you
can see, the script find the occurrences of XX also at locations 51315, 51316 ,
51317 corresponding to string .
Is there a way to search all occurences of XX avoiding location?
Thank you.
Michele
--
http://mail.python.org/mailman/listinfo/python-list
Re: What do I do to read html files on my pc?
Il giorno lunedì 27 agosto 2012 12:59:02 UTC+2, mikcec82 ha scritto:
> Hallo,
>
>
>
> I have an html file on my pc and I want to read it to extract some text.
>
> Can you help on which libs I have to use and how can I do it?
>
>
>
> thank you so much.
>
>
>
> Michele
Hi Peter and thanks for your precious help.
Fortunately, there aren't runs of "X" with repeats other than 2 or 4.
Starting from your code, I wrote this code (I post it, so it could be helpful
for other people):
f = open(fileorig, 'r')
nomefile = f.read()
start = nomefile.find("XX")
start2 = nomefile.find("NOT PASSED")
c0 = 0
c1 = 0
c2 = 0
while (start != -1) | (start2 != -1):
if nomefile[start:start+4] == "":
print " found at location", start
start += 4
c0 +=1
elif nomefile[start:start+2] == "XX":
print "XX found at location", start
start += 2
c1 +=1
if nomefile[start2:start2+10] == "NOT PASSED":
print "NOT PASSED found at location", start2
start2 += 10
c2 +=1
start = nomefile.find("XX", start)
start2 = nomefile.find("NOT PASSED", start2)
print " %s founded" % c0, "\nXX %s founded" % c1, "\nNOT
PASSED %s founded" % c2
Now, I'm able to find all occurences of strings: "", "XX" and "NOT PASSED"
Thank you so much.
--
http://mail.python.org/mailman/listinfo/python-list
Blue Screen Python
Hallo to all, I'm using Python 2.7.3 with Windows 7 @ 64 bit and an Intel Core i3 -2350M CPU @2.30GHz 2.3GHz. Sometimes, when I'm programming in Python on my screen compare this blue screen: http://imageshack.us/a/img228/8352/48579647436249494527021.jpg Can you help on what is the issue, and how I can solve it? If you need more info I'm available. Thank you so much, Michele -- http://mail.python.org/mailman/listinfo/python-list
Re: Blue Screen Python
Il giorno venerdì 21 settembre 2012 16:04:48 UTC+2, mikcec82 ha scritto: > Hallo to all, > > > > I'm using Python 2.7.3 with Windows 7 @ 64 bit > > and an Intel Core i3 -2350M CPU @2.30GHz 2.3GHz. > > > > Sometimes, when I'm programming in Python on my screen compare this blue > screen: > > http://imageshack.us/a/img228/8352/48579647436249494527021.jpg > > > > Can you help on what is the issue, and how I can solve it? > > > > If you need more info I'm available. > > > > Thank you so much, > > Michele Hi to all, and thanks for your answers. I'm not using a buggy library. Yesterday I have another BSOD...but I was using only "OS" library. I have also tested memory using memtest, but there wasn't errors. In my script I open and close an html (in a FOR cycle); could be this the problem? Or is it possible that Python 2.7 is not compatible with Win7? Thank you very much to you. Have a good day, Michele -- http://mail.python.org/mailman/listinfo/python-list
Re: Blue Screen Python
Il giorno venerdì 21 settembre 2012 16:04:48 UTC+2, mikcec82 ha scritto: > Hallo to all, > > > > I'm using Python 2.7.3 with Windows 7 @ 64 bit > > and an Intel Core i3 -2350M CPU @2.30GHz 2.3GHz. > > > > Sometimes, when I'm programming in Python on my screen compare this blue > screen: > > http://imageshack.us/a/img228/8352/48579647436249494527021.jpg > > > > Can you help on what is the issue, and how I can solve it? > > > > If you need more info I'm available. > > > > Thank you so much, > > Michele Thank you so much Philipp. Now I am at work and I can't insert Windows DVD, but as soon as possible I'll done as you said and report you if the problem is solved or not. Best regards, Michele -- http://mail.python.org/mailman/listinfo/python-list
Re: Blue Screen Python
Il giorno venerdì 21 settembre 2012 16:04:48 UTC+2, mikcec82 ha scritto: > Hallo to all, > > > > I'm using Python 2.7.3 with Windows 7 @ 64 bit > > and an Intel Core i3 -2350M CPU @2.30GHz 2.3GHz. > > > > Sometimes, when I'm programming in Python on my screen compare this blue > screen: > > http://imageshack.us/a/img228/8352/48579647436249494527021.jpg > > > > Can you help on what is the issue, and how I can solve it? > > > > If you need more info I'm available. > > > > Thank you so much, > > Michele Hi to all. I solved the problem by creating a WINDOWS XP Virtual Machine (by installing Windows Remote Pc). In this way I have no more problems. I hope this could be helpful to other people. Have a nice day, Michele -- http://mail.python.org/mailman/listinfo/python-list
