[Tutor] Newbie Trouble Processing SRT Strings In Text File

2014-10-31 Thread Matt Varner
TL:DR - Skip to "My Script: "subtrans.py"



Optional Links to (perhaps) Helpful Images:
1. The SRT download button:
http://i70.photobucket.com/albums/i82/RavingNoah/Python%20Help/tutor1_zps080f20f7.png

2. A visual comparison of my current problem (see 'Desire Versus
Reality' below):
http://i70.photobucket.com/albums/i82/RavingNoah/Python%20Help/newline_problem_zps307f8cab.jpg


The SRT File


The SRT file that you can download for every lesson that has a video
contains the caption transcript data and is organized according to
text snippets with some connecting time data.


Reading the SRT File and Outputting Something Useful


There may be a hundred different ways to read one of these file types.
The reliable method I chose was to use a handy editor for the purpose
called Aegisub.  It will open the SRT file and let me immediately
export a version of it, without the time data (which I don't
need...yet).  The result of the export is a plain-text file containing
each string snippet and a newline character.

==
Dealing with the Text File
==

One of these text files can be anywhere between 130 to 500 lines or
longer, depending (obviously) on the length of its attendant video.
For my purposes, as a springboard for extending my own notes for each
module, I need to concatenate each string with an acceptable format.
My desire for this is to interject spaces where I need them and kill
all the newline characters so that I get just one big lump of properly
spaced paragraph text.  From here, I can divide up the paragraphs how
I see fit and I'm golden...

==
My first Python script: Issues
==

I did my due diligence.  I have read the tutorial at www.python.org.
I went to my local library and have a copy of "Python Programming for
the Absolute Beginner, 3rd Edition by Michael Dawson."  I started
collecting what seemed like logical little bits here and there from
examples found using Uncle Google, but none of the examples anywhere
were close enough, contextually, to be automatically picked up by my
dense 'noobiosity.'  For instance, when discussing string
methods...almost all operations taught to beginners are done on
strings generated "on the fly," directly inputted into IDLE, but not
on strings that are contained in an external file.  There are other
examples for file operations, but none of them involved doing string
operations afterward.  After many errors about not being able to
directly edit strings in a file object, I finally figured out that
lists are used to read and store strings kept in a file like the one
I'm sourcing from...so I tried using that.  Then I spent hours
unsuccessfully trying to call strings using index numbers from the
list object (I guess I'm dense).  Anyhow, I put together my little
snippets and have been banging my head against the wall for a couple
of days now.

After many frustrating attempts, I have NEARLY produced what I'm
looking to achieve in my test file.


Example - Source


My Test file contains just twelve lines of a much larger (but no more
complex) file that is typical for the SRT subtitle caption file, of
which I expect to have to process a hundred...or hundreds, depending
on how many there are in all of the courses I plan to take
(coincidentally, there is one on Python)

Line 01: # Exported by Aegisub 3.2.1
Line 02: [Deep Dive]
Line 03: [CSS Values & Units Numeric and Textual Data Types with
Guil Hernandez]
Line 04: In this video, we'll go over the
Line 05: common numeric and textual values
Line 06: that CSS properties can accept.
Line 07: Let's get started.
Line 08: So, here we have a simple HTML page
Line 09: containing a div and a paragraph
Line 10: element nested inside.
Line 11: It's linked to a style sheet named style.css
Line 12: and this is where we'll be creating our new CSS rules.


My Script: "subtrans.py"


# Open the target file, create file object
f = open('tmp.txt', 'r')

# Create an output file to write the changed strings to
o = open('result.txt', 'w')

# Create a list object that holds all the strings in the file object
lns = f.readlines()

# Close the source file you no longer
# need now that you have
 your strings
f.close()

# Import sys to get at stdout (standard output) - "print" results will
be written to file
import sys

# Associate stdout with the output file
sys.stdout = o

# Try to print strings to output file using loopback variable (line)
and the list object
for line in lns:
if ".\n" in line:
a = line.replace('.\n','.  ')
print(a.strip('\n'))
else:
b = line.strip('\n')
print(b + " ")

# Close your output file
o.close()

=
Desire Versus Reality
=

The source file contains a series of strings with n

Re: [Tutor] passing named arguments through command line

2014-10-31 Thread Robert Sokolewicz
cool, thanks guys :)

-Robert

On Thu, Oct 30, 2014 at 7:24 PM, Danny Yoo  wrote:

>
>
> On Thu Oct 30 2014 at 7:58:32 AM Lukas Nemec  wrote:
>
>>  Hello,
>>
>> take a look at argparse library.
>>
>
>
> Hi Robert,
>
> As Lukas mentions, it sounds like you're looking for a "flag parsing"
> library.  A flag parsing library reads a set of key/value pairs that are
> encoded in sys.argv, so they let command-line programs provide variable
> values through the use of these flags.
>
> There are a few of these flag libraries in Python due to Python's long
> history.  The one that Lukas recommends, 'argparse', is probably the one
> you want to use.
>
> You can find documentation for argparse at:
>
> https://docs.python.org/2/howto/argparse.html#id1
> https://docs.python.org/2/library/argparse.html
>
> Good luck!
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Newbie Trouble Processing SRT Strings In Text File

2014-10-31 Thread Alan Gauld

On 31/10/14 11:07, Matt Varner wrote:


# Import sys to get at stdout (standard output) - "print" results will
be written to file
import sys


This is a bad idea.
Instead, write your strings directly to o

o.write(s)

Print adds newlines automatically(unless you explicitly suppress them).
But printing to a file is messy compared to writing directly to the file.

(And also means you cant print debug messages while developing
your code!)

HTH
--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.flickr.com/photos/alangauldphotos

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] Practicing with sockets

2014-10-31 Thread Bo Morris
Hello all, hope everyone is doing well.

I have been practicing with sockets and I am trying to send a small png
from the client to the server.

the client code is...

import socket

f = open('/Users/Bo/Desktop/logo_ONEConnxt.png', 'rb')
strf = f.read()
client_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client_socket.connect(("ip.ip.ip.ip", 8999))
client_socket.sendall(strf)
f.close()
exit()

and the server code is...

import socket

f = open('img.png', 'wb')
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
port = 8999
s.bind(('', port))
s.listen(5)

client_socket, address = s.accept()
data = client_socket.recv(4029)
f.write(data)
client_socket.close()

Both the above client and server code runs without error, however the
"img.png" file that is placed on the server shows zero bytes? Will someone
please show me what I am doing wrong?

Thank you,

Bo
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Newbie Trouble Processing SRT Strings In Text File

2014-10-31 Thread Danny Yoo
[code cut]


Hi Matt,

It looks like you're trying to write your own srt parser as part of this
problem.  If you're in a hurry, you may want to look at existing parsers
that people have written.  For example:

https://github.com/byroot/pysrt


But, even though it successfully kills these additional newlines that
> seem to form in the list-making process...I end up with basically a
> non-concatenated file of strings...with the right spaces I need, but
> not one big chunk of text, like I expect using the s.strip('\n')
> functionality.
>

Rather than immediately print the string, you may want to accumulate your
results in a list.  You can then do some processing on your list of strings.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Practicing with sockets

2014-10-31 Thread Danny Yoo
On Fri Oct 31 2014 at 10:31:20 AM Bo Morris  wrote:

> Hello all, hope everyone is doing well.
>
> I have been practicing with sockets and I am trying to send a small png
> from the client to the server.
>


Hey Bo,

Very cool!  Socket programming is fun, because it lets your programs start
talking to other programs.  But it can get frustrating at times too, since
it's all about communication, and we know communcation can fail for so many
different reasons.  :P  We'll try to help where we can.

Just to make sure, you are probably following the Socket HOWTO:

https://docs.python.org/2/howto/sockets.html

Reading code...


> import socket
>
> f = open('/Users/Bo/Desktop/logo_ONEConnxt.png', 'rb')
> strf = f.read()
> client_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
> client_socket.connect(("ip.ip.ip.ip", 8999))
> client_socket.sendall(strf)
> f.close()
> exit()
>
>
This is problematic: the server code won't know up front how many bytes it
should expect to read from the socket.  That is, the code here is sending
"variable-length" message, and variable lengths are difficult to work with.

One common solution is to prefix the payload with a fixed-size byte
length.  That way, the server can read the fixed-size length first, and
then run a loop that reads the rest of the bytes.  This looks something
like:

import struct
# ...
# Send the length...
client_socket.send(struct.pack("!I", len(strf)))
# followed by the content
client_socket.sendall(strf)

Your client code will symmetrically read the first four bytes, use
struct.unpack() to find how how large the rest of the message is going to
be, and then do a loop until it reads the exact number of bytes.


Ok, I'm reading through the client code a bit more...

data = client_socket.recv(4029)
> f.write(data)
> client_socket.close()
>

You probably want to open the output file _after_ the socket has accepted.
Otherwise, it seems a bit premature to open that "f" file.  Also, don't
forget to close the "f" file once you've finished reading the bytes.  Also
note here that since recv() doesn't guarantee how many bytes you'll read at
a time, the byte-reading code needs to be in a loop.

Also, I strongly suggest that you place some logging messages in both your
client and server to trace where your programs are.  One distinguishing
feature of network programs is that they are typically long-running, and so
logs help to expose what the heck they're doing at a given time.

See:

https://docs.python.org/2/howto/logging.html#logging-basic-tutorial
https://docs.python.org/2/library/logging.html

As it stands, your server might not have ever accepted a message from your
client, and you'll still see an empty file, since the code is opening the
file for writing before listening for a request.  That's the other reason
why you want to move the file opening to _after_ the socket is accepted.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Practicing with sockets

2014-10-31 Thread Bo Morris
Hey Danny, yes I have been having quite a bit of fun learning to work with
sockets. Thank you for your response. I have applied what you suggested
with the exception of the "logging." I read through the logging docs and
figured logging would be learning for another day. I have a hard time
enough staying focused on one task at time haha. I did however insert some
print statements into the code so I could keep track of where it was at,
but to keep my email short, I omitted them here.

After implementing what you suggested, the image fie that is saved on the
server is now 4 bytes, but I assume that is due to...

"Your client code will symmetrically read the first four bytes, use
struct.unpack() to find how how large the rest of the message is going to
be, and then do a loop until it reads the exact number of bytes"

and I have not quite got the correct loop to read all the bytes?

I also reread the docs at https://docs.python.org/2/howto/sockets.html and
decided to remove the "b" from "open('myfile.png', 'wb') open('myfile.png',
'rb')  seeing how binary could be different depending on the machine and I
have not yet learned how to deal with this. Would I be better off
converting the image to base64 prior to sending it to the server, then
decoding it on the server?

Here is my updated code...for brevity sake, I have omitted the "import"
statments...

Client:

f = open('/Users/Bo/Desktop/SIG.png', 'r')
strf = f.read()
client_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client_socket.connect(("ip,ip,ip,ip", 8999))
payload = client_socket.send(struct.pack("!I", len(strf)))
for data in payload:
client_socket.sendall(strf)
f.close()
exit()

Server:

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
port = 8999
s.bind(('', port))
s.listen(5)
client_socket, address = s.accept()
data = client_socket.recv(4029)
f = open('img.png', 'w')
for item in data:
f.write(item)
f.flush()
f.close()
client_socket.close()

At least I am getting 4 bytes in oppose to 0 like I was getting before.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Practicing with sockets

2014-10-31 Thread Bo Morris
ok so I finally got all the bytes to be transfered to the server, however I
am unable to open the image on the server; although the filed is saved as a
png file on the server, the server does not recognize the file as png
format?

I changed the loops to the following...

Client:

f = open('/Users/Bo/Desktop/SIG.png', 'r')
strf = f.read()
client_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client_socket.connect(("25.78.28.110", 8999))
while True:
client_socket.send(struct.pack("!I", len(strf)))
data = client_socket.sendall(strf)
if not data:
break
f.close()
print "Data Received successfully"
exit()

Server:

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
port = 8999
s.bind(('', port))
s.listen(5)
client_socket, address = s.accept()
f = open('img.png', 'w')
while True:
data = client_socket.recv(4029)
f.write(data)
if not data:
break
#f.flush()
f.close()
client_socket.close()

On Fri, Oct 31, 2014 at 3:42 PM, Bo Morris  wrote:

> Hey Danny, yes I have been having quite a bit of fun learning to work with
> sockets. Thank you for your response. I have applied what you suggested
> with the exception of the "logging." I read through the logging docs and
> figured logging would be learning for another day. I have a hard time
> enough staying focused on one task at time haha. I did however insert some
> print statements into the code so I could keep track of where it was at,
> but to keep my email short, I omitted them here.
>
> After implementing what you suggested, the image fie that is saved on the
> server is now 4 bytes, but I assume that is due to...
>
> "Your client code will symmetrically read the first four bytes, use
> struct.unpack() to find how how large the rest of the message is going to
> be, and then do a loop until it reads the exact number of bytes"
>
> and I have not quite got the correct loop to read all the bytes?
>
> I also reread the docs at https://docs.python.org/2/howto/sockets.html and
> decided to remove the "b" from "open('myfile.png', 'wb') open('myfile.png',
> 'rb')  seeing how binary could be different depending on the machine and I
> have not yet learned how to deal with this. Would I be better off
> converting the image to base64 prior to sending it to the server, then
> decoding it on the server?
>
> Here is my updated code...for brevity sake, I have omitted the "import"
> statments...
>
> Client:
>
> f = open('/Users/Bo/Desktop/SIG.png', 'r')
> strf = f.read()
> client_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
> client_socket.connect(("ip,ip,ip,ip", 8999))
> payload = client_socket.send(struct.pack("!I", len(strf)))
> for data in payload:
> client_socket.sendall(strf)
> f.close()
> exit()
>
> Server:
>
> s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
> port = 8999
> s.bind(('', port))
> s.listen(5)
> client_socket, address = s.accept()
> data = client_socket.recv(4029)
> f = open('img.png', 'w')
> for item in data:
> f.write(item)
> f.flush()
> f.close()
> client_socket.close()
>
> At least I am getting 4 bytes in oppose to 0 like I was getting before.
>
>
>
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Practicing with sockets

2014-10-31 Thread Danny Yoo
>
>
> I also reread the docs at https://docs.python.org/2/howto/sockets.html and
> decided to remove the "b" from "open('myfile.png', 'wb') open('myfile.png',
> 'rb')  seeing how binary could be different depending on the machine and I
> have not yet learned how to deal with this.
>

Whoa, wait.  I think you're misunderstanding the point of binary mode.  You
_definitely_ need binary mode on when working with binary file formats like
PNG.  Otherwise, your operating system environment may do funny things to
the file content like treat the 0-character (NULL) as a terminator, or try
to transparently translate line ending sequences.


Would I be better off converting the image to base64 prior to sending it to
> the server, then decoding it on the server?
>

The socket approach is low-level: all you've got is a pipe that can send
and receive bytes.  It's _all_ binary from the perspective of the network
layer.  base64-encoding and decoding these bytes won't harm anything, of
course, but I don't see it helping either.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor