Re: [Tutor] text processing lines variable content

2019-02-07 Thread Peter Otten
ingo janssen wrote:

> depending on how the input file is created data packet a can be in an
> other position for every line.
> figured out how to do it though
> 
> order=[a,b,e,d...]
> for i in lines:
>i=i.split(" ")
>  for j in order:
>if j = a:
> use function for processing data chunk a
>elseif j = b:
>  use proper function for processing data type b
>...

Where will you get the order from? If you plan to specify it manually, e. g.

lookup_steps = {
"foo": [a, b, c, ...],
"bar": [a, a, f, ...],
}
fileformat =  sys.argv[1]
steps = lookup_steps[fileformat]
...
for line in lines:
for step in steps:
if step == a:
...
elif step == b:
...

then I recommend storing one function per file format instead:

def process_foo(line):
...  # process one line in foo format

def process_bar(line):
...

lineprocessors = {
"foo": process_foo,
"bar": process_bar,
}
fileformat =  sys.argv[1]
process = lineprocessors[fileformat]
...
for line in lines:
process(line)

That way you deal with Python functions instead of a self-invented 
minilanguage.

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] text processing lines variable content

2019-02-07 Thread ingo janssen



On 07/02/2019 09:29, Peter Otten wrote:

Where will you get the order from?


Peter,

the order comes from the command line. I intend to call the python 
program with the same command line options as the Voro++ program. Make 
the python program call the Voro++ and process its output.


one command line option contains a string for formatting the output. 
That is what I use for order.


#all output formatting options
order = "%i %q %r %w %p %P %o %m %g %E %s %e %F %a %A %f %t %l %n %v %c %C"
order = re.findall("%[a-z]",order,re.M|re.I)
for i, line in enumerate(open("vorodat.vol",'r')):
  points = i
  line = line.strip()
  line = line.split(" ")
  for action in order:
if action == "%i":
  try:
lbl = f_label(label)
  except NameError as e:
 lbl = f_number(label)
 label=[lbl]
elif action == "%q":
  try:
f_vector(point)
  except NameError as e:
  point = [f_vector(point)]
elif action == "%r":
  try:
f_value(radius)
  except NameError as e:
radius=[f_value(radius)]
etc.

order is important as %w tells me how long %p, %P and %o will be. This 
varies per line.


I'll look into what you wrote,

thanks,

ingo
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] learning python from scratch

2019-02-07 Thread Alan Gauld via Tutor
On 06/02/2019 21:22, Michael Munn wrote:
> dear fellow programmeers, this is michael. I have a question for Python.
> I'm a beginner Pythonist. I havee been learning the history and it's use
> for past years. My main focus this year is to learn it's code and begin
> coding.

Can you program in any other language?
If so the official tutorial on python.org is a
good starting point.

If you have never programmed before there are
several tutorials (including mine, see below).
There is a list of them on the Python web site,
here:

https://wiki.python.org/moin/BeginnersGuide/NonProgrammers


-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] text processing lines variable content

2019-02-07 Thread ingo janssen



On 07/02/2019 09:58, ingo janssen wrote:


On 07/02/2019 09:29, Peter Otten wrote:

Where will you get the order from?




Ahrg, that should have been:


#all output formatting options
order = "%i %q %r %w %p %P %o %m %g %E %s %e %F %a %A %f %t %l %n %v %c %C"
order = re.findall("%[a-z]",order,re.M|re.I)
for i, line in enumerate(open("vorodat.vol",'r')):
   points = i
   line = line.strip()
   line = line.split(" ")
   for action in order:
     if action == "%i":
   try:
     lbl = f_label(label)
   except NameError as e:

label=[]
  lbl = f_number(label)  
     elif action == "%q":

   try:
     f_vector(point)
   except NameError as e:

 point = []
     f_vector(point)

     elif action == "%r":
   try:
     f_value(radius)
   except NameError as e:

   radius = []

    f_value(radius)

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] text processing lines variable content

2019-02-07 Thread Alan Gauld via Tutor
On 07/02/2019 08:58, ingo janssen wrote:

>try:
>  lbl = f_label(label)
>except NameError as e:
>   lbl = f_number(label)
>   label=[lbl]

Just a minor point but since you aren't doing
anything with the error you don't need the
'as e' bit at the end of each line...

Just saves a little typing is all.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] text processing lines variable content

2019-02-07 Thread Peter Otten
ingo janssen wrote:

> 
> On 07/02/2019 09:29, Peter Otten wrote:
>> Where will you get the order from?
> 
> Peter,
> 
> the order comes from the command line. 

Then my one-function-per-format approach won't work.

> I intend to call the python
> program with the same command line options as the Voro++ program. Make
> the python program call the Voro++ and process its output.
> 
> one command line option contains a string for formatting the output.
> That is what I use for order.
> 
> #all output formatting options
> order = "%i %q %r %w %p %P %o %m %g %E %s %e %F %a %A %f %t %l %n %v %c
> %C" order = re.findall("%[a-z]",order,re.M|re.I)
> for i, line in enumerate(open("vorodat.vol",'r')):
>points = i
>line = line.strip()
>line = line.split(" ")
>for action in order:
>  if action == "%i":
>try:
>  lbl = f_label(label)
>except NameError as e:
>   lbl = f_number(label)
>   label=[lbl]

Personally I would avoid the NameError and start with empty lists. If you 
manage to wrap all branches into functions with the same signature you can 
replace the sequence of tests with dictionary lookups. Here's a sketch:


# the f_...() functions take a parsed line and return a value and the
# as yet unused rest of the parsed line

labels = []
points = []
...
def add_label(parts):
   label, rest = f_label(parts)
   labels.append(label)
   return rest

def add_point(parts):
point, rest = f_vector(parts)
points.append(point)
return rest

def add_point(parts):
global width
width, rest = f_width(parts)
return rest

lookup_actions = {
"%i": add_label,
"%q": add_point,
"%w": set_width,
...
}

actions = [lookup_actions[action] for action in order]

with open("vorodat.vol") as instream:
for points, line in enumerate(instream, 1):  # as per Mark's advice
width = None  # dummy value to provoke error when width
  # is not explicitly set
parts = line.split()
for action in actions:
parts = actions(parts)


>  elif action == "%q":
>try:
>  f_vector(point)
>except NameError as e:
>point = [f_vector(point)]
>  elif action == "%r":
>try:
>  f_value(radius)
>except NameError as e:
>  radius=[f_value(radius)]
> etc.
> 
> order is important as %w tells me how long %p, %P and %o will be. This
> varies per line.
> 
> I'll look into what you wrote,

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] text processing lines variable content

2019-02-07 Thread ingo janssen




On 07/02/2019 10:40, Alan Gauld via Tutor wrote:

Just saves a little typing is all.


Sensei,

be lazy, I will study


current state of code is at
https://gist.github.com/ingoogni/e99c561f23777e59a5aa6b4ef5fe37c8

ingo
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] text processing lines variable content

2019-02-07 Thread ingo janssen




On 07/02/2019 11:08, Peter Otten wrote:

Personally I would avoid the NameError and start with empty lists. If you
manage to wrap all branches into functions with the same signature you can
replace the sequence of tests with dictionary lookups.


Just before I saw your post I put my current code up here:

https://gist.github.com/ingoogni/e99c561f23777e59a5aa6b4ef5fe37c8

I will study yours,

ingo
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] text processing lines variable content

2019-02-07 Thread ingo janssen




On 07/02/2019 11:08, Peter Otten wrote:

replace the sequence of tests with dictionary lookups


updated the gist a few times, now I could pre calculate the slices to be 
taken per line, but will there be much gain compared to the copping from 
the left side of the list?


ingo
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] text processing lines variable content

2019-02-07 Thread Peter Otten
ingo janssen wrote:

> 
> 
> On 07/02/2019 11:08, Peter Otten wrote:
>> replace the sequence of tests with dictionary lookups
> 
> updated the gist a few times, now I could pre calculate the slices to be
> taken per line, but will there be much gain compared to the copping from
> the left side of the list?

Sorry, I don't understand the question.


Looking at your code

> if action == "%i":
> lbl = function[action](content[action])

you really do not need the function[action] lookup here because you know 
that the result will always be f_number. Likewise you could bind 
content["%i"] to a name, labels, say, and then write

if action == "%s":
lbl = f_number(labels)

which I find much more readable. 

A lookup table only makes sense if provides all necessary information. I 
tried to apply the idea to one of your gist versions:

def set_lbl(items):
global lbl
lbl = f_number(items)

def set_w(items):
global v
v = f_number(items)

def set_f(items):
global f
f = f_number(items)

def set_mx(items):
global mx
mx = mx_value_array(items, f)

function = {
"%i" : set_lbl,
"%w" : set_w,
"%s" : set_f,
"%a" : set_mx,
"%q" : f_vector,
"%r" : f_value,
"%p" : lambda items: f_vector_array(items, v),
"%P" : lambda items: f_vector_array(items, v),
"%o" : lambda items: f_value_array(items, v),
"%m" : f_value,
"%g" : f_number,
"%E" : f_value,
"%e" : lambda items: f_value_array(items, f),
"%F" : f_value,
"%A" : lambda items: f_value_array(items, mx + 1),
"%f" : lambda items: f_value_array(items, f),
"%t" : lambda items: f_nested_value_array(items, f),
"%l" : lambda items: f_vector_array(items, f),
"%n" : lambda items: f_value_array(items, f),
"%v" : f_value,
"%c" : f_vector,
"%C" : f_vector
} 

order = "%i %q %r %w %p %P %o %m %g %E %s %e %F %a %A %f %t %l %n %v %c %C"
order = re.findall("%[a-z]",order,re.M|re.I)
content = {}

actions = []

for i in order:
items = content[i] = []
actions.append(partial(function[i], items))

for points, line in enumerate(open("vorodat.txt.vol",'r'), 1):
line = line.strip()
line = line.split(" ")
for action in actions:
action()

However, while the loop is rather clean now the rest of the code is 
sprinkled with implicit arguments and thus much worse than what you have.

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] text processing lines variable content

2019-02-07 Thread ingo janssen



On 07/02/2019 18:06, Peter Otten wrote:

Sorry, I don't understand the question.


after a quick look not unlike what you propose but I have to investigate 
further,


lengths of chunks are known or can be found (sketchy):

order= [%i,%q,%r,%w,%p,%P,%o,%m,%g,%E,%s,%e,%F,%a,%A,%f,%t,%l,%n,%v,%c,%C]
length=[ 1, 3, 1, 1,%w,%w,%w, 1, 1, 1, 1,%s, 1,%s,%a,%s,%s,%s,%s, 1, 3, 3]

from there calculate the slices per line
slices={"%i":(0,1), "%q":(1,4), "%r":(4:5)etc

modify all functions to accept and deal with the slice tuple, then the 
action loop gets very simple:


for points, line in enumerate(open("vorodat.txt.vol",'r'), 1):
  line = line.strip()
  line = line.split(" ")
  slices = calculate_slices(line)
  function[action](content[action],slices[action])

thanks for your time and insight, I'll try a few different ways

ingo
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor