Python refactoring question and create dynamic attributes

2019-06-23 Thread Arup Rakshit
In the below code:

@classmethod
def find(self, id):
if isinstance(id, list):
rows = self.__table__().get_all(*id).run(self.__db__().conn)
result = []
for row in rows:
acategory = Category()
acategory.__dict__.update(row)
result.append(acategory)
return result
else:
adict = self.__table__().get(id).run(self.__db__().conn)
acategory = Category()
acategory.__dict__.update(adict)
return acategory

I have 2 questions:

1. Is there any better way to create attributes in an object without using 
__dict__().update() or this is a correct approach?
2. Can we get the same result what for row in rows: block is producing without 
killing the readability ?


To see the context, here is my source code 
https://gitlab.com/aruprakshit/flask_awesome_recipes/blob/master/app/models/category.py
 


Thanks,

Arup Rakshit
[email protected]



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python refactoring question and create dynamic attributes

2019-06-23 Thread Cameron Simpson

On 23Jun2019 13:26, Arup Rakshit  wrote:

In the below code:

   @classmethod
   def find(self, id):
   if isinstance(id, list):
   rows = self.__table__().get_all(*id).run(self.__db__().conn)
   result = []
   for row in rows:
   acategory = Category()
   acategory.__dict__.update(row)
   result.append(acategory)
   return result
   else:
   adict = self.__table__().get(id).run(self.__db__().conn)
   acategory = Category()
   acategory.__dict__.update(adict)
   return acategory

I have 2 questions:

1. Is there any better way to create attributes in an object without using 
__dict__().update() or this is a correct approach?


setattr() is the usual approach, but that sets a single attribute at a 
time. If you have many then __dict__.update may be reasonable.


You should bear in mind that not all objects have a __dict__. It is 
uncommon, but if a class is defined with a __slots__ attribute then its 
instances have fixed attribute names and there is no __dict__. Also some 
builtin types have not __dict__. However, you likely know that the 
objects you are using have a __dict__, so you're probably good.


Also, __dict__ bypasses properties and descriptors. That might be 
important.



2. Can we get the same result what for row in rows: block is producing without 
killing the readability ?


Not obviously. It looks pretty direct to me.

Unless the Category class can be made to accept an attribute map in its 
__int__ method, then you might do some variable on:


 result = [ Category(row) for row in rows ]

which is pretty readable.

BTW, @classmethods receive the class as the first argument, not an 
instance. So you'd normally write:


 @classmethod
 def find(cls, id):
   ...

and you would not have a self to use. is __table__ a class atribute or 
an instance attribute?


To see the context, here is my source code 
https://gitlab.com/aruprakshit/flask_awesome_recipes/blob/master/app/models/category.py


Ah, so Category inherits from BaseModel. That means you can write your 
own __init__. If it accepted an optional mapping (i.e. a dict or row) 
you could put the .update inside __init__, supporting the list 
comprehension I suggest above.


It looks like you've marks almost all the methods as @classmethods. Are 
you sure about that? They all seem written to use self, and thus be 
instance methods.


Cheers,
Cameron Simpson 
--
https://mail.python.org/mailman/listinfo/python-list


Re: Python refactoring question and create dynamic attributes

2019-06-23 Thread Arup Rakshit

> On 23-Jun-2019, at 2:31 PM, Cameron Simpson  wrote:
> 
> On 23Jun2019 13:26, Arup Rakshit  wrote:
>> In the below code:
>> 
>>   @classmethod
>>   def find(self, id):
>>   if isinstance(id, list):
>>   rows = self.__table__().get_all(*id).run(self.__db__().conn)
>>   result = []
>>   for row in rows:
>>   acategory = Category()
>>   acategory.__dict__.update(row)
>>   result.append(acategory)
>>   return result
>>   else:
>>   adict = self.__table__().get(id).run(self.__db__().conn)
>>   acategory = Category()
>>   acategory.__dict__.update(adict)
>>   return acategory
>> 
>> I have 2 questions:
>> 
>> 1. Is there any better way to create attributes in an object without using 
>> __dict__().update() or this is a correct approach?
> 
> setattr() is the usual approach, but that sets a single attribute at a time. 
> If you have many then __dict__.update may be reasonable.
> 
> You should bear in mind that not all objects have a __dict__. It is uncommon, 
> but if a class is defined with a __slots__ attribute then its instances have 
> fixed attribute names and there is no __dict__. Also some builtin types have 
> not __dict__. However, you likely know that the objects you are using have a 
> __dict__, so you're probably good.
> 
> Also, __dict__ bypasses properties and descriptors. That might be important.
> 
>> 2. Can we get the same result what for row in rows: block is producing 
>> without killing the readability ?
> 
> Not obviously. It looks pretty direct to me.
> 
> Unless the Category class can be made to accept an attribute map in its 
> __int__ method, then you might do some variable on:
> 
> result = [ Category(row) for row in rows ]
> 
> which is pretty readable.
> 
> BTW, @classmethods receive the class as the first argument, not an instance. 
> So you'd normally write:
> 
> @classmethod
> def find(cls, id):
>   …
> 

What I know, is that first argument is reserved for the instance upon which it 
is called. It can be any name, so continued to use self. Yes these methods are 
class method intentionally. I am not aware of so far the naming rules of the 
first argument of a class or instance method.

> and you would not have a self to use. is __table__ a class atribute or an 
> instance attribute?

Yes __table__ class attribute defined in the BaseModel. 

> 
>> To see the context, here is my source code 
>> https://gitlab.com/aruprakshit/flask_awesome_recipes/blob/master/app/models/category.py
> 
> Ah, so Category inherits from BaseModel. That means you can write your own 
> __init__. If it accepted an optional mapping (i.e. a dict or row) you could 
> put the .update inside __init__, supporting the list comprehension I suggest 
> above.

Nice idea. I’ll try this.

> 
> It looks like you've marks almost all the methods as @classmethods. Are you 
> sure about that? They all seem written to use self, and thus be instance 
> methods.
> 
> Cheers,
> Cameron Simpson 



Thanks,

Arup Rakshit
[email protected]
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: finding a component in a list of pairs

2019-06-23 Thread Peter Pearson
On 22 Jun 2019 13:24:38 GMT, Stefan Ram  wrote:
[snip]
>
> print( next( ( pair for pair in pairs if pair[ 0 ]== 'sun' ), 
>  ( 0, '(unbekannt)' ))[ 1 ])
> print( next( itertools.dropwhile( lambda pair: pair[ 0 ]!= 'sun', pairs ))
>  [ 1 ])
[snip]
>
>   The last two lines of the program show two different
>   approaches to search for the translation of »sun«.
>
>   Which approach is better? Or, do you have yet a better idea
>   about how to find the translation of »sun« in »pairs«?

Are you allowed to use a dict?

-- 
To email me, substitute nowhere->runbox, invalid->com.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python refactoring question and create dynamic attributes

2019-06-23 Thread MRAB

On 2019-06-23 10:44, Arup Rakshit wrote:



On 23-Jun-2019, at 2:31 PM, Cameron Simpson  wrote:

On 23Jun2019 13:26, Arup Rakshit  wrote:

In the below code:

  @classmethod
  def find(self, id):
  if isinstance(id, list):
  rows = self.__table__().get_all(*id).run(self.__db__().conn)
  result = []
  for row in rows:
  acategory = Category()
  acategory.__dict__.update(row)
  result.append(acategory)
  return result
  else:
  adict = self.__table__().get(id).run(self.__db__().conn)
  acategory = Category()
  acategory.__dict__.update(adict)
  return acategory

I have 2 questions:

1. Is there any better way to create attributes in an object without using 
__dict__().update() or this is a correct approach?


setattr() is the usual approach, but that sets a single attribute at a time. If 
you have many then __dict__.update may be reasonable.

You should bear in mind that not all objects have a __dict__. It is uncommon, 
but if a class is defined with a __slots__ attribute then its instances have 
fixed attribute names and there is no __dict__. Also some builtin types have 
not __dict__. However, you likely know that the objects you are using have a 
__dict__, so you're probably good.

Also, __dict__ bypasses properties and descriptors. That might be important.


2. Can we get the same result what for row in rows: block is producing without 
killing the readability ?


Not obviously. It looks pretty direct to me.

Unless the Category class can be made to accept an attribute map in its __int__ 
method, then you might do some variable on:

result = [ Category(row) for row in rows ]

which is pretty readable.

BTW, @classmethods receive the class as the first argument, not an instance. So 
you'd normally write:

@classmethod
def find(cls, id):
  …



What I know, is that first argument is reserved for the instance upon which it 
is called. It can be any name, so continued to use self. Yes these methods are 
class method intentionally. I am not aware of so far the naming rules of the 
first argument of a class or instance method.

As Cameron wrote, the convention is that if it's an instance method you 
call its first parameter "self", whereas if it's a class method you call 
its first parameter "cls".


[snip]
--
https://mail.python.org/mailman/listinfo/python-list


Re: Python refactoring question and create dynamic attributes

2019-06-23 Thread DL Neil

On 23/06/19 7:56 PM, Arup Rakshit wrote:

In the below code:

 @classmethod
 def find(self, id):
 if isinstance(id, list):
 rows = self.__table__().get_all(*id).run(self.__db__().conn)
 result = []
 for row in rows:
 acategory = Category()
 acategory.__dict__.update(row)
 result.append(acategory)
 return result
 else:
 adict = self.__table__().get(id).run(self.__db__().conn)
 acategory = Category()
 acategory.__dict__.update(adict)
 return acategory

I have 2 questions:

1. Is there any better way to create attributes in an object without using 
__dict__().update() or this is a correct approach?
2. Can we get the same result what for row in rows: block is producing without 
killing the readability ?



Examining the readability concern(s):-

1
If it seems complex, first write a comment (in plain English).

2
Is the most basic refactoring improvement is any possible improvement in 
attribute/variable names?



A mid-point between these two: if you find the intricate coding of 
specific concepts awkward to read/explain, eg


__table__().get_all(*id).run(self.__db__().conn)
or __dict__.update(row)

then might you consider a short 'helper function' - whose name will 
explain-all and whose body will 'hide' or abstract the complex detail?


eg absorb_attributes( acategory, row )

(and yes, someone, somewhere, is absolutely itching to throw-in a Star 
Trek reference!)

--
Regards =dn
--
https://mail.python.org/mailman/listinfo/python-list


Fwd: Understand workflow about reading and writing files in Python

2019-06-23 Thread Windson Yang
Thank you so much for you review DL Neil, it really helps :D. However,
there are some parts still confused me, I replyed as below.

DL Neil  于2019年6月19日周三 下午2:03写道:

> I've not gone 'back' to refer to any ComSc theory on buffer-management.
> Perhaps you might benefit from such?
>
> I just take a crash course on it so I want to know if I understand the
details correctly :D


> I like your use of the word "shift", so I'll continue to use it.
>
> There are three separate units of data to consider - each of which could
> be called a "buffer". To avoid confusing (myself) I'll only call the
> 'middle one' that:
> 1 the unit of data 'coming' from the data-source
> 2 the "buffer" you are implementing
> 3 the unit of data 'going' out to a data-destination.
>
> Just to make it clear, when we use `f.write('abc')` in python, (1) means
'abc', (2) means the buffer handle by Python (by default 8kb), (2) means
the file *f* we are writing to, right?

1 and 3 may be dictated to you, eg hardware or file specifications, code
> requirements, etc.
>
> So, data is shifted into the (2) buffer in a unit-size decided by (1) -
> in most use-cases each incoming unit will be the same size, but remember
> that the last 'unit' may/not be full-size. Similarly, data shifted out
> from the (2) buffer to (3).
>
> The size of (1) is likely not that of (3) - otherwise why use a
> "buffer"? The size of (2) must be larger than (1) and larger than (2) -
> for reasons already illustrated.
>

Is this a typo? (2) larger than (1) larger than (2)?

>
> I recall learning how to use buffers with a series of hand-drawn block
> diagrams. Recommend you try similarly!
>
>
> Now, let's add a few critiques, as requested (interposed below):-
>
>
> On 19/06/19 3:53 PM, Windson Yang wrote:t
> > I'm trying to understand the workflow of how Python read/writes data with
> > buffer. I will be appreciated if someone can review it.
> >
> > ### Read n data
>
> - may need more than one read operation if the size of (3) "demands"
> more data than the size of (1)/one "read".
>

Looks like the size of len of one read() depends on
https://github.com/python/cpython/blob/master/Modules/_io/bufferedio.c#L1655
 ?

>
> > 1. If the data already in the buffer, return data
>
> - this a data-transfer of size (3)
>
> For extra credit/an unnecessary complication (but probable speed-up!):
> * if the data-remaining is less than size (3) consider a read-ahead
> mechanism
>
> > 2. If the data not in the buffer:
>
> - if buffer's data-len < size (3)
>
> >  1. copy all the current data from the buffer
>
> * if "buffer" is my (2), then no-op
>

I don't understand your point here, when we read data we would copy some
data from the current buffer from python, right? (
https://github.com/python/cpython/blob/master/Modules/_io/bufferedio.c#L1638),
we use `out` (which point to res) to store the data here.

>
> >  2. create a new buffer object, fill the new buffer with raw read
> which
> > read data from disk.
>
> * this becomes: perform read operation and append incoming data (size
> (1)) to "buffer" - hence why "buffer" is larger than (1), by definition.
> NB if size (1) is smaller than size (3), multiple read operations may be
> necessary. Thus a read-loop!?
>
> Yes, you are right, here is a while loop (
https://github.com/python/cpython/blob/master/Modules/_io/bufferedio.c#L1652
)

>
> >  3. concat the data in the old buffer and new buffer.
>
> = now no-op. Hopefully the description of 'three buffers' removes this
> confusion of/between buffers.
>
>  I don't get it. When we call the function like seek(0) then read(1000),
we can still use the data from buffer from python, right?

>
> >  4. return the data
>
> * make the above steps into a while-loop and there won't be a separate
> step here (it is the existing step 1!)
>
>
> * build all of the above into a function/method, so that the 'mainline'
> only has to say 'give me data'!
>
>
> > ### Write n data
> > 1. If data small enough to fill into the buffer, write data to the buffer
>
> =yes, the data coming from source (1), which in this case is 'your' code
> may/not be sufficient to fill the output size (3). So, load it into the
> "buffer" (2).
>
> > 2. If data can't fill into the buffer
> >  1. flush the data in the buffer
>
> =This statement seems to suggest that if there is already some data in
> the buffer, it will be wiped. Not recommended!
>
> We check if any data in the buffer if it does, we flush them to the disk (
https://github.com/python/cpython/blob/master/Modules/_io/bufferedio.c#L1948
)

> =Have replaced the next steps, see below for your consideration:-
>
> >  1. If succeed:
> >  1. create a new buffer object.
> >  2. fill the new buffer with data return from raw write
> >  2. If failed:
> >  1. Shifting the buffer to make room for writing data to the
> > buffer
> >  2. Buffer as much writing data as possible (may raise
> > BlockingIOError)
> >   

Re: Understand workflow about reading and writing files in Python

2019-06-23 Thread DL Neil

Yes, better to reply to list - others may 'jump in'...


On 20/06/19 5:37 PM, Windson Yang wrote:
Thank you so much for you review DL Neil, it really helps :D. However, 
there are some parts still confused me, I replyed as below.


It's not a particularly easy topic...


DL Neil > 于2019年6月19日周三 下午2:03写道:


I've not gone 'back' to refer to any ComSc theory on buffer-management.
Perhaps you might benefit from such?

I just take a crash course on it so I want to know if I understand the 
details correctly :D


...there are so many ways one can mess-up!



I like your use of the word "shift", so I'll continue to use it.

There are three separate units of data to consider - each of which
could
be called a "buffer". To avoid confusing (myself) I'll only call the
'middle one' that:
1 the unit of data 'coming' from the data-source
2 the "buffer" you are implementing
3 the unit of data 'going' out to a data-destination.

Just to make it clear, when we use `f.write('abc')` in python, (1) means 
'abc', (2) means the buffer handle by Python (by default 8kb), (2) means 
the file *f* we are writing to, right?


No! (sorry) f.write() is an output operation, thus nr3.

"f" is not a "buffer handle" but a "file handle" or more accurately a 
"file object".


When we:

one_input = f.read( NRbytes )

(ignoring EOF/short file and other exceptions) that many bytes will 
'appear' in our program labelled as "one_input".


However, the OpSys may have read considerably more data, depending upon 
the device(s) involved, the application, etc; eg if we ask for 2 bytes 
the operating system will read a much larger block (or applicable unit) 
of data from a disk drive.


The same applies in reverse, with f.write( NRbytes/byte-object ), until 
we flush or close the file.


Those situations account for nr1 and nr3. In the usual case, we have no 
control over the size of these buffers - and it is best not to meddle!


Hence:-


1 and 3 may be dictated to you, eg hardware or file specifications,
code
requirements, etc.

So, data is shifted into the (2) buffer in a unit-size decided by (1) -
in most use-cases each incoming unit will be the same size, but
remember
that the last 'unit' may/not be full-size. Similarly, data shifted out
from the (2) buffer to (3).

The size of (1) is likely not that of (3) - otherwise why use a
"buffer"? The size of (2) must be larger than (1) and larger than (2) -
for reasons already illustrated.

Is this a typo? (2) larger than (1) larger than (2)?


Correct - well spotted! nr2 > nr1 and nr2 > nr3



I recall learning how to use buffers with a series of hand-drawn block
diagrams. Recommend you try similarly!


Try this!



Now, let's add a few critiques, as requested (interposed below):-


On 19/06/19 3:53 PM, Windson Yang wrote:t
 > I'm trying to understand the workflow of how Python read/writes
data with
 > buffer. I will be appreciated if someone can review it.
 >
 > ### Read n data

- may need more than one read operation if the size of (3) "demands"
more data than the size of (1)/one "read".


Looks like the size of len of one read() depends on 
https://github.com/python/cpython/blob/master/Modules/_io/bufferedio.c#L1655 ?



You decide how many bytes should be read. That's how much will be 
transferred from the OpSys' I/O into the Python program's space. With 
the major exception, that if there is no (more) data available, it is 
defined as an exception (EOF = end of file) or if there are fewer bytes 
of data than requested (in which case you will be given only the number 
of bytes of data-available.




 > 1. If the data already in the buffer, return data

- this a data-transfer of size (3)

For extra credit/an unnecessary complication (but probable speed-up!):
* if the data-remaining is less than size (3) consider a read-ahead
mechanism

 > 2. If the data not in the buffer:

- if buffer's data-len < size (3)

 >      1. copy all the current data from the buffer

* if "buffer" is my (2), then no-op

I don't understand your point here, when we read data we would copy some 
data from the current buffer from python, right? 
(https://github.com/python/cpython/blob/master/Modules/_io/bufferedio.c#L1638), 
we use `out` (which point to res) to store the data here.


We're becoming confused: the original heading 'here' was "### Read n 
data" which is inconsistent with "out" and "from python".



If the read operation is set to transfer (say) 2KB into the program at a 
time, but the code processes it in 100B units, then it would seem that 
after the first read, twenty process loops will run before it is 
necessary to issue another input request.


In that example, the buffer (nr2) is twenty-times the length of the 
input 'buffer' (nr1).


So, from the second to the twentieth iteration of the process, your

generator to write N lines to file

2019-06-23 Thread Sayth Renshaw
Afternoon

Trying to create a generator to write the first N lines of text to a file.
However, I keep receiving the islice object not the text, why?

from itertools import islice

fileName = dir / "7oldsamr.txt"
dumpName = dir / "dump.txt"

def getWord(infile, lineNum):
with open(infile, 'r+') as f:
lines_gen = islice(f, lineNum)
yield lines_gen

for line in getWord(fileName, 5):
with open(dumpName, 'a') as f:
f.write(line)

Thanks,

Sayth
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: generator to write N lines to file

2019-06-23 Thread Peter Otten
Sayth Renshaw wrote:

> Afternoon
> 
> Trying to create a generator to write the first N lines of text to a file.
> However, I keep receiving the islice object not the text, why?
> 
> from itertools import islice
> 
> fileName = dir / "7oldsamr.txt"
> dumpName = dir / "dump.txt"
> 
> def getWord(infile, lineNum):
> with open(infile, 'r+') as f:
> lines_gen = islice(f, lineNum)
> yield lines_gen

To get the individual lines you have to yield them

  for line in lines_gen:
  yield line

This can be rewritten with some syntactic sugar as

  yield from lines_gen
 
> for line in getWord(fileName, 5):
> with open(dumpName, 'a') as f:
> f.write(line)

Note that getWord as written does not close the infile when you break out of 
the loop, e. g.

for line in getWord(filename, 5):
break

To avoid that you can use a context manager:

from itertools import islice
from contextlib import contextmanager

infile = ...
outfile = ...

@contextmanager
def head(infile, numlines):
with open(infile) as f:
yield islice(f, numlines)

with open(outfile, "w") as f:
with head(infile, 5) as lines:
f.writelines(lines)


-- 
https://mail.python.org/mailman/listinfo/python-list