Suggested datatype for getting latest information from log files

2016-02-11 Thread ltomassmail
I have timestamped  log files I need to read through and keep track of the most 
upto date information.

For example lets say we had a log file

timeStamp,name,marblesHeld,timeNow,timeSinceLastEaten

I need to keep track of every 'name' in this table, I don't want duplicate 
values so if values come in from a later timestamp that is different then that 
needs to get updated. For example if a later timestamp showed 'dave' with less 
marbles that should get updated.

I thought a dictionary would be a good idea because of the key restrictions 
ensuring no duplicates, so the data would always update - However because they 
are unordered and I need to do some more processing on the data afterwards I'm 
having trouble.

For example lets assume that once I have the most upto date values from 
dave,steve,jenny I wanted to do timeNow - timeSinceLastEaten to get an interval 
then write all the info together to some other database. Crucially order is 
important here.

I don't know of a particular name will appear in the records or not, so it 
needs to created on the first instance and updated from then on.

Could anyone suggest some good approaches or suggested data structures for this?

I thought about trying to create an object for each 'name' then check if that 
object exists and update values within that object. However that seemed like
a. overkill
b. beyond my Python skills for the timeframe I have
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Suggested datatype for getting latest information from log files

2016-02-11 Thread ltomassmail
On Thursday, February 11, 2016 at 6:16:35 PM UTC, jmp wrote:
> On 02/11/2016 07:07 PM, [email protected] wrote:
> > I thought a dictionary would be a good idea because of the key restrictions 
> > ensuring no duplicates, so the data would always update - However because 
> > they are unordered and I need to do some more processing on the data 
> > afterwards I'm having trouble.
> 
> If it's your only concern about using dictionaries, then you may have a 
> look  at 
> https://docs.python.org/2/library/collections.html#collections.OrderedDict
> 
> JM

I did look into that but I'm trying to do something like this which doesn't 
work - I guess I'm struggling a little with the implementation.
fillinfo = {}
fillInfo['name'] = OrderedDict('info1','info2','info3','info4','info5',)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Suggested datatype for getting latest information from log files

2016-02-11 Thread ltomassmail
On Thursday, February 11, 2016 at 6:16:35 PM UTC, jmp wrote:
> On 02/11/2016 07:07 PM, [email protected] wrote:
> > I thought a dictionary would be a good idea because of the key restrictions 
> > ensuring no duplicates, so the data would always update - However because 
> > they are unordered and I need to do some more processing on the data 
> > afterwards I'm having trouble.
> 
> If it's your only concern about using dictionaries, then you may have a 
> look  at 
> https://docs.python.org/2/library/collections.html#collections.OrderedDict
> 
> JM

I did look into this but struggling a little with the implementation, currently 
trying to do something like this which doesn't work:

fillInfo = {}
p = re.compile('PATTERN')
with (open(path,'r')) as f:
for row in f:
m = p.search(row)
if m == None:
continue
else:
fillInfo[m.group(5)] = 
OrderedDict(m.group(1),m.group(2),m.group(3),m.group(4),m.group(6))
-- 
https://mail.python.org/mailman/listinfo/python-list