Suggested datatype for getting latest information from log files
I have timestamped log files I need to read through and keep track of the most upto date information. For example lets say we had a log file timeStamp,name,marblesHeld,timeNow,timeSinceLastEaten I need to keep track of every 'name' in this table, I don't want duplicate values so if values come in from a later timestamp that is different then that needs to get updated. For example if a later timestamp showed 'dave' with less marbles that should get updated. I thought a dictionary would be a good idea because of the key restrictions ensuring no duplicates, so the data would always update - However because they are unordered and I need to do some more processing on the data afterwards I'm having trouble. For example lets assume that once I have the most upto date values from dave,steve,jenny I wanted to do timeNow - timeSinceLastEaten to get an interval then write all the info together to some other database. Crucially order is important here. I don't know of a particular name will appear in the records or not, so it needs to created on the first instance and updated from then on. Could anyone suggest some good approaches or suggested data structures for this? I thought about trying to create an object for each 'name' then check if that object exists and update values within that object. However that seemed like a. overkill b. beyond my Python skills for the timeframe I have -- https://mail.python.org/mailman/listinfo/python-list
Re: Suggested datatype for getting latest information from log files
On Thursday, February 11, 2016 at 6:16:35 PM UTC, jmp wrote: > On 02/11/2016 07:07 PM, [email protected] wrote: > > I thought a dictionary would be a good idea because of the key restrictions > > ensuring no duplicates, so the data would always update - However because > > they are unordered and I need to do some more processing on the data > > afterwards I'm having trouble. > > If it's your only concern about using dictionaries, then you may have a > look at > https://docs.python.org/2/library/collections.html#collections.OrderedDict > > JM I did look into that but I'm trying to do something like this which doesn't work - I guess I'm struggling a little with the implementation. fillinfo = {} fillInfo['name'] = OrderedDict('info1','info2','info3','info4','info5',) -- https://mail.python.org/mailman/listinfo/python-list
Re: Suggested datatype for getting latest information from log files
On Thursday, February 11, 2016 at 6:16:35 PM UTC, jmp wrote: > On 02/11/2016 07:07 PM, [email protected] wrote: > > I thought a dictionary would be a good idea because of the key restrictions > > ensuring no duplicates, so the data would always update - However because > > they are unordered and I need to do some more processing on the data > > afterwards I'm having trouble. > > If it's your only concern about using dictionaries, then you may have a > look at > https://docs.python.org/2/library/collections.html#collections.OrderedDict > > JM I did look into this but struggling a little with the implementation, currently trying to do something like this which doesn't work: fillInfo = {} p = re.compile('PATTERN') with (open(path,'r')) as f: for row in f: m = p.search(row) if m == None: continue else: fillInfo[m.group(5)] = OrderedDict(m.group(1),m.group(2),m.group(3),m.group(4),m.group(6)) -- https://mail.python.org/mailman/listinfo/python-list
