question about loading variables from a file...
Can someone offer some advice as to how the best way to approach this might be? I am trying to write a generic python script to build out some applications, so the python script will be generic enough to work for all of them, but it needs to 'source' a file that contains app and environment specific variables. I have used WLST, which provides the loadProperties() method for reading values for weblogic domains, but I'm trying to find the best way to do the same sort of thing in python... thanks for any advice. -- http://mail.python.org/mailman/listinfo/python-list
Help with pyparsing and dealing with null values
Help with pyparsing and dealing with null values
I am trying to parse a log file (web.out) similar to this:
---
MBeanName: "mtg-model:Name=mtg-model_managed2,Type=Server"
AcceptBacklog: 50
AdministrationPort: 0
AutoKillIfFailed: false
AutoRestart: true
COM: mtg-model_managed2
COMEnabled: false
CachingDisabled: true
ClasspathServletDisabled: false
ClientCertProxyEnabled: false
Cluster: mtg-model-cluster
ClusterRuntime: mtg-model-cluster
ClusterWeight: 100
CompleteCOMMessageTimeout: -1
CompleteHTTPMessageTimeout: -1
CompleteIIOPMessageTimeout: -1
CompleteMessageTimeout: 60
CompleteT3MessageTimeout: -1
CustomIdentityKeyStoreFileName:
CustomIdentityKeyStorePassPhrase:
CustomIdentityKeyStorePassPhraseEncrypted:
CustomIdentityKeyStoreType:
CustomTrustKeyStoreFileName:
CustomTrustKeyStorePassPhrase:
CustomTrustKeyStorePassPhraseEncrypted:
CustomTrustKeyStoreType:
DefaultIIOPPassword:
DefaultIIOPPasswordEncrypted:
DefaultIIOPUser:
DefaultInternalServletsDisabled: false
DefaultProtocol: t3
DefaultSecureProtocol: t3s
DefaultTGIOPPassword:
DefaultTGIOPPasswordEncrypted: **
DefaultTGIOPUser: guest
DomainLogFilter:
EnabledForDomainLog: true
ExecuteQueues: weblogic.kernel.Default,foglight
ExpectedToRun: false
ExternalDNSName:
ExtraEjbcOptions:
ExtraRmicOptions:
GracefulShutdownTimeout: 0
---
and I need the indented values (eventually) in a dictionary. As you
can see, some of the fields have a value, and some do not. It appears
that the code I have so far is not dealing with the null values and
colons as I had planned. Here is the code:
---
from pyparsing import *
input = open("web.out", 'r')
data = input.read()
end = Literal("\n").suppress()
all = SkipTo(end)
colon = Literal(":").suppress()
MBeanName = Literal("MBeanName:")
ServerName = dblQuotedString
identity = Word(alphas, alphanums+"._*/,-")
pairs = Group(identity + colon + Optional(identity) +all)
logEntry = MBeanName + ServerName.setResultsName("servername") +
OneOrMore(pairs)
for tokens in logEntry.searchString(data):
print
print "ServerName =\t "+ tokens.servername
for t in tokens:
print t
print
print 50*"-"
-
which is giving me this:
-
ServerName = "mtg-model:Name=mtg-modelserver_map501,Type=Server"
MBeanName:
"mtg-model:Name=mtg-modelserver_map501,Type=Server"
['AcceptBacklog', '50']
['AdministrationPort', '0']
['AutoKillIfFailed', 'false', 'AutoRestart: true']
['COM', 'mtg-modelserver_map501', 'COMEnabled: false']
['CachingDisabled', 'true', 'ClasspathServletDisabled: false']
['ClientCertProxyEnabled', 'false', 'Cluster:']
['ClusterRuntime', 'ClusterWeight', ': 100']
['CompleteCOMMessageTimeout', '-1']
['CompleteHTTPMessageTimeout', '-1']
['CompleteIIOPMessageTimeout', '-1']
['CompleteMessageTimeout', '60']
['CompleteT3MessageTimeout', '-1']
['CustomIdentityKeyStoreFileName', 'CustomIdentityKeyStorePassPhrase',
':']
['CustomIdentityKeyStorePassPhraseEncrypted',
'CustomIdentityKeyStoreType', ':']
['CustomTrustKeyStoreFileName', 'CustomTrustKeyStorePassPhrase', ':']
['CustomTrustKeyStorePassPhraseEncrypted', 'CustomTrustKeyStoreType',
':']
['DefaultIIOPPassword', 'DefaultIIOPPasswordEncrypted', ':']
['DefaultIIOPUser', 'DefaultInternalServletsDisabled', ': false']
['DefaultProtocol', 't3', 'DefaultSecureProtocol: t3s']
['DefaultTGIOPPassword', 'DefaultTGIOPPasswordEncrypted', ': **']
['DefaultTGIOPUser', 'guest', 'DomainLogFilter:']
['EnabledForDomainLog', 'true', 'ExecuteQueues:
weblogic.kernel.Default,foglight']
['ExpectedToRun', 'false', 'ExternalDNSName:']
['ExtraEjbcOptions', 'ExtraRmicOptions', ':']
['GracefulShutdownTimeout', '0']
instead of this (one to one):
ServerName = "mtg-model:Name=mtg-modelserver_map501,Type=Server"
MBeanName:
"mtg-model:Name=mtg-modelserver_map501,Type=Server"
['AcceptBacklog', '50']
['AdministrationPort', '0']
['AutoKillIfFailed', 'false']
['AutoRestart', 'true']
['COM', 'mtg-modelserver_map501']
['COMEnabled', 'false']
['CachingDisabled', 'true']
['ClasspathServletDisabled', false']
['ClientCertProxyEnabled', 'false']
['Cluster', 'mtg-model-cluster']
['ClusterRuntime', 'mtg-model-cluster']
['ClusterWeight', '100']
['CompleteCOMMessageTimeout', '-1']
['CompleteHTTPMessageTimeout', '-1']
['CompleteIIOPMessageT
Re: Help with pyparsing and dealing with null values
On Mon, 29 Oct 2007 05:45:26 -0700, Paul McGuire <[EMAIL PROTECTED]>
wrote:
>On Oct 29, 1:11 am, avidfan <[EMAIL PROTECTED]> wrote:
>> Help with pyparsing and dealing with null values
>>
>> I am trying to parse a log file (web.out) similar to this:
>>
>> ---
>>
>> MBeanName: "mtg-model:Name=mtg-model_managed2,Type=Server"
>> AcceptBacklog: 50
>
>> ExpectedToRun: false
>> ExternalDNSName:
>> ExtraEjbcOptions:
>> ExtraRmicOptions:
>> GracefulShutdownTimeout: 0
>>
>> ---
>>
>> and I need the indented values (eventually) in a dictionary. As you
>> can see, some of the fields have a value, and some do not. It appears
>> that the code I have so far is not dealing with the null values and
>> colons as I had planned.
>>
>
>This is a very good first cut at the problem. Here are some tips to
>get you going again:
>
>1. Literal("\n") wont work, use LineEnd() instead. Literals are for
>non-whitespace literal strings.
>
>
>2. "all = SkipTo(end)" can be removed, use restOfLine instead of all.
>("all" as a variable name masks Python 2.5's "all" builtin function.)
>
>
>3. In addition to identity, you might consider defining some other
>known value types:
>
>boolean = oneOf("true false")
>boolean.setParseAction(lambda toks: toks[0]=="true")
>
>integer = Combine(Optional("-") + Word(nums))
>integer.setParseAction(lambda toks: int(toks[0]))
>
>These will do data conversion for you at parse time, so that the
>values are already in int or bool form when you access them later.
>
>
>4. The significant change is to this line (I've replaced all with
>restOfLine):
>
>pairs = Group(identity + colon + Optional(identity) + restOfLine)
>
>What gives us a problem is that pyparsing's whitespace-skipping will
>read an identity, even if it's not on the same line. So for keys that
>have no value given, you end up reading past the end-of-line and read
>the next key name as the value for the previous key. To work around
>this, define the value as something which must be on the same line,
>using the NotAny lookahead, which you can abbreviate using the ~
>operator.
>
>pairs = Group(identity + colon + Optional(~end + (identity |
>restOfLine) ) + end )
>
>If we add in the other known value types, this gets a bit unwieldy, so
>I recommend you define value separately:
>
>value = boolean | integer | identity | restOfLine
>pairs = Group(identity + colon + Optional(~end + value) + end )
>
>At this point, I think you have a working parser for your log data.
>
>
>5. (Extra Credit) Lastly, to create a dictionary, you are all set to
>just add pyparsing's Dict class. Change:
>
>logEntry = MBeanName + ServerName("servername") + OneOrMore(pairs)
>
>to:
>
>logEntry = MBeanName + ServerName("servername") +
>Dict(OneOrMore(pairs))
>
>(I've also removed ".setResultsName", using the new shortened form for
>setting results names.)
>
>Dict will return the parsed tokens as-is, but it will also define
>results names using the tokens[0] element of each list of tokens
>returned by pairs - the values will be the tokens[1:], so that if a
>value expression contains multiple tokens, they all will be associated
>with the results name key.
>
>Now you can replace the results listing code with:
>
>for t in tokens:
> print t
>
>with
>
>print tokens.dump()
>
>And you can access the tokens as if they are a dict, using:
>
>print tokens.keys()
>print tokens.values()
>print tokens["ClasspathServletDisabled"]
>
>If you prefer, for keys that are valid Python identifiers (all of
>yours appear to be), you can just use object.attribute notation:
>
>print tokens.ClasspathServletDisabled
>
>Here is some sample output, using dump(), keys(), and attribute
>lookup:
>
>tokens.dump() -> ['MBeanName:', '"mtg-model:Name=mtg-
>model_managed2,Type=Server"', ['AcceptBacklog', 50],
>['AdministrationPort', 0], ['AutoKillIfFailed', False],
>['AutoRestart', True], ['COM', 'mtg-model_managed2'], ['COMEnabled',
>False], ['CachingDisabled', True], ['ClasspathServletDisabled',
>False], ['ClientCertProxyEnabled', False], ['Cluster', 'mtg-model-
>cluster'], ['C
log parser design question
I need to parse a log file using python and I need some advice/wisdom
on the best way to go about it:
The log file entries will consist of something like this:
ID=8688 IID=98889998 execute begin - 01.21.2007 status enabled
locked working.lock
status running
status complete
ID=9009 IID=87234785 execute wait - 01.21.2007 status wait
waiting to lock
status wait
waiting on ID=8688
and so on...
I need to be able to group these entries together, index them by ID
and IID, and search the context of each entry and if a certain status
if found (such as wait), then be able to return the ID or IID
(depending...) of that entry.
So I was considering parsing them to this effect:
in a dictionary, where the key is a tuple, and the value is a list:
{('ID=8688', 'IID=98889998'): ['ID=8688 IID=98889998 execute begin -
01.21.2007 status enabled', 'locked working.lock', 'status running',
'status complete']}
I am keeping the full text of each entry in the list so that I can
recreate them for display if need be.
I am fairly new to python, so could anyone offer any advice here
before I get too far and discover a fatal flaw that you might see
coming a mile away?
would I, with this design, be able to, for example, search each list
for "waiting on ID=8688", and when found, be able to associate that
value with one of the elements of it's key "ID=9009" ? or is this
approached flawed? I'm assuming there is a better way, but I need
some advice...
I appreciate any thoughts.
Thanks.
--
http://mail.python.org/mailman/listinfo/python-list
Re: log parser design question
On 28 Jan 2007 21:20:47 -0800, "Paul McGuire" <[EMAIL PROTECTED]>
wrote:
>On Jan 27, 10:43 pm, avidfan <[EMAIL PROTECTED]> wrote:
>> I need to parse a log file using python and I need some advice/wisdom
>> on the best way to go about it:
>>
>> The log file entries will consist of something like this:
>>
>> ID=8688 IID=98889998 execute begin - 01.21.2007 status enabled
>> locked working.lock
>> status running
>> status complete
>>
>> ID=9009 IID=87234785 execute wait - 01.21.2007 status wait
>> waiting to lock
>> status wait
>> waiting on ID=8688
>>
>> and so on...
>>
>For the parsing of this data, here is a pyparsing approach. Once
>parse, the pyparsing ParseResults data structures can be massaged into
>a queryable list. See the examples at the end for accessing the
>individual parsed fields.
>
>-- Paul
>
>data = """
>ID=8688 IID=98889998 execute begin - 01.21.2007 status enabled
>locked working.lock
>status running
>status complete
>
>
>ID=9009 IID=87234785 execute wait - 01.21.2007 status wait
>waiting to lock
>status wait
>waiting on ID=8688
>
>"""
>from pyparsing import *
>
>integer=Word(nums)
>idref = "ID=" + integer.setResultsName("id")
>iidref = "IID=" + integer.setResultsName("iid")
>date = Regex(r"\d\d\.\d\d\.\d{4}")
>
>logLabel = Group("execute" + oneOf("begin wait"))
>logStatus = Group("status" + oneOf("enabled wait"))
>lockQual = Group("locked" + Word(alphanums+"."))
>waitingOnQual = Group("waiting on" + idref)
>statusQual = Group("status" + oneOf("running complete wait"))
>waitingToLockQual = Group(Literal("waiting to lock"))
>statusQualifier = statusQual | waitingOnQual | waitingToLockQual |
>lockQual
>logEntry = idref + iidref + logLabel.setResultsName("logtype") + "-" \
>+ date + logStatus.setResultsName("status") \
>+ ZeroOrMore(statusQualifier).setResultsName("quals")
>
>for tokens in logEntry.searchString(data):
>print tokens
>print tokens.dump()
>print tokens.id
>print tokens.iid
>print tokens.status
>print tokens.quals
>print
>
>prints:
>
>['ID=', '8688', 'IID=', '98889998', ['execute', 'begin'], '-',
>'01.21.2007', ['status', 'enabled'], ['locked', 'working.lock'],
>['status', 'running'], ['status', 'complete']]
>['ID=', '8688', 'IID=', '98889998', ['execute', 'begin'], '-',
>'01.21.2007', ['status', 'enabled'], ['locked', 'working.lock'],
>['status', 'running'], ['status', 'complete']]
>- id: 8688
>- iid: 98889998
>- logtype: ['execute', 'begin']
>- quals: [['locked', 'working.lock'], ['status', 'running'],
>['status', 'complete']]
>- status: ['status', 'enabled']
>8688
>98889998
>['status', 'enabled']
>[['locked', 'working.lock'], ['status', 'running'], ['status',
>'complete']]
>
>['ID=', '9009', 'IID=', '87234785', ['execute', 'wait'], '-',
>'01.21.2007', ['status', 'wait'], ['waiting to lock'], ['status',
>'wait'], ['waiting on', 'ID=', '8688']]
>['ID=', '9009', 'IID=', '87234785', ['execute', 'wait'], '-',
>'01.21.2007', ['status', 'wait'], ['waiting to lock'], ['status',
>'wait'], ['waiting on', 'ID=', '8688']]
>- id: 9009
>- iid: 87234785
>- logtype: ['execute', 'wait']
>- quals: [['waiting to lock'], ['status', 'wait'], ['waiting on',
>'ID=', '8688']]
>- status: ['status', 'wait']
>9009
>87234785
>['status', 'wait']
>[['waiting to lock'], ['status', 'wait'], ['waiting on', 'ID=',
>'8688']]
Paul,
Thanks! That's a great module. I've been going through the docs and
it seems to do exactly what I need...
I appreciate your help!
--
http://mail.python.org/mailman/listinfo/python-list
Re: log parser design question
On Mon, 29 Jan 2007 23:11:32 -0600, avidfan <[EMAIL PROTECTED]> wrote:
>On 28 Jan 2007 21:20:47 -0800, "Paul McGuire" <[EMAIL PROTECTED]>
>wrote:
>
>>On Jan 27, 10:43 pm, avidfan <[EMAIL PROTECTED]> wrote:
>>> I need to parse a log file using python and I need some advice/wisdom
>>> on the best way to go about it:
>>>
>>> The log file entries will consist of something like this:
>>>
>>> ID=8688 IID=98889998 execute begin - 01.21.2007 status enabled
>>> locked working.lock
>>> status running
>>> status complete
>>>
>>> ID=9009 IID=87234785 execute wait - 01.21.2007 status wait
>>> waiting to lock
>>> status wait
>>> waiting on ID=8688
>>>
>>> and so on...
>>>
>>For the parsing of this data, here is a pyparsing approach. Once
>>parse, the pyparsing ParseResults data structures can be massaged into
>>a queryable list. See the examples at the end for accessing the
>>individual parsed fields.
>>
>>-- Paul
>>
>>data = """
>>ID=8688 IID=98889998 execute begin - 01.21.2007 status enabled
>>locked working.lock
>>status running
>>status complete
>>
>>
>>ID=9009 IID=87234785 execute wait - 01.21.2007 status wait
>>waiting to lock
>>status wait
>>waiting on ID=8688
>>
>>"""
>>from pyparsing import *
>>
>>integer=Word(nums)
>>idref = "ID=" + integer.setResultsName("id")
>>iidref = "IID=" + integer.setResultsName("iid")
>>date = Regex(r"\d\d\.\d\d\.\d{4}")
>>
>>logLabel = Group("execute" + oneOf("begin wait"))
>>logStatus = Group("status" + oneOf("enabled wait"))
>>lockQual = Group("locked" + Word(alphanums+"."))
>>waitingOnQual = Group("waiting on" + idref)
>>statusQual = Group("status" + oneOf("running complete wait"))
>>waitingToLockQual = Group(Literal("waiting to lock"))
>>statusQualifier = statusQual | waitingOnQual | waitingToLockQual |
>>lockQual
>>logEntry = idref + iidref + logLabel.setResultsName("logtype") + "-" \
>>+ date + logStatus.setResultsName("status") \
>>+ ZeroOrMore(statusQualifier).setResultsName("quals")
>>
>>for tokens in logEntry.searchString(data):
>>print tokens
>>print tokens.dump()
>>print tokens.id
>>print tokens.iid
>>print tokens.status
>>print tokens.quals
>>print
>>
>>prints:
>>
>>['ID=', '8688', 'IID=', '98889998', ['execute', 'begin'], '-',
>>'01.21.2007', ['status', 'enabled'], ['locked', 'working.lock'],
>>['status', 'running'], ['status', 'complete']]
>>['ID=', '8688', 'IID=', '98889998', ['execute', 'begin'], '-',
>>'01.21.2007', ['status', 'enabled'], ['locked', 'working.lock'],
>>['status', 'running'], ['status', 'complete']]
>>- id: 8688
>>- iid: 98889998
>>- logtype: ['execute', 'begin']
>>- quals: [['locked', 'working.lock'], ['status', 'running'],
>>['status', 'complete']]
>>- status: ['status', 'enabled']
>>8688
>>98889998
>>['status', 'enabled']
>>[['locked', 'working.lock'], ['status', 'running'], ['status',
>>'complete']]
>>
>>['ID=', '9009', 'IID=', '87234785', ['execute', 'wait'], '-',
>>'01.21.2007', ['status', 'wait'], ['waiting to lock'], ['status',
>>'wait'], ['waiting on', 'ID=', '8688']]
>>['ID=', '9009', 'IID=', '87234785', ['execute', 'wait'], '-',
>>'01.21.2007', ['status', 'wait'], ['waiting to lock'], ['status',
>>'wait'], ['waiting on', 'ID=', '8688']]
>>- id: 9009
>>- iid: 87234785
>>- logtype: ['execute', 'wait']
>>- quals: [['waiting to lock'], ['status', 'wait'], ['waiting on',
>>'ID=', '8688']]
>>- status: ['status', 'wait']
>>9009
>>87234785
>>['status', 'wait']
>>[['waiting to lock'], ['status', 'wait'], ['waiting on', 'ID=',
>>'8688']]
>
>Paul,
>
>Thanks! That's a great module. I've been going through the docs and
>it seems to do exactly what I need...
>
>I appreciate your help!
http://www.camelrichard.org/roller/page/camelblog?entry=h3_parsing_log_files_with
Thanks, Paul!
--
http://mail.python.org/mailman/listinfo/python-list
