Hi, This often happens when one do some code: you hack, hack and hack around, then after a while, you realize your code need some clean up. I'm at this point in time with the ini parsing functions in all core packages of Openstack.
As we're supposed to be a team (even though I really feel too much alone doing the heavy work...), I'd like to discuss the problem publicly. When Roland was doing some packaging work, we had discussions on how to do the parsing of .ini files inside Openstack. I first wrote a very ugly shell script function, which was about 150 lines, which is completely lame, for such a small task. Then he decided (rightly) that it was crap, and tried rewriting it in python. The current code in openstack-pkg-tools is from him, and is like this: python -c "import configobj config=configobj.ConfigObj('${FILE}') config['${SECTION}']['${DIRECTIVE}']='${VALUE}' config.write()" Unfortunately, the Openstack .ini files aren't really ini files, and can have entries this way: [composite:ec2] use = egg:Paste#urlmap /services/Cloud: ec2cloud As a consequence, the class configobj which is in the minimal default python installation in Debian crashes. Roland then decided to patch the nova api-paste.ini to fix the problem, and transform some of the directives this way: diff -u etc/nova/api-paste.ini.orig etc/nova/api-paste.ini: [...] -/: meta +/ = meta [...] This somehow worked, but unfortunately, what none of us saw is that the Essex version of some Openstack packages, currently in Wheezy, does contain some non-standard .ini files. As a consequence, upgrade crashes, as per this bug: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=700620 More over, in Openstack Grizzly, which will be out soon, there's even more URLs like above, so we have even more problems. So, I'm at a point where the python code using configobj is not a viable solution. I didn't like using it and patching nova api-paste.ini (I'm not trying to put the blame on Roland, but it wasn't my idea), but now I'm even more convince it's not the solution to our problem. In Grizzly, there's a new python module called python-oslo.config. It's very easy to use, and knows how to parse. The problem is that using a python module means adding a Pre-Depends, which breaks the flow of Debconf. Indeed, here's what it does currently: # apt-get install openstack-proxy-node 1/ some downloads 2/ Absolutely all debconf questions 3/ Installation of all packages But if I add a Pre-Depends: python-oslo.config, it will go like this: # apt-get install openstack-proxy-node 1/ some downloads 2/ some debconf questions, for example mysqld password 3/ some install with dpkg, including python-oslo.config 4/ some more questions for the packages that have the Pre-Depends: 5/ some install with dpkg of the packages not finished to install I don't want the 2nd version. Openstack operator all want to answer all questions, then go to the coffee machine and rest 10 minutes during the setup... ;) So, I continue to think that everything should be done in shell script. Not the way I wrote it the first time though. The goal is to have the same implementation as this: https://github.com/openstack/oslo-config/blob/master/oslo/config/iniparser.py So, I wrote 2 versions: one in pure sh, and one using awk. Both are using regular expression, so they are easy to fix if the config files where to evolve in format (which is most probable seeing the history of nova.conf: from flag file, to config file, to .ini format, to what it is now ...). I have attached both versions, and I would like my you, my fellow Openstack Debian packaging team members, to comment on the implementation. Which of the 2 seems the most maintainable, easy to understand, and above all the most correct and bug free? * Note that I'm aware that the awk version doesn't do all of what is described before the function, and would need some more work. * Would any of you use a totally different approach to the problem? For the moment, I would lean toward the grep + sed version, as I see it easier to read and understand, plus there's a unique parsing logic for both get and set, which is nicer. If you have suggestions on how to improve, or do it better in a totally different way, please do explain how you would do it and share some code. *** WARNING *** With these functions, I'm breaking the current openstack-pkg-os API, so I will have to edit all core projects (nova, cinder, keystone, glance, quantum, ceilometer and heat). I do *expect some breakage* doing it, since I'm also changing parameter order calling, and removing some parameters which I think are bad. All this is error prone. So if you need to get a working snapshot of Folsom from my FTP, please do it right now, before I start uploading new versions. As for the Grizzly release G3, it's currently broken anyway until all this work is done (so please be patient and wait for me to fix cinder, quantum and nova in Grizzly, who are failing at the postinst stage currently). Thanks already if you read all the above, and even more for all those who will take the time to voice an opinion and contribute a better implementation. Cheers, Thomas
#!/bin/sh # Params: $1 = set or get (set means read mode and return a directive value, get means write a new value in it) # $2 = config file path (example: /etc/nova/nova.conf) # $3 = .ini file section (example: DEFAULT) # $4 = directive name (example: sql_connection) # $5 = only present in "set" mode: new value to replace in the .ini file # Note that if $3 = NO_SECTION, then get or set mode will not manage sections at all, # and will totally ignore the sections in the .ini file. # # Example (get the value of keystone hostname in your Nova box): # parse_ini_file /etc/nova/api-paste.ini filter:authtoken auth_host # # Returns: $RET: either NOT_FOUND, or the (previous, in the case of set) value of the searched directive pkgos_inifile () { local ACCESS_MODE MYCONFIG SEARCH_SECTION SEARCH_DIRECTIVE ACCESS_MODE=${1} ; MYCONFIG=${2} ; SEARCH_SECTION=${3} ; SEARCH_DIRECTIVE=${4} if [ "x${ACCESS_MODE}" = "xset" ] ; then NEW_VALUE=${5} ; else NEW_VALUE="pkgos_inifile_function_called_with_wrong_access_mode" ; fi if [ "x${ACCESS_MODE}" = "xset" ] ; then RET=$(awk -v section=${SEARCH_SECTION} -v directive=${SEARCH_DIRECTIVE} ' $0~"^[ \t]*\[.*\][ \t]*$"{in_section=0} $0~"^[ \t]*\[" section "\][ \t]*$"{in_section=1} in_section && $0~"^[ \t]*" directive "[ \t]*[:=][ \t]*" { sub("^[ \t]*" directive "[ \t]*[:=][ \t]*", "") sub("^[ \t]*$", "") print exit }' ${MYCONFIG}) }else{ TMPFILE=`mktemp -t blabla.XXXXXX` awk -v section="${SEARCH_SECTION}" -v directive="${SEARCH_DIRECTIVE}" -v value="myvalue" -v separator="=" ' BEGIN{if (section=="NO_SECTION") {no_section=1; in_section=1}} !no_section && $0~"^[ \t]*\[.*\][ \t]*$"{in_section=0} !no_section && $0~"^[ \t]*\[" section "\][ \t]*$"{in_section=1} { if ( (no_section || in_section) && $0~"^[ \t]*" directive "[ \t]*[:=][ \t]*" ) print directive separator value else print }' api-paste.ini > "${TMPFILE}" && cat "${TMPFILE}" > api-paste.ini && rm -f "${TMPFILE}" } }
#!/bin/sh set -e # Params: $1 = set or get (set means read mode and return a directive value, get means write a new value in it) # $2 = config file path (example: /etc/nova/nova.conf) # $3 = .ini file section (example: DEFAULT) # $4 = directive name (example: sql_connection) # $5 = only present in "set" mode: new value to replace in the .ini file # Note that if $3 = NO_SECTION, then get or set mode will not manage sections at all, # and will totally ignore the sections in the .ini file. # # Example (get the value of keystone hostname in your Nova box): # parse_ini_file /etc/nova/api-paste.ini filter:authtoken auth_host # # Returns: $RET: either NOT_FOUND, or the (previous, in the case of set) value of the searched directive pkgos_inifile () { local ACCESS_MODE MYCONFIG SEARCH_SECTION SEARCH_DIRECTIVE NUM_LINES CNT CUR_SECTION DIRECTIVE VALUE CUR_SECTION ACCESS_MODE=${1} ; MYCONFIG=${2} ; SEARCH_SECTION=${3} ; SEARCH_DIRECTIVE=${4} if [ "x${ACCESS_MODE}" = "xset" ] ; then NEW_VALUE=${5} ; else NEW_VALUE="pkgos_inifile_function_called_with_wrong_access_mode" ; fi CUR_SECTION=NO_SECTION CNT=0 RET=NOT_FOUND if [ ! -r ${MYCONFIG} ] ; then return ; fi # Iterate through all lines of the file NUM_LINES=`cat ${MYCONFIG} | wc -l` cat ${MYCONFIG} | while read LINE ; do CNT=$((${CNT} + 1 )) # This is a section block: [DEFAULT] if echo $LINE | grep -q '^[ \t]*\[.*\][ \t]*$' ; then if [ "${SEARCH_SECTION}" = "NO_SECTION" ] ; then SECTION=`echo ${LINE} | sed -e 's/\[//' | sed -e 's/\]//'` #echo "===> $SECTION" fi # This is a directive: auth_host = 127.0.0.1 elif echo $LINE | grep -q '^[ \t]*[\._\/a-zA-Z0-9]*[ \t]*[=\:][ \t]*' ; then # This is a directive which uses the equal sign (directive = value) if echo $LINE | grep -q '^[ \t]*[\._\/a-zA-Z0-9]*[ \t]*=[ \t]*' ; then DIRECTIVE=`echo $LINE | cut -d= -f1 | awk '{print $1}'` DIRECTIVE_TYPE="equal" VALUE=`echo $LINE | cut -d= -f2 | sed -e 's/^[ \t]//'` #echo ${DIRECTIVE}=${VALUE} # This one uses the semi-column sign (/directive: value) else DIRECTIVE=`echo $LINE | cut -d':' -f1 | awk '{print $1}'` DIRECTIVE_TYPE="dots" VALUE=`echo $LINE | cut -d':' -f2 | sed -e 's/^[ \t]//'` #echo ${DIRECTIVE}: ${VALUE} fi if [ "${SECTION}" = "${SEARCH_SECTION}" ] && [ "${DIRECTIVE}" = "${SEARCH_DIRECTIVE}" ] ; then RET=${VALUE} if [ "x${ACCESS_MODE}" = "xset" ] ; then if [ "${DIRECTIVE_TYPE}" = "equal" ] ; then sed -i ${CNT}' s/.*/'${DIRECTIVE}' = '${NEW_VALUE}/ ${MYCONFIG} else sed -i ${CNT}' s/.*/'${DIRECTIVE}': '${NEW_VALUE}/ ${MYCONFIG} fi fi return fi # This is a comment block elif echo $LINE | grep -q '^[ \t]*#.*$' ; then echo -n "" fi done }