[PATCH 1/2] libdw: New get_uleb128_unchecked to use with already checked Dwarf_Abbrev.
When creating a Dwarf_Abbrev in dwarf_getabbrev (__libdw_getabbrev) we already check it is fully readable from the .debug_abbrev section. So whenever we reread it later using the attrp pointer we don't have to check it again. Introduce get_uleb128_unchecked to use for ulebs we know are safe to read directly. Signed-off-by: Mark Wielaard --- libdw/ChangeLog | 10 ++ libdw/dwarf_child.c | 19 +-- libdw/dwarf_getabbrevattr.c | 10 +- libdw/dwarf_getattrs.c | 20 +--- libdw/dwarf_hasattr.c | 22 +- libdw/memory-access.h | 18 ++ 6 files changed, 48 insertions(+), 51 deletions(-) diff --git a/libdw/ChangeLog b/libdw/ChangeLog index eb1cb709..3e3b7169 100644 --- a/libdw/ChangeLog +++ b/libdw/ChangeLog @@ -1,3 +1,13 @@ +2017-12-26 Mark Wielaard + + * memory-access.h (__libdw_get_uleb128_unchecked): New function. + (get_uleb128_unchecked): New define. + * dwarf_child.c (__libdw_find_attr): Use get_uleb128_unchecked to + read attr name and form. + * dwarf_getabbrevattr.c (dwarf_getabbrevattr): Likewise. + * dwarf_getattrs.c (dwarf_getattrs): Likewise. + * dwarf_hasattr.c (dwarf_hasattr): Likewise. + 2017-05-09 Ulf Hermann Mark Wielaard diff --git a/libdw/dwarf_child.c b/libdw/dwarf_child.c index cc95fb3f..248338eb 100644 --- a/libdw/dwarf_child.c +++ b/libdw/dwarf_child.c @@ -1,5 +1,5 @@ /* Return child of current DIE. - Copyright (C) 2003-2011, 2014 Red Hat, Inc. + Copyright (C) 2003-2011, 2014, 2017 Red Hat, Inc. This file is part of elfutils. Written by Ulrich Drepper , 2003. @@ -43,36 +43,27 @@ internal_function __libdw_find_attr (Dwarf_Die *die, unsigned int search_name, unsigned int *codep, unsigned int *formp) { - Dwarf *dbg = die->cu->dbg; const unsigned char *readp; /* Find the abbreviation entry. */ Dwarf_Abbrev *abbrevp = __libdw_dieabbrev (die, &readp); if (unlikely (abbrevp == DWARF_END_ABBREV)) { -invalid_dwarf: __libdw_seterrno (DWARF_E_INVALID_DWARF); return NULL; } - /* Search the name attribute. */ - unsigned char *const endp -= ((unsigned char *) dbg->sectiondata[IDX_debug_abbrev]->d_buf - + dbg->sectiondata[IDX_debug_abbrev]->d_size); - + /* Search the name attribute. Attribute has been checked when + Dwarf_Abbrev was created, we can read unchecked. */ const unsigned char *attrp = abbrevp->attrp; while (1) { /* Get attribute name and form. */ - if (unlikely (attrp >= endp)) - goto invalid_dwarf; unsigned int attr_name; - get_uleb128 (attr_name, attrp, endp); + get_uleb128_unchecked (attr_name, attrp); - if (unlikely (attrp >= endp)) - goto invalid_dwarf; unsigned int attr_form; - get_uleb128 (attr_form, attrp, endp); + get_uleb128_unchecked (attr_form, attrp); /* We can stop if we found the attribute with value zero. */ if (attr_name == 0 && attr_form == 0) diff --git a/libdw/dwarf_getabbrevattr.c b/libdw/dwarf_getabbrevattr.c index 3b4da99c..57fe3637 100644 --- a/libdw/dwarf_getabbrevattr.c +++ b/libdw/dwarf_getabbrevattr.c @@ -1,5 +1,5 @@ /* Get specific attribute of abbreviation. - Copyright (C) 2003, 2004, 2005, 2014 Red Hat, Inc. + Copyright (C) 2003, 2004, 2005, 2014, 2017 Red Hat, Inc. This file is part of elfutils. Written by Ulrich Drepper , 2003. @@ -53,10 +53,10 @@ dwarf_getabbrevattr (Dwarf_Abbrev *abbrev, size_t idx, unsigned int *namep, { start_attrp = attrp; - /* Attribute code and form are encoded as ULEB128 values.i - XXX We have no way to bounds check. */ - get_uleb128 (name, attrp, attrp + len_leb128 (name)); - get_uleb128 (form, attrp, attrp + len_leb128 (form)); + /* Attribute code and form are encoded as ULEB128 values. + Already checked when Dwarf_Abbrev was created, read unchecked. */ + get_uleb128_unchecked (name, attrp); + get_uleb128_unchecked (form, attrp); /* If both values are zero the index is out of range. */ if (name == 0 && form == 0) diff --git a/libdw/dwarf_getattrs.c b/libdw/dwarf_getattrs.c index 0da8b5ba..7f55fafc 100644 --- a/libdw/dwarf_getattrs.c +++ b/libdw/dwarf_getattrs.c @@ -1,5 +1,5 @@ /* Get attributes of the DIE. - Copyright (C) 2004, 2005, 2008, 2009, 2014 Red Hat, Inc. + Copyright (C) 2004, 2005, 2008, 2009, 2014, 2017 Red Hat, Inc. This file is part of elfutils. Written by Ulrich Drepper , 2004. @@ -51,7 +51,6 @@ dwarf_getattrs (Dwarf_Die *die, int (*callback) (Dwarf_Attribute *, void *), if (unlikely (abbrevp == DWARF_END_ABBREV)) { -invalid_dwarf: __libdw_seterrno (DWARF_E_INVALID_DWARF); return -1l; } @@ -61,24 +60,15 @@ dwarf_getattrs (Dwarf_Die *die, int (*callback) (Dwarf_Attribute *, void *), const unsigned char *
[PATCH 2/2] libdw: Reduce size of struct Dwarf_Abbrev.
If we don't cache the attrcnt and use bitfields for the has_children and code we can reduce the size of struct Dwarf Abbrev from 32 to 24 bytes on 64bit architectures and from 28 to 20 bytes on 32bit arches. Signed-off-by: Mark Wielaard --- libdw/ChangeLog | 7 +++ libdw/dwarf_getabbrev.c | 7 +++ libdw/dwarf_getattrcnt.c | 19 +-- libdw/libdwP.h | 15 +++ 4 files changed, 34 insertions(+), 14 deletions(-) diff --git a/libdw/ChangeLog b/libdw/ChangeLog index 3e3b7169..c284dbbd 100644 --- a/libdw/ChangeLog +++ b/libdw/ChangeLog @@ -1,3 +1,10 @@ +2017-12-26 Mark Wielaard + + * libdwP.h (struct Dwarf_Abbrev): Pack struct. Remove attrcnt, + use bitfields for has_children and code. + * dwarf_getabbrev.c (__libdw_getabbrev): Don't count attrs. + * dwarf_getattrcnt.c (dwarf_getattrcnt): Count attrs. + 2017-12-26 Mark Wielaard * memory-access.h (__libdw_get_uleb128_unchecked): New function. diff --git a/libdw/dwarf_getabbrev.c b/libdw/dwarf_getabbrev.c index ef51b847..a3a68b37 100644 --- a/libdw/dwarf_getabbrev.c +++ b/libdw/dwarf_getabbrev.c @@ -1,5 +1,5 @@ /* Get abbreviation at given offset. - Copyright (C) 2003, 2004, 2005, 2006, 2014 Red Hat, Inc. + Copyright (C) 2003, 2004, 2005, 2006, 2014, 2017 Red Hat, Inc. This file is part of elfutils. Written by Ulrich Drepper , 2003. @@ -121,8 +121,7 @@ __libdw_getabbrev (Dwarf *dbg, struct Dwarf_CU *cu, Dwarf_Off offset, abb->attrp = (unsigned char *) abbrevp; abb->offset = offset; - /* Skip over all the attributes and count them while doing so. */ - abb->attrcnt = 0; + /* Skip over all the attributes and check rest of the abbrev is valid. */ unsigned int attrname; unsigned int attrform; do @@ -134,7 +133,7 @@ __libdw_getabbrev (Dwarf *dbg, struct Dwarf_CU *cu, Dwarf_Off offset, goto invalid; get_uleb128 (attrform, abbrevp, end); } - while (attrname != 0 && attrform != 0 && ++abb->attrcnt); + while (attrname != 0 && attrform != 0); /* Return the length to the caller if she asked for it. */ if (lengthp != NULL) diff --git a/libdw/dwarf_getattrcnt.c b/libdw/dwarf_getattrcnt.c index 2bfb4ac5..a05976d4 100644 --- a/libdw/dwarf_getattrcnt.c +++ b/libdw/dwarf_getattrcnt.c @@ -1,5 +1,5 @@ /* Get number of attributes of abbreviation. - Copyright (C) 2003, 2004 Red Hat, Inc. + Copyright (C) 2003, 2004, 2017 Red Hat, Inc. This file is part of elfutils. Written by Ulrich Drepper , 2003. @@ -40,7 +40,22 @@ dwarf_getattrcnt (Dwarf_Abbrev *abbrev, size_t *attrcntp) if (abbrev == NULL) return -1; - *attrcntp = abbrev->attrcnt; + const unsigned char *abbrevp = abbrev->attrp; + + /* Skip over all the attributes and count them while doing so. */ + int attrcnt = 0; + unsigned int attrname; + unsigned int attrform; + do +{ + /* We can use unchecked since they were checked when the Dwrf_Abbrev +was created. */ + get_uleb128_unchecked (attrname, abbrevp); + get_uleb128_unchecked (attrform, abbrevp); +} + while (attrname != 0 && attrform != 0 && ++attrcnt); + + *attrcntp = attrcnt; return 0; } diff --git a/libdw/libdwP.h b/libdw/libdwP.h index 82b47d09..fbb8ed08 100644 --- a/libdw/libdwP.h +++ b/libdw/libdwP.h @@ -1,5 +1,5 @@ /* Internal definitions for libdwarf. - Copyright (C) 2002-2011, 2013-2016 Red Hat, Inc. + Copyright (C) 2002-2011, 2013-2017 Red Hat, Inc. This file is part of elfutils. Written by Ulrich Drepper , 2002. @@ -212,13 +212,12 @@ struct Dwarf /* Abbreviation representation. */ struct Dwarf_Abbrev { - Dwarf_Off offset; - unsigned char *attrp; - unsigned int attrcnt; - unsigned int code; - unsigned int tag; - bool has_children; -}; + Dwarf_Off offset; /* Offset to start of abbrev into .debug_abbrev. */ + unsigned char *attrp; /* Pointer to start of attribute name/form pairs. */ + bool has_children : 1; /* Whether or not the DIE has children. */ + unsigned int code : 31; /* The (unique) abbrev code. */ + unsigned int tag; /* The tag of the DIE. */ +} attribute_packed; #include "dwarf_abbrev_hash.h" -- 2.14.3
Simpler abbrev parsing using less memory
Hi, When we added bounds checking to almost all data reading functions (commit 7a05347 libdw: Add get_uleb128 and get_sleb128 bounds checking) we also added extra checks to the abbrev reading. But since we didn't really have bounds for the "raw" Dwarf_Abbrev reading functions we just "guessed" the maximum of a uleb128. This wasn't really correct and not really needed. A struct Dwarf_Abbrev can only be created by __libdw_getabbrev, which checks the whole abbrev (code, tag, children and attribute names/forms) is valid already. So whenever we use the attrp pointer from the Dwarf_Abbrev to read the name/forms we already know they are in the .debug_abbrev bounds). [PATCH 1/2] libdw: New get_uleb128_unchecked to use with already checked Dwarf_Abbrev. So the first patch introduces a get_uleb128_unchecked function that is used for re-reading such uleb128 values. The second patch reduces the size of the struct Dwarf_Abbrev by not storing the attrcnt and by using bitfields for has_children and code. [PATCH 2/2] libdw: Reduce size of struct Dwarf_Abbrev. The attrcnt was only used by the dwarf_getattrcnt function. Which is only used in one testcase. And which seems mostly unnecessary for real programs. The function now explicitly counts the attrs instead of using a cached value. The combined patches very slightly reduces the time for parsing abbrevs and make the abbrev cache somewhat smaller. On my machine eu-readelf -N --debug-dump=info libstdc++.so.debug goes down from 1.79 to 1.71 secs. And max rss goes down from 15.296 to 14.684 kbytes. Cheers, Mark