gcc - Detecting if a C++ class is abstract purely on DWARF information - Stack Overflow

admin2025-04-09  0

Context

I'm creating an DWARF parser, this parser must be able to detect which classes are abstract classes. The parser is created to parse the output of a certain code-base. I have a working method that solves 90% of the cases, but I want to reach full coverage of the code-base. To show what the final 10% is about, a minimum, reproducible exmaple was created.

Based on the C++ definition, we have the following statements:

  • An abstract class cannot be instantiated by itself, but can be used as a base class.
  • An abstract class is a class that either defines or inherits at least one function for which the final overrider is pure virtual.

Setup

In my example I have the following:

  • Class Interface which is abstract, and only has pure virtual functions
  • Class Partial which is abstract, because it derives from an abstract and does not override the function bFunc(), so the final overrider is still pure virtual, hence the class itself is abstract.
  • Class Full which is concrete (i.e. not an abstract class), as the final specifiers of both functions are not pure virtual.

The above classes are used as an interface, I furthermore specify two classes which I will use to create objects

  • Class Foo, based on the Partial class
  • Class Bar, based on the Full class

The code is shown below:

// Pure virtual abstract class, used as an interface
class Interface {
  public:
    virtual void aFunc() = 0;
    virtual void bFunc() = 0;
};

// Still abstract Interface
class Partial: public Interface {
  public:
    void aFunc() override {}
};

// Concrete Interface
class Full: public Interface {
  public:
    void aFunc() override {}
    void bFunc() override {}
};

// Concrete
class Foo : public Partial {
  public:
    void fooFunc(){ }
    void bFunc() override {}
};

// Concrete
class Bar : public Full {
  public:
    void barFunc(){ }
};

Foo      inst_foo;
Bar      inst_bar;
Partial* inter_foo = &inst_foo;
Full*    inter_bar = &inst_bar;


// Main
int main() {

    return 0;
}

Class diagram

        ┌─────────┐      
        │Interface│      
        └────▲────┘      
             │           
      ┌──────┴─────┐     
  ┌───┴───┐    ┌───┴──┐  
  │Partial│    │ Full │  
  └───▲───┘    └───▲──┘  
      │            │     
   ┌──┴──┐      ┌──┴──┐  
   │ Foo │      │ Bar │  
   └─────┘      └─────┘  

Problem

Now let us examine the DWARF debug_info dump. I've separated the class description from the variable description. The full example could be found at:

Class-info

<1><33>: Abbrev Number: 15 (DW_TAG_class_type)
    <34>   DW_AT_name        : (string) Foo
    <38>   DW_AT_byte_size   : (implicit_const) 8
    <38>   DW_AT_decl_file   : (implicit_const) 1
    <38>   DW_AT_decl_line   : (data1) 22
    <39>   DW_AT_decl_column : (implicit_const) 7
    <39>   DW_AT_containing_type: (ref4) <0x348>
    <3d>   DW_AT_sibling     : (ref4) <0xe4>
 <2><41>: Abbrev Number: 9 (DW_TAG_inheritance)
    <42>   DW_AT_type        : (ref4) <0x1a7>, Partial
    <46>   DW_AT_data_member_location: (implicit_const) 0
    <46>   DW_AT_accessibility: (implicit_const) 1  (public)
...
<1><fd>: Abbrev Number: 15 (DW_TAG_class_type)
    <fe>   DW_AT_name        : (string) Bar
    <102>   DW_AT_byte_size   : (implicit_const) 8
    <102>   DW_AT_decl_file   : (implicit_const) 1
    <102>   DW_AT_decl_line   : (data1) 29
    <103>   DW_AT_decl_column : (implicit_const) 7
    <103>   DW_AT_containing_type: (ref4) <0x348>
    <107>   DW_AT_sibling     : (ref4) <0x18e>
 <2><10b>: Abbrev Number: 9 (DW_TAG_inheritance)
    <10c>   DW_AT_type        : (ref4) <0x260>, Full
    <110>   DW_AT_data_member_location: (implicit_const) 0
    <110>   DW_AT_accessibility: (implicit_const) 1 (public)
...
<1><1a7>: Abbrev Number: 13 (DW_TAG_class_type)
    <1a8>   DW_AT_name        : (strp) (offset: 0xa5d): Partial
    <1ac>   DW_AT_byte_size   : (implicit_const) 8
    <1ac>   DW_AT_decl_file   : (implicit_const) 1
    <1ac>   DW_AT_decl_line   : (data1) 9
    <1ad>   DW_AT_decl_column : (implicit_const) 7
    <1ad>   DW_AT_containing_type: (ref4) <0x348>
    <1b1>   DW_AT_sibling     : (ref4) <0x23d>
 <2><1b5>: Abbrev Number: 9 (DW_TAG_inheritance)
    <1b6>   DW_AT_type        : (ref4) <0x348>, Interface
    <1ba>   DW_AT_data_member_location: (implicit_const) 0
    <1ba>   DW_AT_accessibility: (implicit_const) 1 (public)
...
<1><260>: Abbrev Number: 13 (DW_TAG_class_type)
    <261>   DW_AT_name        : (strp) (offset: 0x21e3): Full
    <265>   DW_AT_byte_size   : (implicit_const) 8
    <265>   DW_AT_decl_file   : (implicit_const) 1
    <265>   DW_AT_decl_line   : (data1) 15
    <266>   DW_AT_decl_column : (implicit_const) 7
    <266>   DW_AT_containing_type: (ref4) <0x348>
    <26a>   DW_AT_sibling     : (ref4) <0x316>
 <2><26e>: Abbrev Number: 9 (DW_TAG_inheritance)
    <26f>   DW_AT_type        : (ref4) <0x348>, Interface
    <273>   DW_AT_data_member_location: (implicit_const) 0
    <273>   DW_AT_accessibility: (implicit_const) 1 (public)
...
<1><348>: Abbrev Number: 13 (DW_TAG_class_type)
    <349>   DW_AT_name        : (strp) (offset: 0x332d): Interface
    <34d>   DW_AT_byte_size   : (implicit_const) 8
    <34d>   DW_AT_decl_file   : (implicit_const) 1
    <34d>   DW_AT_decl_line   : (data1) 2
    <34e>   DW_AT_decl_column : (implicit_const) 7
    <34e>   DW_AT_containing_type: (ref4) <0x348>
    <352>   DW_AT_sibling     : (ref4) <0x404>

Variable-info

<1><e9>: Abbrev Number: 11 (DW_TAG_variable)
    <ea>   DW_AT_name        : (strp) (offset: 0x2e8c): inst_foo
    <ee>   DW_AT_decl_file   : (implicit_const) 1
    <ee>   DW_AT_decl_line   : (data1) 34
    <ef>   DW_AT_decl_column : (implicit_const) 10
    <ef>   DW_AT_type        : (ref4) <0x33>, Foo
    <f3>   DW_AT_external    : (flag_present) 1
    <f3>   DW_AT_location    : (exprloc) 9 byte block: 3 0 0 0 0 0 0 0 0    (DW_OP_addr: 0)
...
<1><193>: Abbrev Number: 11 (DW_TAG_variable)
    <194>   DW_AT_name        : (strp) (offset: 0xfe3): inst_bar
    <198>   DW_AT_decl_file   : (implicit_const) 1
    <198>   DW_AT_decl_line   : (data1) 35
    <199>   DW_AT_decl_column : (implicit_const) 10
    <199>   DW_AT_type        : (ref4) <0xfd>, Bar
    <19d>   DW_AT_external    : (flag_present) 1
    <19d>   DW_AT_location    : (exprloc) 9 byte block: 3 8 0 0 0 0 0 0 0   (DW_OP_addr: 8)
...
<1><242>: Abbrev Number: 11 (DW_TAG_variable)
    <243>   DW_AT_name        : (strp) (offset: 0x1eda): inter_foo
    <247>   DW_AT_decl_file   : (implicit_const) 1
    <247>   DW_AT_decl_line   : (data1) 36
    <248>   DW_AT_decl_column : (implicit_const) 10
    <248>   DW_AT_type        : (ref4) <0x256>
    <24c>   DW_AT_external    : (flag_present) 1
    <24c>   DW_AT_location    : (exprloc) 9 byte block: 3 10 0 0 0 0 0 0 0  (DW_OP_addr: 10)
<1><256>: Abbrev Number: 6 (DW_TAG_pointer_type)
    <257>   DW_AT_byte_size   : (implicit_const) 8
    <257>   DW_AT_type        : (ref4) <0x1a7>, Partial
...
<1><31b>: Abbrev Number: 11 (DW_TAG_variable)
    <31c>   DW_AT_name        : (strp) (offset: 0x73b): inter_bar
    <320>   DW_AT_decl_file   : (implicit_const) 1
    <320>   DW_AT_decl_line   : (data1) 37
    <321>   DW_AT_decl_column : (implicit_const) 10
    <321>   DW_AT_type        : (ref4) <0x32f>
    <325>   DW_AT_external    : (flag_present) 1
    <325>   DW_AT_location    : (exprloc) 9 byte block: 3 18 0 0 0 0 0 0 0  (DW_OP_addr: 18)
<1><32f>: Abbrev Number: 6 (DW_TAG_pointer_type)
    <330>   DW_AT_byte_size   : (implicit_const) 8
    <330>   DW_AT_type        : (ref4) <0x260>, Full

Analysis

What I currently do to check if a class is abstract, is by checking if the value of the DW_AT_containing_type attribute of a class is self-referencing. This is a cheap method that works for 90% of the codebase the parser is intended for. In the example above it works for the Interface class as the DW_AT_containing_type has value 0x348 and the class definition starts at offset 0x348.

This doesn't work for the class Partial or Full. Purely by looking at the DWARF description of the class there is no distinction between Partial which is abstract and Full which isn't.

The complete DWARF info also shows information for the functions, for the sake of clarity I've only kept the relevant info.

<1><348>: Abbrev Number: 13 (DW_TAG_class_type)
    <349>   DW_AT_name        : (strp) (offset: 0x332d): Interface
<2><3c7>: Abbrev Number: 16 (DW_TAG_subprogram)
    <3c8>   DW_AT_external    : (flag_present) 1
    <3c8>   DW_AT_name        : (strp) (offset: 0xf55): aFunc
    <3d2>   DW_AT_virtuality  : (implicit_const) 1  (virtual)
    <3d5>   DW_AT_containing_type: (ref4) <0x348>
    <3d9>   DW_AT_declaration : (flag_present) 1
<2><3e7>: Abbrev Number: 10 (DW_TAG_subprogram)
    <3e8>   DW_AT_name        : (strp) (offset: 0x1293): bFunc
    <3f2>   DW_AT_virtuality  : (implicit_const) 1  (virtual)
    <3f5>   DW_AT_containing_type: (ref4) <0x348>
    <3f9>   DW_AT_declaration : (flag_present) 1
    <3f9>   DW_AT_object_pointer: (ref4) <0x3fd>
...
<1><1a7>: Abbrev Number: 13 (DW_TAG_class_type)
    <1a8>   DW_AT_name        : (strp) (offset: 0xa5d): Partial
<2><220>: Abbrev Number: 10 (DW_TAG_subprogram)
    <221>   DW_AT_name        : (strp) (offset: 0xf55): aFunc
    <22b>   DW_AT_virtuality  : (implicit_const) 1  (virtual)
    <22e>   DW_AT_containing_type: (ref4) <0x1a7>
    <232>   DW_AT_declaration : (flag_present) 1
...
<1><260>: Abbrev Number: 13 (DW_TAG_class_type)
    <261>   DW_AT_name        : (strp) (offset: 0x21e3): Full
<2><2d9>: Abbrev Number: 16 (DW_TAG_subprogram)
    <2da>   DW_AT_name        : (strp) (offset: 0xf55): aFunc
    <2e4>   DW_AT_virtuality  : (implicit_const) 1  (virtual)
    <2e7>   DW_AT_containing_type: (ref4) <0x260>
    <2eb>   DW_AT_declaration : (flag_present) 1
<2><2f9>: Abbrev Number: 10 (DW_TAG_subprogram)
    <2fa>   DW_AT_name        : (strp) (offset: 0x1293): bFunc
    <304>   DW_AT_virtuality  : (implicit_const) 1  (virtual)
    <307>   DW_AT_containing_type: (ref4) <0x260>
    <30b>   DW_AT_accessibility: (implicit_const) 1 (public)
    <30b>   DW_AT_declaration : (flag_present) 1

Unfortunately GCC does not distinguish virtual and pure virtual via the DW_AT_virtuality attribute (as defined in DWARF 5). Although aFunc() and bFunc() are pure virtual, only the virtual AT value used.

Nevertheless, the above shows that we can make a distinction by analysis of the implemented functions.

  • Assume the Interface class is abstract as the DW_AT_containing_type is self-referencing.
  • Register the number of virtual functions that are declared.
    • The Interface class declares two virtual functions. aFunc() and bFunc()
  • Check if the derived interfaces also declare those functions
  • If all virtual functions of the base are declared in the derived, assume the derived is not virtual anymore.

I don't believe the method is foolproof, but it is the best method I could think of.

Question

My question is: Is there a better way to determine if a class is abstract?

转载请注明原文地址:http://conceptsofalgorithm.com/Algorithm/1744201842a235835.html

最新回复(0)