User Tools

Site Tools


info:c_memory_structure

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Last revisionBoth sides next revision
info:c_memory_structure [2012/10/15 08:17] – [Determining C memory layout] moritzinfo:c_memory_structure [2012/10/16 17:22] – [Obtaining the debug section data] moritz
Line 17: Line 17:
 </code> </code>
  
-On 32-bit machines, it is assumed that an <code>int</code> has a length of 32 bits, a ''char'' 8 and a ''short'' 16 bits. However, these are just assumptions and will not hold in general. As most computers today have a word size of 64 bits these assumptions are false.+On 32-bit machines, it is assumed that an ''int'' has a length of 32 bits, a ''char'' 8 and a ''short'' 16 bits. However, these are just assumptions and will not hold in general. As most computers today have a word size of 64 bits these assumptions are false.
  
 On the Python side I want to be able to use the following code: On the Python side I want to be able to use the following code:
Line 50: Line 50:
 The only program that really knows the structure layout in memory is the compiler. It decides about the in-memory structure. Hence, I started investigating if there was a way to extract this information from the compiler. The only program that really knows the structure layout in memory is the compiler. It decides about the in-memory structure. Hence, I started investigating if there was a way to extract this information from the compiler.
  
-===== The missing piece: .debug_section =====+===== The missing piece: .debug_info section =====
  
-C compilers can store additional information to the compiled output in program sections. For ELF files, there is an optional ''.debug_section'' that contains compiler-specific data. It has no standardized structure. Internally it is used to help debugging a program, for example to determine what variable is at what memory location etc. It also stores debug information about the C structures, which is what I am looking for.+C compilers can store additional information to the compiled output in program sections. For ELF files, there is an optional ''.debug_info'' (may have a different name depending on the compiler) that contains compiler-specific data. It has no standardized structure. Internally it is used to help debugging a program, for example to determine what variable is at what memory location etc. It also stores debug information about the C structures, which is what I am looking for.
  
 For different compilers exist different tools to access the debug section content. For GCC, there is objdump, for IAR there exists ielfdump. Both allow to print the debug section in a human-readable form. However, the structure is not documented and requires the programmer to reverse engineer it. (For GCC it is open-source, which is not the case for the IAR C compiler). For different compilers exist different tools to access the debug section content. For GCC, there is objdump, for IAR there exists ielfdump. Both allow to print the debug section in a human-readable form. However, the structure is not documented and requires the programmer to reverse engineer it. (For GCC it is open-source, which is not the case for the IAR C compiler).
Line 99: Line 99:
 Here is the output on my machine: Here is the output on my machine:
  
-<code> +<code>struct:     file format elf64-x86-64
- +
-struct:     file format elf64-x86-64+
  
 Contents of the .debug_info section: Contents of the .debug_info section:
Line 243: Line 241:
  
 This is a lot of output. Let's explain the different parts. This is a lot of output. Let's explain the different parts.
 +
 +Each line represents either the start of a new entity or adds an attribute to one. All entities and attributes have a unique address, encoded in hex. Each entity has a type, which determines the attributes it has. The entity line consists of a nesting level, the unique address followed by the entity's type ID and a human readable name of the type. Attribute definitions start with the unique ID followed by the attribute name to be defined and a colon. After the colon, the actual value of the attribute follows. It can be a number, a string or a pointer to another entity. The pointer is encoded as ''<0x34>'' which references the entity with unique ID ''34''. This way, the debug info tree can be traversed.
 +
 +==== Finding the size information ====
 +
  
 Fist, it tells us what format the file has. In this case it is a ''elf64-x86-64''. We know that pointers are 64 bits ling on this platform. Fist, it tells us what format the file has. In this case it is a ''elf64-x86-64''. We know that pointers are 64 bits ling on this platform.
info/c_memory_structure.txt · Last modified: 2012/10/16 17:23 by moritz

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki