Introduction

Object Introspection (shortened to OI and pronounced as in boy) is a memory profiling technology for C++ objects. It provides the ability to dynamically instrument applications to capture the precise memory occupancy of entire object hierarchies including all containers and dynamic allocations. All this with no code modification or recompilation!

How does it do this?

In lieu of more detailed documentation (outside of the code obviously!) here is a brief description of the OI technology. There is a core technology and two different flavors of how it can actually be consumed: a hands-off classic debugger style technology that controls a target process called oid and an API called OIL that provides programmatic object introspection.

Type reconstruction: A C++ object is described in detail by the debug data generated generated by a compiler (the -g flag with the clang and gcc compiler toolchains) and OI consumes DWARF debug data to reconstruct types from a generated binary. Given a known root type we fully reconstruct the entire type hierarchy including all the descendant types - think of an object as a tree of types that is rooted at a known base type.

Code Generation: We then iterate over the type tree to auto-generate C++ code to perform operations on each node on the tree. For example, if we have a std::vector of std::string objects then we know that a vector has an iterator and a strings dynamic size can be measured with it's size() method (individual containers have space optimization schemes such as Short String Optimization for strings which we take into account).

JIT Compilation: The auto-generated C++ code is then JIT compiled into x86-64 object code. Depending upon how we are using OI technology this object code is then relocated for the address space of a target process (oid) or it is ready to be executed directly (OIL).

Dynamic Instrumentation: The generated object code is copied into a dedicated text segment in the target process ready for execution. Standard text modification techniques are employed to hijack threads at specific points in program execution - these locations are specified by the user. The hijacked thread is then redirected to execute the object code for a specific object. Introspection results are written to a dedicated text segment in the target process during execution of the introspection code.

Object Processing: Data generated from the object introspection process is then copied out of the target address space by the oid debugger and processed. This includes reconstructing the entire object tree for the captured data and attributing results correctly.

The above is a very high level view of the basic implementation but there are many more aspects which we hope you'll be interested in. Please bear in mind that this is very much a work-in-progress and the initial design and implementation fits the specific needs of Meta. Our experience is that OI has opened up many new and exciting ways of viewing our application memory footprints in live applications and we hope it can do the same for you. However, please check out the current constraints to see whether you can make use of OI currently and please let us know of your requirements if not (please feel free to contribute!).

How does it do this?​

How does it do this?