Technical Note on `cPyMemTrace` Performance¶

pymemtrace.cPyMemTrace contains Python profilers and tracers written in ‘C’ that logs the runtime information including the Resident Set Size for every Python and C call and return or object allocation and de-allocation as required.

Test Program and Data¶

This tests the performance of cPyMenTrace with a real world program. The program reads remote sensing structured text files containing static data and dynamic arrays of floating point data and summarised them as HTML pages. The input was 3 files totalling 1.3Mb.

Platform was a Mac Book Pro (late 2018) 2.7 GHz Intel Core i7 running macOS Ventura 13.5.

The command used was:

time tdlastohtml --log-process=0.25  example_data/LAS/data tmp/LASTOHTML_A

Profiling¶

The pymemtrace.cPyMemTrace.Profile class that logs Python and C code. This ignores Python line, opcode and exception events. pymemtrace provides a decorator pymemtrace.cpymemtrace_decs.profile() that will profile a function. For example:

from pymemtrace import cpymemtrace_decs

@cpymemtrace_decs.profile(
    message="LASToHTML Profile d_rss_trigger=0",
    d_rss_trigger=0,
)
def process_arguments(args, log_level):
    # Execution code here.
    pass

In this exercise there were 4.4m profiling events. When d_rss_trigger=-1 then only events that change the RSS by >= the page size (4096 bytes) are reported. When d_rss_trigger=0 then all events are reported.

Profiling Performance¶

Here is the cost of profiling given various options:

**cPyMemTrace Profiling Performance**¶
Setup	Time (s)	Time Rel.	Log Size	Log Lines	Notes
No Profiling	0.765	1x	0	0	Baseline execution.
Profile	6.81	8.9x	399K	2,095	`d_rss_trigger=-1`
Profile, all events	32.4	42x	831M	4,428,053	`d_rss_trigger=0`

As expected reporting every event is about four times as expensive (in time) compared to those events that trigger large changes in the RSS.

Tracing¶

The pymemtrace.cPyMemTrace.Trace class that logs pure Python code, it ignores C call, and return events. pymemtrace provides a decorator pymemtrace.cpymemtrace_decs.trace() that will profile a function. For example:

from pymemtrace import cpymemtrace_decs

@cpymemtrace_decs.trace(
    message="LASToHTML Trace d_rss_trigger=0",
    d_rss_trigger=0,
)
def process_arguments(args, log_level):
    # Execution code here.
    pass

In this exercise there were 7m profiling events. When d_rss_trigger=-1 then only events that change the RSS by >= the page size (4096 bytes) are reported. When d_rss_trigger=0 then all events are reported.

Tracing Performance¶

Here is the cost of tracing given various options:

**cPyMemTrace Tracing Performance**¶
Setup	Time (s)	Time Rel.	Log Size	Log Lines	Notes
No Tracing	0.765	1x	0	0	Baseline execution.
Trace	13.0	17x	443K	2,326	`d_rss_trigger=-1`
Trace, all	47.6	62x	1.3G	7,044,826	`d_rss_trigger=0`

This is similar to Profiling but the event count is much larger and that shows up in the runtime cost.

Reference Tracing¶

The pymemtrace.cPyMemTrace.ReferenceTracing class that logs every allocation and de-allocation. The log file format is different to the Profile/Traace file format. See cPyMemTrace Log File Format for the details. pymemtrace provides a decorator pymemtrace.cpymemtrace_decs.reference_tracing() that will profile a function.

Here are the code examples, firstly reference tracing all except the builtin types (int, str etc.). This is the default:

@cpymemtrace_decs.reference_tracing(
    message="LASToHTML Reference Tracing include_builtins=False",
    include_builtins=False,
)
def process_arguments(args, log_level):
    # Execution code here.
    pass

Then reference tracing including all the Python builtins:

@cpymemtrace_decs.reference_tracing(
    message="LASToHTML Reference Tracing include_builtins=True",
    include_builtins=True,
)
def process_arguments(args, log_level):
    # Execution code here.
    pass

Now reference tracing but only reporting on a specific type and no builtin types:

@cpymemtrace_decs.reference_tracing(
    message=(
        "LASToHTML Reference Tracing include_builtins=False"
        " include_tp_names=['LASSection',]"
    ),
    include_builtins=False,
    include_tp_names=['LASSection',],
)
def process_arguments(args, log_level):
    # Execution code here.
    pass

Now reference tracing but only reporting on a specific type and no builtin types with no garbage collection run:

@cpymemtrace_decs.reference_tracing(
    message=(
        "LASToHTML Reference Tracing include_builtins=False"
        " include_tp_names=['LASSection',]"
    ),
    include_builtins=False,
    include_tp_names=['LASSection',],
    gc_collect_on_exit=-1,
)
def process_arguments(args, log_level):
    # Execution code here.
    pass

Finally reference tracing but only reporting on a specific type and no builtin types but with a full Garbage Collection run before closing the log file:

@cpymemtrace_decs.reference_tracing(
    message=(
        "LASToHTML Reference Tracing include_builtins=False"
        " include_tp_names=['LASSection',]"
        " gc_collect_on_exit=2"
    ),
    include_builtins=False,
    include_tp_names=['LASSection',],
    gc_collect_on_exit=2, # The default.
)
def process_arguments(args, log_level):
    # Execution code here.
    pass

Reference Tracing Performance¶

Here are the results compared to the baseline of no tracing:

**cPyMemTrace Reference Tracing Performance**¶
Setup	Time (s)	Time Rel.	Log Size	Log Lines	Notes
No Tracing	0.765	1x	0	0	Baseline execution.
Ref Trace, no builtins	27.3	36x	170M	700K	`include_builtins=False`
Ref Trace, with builtins	59.9	78x	1.0G	4M	`include_builtins=True`
Ref Trace, single class	1.573	2.0x	3.2K	16	`include_tp_names=['LASSection',]` `gc_collect_on_exit=-1`
Ref Trace, single class, with GC	1.841	2.4x	6.1K	28	`include_tp_names=['LASSection',]`

Reference tracing is extremely invasive as it records every Python object allocation and de-allocation. Including builtins can double the executions times compared to ignoring them.

Reference tracing a single type is remarkably effective and efficient. The Garbage Collection run adds very little to the execution time.

Technical Note on `cPyMemTrace` Performance¶

Test Program and Data¶

Profiling¶

Profiling Performance¶

Tracing¶

Tracing Performance¶

Reference Tracing¶

Reference Tracing Performance¶

Table of Contents

Previous topic

Next topic

This Page

Technical Note on cPyMemTrace Performance¶

Test Program and Data¶

Profiling¶

Profiling Performance¶

Tracing¶

Tracing Performance¶

Reference Tracing¶

Reference Tracing Performance¶

Technical Note on `cPyMemTrace` Performance¶