Replications & Static plots
Overview
To generate static plots that account for replications, I adapted a script originally developed by @ctena so that it can produce images directly from the profiler output paths.
The summarize_and_plot.py script is designed to provide a clear and robust analysis of performance logs. Instead of working interactively through the dashboard, it runs in batch mode: you set it up once, execute it, and it generates a full set of plots and summaries in a chosen directory. This makes it especially handy when you need figures for reports or publications.
A key strength of the script is that it uses a trimmed mean to combine results from multiple replications. This helps smooth out random fluctuations or outliers (like an unusually slow run caused by a system hiccup), giving a more reliable picture of typical performance. The script also supports optional mapping files to simplify the often detailed section, subsection labels into higher-level groups, making results easier to interpret. You can even mark levels as ignore in the mapping file to exclude them from the analysis.
What you need before running
- The profiler outputs structured like this:
runs/
├── profiler_output_1/
│ └── ....
├── profiler_output_2/
│ └── ....
-
profiler_output_<replication>→ one folder per replication. -
Inside each,
profiler_data_time_size_<processes>.csvfiles with logs for each process count.
- Mapping file (optional): a CSV that renames sections and subsections into human-readable categories.
- If a section/subsection is not found in the mapping, it will appear as Unknown.
- If explicitly mapped to
"ignore", that section will be excluded from results.
How to run
At the bottom of the script, you can configure the paths inside the __main__ block:
if __name__ == '__main__':
time_log_pattern = r'/path/to/runs/profiler_output_<replication>/profiler_output_time/profiler_data_time_size_<processes>.csv'
mem_log_pattern = r'/path/to/runs/profiler_output_<replication>/profiler_output_time/profiler_data_time_size_<processes>.csv'
image_directory = r'/path/to/output/images'
mapping_path_for_time = "/path/to/mapping.csv"
mapping_path_for_mem = "/path/to/mapping.csv"
# Initialize the analysis object
analysis_tool = HermesLogsAnalysis(
time_log_pattern,
image_directory,
mapping_path_for_time,
mapping_path_for_mem,
replications=2 # Number of replications
)
# Summarize results
simple_times, simple_mem = analysis_tool.summarize()
# Generate plots
analysis_tool.plot_times(simple_times)
analysis_tool.plot_mem(simple_mem)
analysis_tool.plot_speedup(simple_times)
analysis_tool.plot_efficiency(simple_times)
Key points to adapt:
-
time_log_pattern / mem_log_pattern: paths to profiler CSVs, where and will be replaced automatically.
-
image_directory: folder where plots will be saved.
-
mapping_path_for_time / mapping_path_for_mem: optional CSVs with section mappings.
-
replications: how many profiler replications you want to average over.
After editing, run the script located in the visualization package with:
python summarize_and_plot.py
This will generate the following plots in the output directory:
-
Times.png→ execution time vs processes -
Memory.png→ memory usage vs processes -
Speedup.png→ scaling performance -
Efficiency.png→ parallel efficiency