Skip to content

Commit

Permalink
CHANGELOG and README updates (#270)
Browse files Browse the repository at this point in the history
* Initial commit for CHANGELOG updates
* Update README
  • Loading branch information
connersaeli authored Sep 11, 2023
1 parent 1ec813b commit a6b57b3
Show file tree
Hide file tree
Showing 2 changed files with 25 additions and 8 deletions.
17 changes: 17 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,22 @@
# Changelog

## [2.0.0] - 2023-09-11

### Added
- Add support for Prometheus as a performance data source. (https://github.com/ubccr/supremm/pull/268)
- Add 'datasource' field to output summarization record. (https://github.com/ubccr/supremm/pull/269)
- Add better support for the configuration check utility in `supremmconf.py`. (https://github.com/ubccr/supremm/pull/263)
- Add support for configurable job uniqueness. (https://github.com/ubccr/supremm/pull/265)

### Changed
- Update from Python 2.7 to Python 3.6.
- Update template paths to explicitly include the PCP version. (https://github.com/ubccr/supremm/pull/255)
- Use multithreaded archive indexing by default. (https://github.com/ubccr/supremm/pull/250)

### Fixed
- Update indexing in plugins to use integer division when processing jobs with greater than 64 nodes. (https://github.com/ubccr/supremm/pull/264)
- Fix string encoding from byte array to UTF-8 for PCP on RHEL 8. (https://github.com/ubccr/supremm/pull/261)

## [1.4.1] - 2020-10-14

### Fixed
Expand Down
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,23 +100,23 @@ developers. As always, the definitive reference is the source code itself.
The summarization software processing flow is approximately as follows:

- Initial setup including parsing configuration files, opening database connections, etc.
- Query an accounting database to get list of jobs to process and list of PCP archives containing data.
- Query an accounting database to get the list of jobs to process
- For each job:
- retrieve the PCP archives that cover the time period that the job ran;
- extract the relevant datapoints from the PCP archives;
- retrieve performance data that cover the time period the job ran;
- extract the relevant datapoints per timestep;
- run the data through the **preprocessors**;
- run the data through the **plugins**;
- collect the output of the **preprocessors** and **plugins** and store in an output database.

**preprocessors** and **plugins** are both python modules that implement a
defined interface. The main difference between a preprocessor and a plugin is
that the preprocessors run first and their output is avialable to the plugin
that the preprocessors run first and their output is available to the plugin
code.

Each **plugin** is typically responsible for generating a job-level summmary for a PCP metric or group of PCP metrics. Each module
Each **plugin** is typically responsible for generating a job-level summmary for one or many performance metrics. Each module
defines:
- an identifier for the output data;
- a list of PCP metrics;
- a list of required performance metrics;
- a mode of operation (either only process the first and last datapoints or process all data);
- an implementation of a processing function that will be called by the framework with the requested datapoints;
- an implementation of a function that will be called at the end to return the results of the analyis.
Expand All @@ -134,8 +134,8 @@ databases (Open XDMoD being the main one).
If you are interested in doing plugin development, then a suggested starting
point is to look at some of the existing plugins. The simplest plugins, such as
the block device plugin (`supremm/plugins/Block.py`) use the framework-provided
implementation. A more complex example is the Slurm cgroup memory processor
(`supremm/plugins/SlurmCgroupMemory.py`) that contains logic to selectively
implementation. A more complex example is the cgroup memory processor
(`supremm/plugins/CgroupMemory.py`) that contains logic to selectively
ignore certain datapoints and to do some non-trivial statistics on the data.

If you are interested in understanding the full processing workflow, then the
Expand Down

0 comments on commit a6b57b3

Please sign in to comment.