Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix syslog output #103

Merged
merged 6 commits into from
Nov 16, 2023
Merged

Conversation

1000TurquoisePogs
Copy link
Member

@1000TurquoisePogs 1000TurquoisePogs commented Nov 2, 2023

This PR solves the following:

  1. launcher timestamps were not matching server time (local versus UTC)
  2. launcher timestamps were not matching server style (missing milliseconds)
  3. syslog printing was including timestamps, despite syslog already having timestamps. now, via regex launcher detects and omits timestamps. it makes lines easier to read.
  4. Attempt to fix a bug where a server trying to print a line longer than 1024 would hang due to the syslog routing logic only having a 1024 buffer. now, there is no limit but the output will be truncated to 512 chars to make the syslog easier to read.

…uncation to improve syslog readability

Signed-off-by: 1000TurquoisePogs <[email protected]>
Signed-off-by: 1000TurquoisePogs <[email protected]>
Signed-off-by: 1000TurquoisePogs <[email protected]>
Signed-off-by: 1000TurquoisePogs <[email protected]>
@1000TurquoisePogs 1000TurquoisePogs changed the base branch from v2.x/master to v2.x/staging November 9, 2023 12:53
@1000TurquoisePogs 1000TurquoisePogs changed the title Fix component syslog handling and truncate to 512 length Fix syslog output Nov 9, 2023
Signed-off-by: 1000TurquoisePogs <[email protected]>
@DivergentEuropeans DivergentEuropeans requested a review from a team November 13, 2023 12:26
Copy link
Member

@DivergentEuropeans DivergentEuropeans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't fix the duplicate/false-positive syslog issue completely, but this overall improves my code & fixes multiple bugs so a huge thank you. I added a couple suggestions that may do it even better. Overall, great change 👍 but I have some review q's

timestamp match appears to be correct✔️

Removed server timestamps from syslog to avoid duplicate logging of time ❔
11.21.24 S0542560 +2023-10-23 15:21:24.088 <ZWED:459721> ZWESMVD INFO (_zsf.install,index.js:352) ZWED0031I - Server is ready at -->
2023317 08:06:52.32 S0107658 00000090 +<ZWED:34014622> ZWESMVD INFO (_zsf.install,index.js:350) ZWED0031I - Server is ready at

this change I am less sure about because our system log has the date as "2023317" (OR am I reading it wrong?) I'm not sure how that coincides with 11/13/2023. Should we really be removing the date if the syslog appears to be fucked or maybe that's some issue on our system

syslog output per line is capped at 512 bytes, extra characters will be omitted ❔ I need to test this

Added milliseconds logging to match the server timestamps ❓ see my comment

const char *sys_message_id = jsonArrayGetString(zl_context.sys_messages, i);
if (sys_message_id && (index_of_string_limited(input_string, input_length, sys_message_id, 0, SYSLOG_MESSAGE_LENGTH_LIMIT) != -1)) {

//exclude "ZWE_zowe_sysMessages" messages to avoid spam.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks this is useful 👍 and this reduces some spam like

11.21.19 S0542560 +ZWE_zowe_sysMessages_0=ZWEL0021I
but it doesn't reduce other spam like

+applyEnviron setting ZWE_zowe_sysMessages=[...]

It's probably not within the scope to catch all the false positives but spam reduction should probably be a method, i.e. a filter and this filter can be added to, in the future with list of strings that exclude a message from being eligible to be checked something like,

const char* exclude_list[] = {"ZWE_zowe_sysMessages", "export ZWE_zowe_sysMessages", "applyEnviron setting ZWE_zowe_sysMessages", NULL};

and the method would be like
if (is_excluded(input_string, exclude_list)) {
...

bool is_excluded(const char* input_string, const char** exclude_list) {
    for (const char** current = exclude_list; *current != NULL; current++) {
        size_t exclude_length = strlen(*current);
        if (strncmp(input_string, *current, exclude_length) == 0) {
            return true;
        }
    }
    return false;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this you can remove the need for ZWE_SYSMESSAGES_EXCLUDE_LEN too

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2023317 is the current date in julian format, as that is what SYSLOG has always used for dates, which converts to 2023-11-13. Whereas the "example line" above it is from the joblog output which does NOT have a date preceding it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think applyEnviron is a debug-level message (zowe logLevel=debug/trace) so i didn't catch that but it would be good to remove.
i also note that the app-server prints out the env vars too, in a way that isnt caught by this code and shouldnt be. it would be better to make a pr in app-server to omit them.

- Bugfix: Removed server timestamps from syslog to avoid duplicate logging of time (#103)
- Bugfix: Avoided hang on components when components were logging messages longer than 1024 characters. (#103)
- Enhancement: syslog output per line is capped at 512 bytes, extra characters will be omitted (#103)
- Enhancement: Added milliseconds logging to match the server timestamps (#103)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But where is this added? I am confused. You created

struct timeval now; gettimeofday(&now, NULL); int milli = now.tv_usec / 1000; snprintf(result.value+strlen(result.value), 5, ".%03d", milli);

  1. but milli isn't being logged anywhere I can see
  2. Also, milli is based on gettimeofday so you should probably add a comment here that to use a reliable milliseconds, you need to call gettimeofday every time you use milli in a log message

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. it is, in the job log. you wouldnt see it in the syslog because it gets omitted on purpose.
  2. thats not relevant information to end users (changelog). thats an implementation detail.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its right here
image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I meant add a code comment @1000TurquoisePogs to 2)

@1000TurquoisePogs
Copy link
Member Author

i will add one note: although i try to truncate the syslog output to 512, it doesnt appear i ever get there. i think the wtoprintf function in zowe-common-c may not do multi-line syslog correctly? i have not investigated further, but may in a future ticket.

@gadyck
Copy link

gadyck commented Nov 13, 2023

Note - WTO processing has a limit of 126 characters of text, and function wtoPrintf() will truncate any formatted line that exceeds that limit. Log messages written to a USS file have no limit that I am aware of. Log messages written to a dataset are limited by the LRECL specified when the dataset was allocated. Log messages written to SYSOUT are limited by the LRECL specified when the DDNAME was allocated.

@DivergentEuropeans
Copy link
Member

DivergentEuropeans commented Nov 13, 2023

@1000TurquoisePogs the "disapprove" was for the changelog btw, the other comments are just good suggestions

@DivergentEuropeans
Copy link
Member

But anyway, getting tired now apologies - see you responded, I'll check later

Copy link
Member

@DivergentEuropeans DivergentEuropeans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments clarify things, but we need a way to track what we need to exclude if we're not going to do it in this code....

@1000TurquoisePogs
Copy link
Member Author

The build machine is currently down but we know this builds internally, so merging.

@1000TurquoisePogs 1000TurquoisePogs merged commit 5cf08b8 into v2.x/staging Nov 16, 2023
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants