-
Notifications
You must be signed in to change notification settings - Fork 396
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Format the code in the Compiler component using clang-format #7434
Comments
Thanks for undertaking this. The formatting looks generally fine to me, and the visual delta between what we have now and the new format doesn't really stand out (which is a good thing, in my opinion). A few initial questions:
where the pointer definitions differ in the placement of the
|
In situations like this, I think we could either 1) change clangformat to right align all the
@dsouzai your |
Similarly, I noticed that parentheses and braces were preserved as is. Existing code is inconsistent in the use of braces around single statements in if/else, and in the cases of switch. I think it would be good if consistency was enforced.
versus
and
versus
|
For While we are here, maybe we could also explore the clang-tidy static analysis tool in addition to clang-format. (list of clang-tidy checks) |
I guess that's a matter of taste. I prefer to think of each case as a block of code and like that to be emphasized - hence, I almost always put braces around the blocks of code in the cases. If most others prefer not to do that, that's fine and I can live with that, but I think we should have a consistent approach. Both styles appear in the compiler today, even in the absence of variable declarations - though I don't know how many of the braces without variable declarations might have been introduced by me. :-) |
I also prefer the brace scopes for each
No harm here! I think this is another example of how the coding standard should be updated to state that all |
I think one of the goals of this exercise is to apply consistency in formatting rules throughout the code base. Each of us has different perspectives for how we prefer things to look (and we'll likely never reach consensus on), but I think we will all agree that regardless of the style chosen it should be applied consistently throughout the code base [1]. Even after the auto formatting has been applied there appear to be examples where the formatting is inconsistent for different parts of the language syntax or even the comment style. @dsouzai : based on your experience with this tool, do you think it is feasible/reasonable for all of these formatting inconsistencies to be smoothed out in the initial, single commit? |
@ThanHenderson Yeah I had noticed that, but then I figured since the namespace merging would happen in the same PR as the code formatting, it would get washed away as part of the formatting. That said, I realized that the regex was relatively straightforward to fix, so I updated it to remove spaces before and after.
I haven't spent too much time with this tool aside from the work needed to create this issue; I mostly just piggy-backed off the work of others in the past. I'll have to read up the documentation and play around with it more to see if I can answer some of the questions brought up above. My initial guess though is that we likely won't be able to smooth out absolutely everything in one go, and that some things will need some manual intervention on an ongoing basis. However, that shouldn't prevent us from going ahead with this for a couple of reasons
I'll play around with it though and see. If anyone else also has experience with clang-format, I'd appreciate your insights. |
Heh, I should mention, since I only looked at the diff from #3390, I missed this bit in the clang format file that may have impacted how the code looks:
I think the webkit file has expanded since Robert looked at it way back in 2018. Once I go through all the various config options, I'll run the formatter again and see how it looks. |
The syntax inconsistencies should be minimal after clang-format is applied; provided we specify all of the rules that we care about in the I think the important thing to do here is identify exactly the inconsistencies that we find, and devise the appropriate rules for clang-format in light of them. Where there any other inconsistencies that people noticed? For example, though it may not be everyone's preference, going with option 2 here should eliminate all inconsistencies with pointer types. Certain other inconsistencies, like the braced- w.r.t. the comment formatting, clang-format has some comment formatting options and general options that also apply to comments. However, they are pretty basic. |
That shouldn't affect it since that section was separated by a |
I went through the clang-format documentation, and played around with the config a bit. You can look at the latest branch https://github.com/dsouzai/omr/tree/clangformat_20240813. I'm using
as it is the latest stable release. An important fact I came across from reading online is it is important everyone use the same version of clang-format. Apparently, the output of newer versions of clang-format will not be a strict superset of the output of older versions. Therefore, whatever version of clang-format is chosen will need to be the one we use for good. I had to do
to run The diff of the clang-format file is now 1,2c1
< ---
< # BasedOnStyle: WebKit
---
> BasedOnStyle: WebKit
4a4
> AlignArrayOfStructures: Right
5a6
> AlignConsecutiveBitFields: false
7c8,9
< AlignEscapedNewlines: Right
---
> AlignConsecutiveMacros: false
> AlignEscapedNewlines: Left
9a12
> AllowAllArgumentsOnNextLine: false
12a16
> AllowShortEnumsOnASingleLine: false
14a19
> AllowShortLambdasOnASingleLine: true
19c24
< AlwaysBreakTemplateDeclarations: No
---
> AlwaysBreakTemplateDeclarations: MultiLine
22c27,29
< BraceWrapping:
---
> BitFieldColonSpacing: After
> BraceWrapping:
> AfterCaseLabel: false
24c31
< AfterControlStatement: false
---
> AfterControlStatement: Never
32a40,41
> BeforeLambdaBody: false
> BeforeWhile: false
34,36c43,48
< SplitEmptyFunction: true
< SplitEmptyRecord: true
< SplitEmptyNamespace: true
---
> SplitEmptyFunction: false
> SplitEmptyRecord: false
> SplitEmptyNamespace: false
> BreakAdjacentStringLiterals: true
> BreakAfterAttributes: Never
> BreakArrays: false
38,39c50,51
< BreakBeforeBraces: WebKit
< BreakBeforeInheritanceComma: false
---
> BreakBeforeBraces: Custom
> BreakBeforeInlineASMColon: Never
41d52
< BreakConstructorInitializersBeforeComma: false
43c54
< BreakAfterJavaFieldAnnotations: false
---
> BreakInheritanceList: BeforeComma
45,48c56,58
< ColumnLimit: 0
< CommentPragmas: '^ IWYU pragma:'
< CompactNamespaces: false
< ConstructorInitializerAllOnOneLineOrOnePerLine: false
---
> ColumnLimit: 120
> CommentPragmas: '^ IWYU pragma:|SPDX-License-Identifier:'
> CompactNamespaces: true
53a64,65
> EmptyLineAfterAccessModifier: Never
> EmptyLineBeforeAccessModifier: LogicalBlock
55,56c67,68
< FixNamespaceComments: false
< ForEachMacros:
---
> FixNamespaceComments: true
> ForEachMacros:
59a72
> IncludeBlocks: Preserve
73c86,91
< IndentCaseLabels: false
---
> IndentAccessModifiers: false
> IndentCaseBlocks: false
> IndentCaseLabels: true
> IndentExternBlock: NoIndent
> IndentGotoLabels: false
> IndentPPDirectives: None
76,78c94,99
< JavaScriptQuotes: Leave
< JavaScriptWrapImports: true
< KeepEmptyLinesAtTheStartOfBlocks: true
---
> InsertBraces: false
> InsertNewlineAtEOF: true
> KeepEmptyLinesAtEOF: true
> KeepEmptyLinesAtTheStartOfBlocks: false
> Language: Cpp
> LineEnding: LF
83,85c104,105
< ObjCBlockIndentWidth: 4
< ObjCSpaceAfterProperty: true
< ObjCSpaceBeforeProtocolList: true
---
> PPIndentWidth: 0
> PackConstructorInitializers: Never
93,95c113,121
< PointerAlignment: Left
< ReflowComments: true
< SortIncludes: true
---
> PointerAlignment: Right
> QualifierAlignment: Leave
> ReferenceAlignment: Right
> ReflowComments: true
> RemoveBracesLLVM: false
> RemoveParentheses: Leave
> RemoveSemicolon: false
> SeparateDefinitionBlocks: Always
> SortIncludes: false
97a124
> SpaceAfterLogicalNot: false
98a126
> SpaceAroundPointerQualifiers: Before
99a128
> SpaceBeforeCaseColon: false
100a130,131
> SpaceBeforeCtorInitializerColon: true
> SpaceBeforeInheritanceColon: true
101a133,134
> SpaceBeforeSquareBrackets: false
> SpaceInEmptyBlock: false
106a140,142
> SpacesInLineCommentPrefix:
> Minimum: 1
> Maximum: -1
109c145
< Standard: Cpp11
---
> Standard: c++03
112,115d147
< ---
< Language: ObjC
< PointerAlignment: Right
< ... |
After looking at the docs, as @ThanHenderson said, using
is the easiest solution for this. |
It doesn't look like clang-format supports anything like that. However, doing
and
shows a handful of doxygen keywords in the compiler dir. It should be fairly straightforward to do a search and replace of either the |
I believe even the original clang-format config file enabled reflowing comments. According to the doc, the reflow respects the columnLimit, which is set to 120. As such, the comments should be consistent at least wrt to the column limit. Now, we could lower the columnLimit to say 80 or something like that, but I was just using what was in #3390. Do you have an example of the inconsistency? |
Since clang-format v15, you can specify
However, it comes with the warning
So it's not something that can be done when applying to a big chunk of code as we're proposing to do. |
After looking at the docs and online, all the inconsistencies brought up in this issue thus far can't be addressed in one go; it'll have to be something that we clean up as we come across them. Really, only the pointer alignment issue can be addressed. |
|
Prompted by a discussion with @hzongaro on a PR I opened yesterday, Would it be possible to add a rule which enforces comments documenting conditional directives as outlined in the OMR C and C++ Coding Standards? I know the coding standards are more of a loose guideline throughout many components of the codebase, but these comments do appear frequently in both OMR and OpenJ9, and I personally find them very useful for readability. For example: #if defined(A) || defined(B)
...
#elif !defined(C) /* defined(A) || defined(B) */
...
#else /* !defined(C) */
...
#endif /* defined(A) || defined(B) */ I'm working on a Python script that does exactly this and so far it appears to be feasible to do programmatically. I was intending to use it to do a one time sweeping change to introduce missing comments or fix incorrect ones, but maintaining these comments would be sort of inconvenient without a formatting tool like the one we're planning on using here. On the other hand, from some cursory research, it doesn't really seem like the kind of thing that clang-format does. Perhaps those who are more familiar with clang-format know whether this sort of thing is possible. |
I think it's a good idea in principle. I'm not familiar with a way to do this via clang-format, and I don't imagine there would be one. A nice stack-based approach for adding in the comments may suffice for the initial pass, and taking care to fix anything suspicious. Once that initial pass is done, for a CI solution we could scan for commentless |
When I had done my initial research online, it didn't look like A script as part of the CI to check that all modified files in a PR do not have a commentless |
Overview
In the August 1, 2024 OMR Architecture Meeting, we discussed changing the code formatting style of the compiler component. A prior effort involved an attempt to format the entire OMR project, but ultimately that was not a direction the community as a whole agreed on1. The current attempt to format only the compiler component stems from the fact that original effort was led by the compiler component, and it also had the largest buy-in.
The main ongoing benefit of this effort, aside from the fact that it will yield code that is formatted in a style that is generally satisfactory to the majority of the compiler developers, is the ability to run tools such as linters that can be used to enforce the coding style; this reduces the burden on reviewers allowing them to focus purely on the functional aspects of a pull request. Targeting only the compiler component also has the added benefit of providing a "proof of concept" that other components can use to re-evaluate their position; the barrier to entry would also be lower as the majority of the infrastructure would already be in place.
The outcome of the August 1st OMR meeting is the following set of actions:
.clang-format
file to the compiler component,The proposed code format is the style that was accepted in the previous attempt, namely style no. 6 which is based on #3390.
To compare the code with and without this style applied, please see:
Plan
clang-format
(probably the latest stable version)clang-format
on the machine that runs the Line Endings jobScripts
Fix up multiple namespace declarations:
pre-commit git hook:
Rebase helper (though this may not work for multiple commits changing the same file):
Azure pipeline code format checker:
Footnotes
https://github.com/eclipse/omr/issues/4631 ↩
The text was updated successfully, but these errors were encountered: