Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use GitHub actions for CI #163

Merged
merged 7 commits into from
Feb 25, 2021
Merged

Use GitHub actions for CI #163

merged 7 commits into from
Feb 25, 2021

Conversation

mrkkrp
Copy link
Collaborator

@mrkkrp mrkkrp commented Jul 13, 2020

No description provided.

@mrkkrp mrkkrp force-pushed the use-github-actions branch 4 times, most recently from 3f97b23 to b13ea3d Compare July 13, 2020 20:50
@mrkkrp
Copy link
Collaborator Author

mrkkrp commented Jul 13, 2020

@NorfairKing we need to fix the validity tests on Windows. Apparently Windows build has been broken for a long time...

@NorfairKing
Copy link
Collaborator

@mrkkrp Is it an incorrect test or is there a bug?

@mrkkrp
Copy link
Collaborator Author

mrkkrp commented Jul 13, 2020

I only checked that all failures are validity-related, so I decided that it will be best to delegate them to you. There were other failures, TH-related, those I fixed myself.

@NorfairKing
Copy link
Collaborator

@mrkkrp I had a look at it but I don't have a windows machine to do any debugging.
It looks like there are either problems with the way the validity of paths is defined or there are real bugs.

@NorfairKing
Copy link
Collaborator

I just rebased this on top of master to see if CI passes now.

@NorfairKing
Copy link
Collaborator

@Martinsos this still fails on windows but not linux. Do you have any idea why?

@Martinsos
Copy link
Contributor

@Martinsos this still fails on windows but not linux. Do you have any idea why?

I just took a quick look at appveyor and I see it passed after merging my PR (https://ci.appveyor.com/project/chrisdone/path/builds/37821901) -> is it because something changed in this PR that caused validity tests to fail, or is it because appveyor was not running validity tests?

I haven't been paying attention to validity tests (because they were either passing or were not run, not sure), so it might be that they need to be updated, or there is still some problem in the code. I will be able to catch some time over the weekend to give it a more detailed look.

@NorfairKing
Copy link
Collaborator

@Martinsos the validity tests pass on linux, so it would be great if you could have a look. I cannot reproduce the failure locally.

@Martinsos
Copy link
Contributor

@Martinsos the validity tests pass on linux, so it would be great if you could have a look. I cannot reproduce the failure locally.

Sure, I will give it a look in the next couple of days! Just to make it clear, I also don't have local development setup on Windows so I will not be able to test locally, I normally do all the testing regarding windows on the CI - but nevertheless, I am sure we can figure it out. I will probably be asking more details about the validity tests once I get into them.

@NorfairKing
Copy link
Collaborator

@Martinsos I think the next step will be to make the tests fail on linux, not to get them to pass on windows.

@Martinsos
Copy link
Contributor

Martinsos commented Feb 21, 2021

@NorfairKing Getting them to fail on Linux would be valuable in order to make it possible to catch these during development on Linux machine?
Right now ValidityTests test Path module, which picks its implementation based on the OS of the machine it was compiled on.
We should probably look into modifying ValidityTests so that we can run them explicitly for the Path.Windows or Path.Posix, similar as right now we run unit tests first directly for Path.Windows and then directly for Path.Posix.
That would however require rewriting of the ValidityTests, for which I don't have time at the moment -> I might have in a month or two.

In the meantime we at least have Validity Tests running in CI on both Linux and Win!

I took some time today to investigate specific errors and I managed to categorize them by two causes, here is what I found out:

</> related validity errors:

On Windows, path starting with \ is considered relative path (by System.FilePath.Windows.isRelative). parseRelDir/File in Path will successfullly parse such path (because FilePath.isRelative returns true) and it will also keep the \ at the start of the path - normalization will let it be. So parseRelDir "\\a" will be Path "\\a".

On the other hand, </> is very simple - it only concatenates two internal FilePaths and does no normalization. So if given Path "a\\" (which is valid relative dir) and Path "\\a" (which is valid relative file), it will return Path "a\\\\a" -> it wasn't expecting there will be separator at the start of the second filepath, and now we have two consecutive separators, which is not a normalized path. So if we take its internal value and parse it again, it will be normalized and internal value will be different, violating the Validity property. Specifically, parseRelFile "a\\\\a" == Path "a\\\\a" does not stand.

Possible solutions that I can think of:

  • In </>, do normalization. Maybe this could be problematic performance wise? I personally wouldn't worry about it but I understand if that could be the complaint.
  • Give up on the problematic property. But it feels like we are loosing a lot with that, and if I am correct this property is guaranteed in the docs, so I don't think this is an option.
  • When doing parseRelDir, remove that starting separator -> but again, it seems like that separator on Windows might be relevant so removing it could change the meaning of path (I am not sure about this). Might be worth investigating but also doesn't sound very viable.

So at the end, I think doing normalization in </> sounds like the best solution.

"\\\\foo" is a drive

The rest of the errors are all caused by "\\\\foo".

The thing is, System.FilePath.Windows.isDrive considers considers "\\\\foo" to be drive, and I don't think anybody expected that while writing code in Path and using FilePath.isDrive.
I am not sure if this is bug in FilePath or if "\\\\foo" should be considered a drive btw -> we should investigate this further, what "\\\\foo" represents on Win.
EDIT: I did a quick search, and found that there is a kind of path called UNC path, which is path to a shared network resource, and those start with \\ on windows, so it is indeed a valid absolute path on Windows.

So, for example parent error. parent is implement in such way that if FilePath.isDrive is true for the given path, it will just return it as it is. The problem is, parent returns Path a Dir, and variant expected from Dir in the rest of the code is that it ends with separator. In this case, it will return "\\\\foo" as dir, but there is no separator at the end, so property in Validity instance that checks for this condition fails.

Solution here: Add \ at the end of the path when parent is called on the path that is drive.

There are two more failed tests caused by this, I didn't have time to dig into them but I see they are also caused by "\\\\foo".

Conclusion

  1. As @NorfairKing suggested, it would be nice if we could run ValidityTests locally both for Path.Posix and Path.Windows. This requires a rewrite of ValidityTests. In the meantime they are running on CI so this is probably not urgent.
  2. Failing validity tests need to be fixed on one-by-one basis.

I unfortunately have very little time in the next month or so (since my startup wasp-lang.dev (written in Haskell) got into YCombinator and now I am working whole day every day on that) so I can't take these on, but I think individual tests should be easy to fix anyway once the approach is decided upon, and rewriting ValidityTests is not urgent. I can jump in quickly into conversation though, if I can help.

Btw some thoughts: it seems that the Path codebase in general heavily relies on certain properties/variants of FilePath value that is inside Path (Path FilePath), but those are not enforced / checked on many places in the code. Looking at particular piece of the codebase, it can be tricky to figure out why smth does not work because to figure out all assumptions made regarding what this FilePath looks like, one has to go all the way back to parser. If these assumptions were documented wherever they are made, that might be one step forward ("Here we expect that path does not end with '/')". Or, if instead of assumptions, there were checks or normalizations, it might also be easier to reason about the code because I know it was normalized and I know what it should look like after normalization. I am talking relatively abstractly and I understand that in practice it might not be so easy or practical to make such kind of improvements, but it does feel like the situation could be improved in some way. All of the errors above were caused by this.

@NorfairKing
Copy link
Collaborator

I unfortunately have very little time in the next month or so (since my startup wasp-lang.dev (written in Haskell) got into YCombinator and now I am working whole day every day on that) so I can't take these on, but I think individual tests should be easy to fix anyway once the approach is decided upon, and rewriting ValidityTests is not urgent. I can jump in quickly into conversation though, if I can help.

Oh that's you! Congratulations to you and your brother :D

Or, if instead of assumptions, there were checks or normalizations,

Keep this train of thought going. You're about to invent validity-based testing :)

Right now ValidityTests test Path module, which picks its implementation based on the OS of the machine it was compiled on.

Ah right, I forgot about that! Yeah this would be non-trivial but I think an easy-er solution would be to compile the same test suite suite file twice but with different flags. We could do this by having two almost-identical sections in the cabal file that refer to the same test directory but use different compile flags; one for windows and one for posix.

</> related validity errors:
"\\foo" is a drive

This looks like the tests have found some real bugs.
Yay, that's what they are for.
We should be happy about that!
Personally I don't have any windows machines and have not used windows since 2012, so I don't really care about Windows myself, but path does, so we should get this in order.

@mrkkrp @chrisdone @sjakobi @harendra-kumar does any of you use windows or have the time to dig into these bugs?

@Martinsos
Copy link
Contributor

Martinsos commented Feb 22, 2021

Oh that's you! Congratulations to you and your brother :D

Ha thanks :D! We are dealing with a lot of files -> reading templates and then generating whole project with relatively complicated structure, so we found Path to be indispensable, I love it.

Keep this train of thought going. You're about to invent validity-based testing :)

Congrats on the Validity package, I saw that you are the author (gave it a star)! I was actually thinking somewhat more in the direction of capturing/ensuring these variants directly in the code, so that while I read the code I am aware of them. For example, Path could have smart constructor that does normalization. I don't think it can be done in the measure that would make validity tests redundant, but still it could make it easier to reason about the implementation without checking the tests. But I am guessing there were reasons to not do it also, as I said maybe possible performance impact hm.

Ah right, I forgot about that! Yeah this would be non-trivial but I think an easy-er solution would be to compile the same test suite suite file twice but with different flags. We could do this by having two almost-identical sections in the cabal file that refer to the same test directory but use different compile flags; one for windows and one for posix.

I didn't think of that! It might still need a little bit (or maybe none) of rewriting to ensure that validity properties are correctly defined for both platforms, but that shouldn't be much work.

This looks like the tests have found some real bugs.
Yay, that's what they are for.
We should be happy about that!

Haha yes, you are right :D 🎉 !

@NorfairKing
Copy link
Collaborator

we found Path to be indispensable, I love it.

Same for me with https://github.com/NorfairKing/smos.

gave it a star

Thanks 🤗

Path could have smart constructor that does normalization.

Path Does have constructors that do normalisation: https://github.com/commercialhaskell/path/blob/master/src/Path/Include.hs#L709
I think I'm misunderstanding what you meant. Would you elaborate please?

@Martinsos
Copy link
Contributor

Martinsos commented Feb 23, 2021

Path Does have constructors that do normalisation: https://github.com/commercialhaskell/path/blob/master/src/Path/Include.hs#L709
I think I'm misunderstanding what you meant. Would you elaborate please?

You are right, I didn't think it through completely - I think I disregarded these parsers since they can't be used in the </> as it is currently defined (it is pure) due to MonadThrow. That also explain why complete normalization is not done in </> -> it would be impractical if user had to worry about possible errors coming from it.
What I was actually thinking about is having a lightweight, pure normalization method, that could be used more freely -> one that doesn't throw any errors and is used internally to ensure some properties. But now that I am looking into it, those actually already exist, in Include.hs: normalizeDir and normalizeFilePath - so we probably just need to call them on some additional places (like in </>) and that is it really.
Sorry for blabbering, this happens when I share thoughts on something that I didn't think through properly!

@NorfairKing
Copy link
Collaborator

Sorry for blabbering, this happens when I share thoughts on something that I didn't think through properly!

No worries! This happens to me too :)

IIRC, one of @chrisdone 's goals was that </> was just ++.
If we can show an example where that doesn't work, then I'm sure it's discuss-able, but it would be nice if we don't have to go there.
To finish up this PR, let's disable the failing tests and create issues for fixing them.

@chrisdone
Copy link
Member

Indeed, the definition is literally that

(</>) (Path a) (Path b) = Path (a ++ b)

I found normalisation on append as in other libraries to be confusing.

@NorfairKing NorfairKing merged commit 05a747b into master Feb 25, 2021
@NorfairKing
Copy link
Collaborator

@Martinsos I can't seem to figure out why CI fails, would you mind having a look? It looks like the sources are missing somehow, and/or the checkout action is not run...

@Martinsos
Copy link
Contributor

Sorry for blabbering, this happens when I share thoughts on something that I didn't think through properly!

No worries! This happens to me too :)

IIRC, one of @chrisdone 's goals was that </> was just ++.
If we can show an example where that doesn't work, then I'm sure it's discuss-able, but it would be nice if we don't have to go there.
To finish up this PR, let's disable the failing tests and create issues for fixing them.

It works if left path has separator at the end, and right path has no separator at the start. However, as shown by tests above, there are cases in the current code where one or the other condition failed (parent produces dir that has no separator at the end, parsing \foo file leaves separator at start). So either we fix those issues individually and hope there will be no more of those, or we put in place some mechanisms to ensure those properties, or do some normalization in </>. But if we want to keep things lightweight, then probably it is best to start by fixing individual errors.

@Martinsos
Copy link
Contributor

Martinsos commented Feb 26, 2021

@Martinsos I can't seem to figure out why CI fails, would you mind having a look? It looks like the sources are missing somehow, and/or the checkout action is not run...

@NorfairKing As you said, it seems that source is missing. checkout step is not executed - it has the icon that indicates it was skipped. Looking at the code in the ci.yaml, I see following:

uses: actions/checkout@v2
      if: github.event.action == 'opened' || github.event.action == 'synchronize' || github.event.ref == 'refs/heads/main'

Now, these seem to be conditions under which checkout is done, and I am guessing the idea behind this is to target PRs and the main/master branch. I don't really understand why would you want to conditionally execute only checkout, I would instead assume such condition should be applied to the whole workflow (because if I don't do the checkout, nothing will work anyway), but I didn't read the whole ci.yaml so maybe I am missing smth.
We are still working on our github actions for Wasp (they are missing deployment), but we are using following to conditionally execute workflow: https://github.com/wasp-lang/wasp/blob/master/.github/workflows/ci.yaml#L3 , maybe that would be a better approach.

Anyway, problem seems to be in refs/heads/main -> There is main here beacuse of the new trend to use main for the main branch, but this repo is using master, so checkout step does not trigger on master. So the fix should be changing main to master here.

@chrisdone
Copy link
Member

chrisdone commented Feb 26, 2021

Typically the parsers do the proper normalization and then the rest of the combinators are trivial.

Windows file path formats are very complicated.

I think that path can be considered a good and complete package if it will only accept a very restricted subset of the plethora of syntactic varieties added by Microsoft.

For example, C:Project\wibble.sln "a relative path from the current directory of the C:" drive can be thrown out.

\Program Files\wibble.exe is an "absolute path from the root of the current drive" -- which makes it both absolute and relative. That can be thrown out.

I instead push for letting users write whatever junk they want when providing input to your program, and then your program runs "canonicalize path" on it to get something sensible that has a restricted meaning. That includes preventing the programmer from inputting junk like "\foo" as a relative directory in their source code.

I think the filepath package is serving as a crutch and confusing us rather than serving our judgement well.

Let's not get lost in the weeds serving the 1% case when the most important case on Windows file paths is C:\wibble\wobble.

The "universal naming convention" (\\machine\path) will be used by people wanting to deal with network drives; but this is a substantial departure from the rest of the platforms -- Linux and MacOS don't have a separate path syntax for network drives (they just mount them). I'm inclined to push for omitting this syntax until someone genuinely comes along needing it; and then frame the discussion as a separate package. UNCs are some awkward mid-point between file paths and URIs, they name a machine, but not a protocol.

@hasufell
Copy link

hasufell commented May 22, 2022

Windows file path formats are very complicated.

FYI, I wrote a proper parser of those windows filepaths based on this ABNF grammar which I extended.

\Program Files\wibble.exe is an "absolute path from the root of the current drive" -- which makes it both absolute and relative. That can be thrown out.

Kind of depends what you mean with relative. I'd argue what filepath currently means is: "semantics of the filepath depends on CWD". Following that definition \Program Files\wibble.exe is a relative path, but still distinct from Program Files\wibble.exe.

The path library seems to assign a different meaning to relative, as it uses this concept to ensure that </> behaves well. Those are different definitions.

I think the filepath package is serving as a crutch and confusing us rather than serving our judgement well.

Could you be more specific?

The problems I see currently are:

  1. semantics of isRelative (I don't know of any language that solved this better without restricting the subset of possible user input... filepath package can not restrict the user input. That would render the library broken, as it's a low-level, not a high-level library)
  2. semantics of isDrive (as in, takeDrive "\\\\localhost\\c$\\foo\\bar" == "\\\\localhost", which I find confusing and is not what other languages do)

At any rate... if I were to finish the ABNF parser and make a release... I could expose an API that allows access to the parsed ADT, which libraries that need more sophisticated information, could leverage.


Also note that there are many more forms, some of which cannot be validated or understood without performing IO:

\\.\UNC\localhost\c$\foo\bar                       -> \\localhost\c$\foo\bar
\\.\GLOBALROOT\GLOBAL??\UNC\localhost\c$\foo\bar   -> \\localhost\c$\foo\bar
\\.\GLOBALROOT\Device\Harddisk0\Partition2\foo\bar -> C:\foo\bar (if Harddisk0\Partition2 is C:)
\\.\HarddiskVolume2\foo\bar                        -> C:\foo\bar (if HarddiskVolume2 is C:)

You can see the full hierarchy of those names via https://docs.microsoft.com/en-us/sysinternals/downloads/winobj

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants