Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pathInfo has leading slash as an empty string #4

Open
Woody88 opened this issue May 17, 2020 · 14 comments
Open

pathInfo has leading slash as an empty string #4

Woody88 opened this issue May 17, 2020 · 14 comments

Comments

@Woody88
Copy link
Owner

Woody88 commented May 17, 2020

pathInfo property from the Request type outputs this ["", "path1", "path2"] for such path /path1/path2.

We do not need that empty string at the beginning there it is best to drop to the forward-slash from the path. Like this, we will end up with something like ["path1", "path2"] which is cleaner.

@JordanMartinez
Copy link
Contributor

Just came across this myself. How did httpure solve it?

@Woody88
Copy link
Owner Author

Woody88 commented May 30, 2020

Not sure. But, I think that we can just String.drop 0

@JordanMartinez
Copy link
Contributor

Looks like they drop empty strings

@JordanMartinez
Copy link
Contributor

I think Routing Duplex's method is a bit more efficient for parsing all parts of a given path.

@Woody88
Copy link
Owner Author

Woody88 commented May 30, 2020

Yea, it seems like they are also parsing query value which is not the purpose of pathInfo. I don't want this lib to have a lot of overhead. That is the reason why I am using something like the url package to handle all of that for me. pathInfo should really just return the url path segments.

P.S: I am assuming now that the URL parsing is more efficient than the PS version. I might be totally wrong.

@paluh
Copy link
Contributor

paluh commented May 30, 2020

Do we really need to provide pathInfo as a separate field? As @JordanMartinez pointed out users like me who use routing duplex or boomerang want full path and query string as a single value so this step is an additional overhead which is not useable and configurable... Maybe we should introduce polimorphism to the request row and allow the user to choose what she/he wants to include. We can do this efficiently with Record.Builder… I don't know.

@paluh
Copy link
Contributor

paluh commented May 30, 2020

It is somewhat related so I want to point out that for example query parsing is context dependent which can be somewhat surprising. It can depend whether a backend expects given request to be encoded by browser or not (so be prepared to handle space encoding against the spec):

purescript-contrib/purescript-uri#38 (comment)

purescript-hyper/hyper#51

@JordanMartinez
Copy link
Contributor

I think this extra field should be dropped. Perhaps it can be calculated and inserted into the normal request record via Record.Builder in a separate library that builds on top of this? This was the main reason why I stopped using httpure.

@paluh
Copy link
Contributor

paluh commented May 31, 2020

I fully agree with @JordanMartinez. We can provide utils for quick and opinionated parsing but this should be an opt-in in some way.
P.S.
I've also updated library links in my comment under URI issue about decoding + in queries.

@Woody88
Copy link
Owner Author

Woody88 commented Jun 1, 2020

Dumping @paluh and I discussion on slack related to this ticket:

paluh 6:45 PM

  • Sorry I've included two question in the first one - so let me clearly list them again:
  • What is the purpose of Request - what should be included as a minimum?
  • Should user be able to extend / customize somehow how Request is build?
  • If Request is closed and well defined, how much of the processing overhead is "OK" for some opinionated pieces like pathInfo or query. From my perspective this fields bring no value for our framework or apps.

6:52
I think that minimalism is an OK approach so provide only the minimum: status code, headers array, body bytestring, full path. The rest could be a set of utils.
6:52
But I'm a hard core user probably.
6:54
Also from my perspective these additional fields (like some headers etc.) are not a problem if they are not a huge overhead with unnecessary allocations for every request (like parsing and building object and array for an opinionated path representation).

Woody 7:03 PM
First in foremost, I think that we should probably start working on the Issue that you opened which is ttitled “Add note about motivation behind this lib”. For me, warp should act as micro web server framework that people can use easily if they do not crazy features like servant, yesod, or even paylod(PS world) - it should also have everything that web framework author would need to make some the similar frameworks that I just mentioned.

What is the purpose of Request - what should be included as a minimum?

If we are talking about bare minimum, then an approach like Hyper resonates more with me : https://github.com/purescript-hyper/hyper/blob/master/src/Hyper/Request.purs#L28-L34

Should user be able to extend / customize somehow how Request is build?

Extend it in what way that can help them? Example of what your are thinking would much appreciated

If Request is closed and well defined, how much of the processing overhead is “OK” for some opinionated pieces like pathInfo or query. From my perspective this fields bring no value for our framework or apps.

I dont understand the “how much overhead” for example I am using Url module - when I parse the whole url, including queries, everything is already there it’s more of a matter of assign the existing value to different field to give context or make it easy to use as a micro webserver framework in my opinion

paluh 7:08 PM
I like hyper representation but it doesn't work with HTML forms!
7:09
Here is an issue: purescript-hyper/hyper#51
7:10
But the essence is in the comment of mine under uri which I've already linked: purescript-contrib/purescript-uri#38 (comment)
7:10
Maybe we should move this discussion to the github to be able to answer in a systematic manner.
7:12
Also url parsing is useless in the case of usage of dedicated libs for url handling like routing-duplex (I'm using this in production).
7:12
In other words even an Object for a query is an opinionated representation of something which is not specified in the URI RFC!
7:13
In the RFC uri part is a properly bounded part / string with some limited charset.
7:14
Additionally pathInfo representation as an Array could be also inapropriate (edited)
7:14
If you want a List - which is an efficient stack and doesn't require allocation when you do pop (edited)
7:15
Which is important detail in the context of parsing it. (edited)
7:17
These are not theoretical details / problems. We have tried to use hyper (you can check my contributions and also issues list). I'm currently using httpure with problems (please check commits + issues list there too) :-)
7:18
In other words - hyper representation without url parsing would be a perfect Request for me but like I said I'm probably a hard core user 😉

Woody 7:21 PM
So, you are saying you do not want any url parsing. However, if someone wants to use warp as how would they filter paths?
7:21
It should provide some form of easy filtering

paluh 7:21 PM
It is like single call to his favorite library
7:22
case myParseQuery parse of
path1 -> endpoint1
path2 -> enpdoint2
_ -> repsonse404 (edited)

Woody 7:22 PM
So, you want to have a new person that decide to use warp to always think about their own parsing lib?

paluh 7:23 PM
You can provide this function if like it so much and think that this is the proper way of parsing query.
7:23
But I don't think that call to a pure function is a problem and nobody who wants to build something serious are going to use unidirectional parsing (edited)
7:24
I'm talking about web app context and of course that is my opinion.

Woody 7:26 PM
Well I am building backends that directly communicating which each other just like the API gateway project that I was testing

paluh 7:27 PM
How do you build URIs?
7:27
Do you hard code Strings ?

Woody 7:28 PM
It depends

paluh 7:28 PM
And that is the main point!
7:28
IT DEPENDS

Woody 7:30 PM
Alright, that was a good question
7:30
You made me see something a bit clearer here (edited)
7:30
The only thing is that the end user has a lot of things to thing about
7:30
I also thing about newbies
7:31
so. a good guide will be important to guide them step by step

paluh 7:31 PM
Talking about newbies I think that teaching people to hard coded uri strings is really strange practice from my perspective. (edited)

Woody 7:32 PM
When you want to prototype things its also nice when the tool doesnt get too much in your way too (edited)

paluh 7:33 PM
It was always really strange for me that Haskell people encourage SQL strings, URI strings and talk about strong typing on the same page of their web framework. (edited)

Woody 7:33 PM
I hear ya
7:34
but when I need to quickly spinup something which I want to prototype. It’s nice not having the need to create a special module for routing

paluh 7:36 PM
I understand that. On the other hand dispatch (so parsing) is I think always the first step which allows routing.
7:36
so like I've written it is only a single call to your favorite parsing function in a router case expression.
7:37
You can provide it - I'm not against some sane and simple, opinionated utils.
7:37
Like you suggested in our previous meet up.

Woody 7:40 PM
Ok, I think that I am reaching a conclusion. To summarize: I think that the Hyper approach of a request is nice. However, to be able to solve problem such as decoding “+”, it is best not to parse the url and allow the user to quickly install their own package and call their parsing function directly (edited)
7:40
lastly providing some simple and opinionated function that parse url for say fast prototyping should help (edited)
7:40
however, it might be best to warn the end user to not go that way in prod

paluh 7:42 PM
You know that I like this proposition 🙂

Woody 7:54 PM
Yea, I fully agree that is how I would do it anyway. I prefer the crazy type tricky stuff to be carried out by other framework that buids on top of warp and wai. I just dont want wai and warp to get in the way of people
7:54
So, that is why listening to you and jordan was important to me.
7:54
However, I just need clear concrete explanation
7:54
Maybe because Im slow lol

paluh 7:55 PM
I like this approach of yours to use simplest possible approach in these base libraries
🙏
1

Woody 7:55 PM
I want warp to become the node-http of purescript, wishful thiking lol

paluh 7:55 PM
If you stick to these principles and we provide different frameworks on top
7:56
I think it can become de facto standard.
7:56
I think that clear statement in the README could help. Why it is so "simple". Why do you want to possibly use it when building something larger.
7:57
Why and how do you want to use it when want for quick prototyping (using some predefined utils).

Woody 7:58 PM
Yea, I need to open a few issue similar to this #3

paluh 7:58 PM
I hope that we have enough power to carry this ecosystem stuff to the "full maturity" :-)

@JordanMartinez
Copy link
Contributor

JordanMartinez commented Jun 1, 2020

To summarize: I think that the Hyper approach of a request is nice. However, to be able to solve problem such as decoding “+”, it is best not to parse the url and allow the user to quickly install their own package and call their parsing function directly (edited)
lastly providing some simple and opinionated function that parse url for say fast prototyping should help
however, it might be best to warn the end user to not go that way in prod

Yes, that's exactly the kind of approach that should be taken here.

Let's say this framework did not have any url parsing, but just provided the url. To increase performance by not creating a new Record value, one could insert into the Request record what is currently pathInfo via Record.Builder before it runs any of the application. Thus, it might look like:

run settings \req callback -> app (addMyFields req) callback
where
  addMyFields :: NotNewtypedRequest -> ExpandedRequestType
  addMyFields req = 
    Record.insert (SProxy :: SProxy "pathInfo") (parseUrl req.rawPathInfo) req

This is something a framework (or even a small function) could do on top of this web server library. It would get you the quick prototyping you might want without forcing those who want their own routing library to pay for the performance cost of calculating what pathInfo is.

@Woody88
Copy link
Owner Author

Woody88 commented Jun 13, 2020

Let's say this framework did not have any url parsing, but just provided the url. To increase performance by not creating a new Record value, one could insert into the Request record what is currently pathInfo via Record.Builder before it runs any of the application. Thus, it might look like:

run settings \req callback -> app (addMyFields req) callback
where
  addMyFields :: NotNewtypedRequest -> ExpandedRequestType
  addMyFields req = 
    Record.insert (SProxy :: SProxy "pathInfo") (parseUrl req.rawPathInfo) req

This is something a framework (or even a small function) could do on top of this web server library. It would get you the quick prototyping you might want without forcing those who want their own routing library to pay for the performance cost of calculating what pathInfo is.

I have been looking at this approach but the example that your showing isn't using Record.Builder it actually makes a copy of the req. I tried making an example using Record.Builder but the compiler doesn't seem to like any of my approaches.

@paluh
Copy link
Contributor

paluh commented Jun 13, 2020

To have mutating request we should start from a builder I think. It could be something like Builder Unit Request. To avoid copying at all we create a builder in unsafe way being sure that request value is not used anywhere else:

unsafeRequestBuilder :: Builder Unit Request
unsafeRequestBuilder request = unsafeBuilder (const request)
  where
    unsafeBuilder :: (a -> b) -> Builder a b
    unsafeBuilder = unsafeCoerce

Of course this would complicate Application type etc. so we should consider if we really want to avoid this single copy of an object and pay for it with this builder based approach. I'm not sure.

SIDE NOTE:

This Builder switch could probably provide a possibility to introduce additional type parameters to the Application and Middelware. Currently we have a monoidal Middleware type:

type Middleware = Application -> Application

This can be limiting in the real life - it is really hard to have composable application which doesn't introduce additional context into the stack and not changes the type of the Monad etc. On the other hand this is also basic and easy to understand type so probably we should stick to this Middleware definition in the wai layer.

Just to provide a context - on our current backends (currently httpure based) we have a categorical compositions of middlewares (we avoid indexed monads on purpose).
So we have predefined monad stack (this should be taggless encoded probably but we are moving towards run anyway now ;-)) in App:

type AppM ctx st = ReaderT { | ctx } (StateT { | st} Aff)
type App ctx st = AppM ctx st HTTPure.Response

and then we have Middlware which is able to modify the context but also type:

type Middleware ctx st ctx' st' = App ctx st  App ctx' st'
type Middleware' ctx st = App ctx st  App ctx st

This way we can introduce / transform staff like: take configuration and provide db connection, wrap endpoint in transaction, provide state with session which can be modified during request flow etc.:

let
    apiMiddleware
      = handleLogging
        <<< handleSession
        <<< apiDbMiddleware
        <<< dbTransactionMiddleware
        <<< reqMetrics
in
   case RoutingDuplex.parse Route.route url of
       ...
       Right (Route.Api r') → apiMiddleware $ Api.run r'
       ...

But I think I don't want to introduce this into the wai and these kind of solution could be in contrib and of course in purescript-cookbook :-)

@paluh
Copy link
Contributor

paluh commented Jun 13, 2020

@Woody88 just showed me that he is starting from something like Builder {} Request in his prototype . And I fully agree that it is better (because the API is simpler) even though there would be one additional copy of an empty record when we run the builder ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants