-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HLearn.History #56
Comments
Thanks again for the detailed comments! I'll try to respond to everything point-by-point, but if I miss something let me know. API & HistoryT I really like the idea of a I've thought about moving this monad into a separate library to make it more general purpose. As a separate library, this wouldn't depend on subhask at all, which I think would make it much easier to adopt. I haven't done this yet because I think it would slow down my development time a bit. But if you think it'd be useful I'd be up for it. Can you say a bit more about your use case? Relationship to Criterion Totally agree. That's been one of my wishlist features for a while now. Another feature I wanted added is to measure SubHask One of my goals with subhask is to make all instances of I haven't yet implemented anything related to monad transformers in subhask. The reason is I haven't thought enough about the consequences of having a monad in one category transform a monad in another category. This shouldn't be an obstacle to making Reportable versus Optimizable I don't think there should be any issues when using The Now to address a point you didn't bring up. There's an ugly side to the
in the Univariate.hs file. I want to get the |
My use case for History is pretty much yours, lol. See https://github.com/tonyday567/digit-recognizer/blob/master/testing/BenchKnn.hs#L208. I'd like to replace all the time and timeIOs with report and reportIOs (or similar), and include the info wrapped up in Criterion.Extended. wrt the ugly side, that sig is due to infoType. If you gave up on automatically using the type as the report collection label, and accepted a hard coded label (of type s) eg
or you could go the whole hog, and use a String as the label as beginFunction does, making it:
or even
|
I went back and looked at the There's still a few odd constraints on some of the functions. I think I can remove them too, but I'm off to bed now :) |
Looking at how to integrate History.Timing with History, I ran into a brick wall. The problem that I'm trying to solve is that the Report/Measure thing is related to the context used in the step. So Report as written matches CountInfo as written (and also goes with the hardcoded getCPUTime in runHistory for example). The ideal would be for Report/Context to be built up using disparate effects. The components look like pre-computation data gathering (eg getCPUTime), post-computation compute, a running total and display of the data. I came up with this data type:
An experiment using this is here: https://github.com/tonyday567/HLearn/blob/ghc7.10dev/src/HLearn/History/Measure.hs But which turned out to be a deadend - it's hard to use with the types being swallowed. |
This change shouldn't require adding a new We can create two new functions that are analogues of the
I haven't actually tested these functions, but I think they should work. Then the next step is to write a
Then in order to get the message printed to the screen, Does that explanation make sense? |
I think so - will give it a try! It won't give you real time information though. The part I left out was that I'd like (personally) to collect GC information that you can get from GHC.Stats, so I wanted to write:
but that would then involve a rewrite of runHistory and report. So I was thinking of how to generalise (adding cache misses etc, without having to rewrite all the time). |
Ahh... I see. That's actually a really interesting idea... I'll have to start thinking about that too. |
I did some testing of rdtsc and it has outstanding metrics, compared with getCPUTime and getCurrentTime. It has to be the future of high performance regressions. |
That looks super nice! After a quick look through, this might be using the same API I was hoping to use to measure cache performance. |
I don't think you're going to be that lucky, sorry. |
I was thinking of tackling a FIXME surrounding incorporation of History.Timing into the History modue. Here's a few initial thoughts and ideas.
API & HistoryT
report is the main (and only) access to the History machinery.
I'd like to include an ability to be selective in what's timed, and quickly turn an IO chunk say, into a History chunk (by adding report to a line and liftIOing other stuff). Say we have:
becoming
Which all seems doable and very cool. At this point, however, I jump straight to thinking about a HistoryT, so the report api can handle an IO a etc, but I lack a MonadIO to throw in there.
So question number 1 is whether to give up on a HistoryT given no MonadIO etc, or whether that would be a simple compatability patch.
Relationship to Criterion
I do a lot of benchmarking and the guts of criterion has the wrong types for stuff I often want to do (quickly annotate computations with debugging and timing information, rather than do statistical analytics of multiple runs without any continuation that criterion is based on)
I often pick apart times into GC and Mutation via some criterion functionality. I also think it would benefit the History monad if you could start and stop the timing within a report chunk, but still be able to report timings at higher levels. The report API could morph to something like this:
SubHask versus Control.Monad
Being able to benchmark in the middle of a computation is a nice goal in the broader Haskell toolkit (there might be stuff out there that already does this, but I haven't come across it).
I'm unsure whether to head for a more general History (or HistoryT) based on Control.Monad or stick to SubHask.
My use case, however, is pretty centered on HLearn, so targetting a boiler-plate monad may not get the job done.
Very generally, to what extent can normal monads play nice with subhask?
Reportable versus Optimizable
History.Timing (one off reporting) is starting to drift away from the idea of reporting on optimization (or other) loopings. There might be a case for clear separation of looping constructs like stopping criteria versus straight reporting constructs (like when to report timings).
I haven't dug into how History is used in a looping context yet.
I'm sure there's lots of other issues I haven't thought of yet.
Tony
The text was updated successfully, but these errors were encountered: