mlaug has quit [Quit: The Lounge - https://thelounge.chat]
mlaug has joined #jruby
satyanash has quit [Quit: kthnxbai]
satyanash has joined #jruby
drbobbeaty has quit [Ping timeout: 256 seconds]
drbobbeaty has joined #jruby
<enebo[m]> kares: I don't remember because it was over 3 years ago but did we ever discuss removing _strptime for strptime to avoid the logic and ruby code for just rewrapping data back into a RubyHash and then out into a handful of simpler values for new!
<enebo[m]> I do remember we got a lot of perf out of caching the parsing of the format string but I am not sure why we didn't boil the ocean
<enebo[m]> I don't even mention what I just suggested so I am not sure why I eliminated that. I suppose it may be because we still need a _strptime which returns a hash as a public API
<enebo[m]> (Also I am not looking at working on this but I was just thinking about it because we are working on a new sprintf impl for 9.4)
<kares[m]> Recall working on some of it for a few days in IDEA which had a bug of loosing FS changes on a restart ... while I was just about to start committing.
<kares[m]> actually no, I was porting some other bits due perf. from C code which JRuby still does in Ruby. believe having those aligned would have helped avoiding the Hash rewraps
<enebo[m]> kares: ok well something we can think about in the future. We perform considerably better than MRI so it is just a blue sky thing
<enebo[m]> I plan on sprintf cache and also a size prediction metric so I expect to see some gains there when it is finished
<kares[m]> we do? guess the cache did not exist when I did the RubyDate stuff since we weren't back than
<enebo[m]> I added the cache in the commit above
<enebo[m]> it was a big jump in perf
<nirvdrum[m]> TruffleRuby added a fast path for the default time format string in the default logger and that had a fairly large impact. https://github.com/oracle/truffleruby/pull/2361/files#diff-2165f1135ce1cc7280599d50b87438aee142f40977cc937dbfd175a8c79795fd is most of the changes. That has a small bug that was fixed in a subsequent commit.
<nirvdrum[m]> It might be worth looking at something similar.
<enebo[m]> nirvdrum: I did not look too long but I mostly just see precalculated paddings
<enebo[m]> and I think LazyIntRope so you basically have a pre-calced set of ropes you just potentially merge when you use the value?
mlaug4 has joined #jruby
siasmj_ has joined #jruby
mlaug has quit [*.net *.split]
siasmj has quit [*.net *.split]
siasmj_ is now known as siasmj
mlaug4 is now known as mlaug
<nirvdrum[m]> It simplifies the full parser to just the features used by the default time format string. strftime needs to account for a ton of features that aren't really used all that often. The usage of the restricted set is optimistic. If the format string actually uses any feature outside of that restricted set, it returns `null`, which is a bit like a deopt. From there, there's a cache from the format string to the set of nodes needed to
<nirvdrum[m]> actually parse it.
<nirvdrum[m]> You could probably simplify it to something like `if (format == DEFAULT_LOG_TIME_FORMAT) { use_fast_parser() } else { use_full_parser() }`
<nirvdrum[m]> LazyIntRope is just another layer to all of that.
<nirvdrum[m]> Anyway, since it sounds like you're looking at caching the parse result. I'm mostly just pointing out that we had good luck with default time format for Ruby's built-in logger.
<enebo[m]> ah I see. I did not really see it in the code but I did not look at it very long
<enebo[m]> nirvdrum: as it turns out our strftime impl has some invariant processing every time because it is not caching the format processing at all but does use a per thread parser
<enebo[m]> just adding a simple cache does improve perf a little bit so that is ok
<enebo[m]> but I also notice we use String and CharSequence and then convert it back to bytes using Charsets
<enebo[m]> So I think even without making a scaled back lesser but simpler parser there seems to be quite a bit to squeeze out
<enebo[m]> nirvdrum: The common parser is an idea I outlined for our persistence format a little bit at fosdem
<enebo[m]> script/module/class bodies all only have a small fraction of instrs for most files loaded and we can bump scope without leaving current simpler interpreter