#jruby on 2021-11-09 — irc logs at libera.irclog.whitequark.org

2021-10-13 17:53 ChanServ changed the topic of #jruby to: Get 9.3.1.0! http://jruby.org/ | http://wiki.jruby.org | http://logs.jruby.org/jruby/ | http://bugs.jruby.org | Paste at http://gist.github.com

05:20 mlaug has quit [Quit: The Lounge - https://thelounge.chat]

05:21 mlaug has joined #jruby

06:09 satyanash has quit [Quit: kthnxbai]

06:09 satyanash has joined #jruby

13:25 drbobbeaty has quit [Ping timeout: 256 seconds]

14:56 drbobbeaty has joined #jruby

16:21 <enebo[m]> kares: I don't remember because it was over 3 years ago but did we ever discuss removing _strptime for strptime to avoid the logic and ruby code for just rewrapping data back into a RubyHash and then out into a handful of simpler values for new!

16:22 <enebo[m]> I do remember we got a lot of perf out of caching the parsing of the format string but I am not sure why we didn't boil the ocean

16:25 <enebo[m]> I found a commit with some info: https://github.com/jruby/jruby/commit/f2e7b82a3f49dedec71073919d7315f35a68b737

16:26 <enebo[m]> I don't even mention what I just suggested so I am not sure why I eliminated that. I suppose it may be because we still need a _strptime which returns a hash as a public API

16:26 <enebo[m]> (Also I am not looking at working on this but I was just thinking about it because we are working on a new sprintf impl for 9.4)

16:46 <kares[m]> Recall working on some of it for a few days in IDEA which had a bug of loosing FS changes on a restart ... while I was just about to start committing.

16:49 <kares[m]> actually no, I was porting some other bits due perf. from C code which JRuby still does in Ruby. believe having those aligned would have helped avoiding the Hash rewraps

16:55 <enebo[m]> kares: ok well something we can think about in the future. We perform considerably better than MRI so it is just a blue sky thing

16:55 <enebo[m]> I plan on sprintf cache and also a size prediction metric so I expect to see some gains there when it is finished

17:14 <kares[m]> we do? guess the cache did not exist when I did the RubyDate stuff since we weren't back than

17:33 <enebo[m]> I added the cache in the commit above

17:33 <enebo[m]> it was a big jump in perf

17:41 <nirvdrum[m]> TruffleRuby added a fast path for the default time format string in the default logger and that had a fairly large impact. https://github.com/oracle/truffleruby/pull/2361/files#diff-2165f1135ce1cc7280599d50b87438aee142f40977cc937dbfd175a8c79795fd is most of the changes. That has a small bug that was fixed in a subsequent commit.

17:41 <nirvdrum[m]> It might be worth looking at something similar.

18:23 <enebo[m]> nirvdrum: I did not look too long but I mostly just see precalculated paddings

18:24 <enebo[m]> and I think LazyIntRope so you basically have a pre-calced set of ropes you just potentially merge when you use the value?

19:51 mlaug4 has joined #jruby

19:53 siasmj_ has joined #jruby

20:00 mlaug has quit [*.net *.split]

20:00 siasmj has quit [*.net *.split]

20:00 siasmj_ is now known as siasmj

20:00 mlaug4 is now known as mlaug

20:09 <nirvdrum[m]> It simplifies the full parser to just the features used by the default time format string. strftime needs to account for a ton of features that aren't really used all that often. The usage of the restricted set is optimistic. If the format string actually uses any feature outside of that restricted set, it returns `null`, which is a bit like a deopt. From there, there's a cache from the format string to the set of nodes needed to

20:09 <nirvdrum[m]> actually parse it.

20:11 <nirvdrum[m]> You could probably simplify it to something like `if (format == DEFAULT_LOG_TIME_FORMAT) { use_fast_parser() } else { use_full_parser() }`

20:11 <nirvdrum[m]> LazyIntRope is just another layer to all of that.

20:12 <nirvdrum[m]> Anyway, since it sounds like you're looking at caching the parse result. I'm mostly just pointing out that we had good luck with default time format for Ruby's built-in logger.

20:31 <enebo[m]> ah I see. I did not really see it in the code but I did not look at it very long

20:32 <enebo[m]> nirvdrum: as it turns out our strftime impl has some invariant processing every time because it is not caching the format processing at all but does use a per thread parser

20:32 <enebo[m]> just adding a simple cache does improve perf a little bit so that is ok

20:32 <enebo[m]> but I also notice we use String and CharSequence and then convert it back to bytes using Charsets

20:33 <enebo[m]> So I think even without making a scaled back lesser but simpler parser there seems to be quite a bit to squeeze out

20:37 <enebo[m]> nirvdrum: The common parser is an idea I outlined for our persistence format a little bit at fosdem

20:38 <enebo[m]> script/module/class bodies all only have a small fraction of instrs for most files loaded and we can bump scope without leaving current simpler interpreter