<lopex[m]> bleh, I remember that MRI spawns threads on bignums on some threshold
<lopex[m]> one of the wackiest impls
<lopex[m]> btw i'll be coding in .net now, but everything works onlinux even sqlserver
<headius> I am interested in attempting a bite code compiler for Joni again
<lopex[m]> yeah for compiler for joni
<headius> There's a number of popular benchmarks right now that are dominated by the creation of the regular expression engine
<lopex[m]> and native stack
<lopex[m]> and after threshold go to heap one
<headius> Yeah especially that. Lots and lots of int arrays
<lopex[m]> I mean after some stack depth
<lopex[m]> there's also repeat stack but it's simpler
<headius> I'm interested in compiling in such a way that I could compile literal regular expressions right into the JRuby jit code
<lopex[m]> so make it on IR level ?
<lopex[m]> wild
<lopex[m]> I think I could assist in some way
<headius> Potentially yes, or just compiled as additional synthetic functions in the same class file
<lopex[m]> but this would be big
<headius> For small expressions it would be easy to just emit it right into the bike code
<lopex[m]> also ther's possibility for some eliminations not available for current ast analyzer
<lopex[m]> in joni
<headius> Makes sense
<headius> So I would probably do this as a separate compiler library that would go along with Joni if you want to go that extra level, and then JRuby can orchestrate which expressions to compile and where to emit the bike code
<lopex[m]> and then this memoization mri introduced
<headius> I think it should be pretty easy to get at least basic expressions compiling in just a couple days work
<lopex[m]> but it would be desirable to make joni as a standaalone library stil
<headius> The way I imagine the compiler library would be as a plug-in for Joni that can also be called as an optimizer API from JRuby
<lopex[m]> I wonder what techniques V8 and others apply, it would be worth to steal some ideas
<headius> I know truffle has a non-back tracking regex implementation
<lopex[m]> yeah
<headius> They can't use it for everything but for the places where they can use it I assume it's quite fast
<lopex[m]> but for now all those analyze regexps at parse to decide which impl to use
<headius> Right
<headius> We maybe decide there's a number of really common expressions worth having pre-compiled as part of JRuby's build
<headius> Certainly anything in the core of JRuby that's used heavily
<lopex[m]> afaik someone from truffle made that scan
<lopex[m]> er
<lopex[m]> that was those from MRI
<lopex[m]> to decide it that memoization thing was worth the shot
<lopex[m]> *if
<headius> What exactly does it memorize?
<headius> Memoize
<lopex[m]> somewhere there https://github.com/ruby/ruby/pull/6486
<lopex[m]> "Therefore, as a new way to prevent ReDoS, we propose to introduce cache-based optimization for Regexp matching. As CS fundamental knowledge, automaton matching result depends on the position of input and state. In addition, matching time explosion is caused for repeating to arrive at the same position and state many times. Then, ReDoS can be prevented when pairs of position, and state arrived once is recorded (cached). In fact, under such an
<lopex[m]> optimization, it is known as the matching time complexity is linear against input size [1]."
<headius> Interesting
<lopex[m]> but it;'s not backported to onigmo afaik
<headius> Have you looked at this PR very much?
<lopex[m]> they claim that memory usage goes quadratic but for lots of those regexps time remains linear
<lopex[m]> no, just glimpsed
<lopex[m]> I suspect it just mighe be some caching on backtracking
<headius> That's what it sounds like
<lopex[m]> https://miro.medium.com/max/1100/1*6fFlZ4Lt7Uxnq1P13C8FiA.webp
<lopex[m]> the commits didnt look very invasive too
<lopex[m]> but in MRI "monorepo" it's a bit more complicated to analyze
drbobbeaty has quit [Quit: Textual IRC Client: www.textualapp.com]
drbobbeaty has joined #jruby
<headius> enebo: that refinement issue with tzinfo is because they `send(:using, ...)` and it looks like they are removing that
<headius> when you send using it is no longer in lexical scope and so we don't activate refinements
<headius> I mentioned the other day that we might need to start treating all scopes as potentially refined, but that's not going to happen soon
<headius> this case we might be able to hack by detecting send(:using, ... and send(:refine, ...) as though they're the direct calls
<enebo[m]> It has been a couple of months but for some reason I thought they even made that change to send for jruby
<enebo[m]> perhaps they thought they were helping and changing it for version reasons or something like that
<headius> for some bug in 9.0.5.0
<headius> according to the comment
<enebo[m]> ah ok well a bit of a misdirection then
<enebo[m]> I am going to look at an erroneous warning we give saying ... was found which also happens in that run
<enebo[m]> I also have seen it during MRI runs so there is some bad warn logic
<enebo[m]> well good. case closed as far as why
<headius> enebo: I think this will make it work for this specific case, sending from within the same module
<headius> enebo: yeah it works
<enebo[m]> nice. perhaps extract out that logic just to make the main method a little smaller...it just marks maybeIsRefinement right?
<headius> Well, it kind of is its own utility method right now
<headius> I can try to finesse the logic a bit
<enebo[m]> ah ok. I thought it was in buildCall
donv[m] has quit [Quit: You have been kicked for being idle]
<enebo[m]> "Come celebrate the greatest lake, and test your frozen fortitude. A hearty bunch, we gather around at closing time, strip to our skivvies (or not - you do you), and purify ourselves in the cleansing waters of Lake Chipotle. Guac is extra."
<headius> ahh lake Chipotle
<enebo[m]> It is one of the feel good things of minneapolis
<enebo[m]> Fascinating too that it has not been fixed
<headius> now I'm hungry for chipotle
<headius> I'm testing the new psych impl, how do you feel about releasing with it
<headius> I just pushed 5.1.0.pre1 and will have a PR to build against it shortly
<enebo[m]> I have been expecting that but we definitely need to figure out if we can test it significantly beyond unit tests
<headius> yeah the answer to that is probably no
<enebo[m]> The question is also do we put this into 9.3 since it will also still get the CVE issue action
<headius> I did wonder that myself
<headius> probably should but I am less eager there due to stability
<enebo[m]> damn the regexp perf bug is not fixed
<enebo[m]> I tested it but I must have hit 9.4 for 9.3 test and 9.4 still had random issue so the search is fast
<headius> boo
<headius> I saw that
<headius> psych tests needed updating but CI looks ok otherwise
<headius> you know we could probably do a bad implementation of caller by having alias calls push some stack, as gross as that may be
<headius> ugh
<headius> __caller__
<headius> `caller
<enebo[m]> ripper uses caller with aliasing (which I replaced locally)
<enebo[m]> the caller with the underscores that is
<headius> yeah
<headius> there's no easy fix right now
<headius> but we need to rework call path for kwargs so there's an opportunity
<headius> we only pass in the method to be used for super, which is always the original name
<headius> FWIW I disagree that that person's example is a "good" example of metaprogramming, but I get the pattern
<headius> one method that does many things based on how it is called... does not sound like a good design to me
<enebo[m]> yeah all I can say is I have seen it at least once in the wild
<enebo[m]> which is likely to exist more than only one place
<enebo[m]> For ripper it is just to make 60 methods which have 2 behaviors and have a single impl
<headius> yeah
<enebo[m]> There are other ways to do that which does not involve caller but it is more golfed
<enebo[m]> which is why it will win out over just doing an eval
<headius> AliasMethod doesn't even keep the aliased name right now
<enebo[m]> that could be changed at least
<headius> if it did, there's no place to pass it to the contained method
<enebo[m]> but would that be enough?
<enebo[m]> yeah
<headius> if we always called through AliasMethod and never bypassed it, these could push callee on some stack
<headius> I also just realized we don't optimize method to just use the method's real name when possible
<headius> __method__
<headius> so it forces a frame, like block_given? does
<headius> I need to make an indy CallSite that works like an intrinsic when the method is what we expect
subbu has joined #jruby
<headius> psych PR looks ok
subbu has quit [Ping timeout: 260 seconds]
<headius> $ jruby -e 'def foo; puts __callee__; end; alias bar foo; bar'
<headius> bar
<headius> fix isn't super pretty but it works
<headius> need to patch indy separately because it bypasses AliasMethod.call
<headius> actually this is not sufficient because it will see the alias's callee from further up call stack
<headius> would have to be more like kwargs flags
<headius> so yeah back to my point about needing a call path overhaul anyway
<headius> really just needs to go on call stack, one way or another
subbu has joined #jruby
<enebo[m]> yeah
subbu_ has joined #jruby
subbu has quit [Ping timeout: 260 seconds]
<headius> I merged in the send(:using, ...) fix
<headius> seems like my random change might have regressed one test but it doesn't always fail 🤔
<enebo[m]> it randomly fails
<enebo[m]> wow bisect did the amazing coinflip
<enebo[m]> 5 goods in a row and last one was the bad one
<headius> what sort of testing do you think would make you feel ok about psych update
<headius> it should be possible for an app to use older one, fwiw
<headius> just specify in bundle
<enebo[m]> well I wonder if anything in rails hits json enough to have feels
<enebo[m]> like unit tests in one of the main rails gems
<headius> yeah maybe?
<enebo[m]> I think maybe looking to see on rubygems.org what things specify psych
<headius> oh that's a good idea
<enebo[m]> That may be unusual I guess based on it being in stdlib but someone might
<enebo[m]> they all should be
<headius> most of these uses seem like they're probably a bit peripheral, like roundtripping from some markup to yaml
<headius> there was one change required in psych tests for the 1.2 upgrade
<headius> anchors used as keys have to have a space before the colon.... like { *MY_ANCHOR : "value"}
<headius> I don't know if that is common enough to be a problem
<enebo[m]> another idea is closed psych issues as a source of project links
<headius> sequel is failing on master because nokogiri update now includes a library that only works on Java 11+
<headius> not sure if that has been followed up on but Jeremy mentioned it somewhere
subbu_ has quit [Quit: Leaving]
subbu_ has joined #jruby
subbu_ has quit [Quit: Leaving]
subbu has joined #jruby
<enebo[m]> I have the commit which regressed joni. I am asking lopex about it
subbu has quit [Ping timeout: 252 seconds]
<headius> ok
<headius> rails seems to work ok with new psych
<enebo[m]> you could add .json onto url but I suppose the simple data will not be the issue
<enebo[m]> jruby -e 'not_between 1...3'
<enebo[m]> I got a repro on the errant warning
<headius> seems like some skipped or excluded tests in psych repo now pass on new snakeyaml
<enebo[m]> parser seems to see range ... as forward only in the case of warning :)
<headius> nice
<enebo[m]> oh well that's cool
<enebo[m]> I believe the joni commit can be reverted. It appears to tweak next loc to check so I think it was meant to check less but lopex no doubt will grok it a lot better
<headius> well that's good
<enebo[m]> the regression is 100x slower
subbu has joined #jruby
subbu has quit [Read error: Connection reset by peer]
subbu has joined #jruby
subbu has quit [Ping timeout: 248 seconds]