<lopex[m]>
bleh, I remember that MRI spawns threads on bignums on some threshold
<lopex[m]>
one of the wackiest impls
<lopex[m]>
btw i'll be coding in .net now, but everything works onlinux even sqlserver
<headius>
I am interested in attempting a bite code compiler for Joni again
<lopex[m]>
yeah for compiler for joni
<headius>
There's a number of popular benchmarks right now that are dominated by the creation of the regular expression engine
<lopex[m]>
and native stack
<lopex[m]>
and after threshold go to heap one
<headius>
Yeah especially that. Lots and lots of int arrays
<lopex[m]>
I mean after some stack depth
<lopex[m]>
there's also repeat stack but it's simpler
<headius>
I'm interested in compiling in such a way that I could compile literal regular expressions right into the JRuby jit code
<lopex[m]>
so make it on IR level ?
<lopex[m]>
wild
<lopex[m]>
I think I could assist in some way
<headius>
Potentially yes, or just compiled as additional synthetic functions in the same class file
<lopex[m]>
but this would be big
<headius>
For small expressions it would be easy to just emit it right into the bike code
<lopex[m]>
also ther's possibility for some eliminations not available for current ast analyzer
<lopex[m]>
in joni
<headius>
Makes sense
<headius>
So I would probably do this as a separate compiler library that would go along with Joni if you want to go that extra level, and then JRuby can orchestrate which expressions to compile and where to emit the bike code
<lopex[m]>
and then this memoization mri introduced
<headius>
I think it should be pretty easy to get at least basic expressions compiling in just a couple days work
<lopex[m]>
but it would be desirable to make joni as a standaalone library stil
<headius>
The way I imagine the compiler library would be as a plug-in for Joni that can also be called as an optimizer API from JRuby
<lopex[m]>
I wonder what techniques V8 and others apply, it would be worth to steal some ideas
<headius>
I know truffle has a non-back tracking regex implementation
<lopex[m]>
yeah
<headius>
They can't use it for everything but for the places where they can use it I assume it's quite fast
<lopex[m]>
but for now all those analyze regexps at parse to decide which impl to use
<headius>
Right
<headius>
We maybe decide there's a number of really common expressions worth having pre-compiled as part of JRuby's build
<headius>
Certainly anything in the core of JRuby that's used heavily
<lopex[m]>
afaik someone from truffle made that scan
<lopex[m]>
er
<lopex[m]>
that was those from MRI
<lopex[m]>
to decide it that memoization thing was worth the shot
<lopex[m]>
"Therefore, as a new way to prevent ReDoS, we propose to introduce cache-based optimization for Regexp matching. As CS fundamental knowledge, automaton matching result depends on the position of input and state. In addition, matching time explosion is caused for repeating to arrive at the same position and state many times. Then, ReDoS can be prevented when pairs of position, and state arrived once is recorded (cached). In fact, under such an
<lopex[m]>
optimization, it is known as the matching time complexity is linear against input size [1]."
<headius>
Interesting
<lopex[m]>
but it;'s not backported to onigmo afaik
<headius>
Have you looked at this PR very much?
<lopex[m]>
they claim that memory usage goes quadratic but for lots of those regexps time remains linear
<lopex[m]>
no, just glimpsed
<lopex[m]>
I suspect it just mighe be some caching on backtracking
<headius>
enebo: I think this will make it work for this specific case, sending from within the same module
<headius>
enebo: yeah it works
<enebo[m]>
nice. perhaps extract out that logic just to make the main method a little smaller...it just marks maybeIsRefinement right?
<headius>
Well, it kind of is its own utility method right now
<headius>
I can try to finesse the logic a bit
<enebo[m]>
ah ok. I thought it was in buildCall
donv[m] has quit [Quit: You have been kicked for being idle]
<enebo[m]>
"Come celebrate the greatest lake, and test your frozen fortitude. A hearty bunch, we gather around at closing time, strip to our skivvies (or not - you do you), and purify ourselves in the cleansing waters of Lake Chipotle. Guac is extra."
<headius>
ahh lake Chipotle
<enebo[m]>
It is one of the feel good things of minneapolis
<enebo[m]>
Fascinating too that it has not been fixed
<headius>
now I'm hungry for chipotle
<headius>
I'm testing the new psych impl, how do you feel about releasing with it
<headius>
I just pushed 5.1.0.pre1 and will have a PR to build against it shortly
<enebo[m]>
I have been expecting that but we definitely need to figure out if we can test it significantly beyond unit tests
<headius>
most of these uses seem like they're probably a bit peripheral, like roundtripping from some markup to yaml
<headius>
there was one change required in psych tests for the 1.2 upgrade
<headius>
anchors used as keys have to have a space before the colon.... like { *MY_ANCHOR : "value"}
<headius>
I don't know if that is common enough to be a problem
<enebo[m]>
another idea is closed psych issues as a source of project links
<headius>
sequel is failing on master because nokogiri update now includes a library that only works on Java 11+
<headius>
not sure if that has been followed up on but Jeremy mentioned it somewhere
subbu_ has quit [Quit: Leaving]
subbu_ has joined #jruby
subbu_ has quit [Quit: Leaving]
subbu has joined #jruby
<enebo[m]>
I have the commit which regressed joni. I am asking lopex about it
subbu has quit [Ping timeout: 252 seconds]
<headius>
ok
<headius>
rails seems to work ok with new psych
<enebo[m]>
you could add .json onto url but I suppose the simple data will not be the issue
<enebo[m]>
jruby -e 'not_between 1...3'
<enebo[m]>
I got a repro on the errant warning
<headius>
seems like some skipped or excluded tests in psych repo now pass on new snakeyaml
<enebo[m]>
parser seems to see range ... as forward only in the case of warning :)
<headius>
nice
<enebo[m]>
oh well that's cool
<enebo[m]>
I believe the joni commit can be reverted. It appears to tweak next loc to check so I think it was meant to check less but lopex no doubt will grok it a lot better
<headius>
well that's good
<enebo[m]>
the regression is 100x slower
subbu has joined #jruby
subbu has quit [Read error: Connection reset by peer]