sagax has joined #jruby
<kares[m]> enebo: that I actually recall we had JavaSupport no Impl but back the day truffle folks needed a base "interface" and ended up with the abstract class
<kares[m]> the JavaSupportImpl is redundant for JRuby at this point
<kares[m]> * enebo: that I actually recall - there used to be JavaSupport no Impl but back the day truffle folks needed a base "interface" and ended up with the abstract class
<headius> good to know
<headius> back from dental work... face is numb but I will get some work done 👍
<headius> ok now that I'm back to work I will look into these JIT failures during specs
<headius> I thought we used to be green in JIT mode but these may all be the same regression
<enebo[m]> kares: ah ok yeah so maybe we can kill impl
<headius> edipo.federle: are you still looking into
drbobbeaty has quit [Quit: Textual IRC Client:]
<headius> last few comments describe the actual bug here... basically because we will immediately JIT the block but still run it in interp once, we end up with two different values for a `//o` regexp if the block jits on its own
<enebo[m]> looks like in #6748. A raw append caused a regression on inspect for Rubyarray. It used to rbStrCat which would have promoted an encoding change
<headius> I'm not sure how to fix this but it is a very narrow set of conditions to trigger it
<enebo[m]> heh
<enebo[m]> So DRegex makes two forms startup and JIT and neither will pass anything to each other so they both make the same regexp
<enebo[m]> This is not normally an issue with regexps since they are made eagerly during AST creation
<headius> right
<headius> and it is not an issue for method bodies because if we execute once in interp we will see the regexp and use it in JIT
<headius> the problem is that the block executes in interp AFTER we jit
<enebo[m]> In 3.0 there are LOTS of all sites are unique issues and this one was one I did not look at
<enebo[m]> but this shows up as a test
<headius> (this is ignoring a lot of concurrency issues that are up in the air because CRuby has not responded to an issue I filed)
<headius> i.e. what should happen if two pieces of code run the same `//o` regexp at the same time
<enebo[m]> yeah I mean we can assume this is not deterministic in that case
<enebo[m]> how could it be?
<headius> this block case is just in a single thread though, just a weird side effect of how we always execute an unjitted block once before transitioning to jit
<headius> I suppose we could defer block jit until after the interpreted call returns
<headius> that would better reflect when it is actually enabled
<enebo[m]> headius: what was in your blah.rb in that comment?
<enebo[m]> I see that spec but I am confused why that would execute with threshold=0
<enebo[m]> from non-JIT
<enebo[m]> I am assuming it involves the snippet from the spec but what was in there exactly
<headius> the regexp executes twice in a 2.times loop
<enebo[m]> fwiw I somewhat doublt threshold=0 is the only case
<headius> with threshold=0 it will jit right away but because of how we transition to jitted block we still run it in interpreter once
<headius> pretty sure it is
<enebo[m]> if we ran startup 10 times and JIT went on 11th wouldn't we still make a new regexp?
<headius> threshold=1 might also be an issue but anything higher would have the `//o` regexp already
<headius> no, because of the fix I link there
<headius> I modified dregexp compilation to act like a normal static regexp if is is `//o` and already executed once
<headius> but in this case jit happens before interp and it does not see that it will be populated
<headius> if threshold=1 then we know interp will run at least once before we jit so it will see the already-compiled regexp
<headius> if threshold=0 and it is a method body we go straight into JIT
<headius> oh and my blah.rb there is just the body of the spec printing out the regexp
<enebo[m]> alright this is because JIT makes regexp at/before the time the interp instr makes it so it cannot grab it from the instr
<headius> right
<headius> and because we will still interp the block at least for the first time, we get two values
<enebo[m]> we need a promise added :)
<headius> yeah that is basically what we ant
<headius> want
<headius> a promise we can pass into the jitted code somehow
<enebo[m]> but we only need it for this very odd case
<headius> but since we can't do that I think deferring jit until after the first interp call returns would work?
<enebo[m]> assuming it does
<headius> it is a bit weird but we just move the tryJit to a finally rather than pre-call
<enebo[m]> but I suppose it may not matter in that case
<headius> well returns or raises
<enebo[m]> either case it executes
<headius> either way we don't trigger jit until we know we have done that first interp
<enebo[m]> but perhaps not to the point of hitting the regexp
<enebo[m]> which makes this a little weirder
<headius> also true
<enebo[m]> AtomicCallSite
<headius> short of having a global lookup table of `//o` regexp I don't have a better solution
<enebo[m]> yeah we could make a hack just for dregexps based on position for //o
<enebo[m]> ordering is not an issue though is it? I would think not. This is merely that bytecode generation is happening in parallel with execution
<enebo[m]> JIT generated block will never naturally get called before interp can hit the same point
<enebo[m]> "naturally"
<enebo[m]> I say this because interp could run something which never (or takes a long time) to get to first regexp and the JIT version would get called through some other mechanism a second time and then use JIT which would maybe have different captured values or passed values
<enebo[m]> but in that case I suppose it goes back to your original unanswered question of determinancy
<headius> yeah I am not trying to handle the concurrent execution cases at all here, that is a much more tangled issue
<enebo[m]> I don't even think it is a valid issue personally
<headius> this case is just unfortunate ordering of jit + interp execute + jit execute
<enebo[m]> If you have n threads calling something //o you cannot lock down which one wins without killing the reason you would have that situation
<enebo[m]> I think most programmers using //o in that case would get it and then change how they write their code
<headius> I guess another area this could be improved is modifying block jit to execute the jitted code right away, but I think we can't do that because the surrounding scope context may not be right?
<headius> I don't recall what the problem actually is though
<enebo[m]> well it definitely may still be executing interp already
<enebo[m]> we will make a JIT version back to a hard scope (e.g. method) but it may never execute that new version and just stay in the old interp'd one (think loop do)
<headius> right
<headius> in this case it never re-enters the containing scope but it executes fine
<enebo[m]> Simplest idea so far is to just make deregexp/o special and have some safe lookup table
<headius> so I was trying to remember when it would NOT be fine to just go straight into a jitted block from an interpreted container scope
<headius> I think simplest idea is to move the jit edge to after block interp
<headius> that does fix this case
<enebo[m]> but then sync compile
<headius> ignore concurrency issues
<enebo[m]> so we lose some small up front cost but this is very uncommon
<headius> not trying to fix those
<enebo[m]> does this actually work?
<headius> this will still just submit the jit to a background thread normally, but under threshold=0 we do not background jit so this will be ready for the next call in the same thread
<headius> yes, it works
<enebo[m]> I thought we submitted this to compile threads and it then would race until next call
<enebo[m]> oh wait never mind
<enebo[m]> I see why it would work
<headius> I modified background jit to not background when threshold=0
<headius> since the intent is clearly to jit immediately
<enebo[m]> but I guess all methods would lose the ability to concurrently compile while interp version is still running
<enebo[m]> but your fix would not need to even disable I think because the regexp would have a valud in the instr
<headius> we do some sync around that, like swapping in the jitted method only once
<headius> yeah that is exactly it
<headius> just deferring the attempt to jit until interp has run at least once
<enebo[m]> yeah ok .. sorry I was not tracking why you said you modified to disable conc JITTing
<headius> so we know `//o` will have been encountered if they would be encountered
<headius> the real problem was doing jit and THEN still interpreting once before transition
<enebo[m]> yeah
<enebo[m]> well I guess I am having a hard time seeing what would break from this
<headius> yeah it seems pretty benign
<enebo[m]> if it only happens =0 and it is also synch then that case definitely has nothing bad happening
<headius> it also makes it a lot clearer that we will not actually jit a block until it executes once
<enebo[m]> non-zero could mean you may not ever actually JIT
<headius> unless the method surrounding it jits first
<enebo[m]> but with that said that would be fine
<enebo[m]> in that case you would only call the block once
<headius> yeah
<enebo[m]> JITd things will start a bit later (or never if the scope never stops executing in interp) but we will reduce concurrency considerations to some degree
<enebo[m]> with threshold > 0 we will conc compile but we will start later so possibly we may enter a second interp'd call to the same thing because JIT thread(s) have not replaced things yet
<headius> at some point maybe we want a concurrency-safe atomic lookup table for these but I raised the issue with CRuby and they have no answer
<enebo[m]> (and that can already happen anyways but it does not add in first interp run as part of the time)
<headius> concurrent jit could absolutely fail to see a `//o` if the interp doesn't hit it or is still executing during jit
<headius> so yeah this helps that case too by deferring jit until the first call has completed
<enebo[m]> yeah I think the bonus of not JIT'ing if it never returns is very weird property if you method count. Not like most methods only execute 25 times and never return on the 25th call :)
<headius> yeah I didn't think of that case but it is a possibility
<headius> so if none of the calls to the block ever return, jit will never be submitted
<headius> clearly I would like this best if we did not have to continue interpreting an unjitted block
<headius> I am still not sure what breaks
<enebo[m]> I think it might be worth examining warmup for a larger app but I don't think this will have any cost.
<enebo[m]> I am sure we could make a case where it does but I am just thinking in general the methods we JIT are just called a lot and are not massive or something like that
<headius> I will repush ahorek's jit spec run as part of a series of fixes PR
<headius> enebo: if you look at my comment in that PR there's another case that we can't fix without a global lookup
<headius> specifically, any case that expects a regexp object to produce the same physical object every time
<headius> since jit and interp will do their own compile step the first time
<enebo[m]> yeah
<enebo[m]> although in this case I am surprised
<headius> this is not just regexp either.. things like floats and frozen strings may not always be the same identical object either
<enebo[m]> Ok I have a different solution for this one
<enebo[m]> I believe we really more or less make a regexp during parse to validate it
<enebo[m]> for non dregexp we should just eagerly make it
<headius> might be able to do that
<headius> I remember something about encoding changing at runtime could cause problems though
<headius> like if we parsed the script with one encoding and then someone changed the internal encoding it should do something different on first encountering the regexp
<headius> but that is wack
<enebo[m]> That also would be prone to when the method was called
<enebo[m]> internal_encoding that is
<headius> yes
<enebo[m]> So I don't think we can do much about that and whjo knows maybe MRI will change the same regexp in that case
<headius> it might indeed, I don't know
<headius> and we might actually handle that right alread
<headius> already
<headius> I remember debating this with brixen or evanphxat some point
<enebo[m]> RubyRegexp.preprocessCheck(runtime, value);
<enebo[m]> RubyRegexp.newRegexpParser(runtime, value, (RegexpOptions)options.clone());
<enebo[m]> one of these two is damn near just making the regexp
<headius> yeah
<enebo[m]> to the point where I believe we can just literally make it since we know it is valid
<enebo[m]> we have a precedent for this with symbol too
<headius> we could also just pre-cache it and use it if it is still valid, otherwise recompiling it
<headius> that would cover 99% of cases even if the weird encoding flippers see the weird behavior
<enebo[m]> yeah I guess I don't know what the edge case with respect ot encodings
<headius> I don't have to fix tainting do I?
<enebo[m]> haha
<enebo[m]> no
<headius> we are basically just going to remove it in 9.4
<enebo[m]> yeah I don't think tainting is even in 3 is it?
<enebo[m]> yeah I think it was iced in 2.7 (at least as far as being used)
<headius> $ rvm ruby-3.0.1 do ruby -e 'x = "foo"; x.taint; p x.tainted?'
<headius> false
<headius> noop
<headius> -e:1: warning: Object#taint is deprecated and will be removed in Ruby 3.2
<headius> -e:1: warning: Object#tainted? is deprecated and will be removed in Ruby 3.2
<headius> deprecated since 2.7 I think
<headius> confirmed
<headius> so we will noop in 9.4 like cruby does
<headius> I assume you will not object if I just tag off taint specs that fail only with jit threshold=0
<headius> it's probably the same issue... string or regexp should remain tainted but we transition and create a new one
<headius> yeah this is not something I can fix right now
<headius> that block will always run once in interp and once in jit so it will make two regexp
<headius> your fix would work though
<headius> actually no it wouldn't because I can't supply the jit code with a regexp object
<headius> enebo: don't spin your wheels on that too much... this can't be fixed without a global table of literal regexp
<headius> unless you want to set up a global table of literal regexp 😀
drbobbeaty has joined #jruby
drbobbeaty has quit [Ping timeout: 250 seconds]