fidothe has quit [Ping timeout: 268 seconds]
siasmj has quit [Ping timeout: 264 seconds]
subbu has quit [Quit: Leaving]
siasmj has joined #jruby
fidothe has joined #jruby
subbu has joined #jruby
subbu has quit [Quit: Leaving]
deividrodriguez[ has quit [Quit: You have been kicked for being idle]
nelsnnelson[m] has quit [Quit: You have been kicked for being idle]
<mattpatt[m]> @headius switching it to base off 9.3 fixed the mri:int failures. I added the Rubyspec spec:ruby:fast task and now it's now failing weirdly. Bumping to v2 of the setup-java action (which will do maven cache for you) has broken everything.
<mattpatt[m]> So, reasonable progress then... 🥳
<mattpatt[m]> * @headius: switching
<mattpatt[m]> oooh, not the v2 setup-java action, just attempting to 'unset' gem-related env vars.
<mattpatt[m]> If someone could take a quick look at the failed `spec:ruby:fast` job in https://github.com/fidothe/jruby/actions/runs/1448727436, I'd be interested if these failures are expected or (particularly the cancelled one) something seen before on Travis.
siasmj has quit [Ping timeout: 256 seconds]
fidothe has quit [Ping timeout: 268 seconds]
fidothe has joined #jruby
siasmj has joined #jruby
fidothe has quit [Ping timeout: 260 seconds]
<enebo[m]> mattpatt: It is pretty likely the 5F,1E is ok. I do see some failures locally in my FC env involving IPv6
siasmj has quit [Ping timeout: 240 seconds]
siasmj has joined #jruby
fidothe has joined #jruby
<edipofederle[m]> <headius> "edipo.federle: are you still..." <- headius: hi, yes, I plan to finish it this weekend, its ok ?
<enebo[m]> edipo.federle: it is fine
<mattpatt[m]> @enebo the Etc.getlogin failure is in a spec with Travis-specific code so I'm not mega surprised about that
<mattpatt[m]> enebo: the one that worries me is the 'cancelled' JDK 8 spec-ruby-fast job, because it cancelled itself
<mattpatt[m]> not sure if it did because the JDK 11 one failed
<mattpatt[m]> or because of something I caused
<mattpatt[m]> it seems to happen just after the Etc.getlogin failure
<enebo[m]> mattpatt: oh it cancelled itself...I thought you did somehow
<enebo[m]> I don't know that much about GHA and the times I have used it my stuff was all green to begin with
<enebo[m]> 5:46s for java 8 and a tiny bit less on Java 11 for those sections so it seemed like 8 was either done without output or very close
<mattpatt[m]> enebo: I think the main problem we'll have is that Travis' runners were very full-fat linux machines, and it looks like the GHA ones are much more stripped down, so there'll be a lot of stuff where the implicit dependencies on stuff will bite us because they vanished
<enebo[m]> but do you think it is possible we went over some resource limit?
<enebo[m]> that cancelled job was running like 20s longer than the other one
<mattpatt[m]> the machines have 7GB RAM, and the auto-kill timeouts are measured in hours
<enebo[m]> heh ok
<mattpatt[m]> it's weird
<enebo[m]> yeah so the other theory is that perhaps one job failing led to cancelling the other? I have not seen that with the other GHA things we are running
<enebo[m]> Although it is possible the ones we fail on fail after the others all finish
<enebo[m]> I can refire your PR run right? Let's just re-run and see if we get the same result
<mattpatt[m]> the other jobs explicitly list fail-fast: false in their strategy section
<enebo[m]> the theory one killed the other does still make the most sense to me since they are in their own matrix (says the guy who knows almost nothing about GHA)
<enebo[m]> Another test would be to not put those two jobs in the same matrix and see if they both then complete
<mattpatt[m]> Almost certainly unrelated: The Travis setup also had redis-server running. What needs that? A quick search in the code for redis turns nothing up
<enebo[m]> haha
<mattpatt[m]> There were some issues related to socket handling for redis, but I couldn't connect the dots
<enebo[m]> My only substantial experience with GHA was setting up 3 OS builds of the jruby-launcher rust port
<enebo[m]> I found myself cloning lots of crap in GHA recipeland until it all worked
<enebo[m]> so is that just because we pick an existing image?
<enebo[m]> ruby-build perhaps needs it for other things so they just include it
<enebo[m]> err I guess we don't use ruby-build although I suppose that makes sense since we are a java project
<enebo[m]> No redis in default image which I think this page lists what should be in it
<mattpatt[m]> This was in Travis, not GHA - it's just listed as a service to install and run at the bottom of .travis.yml
<enebo[m]> oh
<enebo[m]> ok. That might have been for -Ptest where I think we used to (or one job may have still) ran some app server and integration tests
<enebo[m]> Actually let me see what phase it was. mkristian used to run lots of integration with stuff
<mattpatt[m]> jobs.<job_id>.strategy.fail-fast
<mattpatt[m]> When set to true, GitHub cancels all in-progress jobs if any matrix job fails. Default: true
<mattpatt[m]> aha, got the cancelled job thing:
<enebo[m]> nice!
<enebo[m]> mattpatt: so if that is out of the way I guess I can try and figure out why we fail some of these tests. I get a few IPv6 errors locally so I guess I even have a starting point
<enebo[m]> fwiw I think we can probably just tag these out for now since it has been this way for at least a couple of years
<enebo[m]> Nothing new is broken
<enebo[m]> the getlogin error is new but that may just be a bad test assuming the env will act a particular way
<mattpatt[m]> If you like, I can move all the jobs over and then we can triage expected vs unexpected failures and take it from there
<mattpatt[m]> my main worry was if new and exciting things were failing
<mattpatt[m]> which would suggest bigger problems with the differences between the Travis and SHA stacks
<mattpatt[m]> s/SHA/GHA/
<enebo[m]> yeah so far I think travis vs GHA may just be some env differences and we will have to triage those
<mattpatt[m]> There's a Travis job called 'MRI core jit' that runs `jruby -S rake test:mri:core:fullint`, and 'MRI core jit jdk11' that runs `jruby -S rake test:mri:core:jit`. Is the MRI core jit job justr badly named?
<mattpatt[m]> Or is it badly named and in need a JDK 8 job that runs `jruby -S rake test:mri:core:jit` too?
<enebo[m]> mattpatt: it is just misnamed: :fullint => ["-X-C", "-Xjit.threshold=0", "-Xjit.background=false"],
<enebo[m]> -X-C means interpreted and threshold=0 means it will go from startup interpreter to full interpreter the first time it is called
<enebo[m]> HAHA this is pretty esoteric. The label is just wrong. MRI core full interp would be better name
<enebo[m]> mattpatt: thanks for helping out with this
<mattpatt[m]> enebo: wish it luck, i'm off to bed :-) https://github.com/fidothe/jruby/actions/runs/1450673803
<enebo[m]> wowzers!
<enebo[m]> do org accounts have limits per month?
<mattpatt[m]> lots of scope for refactoring once the footgun errors are removed
<enebo[m]> mattpatt: It will be cool to see this run
<mattpatt[m]> sorry, bad pasteboard
<mattpatt[m]> TL;DR, probably not for a public repo
<headius> nice... FWIW this is the only way to get parallel execution, even though it bloats up the list of checks a ton
<headius> I wish you could get parallel jobs without adding a check entry
yosafbridge has quit [Quit: Leaving]
yosafbridge has joined #jruby