#jruby on 2021-11-16 — irc logs at libera.irclog.whitequark.org

2021-10-13 17:53 ChanServ changed the topic of #jruby to: Get 9.3.1.0! http://jruby.org/ | http://wiki.jruby.org | http://logs.jruby.org/jruby/ | http://bugs.jruby.org | Paste at http://gist.github.com

00:53 sagax has joined #jruby

01:45 <MattWelke[m]> I'm having some trouble dockerizing a test JRuby app. I'm going for the modern approach where you have multi layer builds and you use the first layer to build your runtime and your app and then copy in just what's needed to run your app into the final layer.

01:45 <MattWelke[m]> This is the app, which I've got working when I run it on my machine (using sdkman! to get Java and rbenv to get JRuby): https://github.com/mattwelke/jruby-javalin-example

01:46 <MattWelke[m]> Here's my Dockerfile so far: https://gist.github.com/mattwelke/6c2076a33f1aa7e77c410bcfc5bce724

01:46 <MattWelke[m]> The error I get when I try to run my image as a container is:... (full message at https://libera.ems.host/_matrix/media/r0/download/libera.chat/ef08f6eb9df65f7533c0f37d7cfd688ae66586cb)

01:48 <MattWelke[m]> I'm not surprised by this error because I'm not copying in the jars that jbundle downloaded for me. I can't figure out where jbundler put them though in my `build-ruby` image. On my machine, jbundler puts jars into `$HOME/.m2`, like its docs say it would. But I'm not using Maven in my builder image, as far as I know. Either way, the builder image has no `HOME` env var. I tried adding some `ls` commands in my Dockerfile to try to look

01:48 <MattWelke[m]> around for the jars as the build executes, but had no luck.

01:52 <MattWelke[m]> I'm using Java 17 for one of the builders but Java 11 to fetch my dependencies. This is intentional. I noticed when using JRuby on my local machine that jbundler appears to not support anything greater than Java 11. But I want to use Java 17 to run my app for performance reasons. My understanding is that new Java can run old Java class files because they're forwards compatible.

01:52 <MattWelke[m]> A few things that might look odd, which I'm open to feedback on. I might be missing important things.

01:52 <MattWelke[m]> I'm using bundler version 1.17.3 instead of the most recent version. This is also intentional because I noticed when trying to use the most recent version that jbundler didn't work. I commented about this on https://github.com/mkristian/jbundler/issues/90.

01:52 <MattWelke[m]> I'm copying in the Gemfile and Jarfile (and their lock files) from the build context into the runtime image. That's because I noticed a different error when I tried to run the app before doing this. I forgot to write down that error, but I googled around and learned that jbundler needs to be around at runtime in order to set up the classpath for the running app.

01:55 <MattWelke[m]> * A few things that might look odd, which I'm open to feedback on. I might be missing important things.... (full message at https://libera.ems.host/_matrix/media/r0/download/libera.chat/0d547163dc0f4d1a8e151a6839732db70503de72)

02:58 rcrews[m] has joined #jruby

05:17 <kares[m]> Hey Matt, believe JBundler is effectively dead by now and has been replaces by jar-dependencies (part of JRuby).

05:17 <kares[m]> Jars (dependencies) either loads jars from the Maven repo or rather you could vendor the jars with your app/gem.

05:41 <MattWelke[m]> Thanks! I found both jbundler and jar-dependencies when looking around. I couldn't find a clear answer about what each one was for but I saw jar-dependencies mentioned in jbundler's repo, so I concluded that the relationship between them was that jbundler worked at a higher level and somehow leveraged jar-dependencies.

05:42 <MattWelke[m]> I'll look more into jar-dependencies next time I hack on this.

05:52 <MattWelke[m]> Just checked again. I had it backwards. jbundler was mentioned in the jar-dependencies readme. But the reason I directed my attention to jbundler instead was because the jar-dependencies readme begins by talking about gemspecs and bundling gems. That's not what I was looking for. I was looking for something to help me manage the Java dependencies my app has. In my little test app, that's Javalin.

05:54 <MattWelke[m]> But the jbundler readme jumps right into talking about Jarfile and Jarfile.lock. It immediately felt to me like it was a tool meant to track and download jars you're interested in, like how one would use Maven or Gradle in JVM projects. That's what drew me to it.

06:54 <kares[m]> Honestly I haven't used JBundler that much but there was a way to lock down jars - would have expected a way to vendor them as well, again I am not sure and would need to play with things a bit. But I am sure jar-dependencies has the vendoring functionality.

08:49 drbobbeaty has quit [Ping timeout: 246 seconds]

12:26 drbobbeaty has joined #jruby

14:01 <boc_tothefuture[> headius: are you the right person for my icon question above? I did search again and couldn't find anything.

14:35 <enebo[m]> boc_tothefuture: You can use the logo. It was made for our use when we worked at engine yard but it was made specifically to not run afoul with use Java duke

15:10 <boc_tothefuture[> Thanks!

15:12 richbridger has quit [Remote host closed the connection]

15:12 richbridger has joined #jruby

15:16 <headius> Good morning!

15:16 <headius> back in the saddle

15:18 <kares[m]> Good afternoon!

15:19 <headius> boc_tothefuture: hopefully you also found jruby/collateral which has the vector originals for the various logos and logotypes

15:20 <headius> anewbhav: obviously startup time is a known issue but there are some ways to mitigate... maybe you can describe your setup a bit for us?

15:21 <headius> We still support "drip" which is a JVM preloader that has helped some folks. You might also look at "thein" (I think that's the name) which is a JRuby-compatible preloader for Rails

15:22 <headius> Matt Welke: The maven+jruby situation is kinda all over the place and we are seeking to take over the related gems so we can get that straightened out. That whole ecosystem was maintained by someone who is no longer active in JRuby, so it is a bit confusing and undermaintained right now

15:22 <headius> related gems/artifacts... there's a combination of Ruby and Java libraries that support JRuby+Maven (several of which we use to build and test JRuby itself)

15:24 <boc_tothefuture[> @headius: I hadn't.. but that is perfect.

15:48 <headius> enebo: 9.3.2

15:48 <enebo[m]> yeah

15:48 <enebo[m]> we cannot consider it until we get GHA worked out but that is pretty close

15:48 <headius> I think we were close two weeks ago, what is outstanding now?

15:48 <headius> ah that is a good call

15:48 <headius> I will look at mattpatt PR today

15:49 <enebo[m]> I also just opened an issue with strftime perf improvements

15:49 <headius> noice

15:49 <enebo[m]> I would like to merge it but not until we have GHA

15:49 <enebo[m]> strftime for pattern logger uses is 3.3x faster

15:50 <enebo[m]> but I imagine that will generally translate to most format strings

15:50 <enebo[m]> It is a fairly complicated format so perhaps it gets faster the longer it is but it is all pretty linear

15:51 <enebo[m]> Once we get a callsite cache then this can be made faster again too

15:51 <headius> yeah for sure

15:51 <headius> looking at GHA PR and my first thought is about all these jobs

15:52 <enebo[m]> HAHAH

15:52 <enebo[m]> yeah it is a lot when you see it in this UI

15:52 <headius> at least they scroll now (for a while they just made the "checks" box gigantic) but there's sure a lot of them

15:52 lopex[m] has joined #jruby

15:52 <lopex[m]> heh I guess you already know about this https://openjdk.java.net/jeps/8277131

15:52 <enebo[m]> some of these can be groups like all spec/ruby ones but they need to toggle the group to not cancel on first fail

15:53 <headius> lopex: I did not know about the JEP but I knew this was coming

15:53 <headius> this is a fairly natural progression from coroutines

15:53 <lopex[m]> so internal green scheduler ?

15:53 <enebo[m]> That seems to just be a config. For me though I am ok with refining this and just getting something showing reasonable results first

15:53 <headius> if we could get some time to play with experimental stuff we could have true fibers within a week

15:53 <lopex[m]> M:N like BEAM ?

15:54 <headius> I believe it's more like a fiber scheduling system based on the coroutine work in loom

15:54 <lopex[m]> so cooperative..

15:56 <headius> yeah mostly that I guess... I am behind on it but they have taken care to make coroutines live well with blocking calls and locks and such so I guess there could be scheduler intervention at those points too

15:56 <headius> if we're lucky they are basically implementing the new Ruby 3 scheduler model and we'll be able to just plug it in

15:56 <headius> we have also been tossing around thoughts about what JDK level to go to when we drop 8

15:57 <enebo[m]> "The scheduler for virtual threads is a work stealing ForkJoinPool, that works in first-on-first-out (async) mode, and with parallelism set to the number of available processors."

15:57 <headius> clearly it won't be 9, but if we do it next year could it be 17?

15:57 <lopex[m]> 17 is LTS I guess ?

15:57 <enebo[m]> Feels like this will mostly just allow a lot of people to stop handrolling this

15:57 <headius> enebo: yeah so probably work-stealing at blocking-call points due to coroutine work

15:57 <headius> it already will work steal on subjob boundaries but this would expand it into fiber yields etc

15:58 <enebo[m]> perhaps some magic on seeing blocking beneath java?

15:58 <headius> they already have hooks to know when you are going to do a blocking IO or lock so at that point they can deschedule the current coroutine and run another one for a while

15:58 <enebo[m]> yeah

15:59 <enebo[m]> The evolution of Java has been interesting to watch

15:59 <lopex[m]> and I guess it will affect jit as well right ? since similar things arise like biasing etc

16:00 <enebo[m]> It started where nothing could ever change to build up its reliability across a large number of vendors

16:00 <enebo[m]> Then when externally pressured they realized they needed to improve things quicker

16:00 <enebo[m]> Now the firehose is wide open

16:00 <headius> it is funny and sad that I/we told them how useful this would be over a decade ago at JVMLS and I was shot down by none other than Doug Lea who tried to tell me that thread switching wasn't that bad and nobody needs more than 1000 concurrent jobs

16:00 <headius> thanks Doug

16:01 <enebo[m]> yeah rear guard was defeated by the time they switched release strides

16:01 <enebo[m]> Honestly adding new stuff under the covers is a much simpler sell than externally visible features

16:02 <enebo[m]> but people have been wishing for externally visible features more and they are providing them

16:02 <headius> we could see some interesting use cases if we got our fibers hooked up to Loom

16:02 <enebo[m]> This is sort of the shark decided to start swimming but the the difference between this and Ruby is people actually wanted these features for over a decade

16:02 <headius> yeah and were told they didn't really want them

16:03 <enebo[m]> yeah from google employees :) /ducks

16:03 <headius> so back to GHA...

16:03 <headius> this is promising already... mri:core passes along with most of the smaller suites

16:03 <enebo[m]> getlogin is not working on the image so I think we tag it

16:04 <headius> I will look into the failing suites and work with mattpatt to tweak them, or else we merge and just fix on branch

16:04 <enebo[m]> we have 6 other failures in spec/ruby which I actually see on fedora core as well but they are ipv6 or socket constants. We should probably tag all of these since they were never working

16:05 <enebo[m]> spec:ji has an issue with no 'touch' (which seems simple to workaround) but there might be another issue in there since a bunch of stuff is giving unhelpful error messages. I will but it is not having javac or something

16:05 <headius> JI specs failing is a a bit surprising

16:05 <headius> aha

16:05 <enebo[m]> it could all just be touch but that is the easy error

16:06 <enebo[m]> errors above those errors are mysterious so I figured something we compile is not happening which leads to weird errors...it is just a guess though

16:06 <headius> how can touch not exist

16:06 <enebo[m]> on a stripped down linux instancer?

16:07 <enebo[m]> I find it funny it is not there but it is pretty easy to see how it happens

16:07 <headius> well are these really stripped down? it is a basic POSIX utility

16:07 <headius> it would be like not having ls

16:08 <enebo[m]> Is it POSIX?

16:08 <headius> it is

16:08 <enebo[m]> Even if it wasn't it is GNU coreutils

16:08 <headius> yeah something else might be causing this

16:08 <headius> weird path or something

16:08 <headius> /usr/local/bin/touch or something

16:09 <headius> but that would still be really weird

16:09 <enebo[m]> yeah I thought of that but why would touch not be in same place as cp rm (unless those fail too)

16:09 <headius> TestSocketAddrinfo#test_addrinfo_new_inet [/home/runner/work/jruby/jruby/test/mri/socket/test_addrinfo.rb:90]:

16:09 <headius> <[46102, "127.0.0.2"]> expected but was

16:09 <headius> <[46102, "127.0.0.1"]>.

16:09 <headius> this is clearly a GHA ifconfig issue

16:09 <enebo[m]> ah yeah I forgot about that

16:10 <headius> that's the only one failing in mri:stdlib

16:10 <enebo[m]> yeah that is a bad MRI test

16:10 <enebo[m]> err no it isnt

16:11 <enebo[m]> ai = Addrinfo.new(["AF_INET", 46102, "localhost.localdomain", "127.0.0.2"])

16:11 <enebo[m]> assert_equal([46102, "127.0.0.2"], Socket.unpack_sockaddr_in(ai))

16:11 <headius> oh hmm

16:11 <enebo[m]> I thought it would be based on how much they hard-code 127.0.0.1 here

16:11 <headius> why would that run differently here

16:11 <enebo[m]> are we reverse looking up smoething based on localhost.localdomain?

16:12 <headius> that seems likely

16:13 <headius> well or at least a good possibility

16:13 <headius> but I have no 127.0.0.2 locally and it works

16:13 <enebo[m]> return RubyArray.newArray(runtime, addrinfo.ip_port(context),

16:13 <enebo[m]> addrinfo.ip_address(context));

16:13 <headius> hah

16:13 <enebo[m]> ip_address is the thing

16:14 <headius> yeah it goes back to the address and asks for its address

16:14 <headius> which may resolve back to 127.0.0.1 in this env

16:14 <enebo[m]> yeah and this dips into Java impl of this

16:16 <enebo[m]> heh it is possible AddrInfo changes this before that point in initialize too

16:16 <headius> looks like it from the JDK source

16:16 <headius> I don't see anything weird in the "get" methods

16:17 <headius> this passes for you on fedora?

16:19 <enebo[m]> I get no failures locally with test:mri

16:19 <headius> test:mri:stdlib is a subset, but I think included in test:mri

16:19 <enebo[m]> I get the 6 we see in spec/ruby but those I always thought were some env issue with ipv6

16:19 <enebo[m]> Let me try running locally

16:20 <enebo[m]> I thought that was the last part of doing spec:mri

16:20 <headius> task mri: ['test:mri:core:int', 'test:mri:extra:int', 'test:mri:stdlib:int']

16:21 <headius> so yeah it should run

16:21 <enebo[m]> yeah I just ran through these on my new_strftime branch before landing since I will not get CI

16:21 <headius> reading through the code nothing jumps out so it would help to be able to step through this and see where the address goes arong

16:21 <headius> wrong

16:21 <enebo[m]> GHA is basically just a container thing right?

16:22 <enebo[m]> Can we run them locally?

16:22 <headius> ubuntu 20.04

16:22 <headius> probably

16:22 <enebo[m]> but it has to be "some" ubuntu 20.04 image since it does not seem to have touch of getlogin

16:22 <enebo[m]> s/of/or

16:26 <headius> I have 20.04 running on docker locally, will try to repro

16:27 <headius> I always forget to install jdk headless

16:37 <headius> of course it works fine

16:37 <headius> I will try mri:stdlib run on this container and hope it fails

16:40 <headius> huh that one didn't fail but the ipv6 version did

16:40 <headius> TestSocketAddrinfo#test_addrinfo_new_inet6 [/jruby/test/mri/socket/test_addrinfo.rb:520]:

16:40 <headius> <[42304, "127.0.0.1"]>.

16:40 <headius> <[42304, "::1"]> expected but was

16:41 <headius> so something along that path is re-resolving this back to 127.0.0.1

16:44 <headius> I can repro at command line

16:45 <headius> I think I see it

16:46 <headius> if nodename is given it prefers that for getting the InetAddress

16:51 <headius> # bin/jruby -rsocket -e 'p Socket.unpack_sockaddr_in(Addrinfo.new(["AF_INET6", 42304]))'

16:51 <headius> ConcurrencyError: Detected invalid array contents due to unsynchronized modifications with concurrent users

16:51 <headius> woah

16:51 <headius> # bin/jruby -rsocket -e 'p Socket.unpack_sockaddr_in(Addrinfo.new(["AF_INET6", 42304]))'

16:51 <headius> ConcurrencyError: Detected invalid array contents due to unsynchronized modifications with concurrent users

16:52 <headius> not bounds-checking the incoming array at all

16:57 <enebo[m]> heh

16:59 <headius> I may have a fix

16:59 <headius> basically when it has both a host and an address it needs to use the InetAddress.getByAddress form

16:59 <headius> so it gets the requested hostname with the requested address rather than looking up the hostname

17:00 <headius> it was basically ignoring the address if you give a hostname in this form

17:05 <headius> enebo: https://github.com/jruby/jruby/pull/6943

17:06 <headius> I can't repro the ip4 version but that should work

17:07 <headius> the rubyspec failures almost look like it is laying out the sockaddr wrong for current platform

17:07 <headius> BasicSocket#recvmsg_nonblock using IPv6 using a connected but not bound socket raises Errno::ENOTCONN ERROR

17:07 <headius> but got: NotImplementedError (recvmsg_nonblock is not implemented)

17:07 <headius> Expected Errno::ENOTCONN

17:07 <headius> that is pretty weird

17:08 <enebo[m]> headius: looks good to me

17:09 <headius> I'm not thrilled about creating another InetAddress to get the address bytes but not sure what logic to use to turn an arbitrary string into appropriate ip4 or ip6 bytes

17:09 <headius> it should see this as an IP and not try to resolve so it should be cheap enough

17:10 <headius> ```Etc.getlogin.should == `logname`.chomp```

17:10 <headius> the related failure in rubyspec probably indicates `logname` is not available either

17:12 <enebo[m]> but getlogin was also an issue as wellright?

17:12 <headius> hmm this does check if logname can be run

17:13 <enebo[m]> https://github.com/jruby/jruby/pull/6942

17:13 <enebo[m]> Just to show that PR

17:13 <headius> nice

17:13 <enebo[m]> HAHAHA so longname does exist but getlogin doesn't

17:14 <headius> our etc should be using ffi though

17:14 <enebo[m]> and touch is not in PATH or there

17:16 <headius> 1) Dir globs (Dir.glob and Dir.[]) respects jar content filesystem changes

17:16 <headius> Errno::ENOENT:

17:16 <headius> No such file or directory - zip

17:16 <headius> Failure/Error: `zip -d #{jar_path} glob_target/bar.txt`

17:16 <headius> another one

17:16 <headius> instaling zip and touch would fix JI specs

17:18 <enebo[m]> we can write a ruby touch in a few lines

17:19 <headius> makes no sense for it not to be there so I still suspect something else is up

17:19 <enebo[m]> I would say the same of zip but that I guess depends on what we are doing with zip

17:19 <enebo[m]> sure

17:19 <headius> looks like this is using it to manipulate jars

17:19 <headius> could just be a jar comman

17:19 <headius> jar uf

17:21 <headius> how can touch not be there 😡

17:22 <enebo[m]> HAHAHA logname is there

17:22 <headius> yeah I am still stumped on that one... mspec has poor language for failures but I think what is failing may actually be the getlogin call

17:23 <enebo[m]> I thought it was getlogin returning nil

17:23 <headius> hmm or not?

17:23 <headius> Etc.getlogin.should == `logname`.chomp

17:23 <headius> error is Expected "runner" == ""

17:23 <headius> I assume the order is the same so getlogin returned "runner" and logname returned "" or just failed outright

17:24 <headius> in the log above

17:24 <headius> aha

17:24 <headius> ```logname: no login name```

17:24 <headius> how

17:24 <enebo[m]> I was going to say spec/mspec is better than test/unit in that is makes expected vs actual simpler to write but I cannot read that error string

17:25 <headius> The "logname: no login name" error is due to logname expecting a tty session which ssh normally does not provide and has nothing to do with your script. The -t option forces a tty so logname completes successfully

17:25 <headius> rando comment on some linux forum

17:25 <enebo[m]> ah yeah

17:26 <headius> this would need a patch in rubyspec

17:26 <enebo[m]> This happens for other things too

17:26 <enebo[m]> Interesting to think no one else is running this spec on GHA

17:26 <headius> travis probably set up a pty for their runs and gha doesn't

17:26 <enebo[m]> how are they running it?

17:26 <headius> hmm good question

17:26 <enebo[m]> Or maybe they are running an instance which does set up a pty

17:27 <enebo[m]> I would argue the PR would make sense regardless

17:27 <enebo[m]> it can still pass without tty/pty

17:27 <headius> I don't see anything odd in the rubyspec GHA

17:28 <enebo[m]> and they don't tag

17:28 <headius> the spec even has special logic for env TRAVIS

17:28 <enebo[m]> for MRI

17:29 <headius> this spec is bad

17:30 <headius> it also has logic that checks if Etc.getlogin returns nil, and if it does, it makes sure that Etc.getlogin returns ENV['USER']

17:30 <headius> which will probably never pass

17:31 <headius> I have patches for zip and login -t

17:31 <headius> logname -t

18:02 <headius> I merged the Addrinfo thing, will push this other stuff shortly

18:12 <headius> hmm great, logname -t is a gnuism

18:20 <headius> enebo: FileUtils also provides touch so I'll just use that

18:24 MateuszFryc[m] has joined #jruby

18:28 <MateuszFryc[m]> Hi all, I am wondering if some of you could shed some light on issue which I am struggling with, namely I migrated from jruby 1.7.2 to jruby 9.3.1.0. I am sending queries written in ruby to java server, and after migration it happened twice in last two days that jvm crashed with fatal error, during compilation of method from jruby. As far as I understand this crash occurs during OSR compilation by C2 compiler, and OSR_BCI equals to 408,

18:28 <MateuszFryc[m]> what is peculiar is the fact that each time it crashed OSR_BCI indicated 408th byte.

18:28 <MateuszFryc[m]> I tried to reproduce this issue, but without a success so far.

18:29 <MateuszFryc[m]> Could you please advice how to approach such problem ?

18:30 <MateuszFryc[m]> I attach small excerpt from hs_err file: https://pastebin.com/2Ytxw43s

18:31 <headius> Congrats on finding a jvm bug

18:31 <MateuszFryc[m]> # JRE version: OpenJDK Runtime Environment (17.0+35) (build 17+35-2724)

18:32 <MateuszFryc[m]> So it has not much in common with the fact that, crash occurs on compilation of parser_magic_comment, I suppose.

18:33 <MateuszFryc[m]> * common with JRUBY (the fact, * of parser_magic_comment), I

18:33 <MateuszFryc[m]> because this is peculiar that crash showed up on compiling this particular method twice. However other compilation of this method where OSR_BCI was different than 408 finshed sucessfully.

18:35 <headius> Usual procedure here is filing an issue with openjdk directly, which I can help with

18:35 <headius> There's really nothing we can do in jruby other than provide workarounds like not compiling this particular method

18:35 <headius> Any crash in the jvm compiler is entirely a jvm bug and they are usually pretty quick to jump on it

18:36 <MateuszFryc[m]> ok, I understand.

18:36 <headius> You could try disabling osr

18:36 <headius> Sometimes disabling the tiered compiler can help as well

18:36 <MateuszFryc[m]> well I have never submitted any bug to openjdk, so it would be probably helpful If you could direct me somehow.

18:36 <headius> Via JVM flags

18:37 <MateuszFryc[m]> Well, but as I understand this kind of configuration can decrease the overall performance of my app.

18:37 <headius> You need to be an open jdk member to directly file bugs but you could also post this to the public hotspot compiler list and have the same effect

18:38 <headius> Disabling osr is unlikely to affect performance since most methods will compile normally. Osr just allows it to replace code that's running on the stack, like a long loop

18:38 <headius> If it can't osr than it will just replace the method for future calls

18:38 <headius> If the bug is actually in how it is compiling this method then we we'll need to find another option

18:39 <MateuszFryc[m]> yeah, I read some articles about this technique. Some long running loops could underperfom then If I understand correctly.

18:39 <headius> Only if they never exit. If they are hot and eventually return then future calls should use the optimized version

18:39 <MateuszFryc[m]> But of course it would be worth to try it, and check experimentally whether this will have visible impact on my server.

18:40 <MateuszFryc[m]> If the issue is only with compilation of this particular method parser_magic_comment, then most likely I will be able to workaround it by excluding from compiling this single method.. I guess.

18:40 <headius> And if this is failing in C2 using tiered compilation, the worst case is that the method is only compiled by C1 and should still be plenty fast

18:40 <headius> Anyway that would be my first recommendation for you to try, and we should raise this up to openjdk folks one way or another

18:40 <headius> Especially since this is in a 17 release

18:41 <headius> Clearly it would help if we could reproduce but if you know exactly what method is failing that might be enough

18:42 <MateuszFryc[m]> Well I tried to reproduce it, but without a success, I didn't come across same compilation (same method (parser_magic_comment)/compilation type OSR/phase - 4/OSR_BCI 408)

18:43 <MateuszFryc[m]> I have two hs_err files, probably they could be handy?

18:43 <headius> For sure

18:44 <headius> I'm on mobile for a few minutes but I will have a look at them in a bit

18:46 <MateuszFryc[m]> Perhaps a stupid question, because program which crashes doesn't belong to me. Are there some vulnerable information in such hs_err file, which I wouldn't like to leak out ? Shuld I sanitize it somehow, can you advice?

18:47 <headius> There definitely could be environment specific information, like file paths. Hard to say what would really be sensitive

18:47 <headius> It's just text so have a look at it and see if anything looks risky

18:48 <headius> I know they try to keep sensitive information out of there so mostly file paths and call stacks you can't really avoid

18:48 <MateuszFryc[m]> what about binary data? from registers etc?

18:50 <mattpatt[m]> @headius: will be around for a bit once kid is down if you want to look at GHA stuff

18:53 <headius> mattpatt: yeah cool, I will push some of these other fixes and have you rebase or merge and see how we're doing

18:54 <headius> I am leaning towards merging before it is green just to make it easier to test small fixes and to get the green jobs running

18:55 <MattWelke[m]> Hey headius just wanted to say that JS for being so active here and answering questions, including my own. I'm not using JRuby at work right now so it's mostly a weekend thing for me because I find the project interesting. I'll read up on your answers next time I'm tinkering.

18:55 <MattWelke[m]> P.s. hope you're feeling better. Saw the Twitter messages.

18:55 <MattWelke[m]> s/that/thanks/, s/JS//

18:56 <headius> Yeah just a bit tired at this point

19:09 <headius> enebo: this login spec also has a fallback to the 'id' command, which is also posix.2 and may not have the same tty issues

19:10 <headius> Matt Welke: let us know if you get to the point of running something larger on JRuby, we're happy to help you bootstrap

19:10 <enebo[m]> headius: I wonder why id is not used

19:11 <headius> yeah not sure

19:12 <headius> it could also read LOGNAME env but there are clearly a bunch of possible fallbacks here and no clear way to choose the right one

19:12 * MateuszFryc[m] posted a file: (348KiB) < https://libera.ems.host/_matrix/media/r0/download/matrix.org/osfLRafGHfBVxUnjOsrfufSm/hs_err_pid18093.sanitized.log >

19:12 <MateuszFryc[m]> this is hs_err log file from one crash.

19:13 <MateuszFryc[m]> if you could help me @headius to send it to openjdk team I would be grateful :)

19:15 <headius> yup, that's definitely a hotspot bug

19:15 <MateuszFryc[m]> the question is also whether they will be able to handle it without the way to reproduce it.

19:16 <headius> SEGV no less

19:16 <headius> Mateusz Fryc: I have yet to dump a crash on them they couldn't figure out

19:16 <MattWelke[m]> <headius> "Matt Welke: let us know if you..." <- Yeah for sure. I doubt that'd happen at my work. We have many languages in our stack already, none of which are Ruby. But we do use the JVM a lot. Some things I want to explore out of curiosity though are writing Apache Beam pipelines and ML stuff in Ruby.

19:16 <headius> Mateusz Fryc: https://mail.openjdk.java.net/mailman/listinfo/hotspot-compiler-dev

19:17 <headius> feel free to post there what you've told us here + hs_err dump

19:17 <headius> drop my name if you like and I will monitor the list

19:17 <headius> I'd open an issue directly but I'm not sure about procedure for crashes like this

19:18 <MattWelke[m]> In the Beam ecosystem, right now, the Java stuff is better than the Python stuff (and those are the only two langs officially supported). Lots of Beam users hate having to code in Java. Static typing can slow them down when writing data pipelines. Hence me being curious what it'd be like using Ruby.

19:18 <MateuszFryc[m]> ok, thank you. I will do it this week, now I go to supper ;] thank you for your help and time.

19:18 <headius> Matt Welke: yeah that sounds like a great use case

19:18 <headius> Mateusz Fryc: if you get a chance open a JRuby issue too, so we can track discussion there

19:18 <headius> it's not a bug we can fix but we like to have a record of it in our tracker

19:19 <MateuszFryc[m]> Ok.

19:19 * headius sent a code block: https://libera.ems.host/_matrix/media/r0/download/libera.chat/dfc1a8b01976aadc6f348ab307495e39f22920a0

19:20 <headius> enebo: without knowing why one would use `logname` instead of `id` this might be workable

19:20 <headius> I will open an issue on ruby/spec for discussion

19:21 <mattpatt[m]> headius: think I'm out of the woods and kid is asleep, I'll be about for an hour or so before I turn into a zombie

19:22 <enebo[m]> heh that is way too much logic for this too. If it is that hard to determine a username then perhaps it should be its own gem

19:22 <enebo[m]> not that we want a spec library to depend on a gem

19:23 <enebo[m]> I guess env edges are always where you see this sort of logic though

19:25 <headius> https://github.com/ruby/spec/issues/898

19:25 <headius> we could just tag this for now and untag when it is fixed

19:25 <headius> I just don't know what the "best" way to get login user is

19:26 <headius> BTW I have not been able to find any standard image for the GHA env

19:26 <headius> it may exist but naive searches did not find it

19:28 <mattpatt[m]> headius: doubt there's a standard image accessible outside GHA

19:28 <mattpatt[m]> if they're anything like Travis there's a lot of extra stuff baked in

19:29 <mattpatt[m]> and it'll change regularly

19:29 <mattpatt[m]> My assumption is that the main difference between Travis and GHA as far as our tests are concerned is that it seems like Travis ran stuff in a login shell and GHA is using a non-interactive shell

19:30 <headius> yeah seems like that

19:30 <mattpatt[m]> (Kinds of errors here smell like login/interactive vs non-interactive)

19:30 <headius> travis did publish a docker image you could run that would be identical to their env, but that was years ago

19:32 <headius> mattpatt: jruby-9.3 branch has all my current fixes: `jar` instead of `zip`, FileUtils.touch instead of `touch`, and an Addrinfo.new fix

19:32 <mattpatt[m]> rebasing now

19:33 <headius> that should get test:mri:stdlib and spec:ji green at least

19:35 <headius> enebo: looks like logname on BSDish does not have the terminal requirement

19:35 <MateuszFryc[m]> so basically, I should post an email to hotspot-compiler-dev@openjdk.java.net. and attach hs_err file? I thought there will be some site where I could upload it. Correct me If I am wrong ;)

19:36 <headius> Mateusz Fryc: yes do that but put hs_err in a gist or something and link to it

19:36 <MateuszFryc[m]> ok

19:36 <enebo[m]> bleh

19:37 <MateuszFryc[m]> thank you once again.

19:37 <headius> Mateusz Fryc: of course you can try other JDK versions and maybe it will go away... this method has been in JRuby for a long time and this is the first time it has crashed the JVM

19:38 <MattWelke[m]> <headius> "BTW I have not been able to find..." <- Is this helpful? https://github.com/actions/virtual-environments

19:38 <MattWelke[m]> GHA runs workflows in VMs by default. You have to opt in to using a Docker container to run the workflow in. Their docs on networking for backing services talks about that because it has implications for whether you use `localhost` or a DNS name derived from part of the workflow YAML to connect to your backing service.

19:38 <MattWelke[m]> But there they describe the environment present in the VM your workflow runs in.

19:38 <headius> mattpatt: these are all really close to green as it is

19:39 <headius> Matt Welke: that's useful information about the env at least

19:39 <headius> the two main types of things we are seeing failing are terminal-related and socket-related

19:39 <MattWelke[m]> Ah. That's all I've ever been able to find on the environments sorry.

19:40 <headius> no worires

19:40 <mattpatt[m]> Matt Welke: that's super useful

19:40 <headius> we had to do this dart-throwing to get other suites running in GHA so it's fine

19:40 <enebo[m]> mattpatt: I think you should just squash and we can merge and then solve these one by one on jruby-9.3

19:40 <headius> agree

19:41 <MattWelke[m]> You can check out https://docs.github.com/en/actions/using-containerized-services/about-service-containers though for examples on running your workflow in the runner (a VM) vs. in a container running on the runner.

19:41 <mattpatt[m]> okay, I'll do that. It needs to be two commits because of the whole reusable-workflow thing requiring SHAs.

19:42 <mattpatt[m]> Once we're green we can look at getting the snapshot deploy task working.

19:42 <enebo[m]> mattpatt: cool

20:04 <mattpatt[m]> headiusenebo: Updated. sha ref means the whole thing will error until it's merged

20:04 <mattpatt[m]> merging it should magically make it work provided the SHAs of the commits don't change

20:05 <mattpatt[m]> if that's a problem, then we could do one merge for the resuable workflow file, which is harmless by itself, and then do a merge where the PR checks will actually run the actiosn

20:06 <mattpatt[m]> tbh, that's probably better

20:06 <mattpatt[m]> lmk

20:13 <enebo[m]> mattpatt: merged.

20:16 <mattpatt[m]> enebo: welp, the CI workflow is running now, so the SHAs work now :-)

20:16 <MateuszFryc[m]> I sent an email with description of the problem to the hot spot email group. Hopefully someone will have a chance to take a look at it, unfortunately I don't know how to track discussion when it starts. ISo, when it reaches you, please open issue in jruby yourself.

20:16 <MateuszFryc[m]> * @headius I sent

20:16 <MateuszFryc[m]> * headius: I sent

20:27 <headius> ok

20:36 <mattpatt[m]> headius: first complete run of GHA CI on the 9.3 branch is done. Do you want to pick a file to go over? Not sure what you've already been looking at

20:36 <headius> nice let's have a look

20:38 <headius> odd failure on JI specs

20:39 <headius> this is one of the file I change wrt `touch` and `zip`

20:40 <headius> mattpatt: I don't have a process in mind... just looking at these failures and trying to suss out why they fail on GHA but not travis

20:40 <mattpatt[m]> i'm not sure what's what

20:40 <mattpatt[m]> the last Travis build I can find is from at least a year ago

20:40 <headius> well your job may be done, but if you look at these and see something obvious we can patch it

20:41 <headius> most will require a bit of digging to figure out what's different in this environment

20:41 <headius> e

20:42 <mattpatt[m]> the JI specs at least all fail on the same spec

20:43 <headius> ugh yeah it fails locally too... i'll figure out what I didn't fix

20:44 <headius> - `zip -d #{jar_path} glob_target/bar.txt`

20:44 <headius> + `jar uf #{jar_path} glob_target/bar.txt`

20:44 <headius> bleh

20:44 <headius> s/+/-/

20:44 <headius> * ```

20:44 <headius> - `zip -d #{jar_path} glob_target/bar.txt`

20:44 <headius> + `jar uf #{jar_path} glob_target/bar.txt`

20:44 <headius> ```

20:45 <headius> oh duh

20:45 <headius> -d delete

20:45 <headius> maybe that's why this used zip

20:49 <mattpatt[m]> hrm

20:50 <headius> I would rather not have to install zip to do this but jar doesn't support deleting a file

20:50 <mattpatt[m]> they claim zip is installed

20:51 <mattpatt[m]> https://github.com/actions/virtual-environments/blob/ubuntu20/20211108.1/images/linux/Ubuntu2004-README.md#installed-apt-packages

20:51 <headius> there must be some pathing issue affecting that and `touch`

20:51 <mattpatt[m]> maybe it's a $PATH thing?

20:52 <mattpatt[m]> gimme a sec. just finding windows. seems my mechanical keyboard woke the kid

20:52 <mattpatt[m]> had to relocate further away

20:52 <headius> hah, mine told me last spring that his fellow remote learners complained about the noise

20:53 <headius> I do type hard and fast

20:53 <mattpatt[m]> no obvious path manipulation in the workflow file itself

20:54 <headius> I don't know that we do anything odd with path in the JI run either

20:55 <mattpatt[m]> gonna put a test workflow up and see about env info

20:56 <mattpatt[m]> touch and zip, right>

20:56 <mattpatt[m]> s/>/?/

20:56 <headius> yeah that's the two I patched in the JI specs

20:56 <headius> touch change seems fine because it just uses a Ruby util for it but the zip one is not working

20:56 <headius> we can just use zip if we can sort out the pathing on GHA

21:04 <headius> mattpatt: other thing that might be a clue is how ruby/spec runs in GHA itself: https://github.com/ruby/spec/blob/master/.github/workflows/ci.yml

21:04 <headius> may be something configured differently that allows stuff to work better

21:04 <headius> I don't see anything obvious though

21:10 * mattpatt[m] sent a code block: https://libera.ems.host/_matrix/media/r0/download/libera.chat/4e4073abb380e3d41f3f45cce93563c1810eeb74

21:10 <mattpatt[m]> well that was unhelpful

21:10 <headius> Looks like it should work

21:10 <mattpatt[m]> https://github.com/fidothe/jruby/blob/gha-env-exploration/Rakefile

21:11 <mattpatt[m]> did that from inside Rake so it was being invoked with jruby -S

21:11 <mattpatt[m]> PATH looks sane, which finds zip

21:11 <mattpatt[m]> so that looks good

21:11 <mattpatt[m]> so problem must be deeper in

21:11 <headius> I need a little lie down so handing off to enebo

21:38 <mattpatt[m]> headius enebo: if you use `/usr/bin/zip -d` instead of `zip -d` it passes spec:ji

21:38 <mattpatt[m]> yuck

21:38 <enebo[m]> hmm

21:39 <enebo[m]> mattpatt: we could `which zip` and when none is found just hard-code the path with a comment GHA requires it

21:39 <enebo[m]> This is not really ideal but it would be good to move past this

21:41 <enebo[m]> if PATH is really the path in your output at that point I think the only issue I can think of is there is a file called zip higher which is not marked with an execute bit

21:41 <mattpatt[m]> enebo: i'm trying again with only `zip -d`

21:42 <mattpatt[m]> if that fails i'm going to dump PATH into the output from the spec and see if that's different to the $PATH from the root of the Rakefile

21:42 <enebo[m]> Your Rakefile experiment looking for a zip in each PATH dir might be worth it too

21:42 <enebo[m]> although a changed PATH seems likely

21:47 <mattpatt[m]> Oh wow: Failure/Error: puts `which zip`

21:49 <mattpatt[m]> okay, i gotta sleep. enebo I'll poke at this again in the morning.

21:51 <enebo[m]> ok. Thanks for the efforts!

22:03 aquijoule_ has joined #jruby

22:06 richbridger has quit [Ping timeout: 256 seconds]