#jruby on 2022-02-09 — irc logs at libera.irclog.whitequark.org

2022-01-19 17:17 ChanServ changed the topic of #jruby to: Get 9.3.3.0! http://jruby.org/ | http://wiki.jruby.org | http://logs.jruby.org/jruby/ | http://bugs.jruby.org | Paste at http://gist.github.com

00:40 subbu has quit [Quit: Leaving]

00:40 subbu has joined #jruby

00:40 subbu has quit [Remote host closed the connection]

04:20 subbu has joined #jruby

04:52 subbu has quit [Quit: Leaving]

05:37 <headius> byteit101: I'll get back to that soon. Just trying to get these gems green and released so we can start to use them on master.

05:38 <headius> kares: could you give your approval to re-license the strscan code on the following PR?

05:38 <headius> https://github.com/ruby/strscan/pull/25

05:38 <byteit101[m]> why would you want green gems instead of red gems? Aren't rubies red? What are you doing with emeralds?

05:38 <headius> That's a great point

05:39 <byteit101[m]> coming soon: JEmerald

05:39 <headius> Luckily there's still one failure so it's ruby red right at the moment

05:39 <byteit101[m]> :-D

07:13 <headius> how is it we have never added a utility method to mask byte[x] with 0xFF to get unsigned byte value

07:18 <headius> I guess Byte.toUnsignedInt is good enough but long

08:23 <headius> Darn down to 1F but it is in encoding negotiation code. I'll pick this up tomorrow

15:59 <headius> I'm going to push what I have on stringio... the last failure is a case where we should be raising an error trying to transcode UTF-8 into Windows-31J and I have gone over the encoding negotiation logic and can't find the problem

16:00 <headius> we may be able to get stringio pushed without this one fixed if I can just show it's something in JRuby

16:00 <headius> today I want to attempt to port timeout back into Ruby so we can have efficient executor-based logic in the gem

16:01 <headius> it would be easier to get them to incorporate some JRuby-specific Ruby code rather than an ext and a -java gem

16:04 <headius> enebo: it came to my attention last night that Byte.toUnsignedInt is a thing... how would you feel about replacing every line of code where we manually & 0xFF with a call to that method?

16:04 <headius> I would prefer if it was something like Array.getUnsignedByte(byte[], index) but that does not appear to exist

16:06 <enebo[m]> If it is semantically the same then sure. Is it a primitive?

16:06 <enebo[m]> I assume it will prevent mistakes too

16:07 <headius> it does not appear to be an intrinsic in 8

16:07 <headius> or 11

16:08 <headius> it was added in 8

16:08 <enebo[m]> I think we have accidentally at times forgot to mask so perhaps this will prevent that

16:09 <headius> it at least is a clear pattern

16:09 <enebo[m]> In those cases I think we somehow forget it can have sign bit

16:09 <headius> not obvious looking at byte[i] & 0xFF why we are doing it

16:09 <enebo[m]> So I am not sure this will prevent that error but sure

16:09 <enebo[m]> yeah. This will definitely help people reading our code

16:10 <enebo[m]> in some code I wrote recently for parser line/offset I pushed all the specific bitmath into a method

16:10 <enebo[m]> well methods but it did make intent obvious as well as consolidate that logic

16:11 <headius> yeah I was going to add such a helper but decided to check if anything had been added to JDK

16:11 <headius> I'll put together a PR today

16:11 <enebo[m]> The other thing we do is return int when I wish we were doing byte

16:11 <enebo[m]> but I think we use int when it could be a codepoint or a byte

16:12 <headius> main issue is not being able to work with a byte as unsigned

16:12 <headius> so we default to int to just get it unsigned right away

16:12 <enebo[m]> yeah

16:12 <enebo[m]> I wish Java had type aliasing like Rust

16:13 <enebo[m]> Then you can make an int into RubyByte or whatever and it would enforce that alias

16:13 <enebo[m]> as if it was a unique type

16:13 <headius> yeah

16:13 <enebo[m]> Seems like something Java could do too

16:14 <enebo[m]> The Java way is class RubyByte { private final int value; } which sucks because you just want it to be an int declared as RubyByte

16:14 <enebo[m]> Especially since EA sucks

16:16 <headius> hah yeah for sure

16:18 <enebo[m]> it is also nice to things like 'type SymbolTable = HashMap<RubySymbol, IRubyObject>'

16:18 <enebo[m]> I guess though if you are a language with a lot of ceremony this all helps

16:21 <enebo[m]> man emacs why are you like this

16:22 <enebo[m]> it changed `coding: binary` to `coding: no-conversion`

16:22 <headius> hah yeah that doesn't seem like the same thing

16:58 <headius> https://github.com/ruby/stringio/pull/21

16:59 <headius> updated with fixed impl, build and CI additions, and gemspec for -java gem

16:59 <headius> moving right along

17:24 subbu has joined #jruby

20:20 subbu has quit [Quit: Leaving]

20:33 subbu has joined #jruby

21:27 <headius> well, that took longer than expected

21:27 <headius> pure-Ruby timeout based on the Java timeout passes all but one test that uses a quirk of exception raising to `throw` instead of `raise` so the error cannot be caught by the target thread. I'm not sure if we can support it, or if we even want to

21:27 <headius> https://github.com/jruby/jruby/pull/7083

21:27 <headius> enebo: check that out

21:28 <headius> the throw behavior is pretty wack... you can see it in the Timeout::Error definition... basically, it overrides Timeout::Error#exception so that when `raise` tries to call it, it throws 😳

21:29 <headius> since throw can't be rescued, the exception bubbles out until caught by the timeout code, and then it is propagated as a normal raise

21:29 <headius> I never understood this logic before but I do now

21:29 <headius> it is some old bug that you should be able to force the timeout block to raise the error even if it rescues it

21:29 <headius> we have never had this reported

21:30 <enebo[m]> main motivation here was to make it simpler to merge?

21:34 <enebo[m]> Here is half my mega PR: https://github.com/jruby/jruby/pull/7082

21:35 <enebo[m]> I am hopeful I will be faster in marshal#dump. MRI tests depend on dump working to test load...so there are a small number of failures still in dump but we pass so many more things now

21:35 <headius> Yeah that is the motivation, but it also passes more tests because it was easier to figure out the behavior differences from the CRuby version

21:35 <enebo[m]> err small number still in load

21:35 <headius> I'm dubious on the value of this weird throw hack they use so I'm not sure if I'm going to keep pushing on that or not

21:36 <enebo[m]> The only 3 in ruby/spec is not having T_OBJECT, T_DATA, and one test which uses the same bytes in a symbol with two or more encodings

21:36 <headius> I have not done any performance testing either and it will certainly be slower due to Ruby blocks

21:36 <headius> But at least it shows it's possible

21:36 <enebo[m]> If I had nothing else to do I would masquerade T_DATA where they do it for core types then we could pass those both

21:37 <headius> What's our total failures on dump and load now?

21:37 <enebo[m]> I am curious now that I just said that...Dir being dumped will mark it as T_DATA in MRI

21:37 <enebo[m]> hmm

21:37 <enebo[m]> I only did load/restore so far but those are 3 and 3

21:38 <enebo[m]> we have 11 in dump

21:38 <enebo[m]> 1 in something called float

21:38 <enebo[m]> dump should be about same number assuming I do not tackle T_DATA

21:39 <enebo[m]> MRI is 12 F/E at the moment but those are almost all just stuff we never passed

21:40 <enebo[m]> I do have 3 FIXMEs but I think 2 of those are failures in test/mri so that is good

21:43 <headius> Very nice

21:44 <headius> Did you make any changes that you think directly improve performance?

21:44 <headius> With compatibility in better shape We could probably do some benchmarking and get clever about optimizing it

21:45 <enebo[m]> I suspect things will be a little slower but maybe not measurable

21:45 <headius> I was always really reluctant to make performance changes in there because it started to take the code too far away from CRuby's structure

21:45 <enebo[m]> we have to mark partial objects into a set

21:45 <enebo[m]> tracking that is needed for compat but is an extra data structure

21:45 <headius> Ah sure

21:46 <headius> Yeah that was a sticky area I could never quite get right, dealing with partial objects and the various ways you can dump and load

21:46 <enebo[m]> links between symbols and data are done directly so we are not constantly looking to see if type is ';' vs '@' but that is probably not visible

21:46 <headius> We could possibly make that a thread local structure and not have to recreate it since dumping load are mostly leaf operations

21:46 <enebo[m]> Struct building is not doing toString and equals on two symbols instances

21:47 <enebo[m]> So struct unmarshalling may be faster

21:47 <headius> Dump and load

21:47 <enebo[m]> By and large I would expect not much difference though

21:47 <headius> Generic serialization is tough to optimize on any runtime but I would be interested in trying once you land everything

21:47 <enebo[m]> There is also extra state to know if we should freeze

21:48 <enebo[m]> Oh and it may end up being faster if you pass proc since we were calling it for everthing

21:48 <enebo[m]> Like you load a nested array and it would call it on sub-elements even though it is only part of the object being loaded

21:48 <enebo[m]> which is why we have a partial list

21:49 <enebo[m]> we already also had partial as a boolean (and MRI does also have both)

21:49 <enebo[m]> Less indirection happens for somethings like class names where it would go to the big switch before

21:50 <enebo[m]> None of this is major stuff though

21:51 <enebo[m]> It is weird to me that when 1.9 and m17n came into being someone said "I know I will add special ivars to symbols/strings to say what the encoding should be"

21:52 <headius> I don't know how frequently that proc feature is used but that sounds way better

21:53 <enebo[m]> I do wonder if anyone uses it

21:53 <headius> I wonder if there's any good Marshal benchmarks out there

21:53 <enebo[m]> load this data but let me potentially change it on the way out with whatever

21:53 <headius> Yeah

21:54 <enebo[m]> I could see someone wrapping some type with something else or something but it just feels odd

21:54 <enebo[m]> I cannot really think of a concrete example of that

21:54 <headius> The encoding variable thing is pretty weird but I suppose it was easier than changing the format in some drastic way

21:54 <enebo[m]> I think one issue is I do not see a version number in the data stream

21:55 <headius> Should be the first two bytes but I don't know if it's changed in years and years

21:56 <enebo[m]> Someone will thank me later but all bodies in main object method outline to nicely named methods so if you are looking at an arbitrary backtrace you can tell what is happening

21:56 <headius> I'm not sure they even changed it when they added encoding

21:57 <enebo[m]> ah yeah it happens in the header

21:57 <enebo[m]> err constructor

21:57 <enebo[m]> ok part of me thought it should be in here but I did not look at this at all

21:58 <enebo[m]> ok so that is pretty weird. They should fix this encoding stuff

21:58 <enebo[m]> especially in 3.1 where utf is just the default

21:58 <enebo[m]> symbols are us-ascii or utf-8 with nothing in the data stream at all

21:59 <headius> The other new thing I saw in timeout is that there's a scheduler hook so you can define how to time out an operation in your own way

21:59 <enebo[m]> and then special encoding is just direclty a type or something

21:59 <headius> I'm not sure if CRuby is falling back on that by default or if they still usually run the Ruby code

21:59 <enebo[m]> heh

21:59 <enebo[m]> well that is interesting

21:59 <headius> The Ruby code is still as inefficient as ever, creating a thread for every call to timeout

22:00 <enebo[m]> I suppose it is for some other out of band condition so you can just abort without a timeout

22:00 <enebo[m]> no point waiting as a worker if the rest of the app went down

22:00 <headius> TruffleRuby is using Evan's implementation pretty much unmodified which uses a single timeout thread and request objects similar to our executor approach

22:00 <headius> Very unlikely it passes as many tests as hours does

22:01 <headius> Yeah I'm guessing it is so folks like ioquatix can define a native timeout of their own that just nukes the thread or fiber from orbit

22:02 <headius> I would also be very surprised if anyone has a scheduler timeout that can pass all of these tests because some of them are pretty weird

22:02 <headius> But maybe that's the point

23:25 subbu has quit [Quit: Leaving]

23:38 <headius> oh so close... io-wait 0.2.2.pre1 works but net-protocol doc parsing fails in ripper

23:51 <headius> kares: the next big priority will be net-http and our hacks to avoid SSL Context stuff

23:51 <headius> we can now install the gem but because it restores a bunch of code we comment out it breaks http stuff:

23:51 * headius sent a code block: https://libera.ems.host/_matrix/media/r0/download/libera.chat/da3da1f2c98072d8bd5ea7eb0eb90662c2a33410