subbu has quit [Quit: Leaving]
subbu has joined #jruby
subbu has quit [Remote host closed the connection]
subbu has joined #jruby
subbu has quit [Quit: Leaving]
<headius> byteit101: I'll get back to that soon. Just trying to get these gems green and released so we can start to use them on master.
<headius> kares: could you give your approval to re-license the strscan code on the following PR?
<byteit101[m]> why would you want green gems instead of red gems? Aren't rubies red? What are you doing with emeralds?
<headius> That's a great point
<byteit101[m]> coming soon: JEmerald
<headius> Luckily there's still one failure so it's ruby red right at the moment
<byteit101[m]> :-D
<headius> how is it we have never added a utility method to mask byte[x] with 0xFF to get unsigned byte value
<headius> I guess Byte.toUnsignedInt is good enough but long
<headius> Darn down to 1F but it is in encoding negotiation code. I'll pick this up tomorrow
<headius> I'm going to push what I have on stringio... the last failure is a case where we should be raising an error trying to transcode UTF-8 into Windows-31J and I have gone over the encoding negotiation logic and can't find the problem
<headius> we may be able to get stringio pushed without this one fixed if I can just show it's something in JRuby
<headius> today I want to attempt to port timeout back into Ruby so we can have efficient executor-based logic in the gem
<headius> it would be easier to get them to incorporate some JRuby-specific Ruby code rather than an ext and a -java gem
<headius> enebo: it came to my attention last night that Byte.toUnsignedInt is a thing... how would you feel about replacing every line of code where we manually & 0xFF with a call to that method?
<headius> I would prefer if it was something like Array.getUnsignedByte(byte[], index) but that does not appear to exist
<enebo[m]> If it is semantically the same then sure. Is it a primitive?
<enebo[m]> I assume it will prevent mistakes too
<headius> it does not appear to be an intrinsic in 8
<headius> or 11
<headius> it was added in 8
<enebo[m]> I think we have accidentally at times forgot to mask so perhaps this will prevent that
<headius> it at least is a clear pattern
<enebo[m]> In those cases I think we somehow forget it can have sign bit
<headius> not obvious looking at byte[i] & 0xFF why we are doing it
<enebo[m]> So I am not sure this will prevent that error but sure
<enebo[m]> yeah. This will definitely help people reading our code
<enebo[m]> in some code I wrote recently for parser line/offset I pushed all the specific bitmath into a method
<enebo[m]> well methods but it did make intent obvious as well as consolidate that logic
<headius> yeah I was going to add such a helper but decided to check if anything had been added to JDK
<headius> I'll put together a PR today
<enebo[m]> The other thing we do is return int when I wish we were doing byte
<enebo[m]> but I think we use int when it could be a codepoint or a byte
<headius> main issue is not being able to work with a byte as unsigned
<headius> so we default to int to just get it unsigned right away
<enebo[m]> yeah
<enebo[m]> I wish Java had type aliasing like Rust
<enebo[m]> Then you can make an int into RubyByte or whatever and it would enforce that alias
<enebo[m]> as if it was a unique type
<headius> yeah
<enebo[m]> Seems like something Java could do too
<enebo[m]> The Java way is class RubyByte { private final int value; } which sucks because you just want it to be an int declared as RubyByte
<enebo[m]> Especially since EA sucks
<headius> hah yeah for sure
<enebo[m]> it is also nice to things like 'type SymbolTable = HashMap<RubySymbol, IRubyObject>'
<enebo[m]> I guess though if you are a language with a lot of ceremony this all helps
<enebo[m]> man emacs why are you like this
<enebo[m]> it changed `coding: binary` to `coding: no-conversion`
<headius> hah yeah that doesn't seem like the same thing
<headius> updated with fixed impl, build and CI additions, and gemspec for -java gem
<headius> moving right along
subbu has joined #jruby
subbu has quit [Quit: Leaving]
subbu has joined #jruby
<headius> well, that took longer than expected
<headius> pure-Ruby timeout based on the Java timeout passes all but one test that uses a quirk of exception raising to `throw` instead of `raise` so the error cannot be caught by the target thread. I'm not sure if we can support it, or if we even want to
<headius> enebo: check that out
<headius> the throw behavior is pretty wack... you can see it in the Timeout::Error definition... basically, it overrides Timeout::Error#exception so that when `raise` tries to call it, it throws 😳
<headius> since throw can't be rescued, the exception bubbles out until caught by the timeout code, and then it is propagated as a normal raise
<headius> I never understood this logic before but I do now
<headius> it is some old bug that you should be able to force the timeout block to raise the error even if it rescues it
<headius> we have never had this reported
<enebo[m]> main motivation here was to make it simpler to merge?
<enebo[m]> Here is half my mega PR: https://github.com/jruby/jruby/pull/7082
<enebo[m]> I am hopeful I will be faster in marshal#dump. MRI tests depend on dump working to test load...so there are a small number of failures still in dump but we pass so many more things now
<headius> Yeah that is the motivation, but it also passes more tests because it was easier to figure out the behavior differences from the CRuby version
<enebo[m]> err small number still in load
<headius> I'm dubious on the value of this weird throw hack they use so I'm not sure if I'm going to keep pushing on that or not
<enebo[m]> The only 3 in ruby/spec is not having T_OBJECT, T_DATA, and one test which uses the same bytes in a symbol with two or more encodings
<headius> I have not done any performance testing either and it will certainly be slower due to Ruby blocks
<headius> But at least it shows it's possible
<enebo[m]> If I had nothing else to do I would masquerade T_DATA where they do it for core types then we could pass those both
<headius> What's our total failures on dump and load now?
<enebo[m]> I am curious now that I just said that...Dir being dumped will mark it as T_DATA in MRI
<enebo[m]> hmm
<enebo[m]> I only did load/restore so far but those are 3 and 3
<enebo[m]> we have 11 in dump
<enebo[m]> 1 in something called float
<enebo[m]> dump should be about same number assuming I do not tackle T_DATA
<enebo[m]> MRI is 12 F/E at the moment but those are almost all just stuff we never passed
<enebo[m]> I do have 3 FIXMEs but I think 2 of those are failures in test/mri so that is good
<headius> Very nice
<headius> Did you make any changes that you think directly improve performance?
<headius> With compatibility in better shape We could probably do some benchmarking and get clever about optimizing it
<enebo[m]> I suspect things will be a little slower but maybe not measurable
<headius> I was always really reluctant to make performance changes in there because it started to take the code too far away from CRuby's structure
<enebo[m]> we have to mark partial objects into a set
<enebo[m]> tracking that is needed for compat but is an extra data structure
<headius> Ah sure
<headius> Yeah that was a sticky area I could never quite get right, dealing with partial objects and the various ways you can dump and load
<enebo[m]> links between symbols and data are done directly so we are not constantly looking to see if type is ';' vs '@' but that is probably not visible
<headius> We could possibly make that a thread local structure and not have to recreate it since dumping load are mostly leaf operations
<enebo[m]> Struct building is not doing toString and equals on two symbols instances
<enebo[m]> So struct unmarshalling may be faster
<headius> Dump and load
<enebo[m]> By and large I would expect not much difference though
<headius> Generic serialization is tough to optimize on any runtime but I would be interested in trying once you land everything
<enebo[m]> There is also extra state to know if we should freeze
<enebo[m]> Oh and it may end up being faster if you pass proc since we were calling it for everthing
<enebo[m]> Like you load a nested array and it would call it on sub-elements even though it is only part of the object being loaded
<enebo[m]> which is why we have a partial list
<enebo[m]> we already also had partial as a boolean (and MRI does also have both)
<enebo[m]> Less indirection happens for somethings like class names where it would go to the big switch before
<enebo[m]> None of this is major stuff though
<enebo[m]> It is weird to me that when 1.9 and m17n came into being someone said "I know I will add special ivars to symbols/strings to say what the encoding should be"
<headius> I don't know how frequently that proc feature is used but that sounds way better
<enebo[m]> I do wonder if anyone uses it
<headius> I wonder if there's any good Marshal benchmarks out there
<enebo[m]> load this data but let me potentially change it on the way out with whatever
<headius> Yeah
<enebo[m]> I could see someone wrapping some type with something else or something but it just feels odd
<enebo[m]> I cannot really think of a concrete example of that
<headius> The encoding variable thing is pretty weird but I suppose it was easier than changing the format in some drastic way
<enebo[m]> I think one issue is I do not see a version number in the data stream
<headius> Should be the first two bytes but I don't know if it's changed in years and years
<enebo[m]> Someone will thank me later but all bodies in main object method outline to nicely named methods so if you are looking at an arbitrary backtrace you can tell what is happening
<headius> I'm not sure they even changed it when they added encoding
<enebo[m]> ah yeah it happens in the header
<enebo[m]> err constructor
<enebo[m]> ok part of me thought it should be in here but I did not look at this at all
<enebo[m]> ok so that is pretty weird. They should fix this encoding stuff
<enebo[m]> especially in 3.1 where utf is just the default
<enebo[m]> symbols are us-ascii or utf-8 with nothing in the data stream at all
<headius> The other new thing I saw in timeout is that there's a scheduler hook so you can define how to time out an operation in your own way
<enebo[m]> and then special encoding is just direclty a type or something
<headius> I'm not sure if CRuby is falling back on that by default or if they still usually run the Ruby code
<enebo[m]> heh
<enebo[m]> well that is interesting
<headius> The Ruby code is still as inefficient as ever, creating a thread for every call to timeout
<enebo[m]> I suppose it is for some other out of band condition so you can just abort without a timeout
<enebo[m]> no point waiting as a worker if the rest of the app went down
<headius> TruffleRuby is using Evan's implementation pretty much unmodified which uses a single timeout thread and request objects similar to our executor approach
<headius> Very unlikely it passes as many tests as hours does
<headius> Yeah I'm guessing it is so folks like ioquatix can define a native timeout of their own that just nukes the thread or fiber from orbit
<headius> I would also be very surprised if anyone has a scheduler timeout that can pass all of these tests because some of them are pretty weird
<headius> But maybe that's the point
subbu has quit [Quit: Leaving]
<headius> oh so close... io-wait 0.2.2.pre1 works but net-protocol doc parsing fails in ripper
<headius> kares: the next big priority will be net-http and our hacks to avoid SSL Context stuff
<headius> we can now install the gem but because it restores a bunch of code we comment out it breaks http stuff: