<headius> good morning!
apaokin[m] has quit [Ping timeout: 246 seconds]
BZK[m] has quit [Ping timeout: 256 seconds]
rcrews[m] has quit [Ping timeout: 252 seconds]
jimtng[m] has quit [Ping timeout: 252 seconds]
olleolleolle[m] has quit [Ping timeout: 252 seconds]
enebo[m] has quit [Ping timeout: 252 seconds]
kares[m] has quit [Ping timeout: 260 seconds]
byteit101[m] has quit [Ping timeout: 264 seconds]
nilsding has quit [Ping timeout: 246 seconds]
headius has quit [Ping timeout: 256 seconds]
PavanNambi[m] has quit [Ping timeout: 260 seconds]
apaokin[m] has joined #jruby
rcrews[m] has joined #jruby
jimtng[m] has joined #jruby
enebo[m] has joined #jruby
<enebo[m]> it must be false!
<enebo[m]> I don't get it
olleolleolle[m] has joined #jruby
<enebo[m]> oh the new vs old csv
<enebo[m]> yeah
<enebo[m]> In ruby profile I did see it was in a looping method all in Ruby but most of it was not self time
<enebo[m]> I could not get that to happen and I did threshold 0
byteit101[m] has joined #jruby
<enebo[m]> I guess I will need to try this again
<enebo[m]> I also think we should bench strscan separately as well
<enebo[m]> ah that could be the difference I guess
<enebo[m]> I was just running with default JIT settings then set to 0
kares[m] has joined #jruby
<enebo[m]> old csv with indy also I take it
<enebo[m]> I believe the main difference which will shake out is old csv tried as hard as possible to remove Ruby execution
<enebo[m]> native is basically best case JIT
<enebo[m]> From what I recall I was further apart but I may have been using my split_opts branch
<enebo[m]> yeah I did not look for loops but I still think that is a good idea
<enebo[m]> It is not in a ton of code without being able to inline but if there is a while there may be a great chance it will never jit
headius has joined #jruby
<enebo[m]> In this case I think it could much faster but in many cases that while is near the top and only called once
<enebo[m]> (although as a server it would get there I suppose)
<enebo[m]> immediately compiling while would probably stick out on single run things
<enebo[m]> while
<enebo[m]> I would like to see if that ends up being like 5 methods or 500
<headius> after 50 iters the top four items in profile are transcoding and regex
<enebo[m]> but I suspect loops are pretty uncommon
<headius> actually top seven
<headius> probably 70% of profile
<enebo[m]> now do that profiling for old
<enebo[m]> That bench + data was from an old reported issue and what makes it more interesting (to me) is that it does have some mbc utf-8 data in it
<enebo[m]> If we picked a pure ASCII file I wonder if we would see a big shift
<headius> interestingly 3.77% is getting thread-local joni stack when it is empty
<headius> so it is not caching right or just oddly heavy
<enebo[m]> not caching would be an amazing find
<headius> I'll do a profile of old_csv too
nilsding has joined #jruby
<headius> heavier regex use in the new one could explain it if we're allocating a lot extra
<enebo[m]> heh...I probably was using the split_opts branch since it does a lot of split(",")
<enebo[m]> but not all strings are clean 7 bit
<enebo[m]> I should land it removing my more aggressive attempt and converting more regexps to strings but I really want that feature in
<enebo[m]> right now it is passing for regexps of length 1
<headius> you could land the conservative stuff for 9.4.2 at least
<enebo[m]> yeah I just don't want to lose track of this
<enebo[m]> but I guess I sort of already did
<enebo[m]> once we can have some callsite data it can become even faster
<enebo[m]> since it can then switch to making an executor and not perform selection logic
<enebo[m]> (split can still realize it needs non-ascii after picking ascii so there is still something)
<headius> zero regex use on old_csv
<enebo[m]> heh
<enebo[m]> THE ANSWER HAS BEEN FOUND
<enebo[m]> hopefully at least it is not using match data on new one
<enebo[m]> but I think it is largely internal to strscan right?
<headius> I can check the back trace on those regex calls
<headius> Got to pick a sick kid up from school, brb
<enebo[m]> Do you remember which version of maven started swapping the order in.xml file generation?
<headius> It would be good to see your branch after the jet has run
<enebo[m]> after burners
<headius> I do not remember where that happened but you mentioned upgrading in here so maybe you can search for it
<enebo[m]> yeah ok
BZK[m] has joined #jruby
<headius> I just know the default maven on Fedora had the problem when I installed Fedora 36 or 37
<enebo[m]> I just told them to get a newer maven
<enebo[m]> which any new maven will be same ordering as us
<enebo[m]> (unless they changed something :)
<headius> Yeah I hope not
<headius> 100% of regex use in new csv seems to be strscan
PavanNambi[m] has joined #jruby
<enebo[m]> yeah. strscan is used in a lot of stuff for "performance"
<enebo[m]> so we should definitely bench how well we work and potentially tune as much as we can
<enebo[m]> split_opts seemingly not making a lot of difference here but I as I said the mixed char data will basically just be slow path
<enebo[m]> On my machine (using java 8) the difference between old and new with threshold 0 and indy is 1.8s vs 1.6s so a bit over 10% slower
<enebo[m]> Thinking I could add a single byte split for utf-8 would help but I am not sure how many versions of things to add :)
<headius> maybe just not converting enough splits to make a difference
<enebo[m]> I could print out histogram of selections. It is possible the ascii one is never called to
<enebo[m]> I definitely think for new performance putting an instant JIT for loops would fix a bulk of the gap
<enebo[m]> I am going to figure out how many loops happen running various commands
<enebo[m]> I think explicit loops are uncommon enough where jitting them immediately would not effect warmup
<enebo[m]> segments.pop while segments.any? {|s| String === s }
<enebo[m]> segments.push 0 while segments.size < 2
<enebo[m]> segments.pop while segments.size > 2
<enebo[m]> lol so some weird stuff I did not expect to see but with that said this is all in the same method
<enebo[m]> 11 while loops in csv (new one)
<headius> is this mostly in one method?
<headius> I did not track down what looping method is getting jitted here
<enebo[m]> This appears to mostly be different methods in csv
<enebo[m]> I need to make sure these are being called but I guess they must since we lazily build methods
<enebo[m]> I just put a counter and print into buildConditionalLoop
<enebo[m]> 47 loops in gem list
<enebo[m]> I find this really odd though...openssl/buffering.rb has 6 methods with whiles in them show up
<headius> just starting up?
<enebo[m]> yeah
<enebo[m]> I am truly confused though...it is more than actually exist in the file
<enebo[m]> says 6 times but there are only 3 methods. Somehow we enter here twice
<enebo[m]> oh I see an until ok
<enebo[m]> so 3 untils and 3 whiles
<enebo[m]> and seeming they all get used
<enebo[m]> I should perhaps do this as a percentage of what would get compiled normally
<enebo[m]> 139 methods+blocks jit on gem list
<enebo[m]> but the 47 would be more than 47 since it likely has blocks in those methods
<enebo[m]> gem list is not really an important use case per se since I think users will --dev
<headius> I fixed one more and it looks like everything older than Nov 23 should be punted
<headius> mostly aspirational things getting kicked down the road
<enebo[m]> ok
<enebo[m]> oh you have something which was supposed to land for .11 sitting there
<enebo[m]> I think we did not land it for bake time
<headius> I see two PRs: update Psych and update joni + strscan
<headius> the latter is probably safe enough even if it is adding some features to strscan
<headius> psych I dunno but that was important for the CVE nonsense
<enebo[m]> 197 loops on that boom-app rspec bug (so rails loading coverage+rspec)
<headius> it would force 9.3 to yaml 1.2 and safe_load logic
<enebo[m]> joni+strscan was the PR
<enebo[m]> I think we decided to not stuff it in immediately before .10 release
<headius> it might be possible to switch older psych to newer snakeyaml library for CVE but that still forces yaml 1.2
<enebo[m]> not that we were only considering doing it
<headius> and I'd need to get a branch and update release for old psych
<enebo[m]> I honestly don't know enough about YAML to know if that is risky or not
<headius> ok I can review the strscan thing... it is already done on master in 9.4.1
<headius> it doesn't seem high risk but there are a few gotchas in aliasing
<headius> (yaml 1.2)
<enebo[m]> 455 methods /blocks JITd Rails app
<enebo[m]> 197 more things might be a lot
<enebo[m]> possibly interesting observation...a number of methods have nested loops
<headius> joni+strscan for 9.3 can land any time
<enebo[m]> in bootstrapping a number of these revolve around loops doing things with paths
<headius> ah sure
<enebo[m]> I think the main issue with the idea is it needs to be synchronous compiles
<headius> right
<enebo[m]> which could impact startup a lot more than normal JIT
<headius> I have hardly profiled JIT overhead at all, I'm sure it is heavier than it needs to be
<headius> other than irreducible costs like loading the bytecode into JVM
<enebo[m]> I have no doubt stuff like LVA is pretty expensive
<headius> ah so the other items on my list were in jffi
<headius> I will see about getting a centos VM to test glibc thing
<enebo[m]> cool
<headius> should I merge joni + strscan update into 9.3 branch?
<enebo[m]> I was thinking if I added dtoa to jnr-posix I could fail on non-native and probably land the printf stuff sooner than later
<headius> this also is a much better strscan that passes all tests so I think even with the feature additions it's a good move
<enebo[m]> eventually a pure-Java one could get added
<headius> yeah you should go for it on dtoa
<enebo[m]> oh I remember the issue with strscan
<enebo[m]> not that Ithink we care
<enebo[m]> it supports regexps in scan and that is not supported in 2.6
<enebo[m]> It will not break existing 2.6 uses but it will not complain
<headius> right
<enebo[m]> I think that is ok
<headius> I agree
<enebo[m]> it may lead to spec fails :P
<headius> PR is green so it seems ok
<headius> (with updated MRI strscan tests)
<enebo[m]> but 2.6 mri strscan tests?
<headius> no
<headius> those fail
<enebo[m]> just edge-case stuff like error cases?
<headius> I don't recall actually
<enebo[m]> while true
<enebo[m]> csv definitely was went over with a fine-tooth comb
<enebo[m]> for MRI at least
<headius> other environments are starting to pick up the bad maven
<enebo[m]> -> { @s.scan("aoeu") }.should raise_error(TypeError)
<headius> looks like they are rolling an update out incrementally because most jobs work
<enebo[m]> ah so I remembered this backward
<enebo[m]> it was only regexp in 2.6 and string was accepted later
<headius> ahh right
<enebo[m]> I expected to see a failure for this but I am surprised it is a single spec
<headius> yeah low risk there
<enebo[m]> this csv parser is some pretty big and sophisticated looping methods
<enebo[m]> a lot of code and a bunch of nexts
<headius> fun
<headius> lots of cyclomatic complexity
<enebo[m]> I am not putting it down but I imagine these methods are pretty large in IR
<enebo[m]> The other random thought reading this is whether we are setting CR on substrings
<enebo[m]> You mentioned transcoding but us marking stuff as VALID vs 7BIT would have a huge difference
<headius> could audit cr propagation
<enebo[m]> we heavily rely on generic encoding methods in a lot of our string methods but we can definitely improve things lookinf more greedily for 7bit up front can end up much faster
<enebo[m]> Split on ascii string with "," ended up being like 80% quicker
<enebo[m]> which might mean something is not optimizing well in the string helpers we use but I think it largely because it is just some dead simple loop
<headius> there's probably many rounds of CR optimizations missing from cruby
<enebo[m]> yeah I would not be surprised if they added stuff and we never noticed. I do not really follow their development very closely.
<enebo[m]> They just tend to end up a reference when I want to understand behavior
<enebo[m]> 100% of all lines delivered to get_lines is CR 0
<headius> hah
<headius> well that's not helpful
<enebo[m]> yeah I guess I need to figure out where it came from and whether it reasonably should have one
<enebo[m]> we make so many micromistakes around sharing code
<enebo[m]> some callers call through a method which looks up a separator but then calls another method which has to have same logic because other things call it directly instead
<enebo[m]> in case of each line stuff it means checking opts twice and checking separator arg twice
<headius> yeah bad refactoring probably
<enebo[m]> really small overhead obviously
<enebo[m]> the main split method is 130ish lines with several call outs to other ways (enumerate) in a few places
<enebo[m]> I definitely feel we get here because we tend to port some methods closely
<enebo[m]> interestingly this method does not even care about CR or even if the string is a valid string
<enebo[m]> It just seems to take it at its word but after the recent mail gem issue with joni I wonder how much code is literally forgiving on broken data
<headius> hmm I booched something in chr fix
<headius> centos installing on my other machine
<enebo[m]> lunch
<headius> oops, >> vs >>>
<headius> I can never remember the right one
<enebo[m]> yeah bit math sux
subbu has joined #jruby
<enebo[m]> IO.gets has a fast path which will set CR but slow path does not seem to have that logic
<enebo[m]> Testing to see if it actually is hitting that path or not
<enebo[m]> OpenFile.getlineFast which I thought was setting cr is not providing any way for it to be anything other than 0
<enebo[m]> I suspect cr was int* in MRI
<enebo[m]> So slow path of getline has no cr logic and fast path is looking at a value which I think is always 0
<enebo[m]> We pass all the things so I am guessing cr is merely for optimization and not for correctness in these methods
<headius> ok so I finally got this tested on centos
<enebo[m]> yeah cr is passed &cr
<headius> the builds I did in CI against an earlier debian are still picking up newer glibc so it did not help
<headius> building on centos directly does produce binaries that work, but it's not reproducible unless I can fix it in CI
<enebo[m]> yuck
<enebo[m]> So I am really curious will fixing cr calculation on IO speed things up
<headius> it is frustrating that glibc does not have better support for building against a new version and still working on older ones
<headius> so I think we punt this and I'll try to come up with something else
<headius> I will open an issue against 9.4.3 so it doesn't get lost
<enebo[m]> I imagine in some methods it will just say "oh no cr I better walk it once"
<enebo[m]> but in each_line it just says "you are utf-8 great...utf-8 walking"
<headius> yeah if it is walking a lot of strings it doesn't need to that would be overhead
<headius> I did see some calls to CR calculation stuff in profile but not up very high
<enebo[m]> so each_line is not doing that second walk but other methods are likely scanning or at least using CR for opts
<enebo[m]> but IO is not making a CR other than 0 so it just opts out all IO created data from playing
<enebo[m]> It might only be getline although I suspect that is a very common way of working with data
<enebo[m]> This may yield something
<enebo[m]> headius: so OpenFile can have a cr field which is only used for temporary cr calcs or each of these methods can make int[] cr = new int[] { 0 };
<enebo[m]> using a field prevents the box but it may be confusing
<enebo[m]> part of me feels like JVM should properly be able to escape single value arrays as out params but I guess it is not that simple :)
<headius> yeah would need to make sure any shared state is accessed under lock
<headius> but it is probably doing that
<enebo[m]> I believe all IO methods have lock() in places but yeah I think it maybe would be brittle since you would have to remember this
<enebo[m]> I could see accidentally doing it outside the lock and creating potentially very strange errors
<enebo[m]> Part of me wonders how just nuking all CR out of IO code would help or hurt
<enebo[m]> gets it is still doing lots of checks and logic but it is always 0
<enebo[m]> but if you accept IO does not do CR does removing all that logic speed up IO at the cost of other things being able to use it
<enebo[m]> Thie is a very long way of wondering how much cr costs relative to how much it helps
<enebo[m]> I predict it is a huge benefit for ASCII data but that is anecdotal
<enebo[m]> I am going to make a bench of IO.gets that calls something which uses CR
<headius> IO with transcoding should definitely be able to set CR since we know it has successfully converted
<headius> without that we don't really scan for chars so it would need one scan somewhere
<headius> so I also pushed this today: https://github.com/jruby/jruby/pull/7693
<headius> turns on color for backtraces and ir.print if running on a tty
<enebo[m]> nice
<headius> This Maven bug is getting annoying now. Who knows how long until GitHub updates to a fixed version, if the fix has even been released yet
<headius> I guess we just need to force all jobs to use a known working version for now
subbu has quit [Quit: Leaving]
<enebo[m]> interesting. you did something clever to avoid out param for cr but it is not hooked up to anything
<enebo[m]> you know pos will be int and you increate an int using a long which loses second half of the long
<enebo[m]> but second half is where you were stuffing CR data
<enebo[m]> so I will reaudit the methods in question but I think I can make smaller changes to keep CR properly updated
<enebo[m]> Although looking at this method I don't think it does anything other than figuring out CR (which is fine) but we are not setting it so it is just dead work. For fun I will delete this call and see if this is faster
<headius> Ha I must have gotten distracted
<enebo[m]> who knows. A lot of this is really old too
<enebo[m]> It is possible it was hooked up at some point
<headius> fwiw JVM should be able to eliminate temoprary boxes if they inline and don't escape but it's not guaranteed of course
<headius> JVM as in Hotspot
<enebo[m]> Is a precondition that the place it escapes has been inlined right?
<enebo[m]> in theory my bench uses index on each line and if 7bit it should just byte index
<enebo[m]> I should probably do aref perhaps to just make it as dramatic as possible
<headius> yes must inline
<headius> and can't be used as part of any branching structure, unless they have improved that
<headius> we might get better escape analysis with a single-field carrier object also since arrays have bounds checks and things
<enebo[m]> this is just bit mathing it into the return result
<enebo[m]> using long instead of int
subbu has joined #jruby
<headius> yeah
<headius> when we can do that it's obviously ideal
<headius> I just did it for the chr fix
<headius> enebo: I will manually merge that irb fix and avoid his pom.xml juggling commits
<enebo[m]> ok
<enebo[m]> A bit noisy but CR hooked up is setting CR to 7bit and then using simpler index
<enebo[m]> The actual cost of opening and reading perhaps makes it noisier
<headius> seems like a good improvement though
<enebo[m]> well when it matters it will matter a lot like perhaps a more complicated regexp
<enebo[m]> I thought aref would be pretty visible and it is not so much but it is shiowing something
<enebo[m]> I am loading from YAML test file we had so many lines are shorter
<enebo[m]> I could best case this and make each line 10k long but who has data like that
<enebo[m]> Going to hook up another method I see not doing this and then look at the slow path the csv bench is using
<headius> ok
<headius> looks good to me
<enebo[m]> MRI slow path does not seem to bother with CR
<headius> maven downlevel for GHA jobs
<headius> enebo: I think something you fixed fixed this one: https://github.com/jruby/jruby/issues/7689
<enebo[m]> Yeah that is a duplicate
<headius> ah nice
<headius> ok
<enebo[m]> good news...I just made the OLD version of csv faster: https://gist.github.com/enebo/e40f4d59dd41438b51e6b96d784711a0
<headius> haha
<enebo[m]> oh and that is without the split opts
<enebo[m]> but I did not see much improvement with that so?
<headius> FWIW the old one is also fastercsv by JEG so it's not like the really old csv is winning or something
<enebo[m]> oh of course I didn't
<enebo[m]> It had no CR
<enebo[m]> yeah not much different but I think it is because much of it is UTF-8 and not 7 bit
<enebo[m]> In fact I think nearly 100% have mbc
<enebo[m]> I have some huge ascii .csv file with the same lines over and over
<headius> bleh, irb must have some updated tests in 1.4.2
<headius> enebo: I pushed those maven downgrades to master so things should work now
<headius> enebo: this might be a snakeyaml bug: https://github.com/jruby/jruby/issues/7698
<headius> or invalid yaml
<enebo[m]> hmm
<enebo[m]> some dangling reference?
<enebo[m]> but all YAML is utf-8
<enebo[m]> I believe it is including an encoding reference as an ivar for a string which only have \u char in it so maybe it just knows that is utf-8?
<enebo[m]> So it appears to be a tiny bit faster and almost breaks loose initially
<enebo[m]> Not sure what is up with that really fast first pass
<enebo[m]> headius: you should merge this one: https://github.com/jruby/joni/pull/61
<enebo[m]> looks obviously correct except if it is an encoding which has no valid java charset
<headius> I'll have a look at that