<byteit101[m]> hmm, lots of warning: unknown module org.jruby.dist specified to --add-opens
<headius> Oh?
<byteit101[m]> and spec:ruby hung
<byteit101[m]> fresh master checkout
<byteit101[m]> NING: Unknown module: org.jruby.dist specified to --add-opens... (full message at <https://libera.ems.host/_matrix/media/v3/download/libera.chat/032923e22252785e588e6703ad1402bebd70e2fe>)
<byteit101[m]> (etc)
<byteit101[m]> jdk 19 and ant 1.10.13
<byteit101[m]> windows 10
<headius> That should be the right module name running like a ruby command line
<byteit101[m]> Yes, but it seems to be under java.base not org.jruby.dist for some reason
<byteit101[m]> command line is .\bin\jruby.bat -S rake spec:ruby
<headius> Hmm
<headius> Ok could be a problem
<headius> In any case it's showing up for you so open a bug
cgautam[m] has quit [Quit: You have been kicked for being idle]
<headius> so I'm looking at this CSV thing and the biggest impact is character transcoding... removing "UTF-8" from the CSV foreach call doubles performance on 9.4.1
<headius> an allocation trace shows the top item as Joni `Ptr` instances, which we use to track the current pointer into the input and output byte arrays... it's >32% of the allocations in this benchmark run
<headius> not Joni, sorry, jcodings transcoder Ptr
<headius> preallocating that once per OpenFile/IO helps a little bit
<headius> the other big allocator is Joni of course
<enebo[m]> headius: if you believe the rdoc on the old fastercsv it says it also transcodes to utf-8
<headius> What I know is that removing the explicit internal encoding UTF-8 from the reporters script greatly improves the performance, but I have not validated that it's still parse is correctly
<enebo[m]> so just re-reading the older versions comments bit more I think only the headers are processed as UTF-8 and the rest is agnostic
<enebo[m]> so that could be a reasonable explanation for some of it
<headius> So in this case he is causing it to also transcode the rest of the content which it did not do before
<headius> A theory
<enebo[m]> I did a time profile between the two versions of fastercsv and there is a lot more going on in the newer version in Ruby itself as well
<enebo[m]> yeah if so that would be an obvious chunk perhaps enough to say these are close enough
<headius> The one he linked is the old one you tried?
<enebo[m]> I opened this issue
<headius> Oh right duh
<enebo[m]> yeah that was the version of fastercsv that I noticed was 2x faster
<enebo[m]> 1.7 was 1.9 by default and it did support m17n but fastercsv is not explicitly transcoding and default encoding was ascii
<enebo[m]> In notes from that other issue that did have a sizeable diff in perf even without transcoding
<enebo[m]> but largely due to complexity of walking UTF-8 over byte indexing
<headius> It turns out my layover in Newark is like 7 hours so I'm going to find a quiet place to work
<enebo[m]> yow
<enebo[m]> I spend a little time looking and I noticed things like String#split is prettyyu minor perf wise on older one but StringScanner#scan is pretty large in comparison
<enebo[m]> If this is in fact due to being UTF-8 then it should be simple enough to just make the test string UTF-8 and see the old one make split take much longer (which it may)
<enebo[m]> This profile is what made me opt != for fixnum (which eliminated half those calls on the old version)
<enebo[m]> Not important but just an aside
<headius> Nice
<enebo[m]> So another weird aspect of the UTF-8 transcoding detail
<enebo[m]> It is UTF-8 by default in 9.4
<enebo[m]> It shouldn't be doing anything
<headius> The transcoding overhead I saw was during IO
<headius> I think the encoding being passed in is passed through to IO as the external and internal and codings so get enables and extra layer of trends coating
<enebo[m]> hmm
<headius> JRuby 1.7 can't run on Java 17 because the modules got locked down
<enebo[m]> I have been using 8 for a reasonable comparison between the two
<enebo[m]> not specifically for this but in the other issue where the reporter is on 8
<enebo[m]> that is where I found the split optimizations but if I went up to 17 + indy I did get even more perf relative to MRI
<headius> I also noticed a lot of interpreter time in profile until I ran with jit threshold 0
<enebo[m]> This is in part because some toplevel methods do not run enough to JIT but are somewhat substantial
<enebo[m]> I also did the same thing which dropped down to 2.2s once that happened
<headius> yeah I figured it is some toplevel parser loop
<enebo[m]> old csv was much less Ruby
<headius> oy vey
<enebo[m]> One thing I also just realized is I did not check for exceptions
<enebo[m]> I did not spend much time on this to be fair but I did profile it
<headius> I just found a transcoding-related method used by this hot path that always allocates a StringBuilder to build an error message even if there was no error
<enebo[m]> huzzah
<headius> #2 in alloc profile
<enebo[m]> hahah
<enebo[m]> So what is even happening in transcode...isn't this a UTF-8 IO source being transcoded to UTF-8?
<headius> I dunno, your example had ISO-8859-1 in that call and if I remove it entirely it complains about the file not being UTF-8
<enebo[m]> ah yeah. the funny part here is I just grepped to see if I had a csv bench and this happened to be in my snippets dir
<enebo[m]> Perhaps the encoding: bit is strange in real life...or not
<enebo[m]> I guess the input is opened as 8 bit ascii so it cannot just use UTF-8
<enebo[m]> In this case perhaps this is a weird bench and it may just come down to 1.7 not really transcoding stuff
<headius> yeah could be
<headius> but I have found some improvements nonetheless
<enebo[m]> 1.7 is 1.9.x so it did supposedly handle transcoding but I do not think the layer was really working until later in 2.0 support
<enebo[m]> You said if you remove encoding: it gets a lot closer?
<enebo[m]> MRI between the two versions is like 5% slower or something around there
<enebo[m]> so it does slow down between the two versions but it is so close
<headius> if I remove encoding: altogether it errors
<enebo[m]> This was another reason why I figured this was a reasonable issue since we tank a lot more
<enebo[m]> ah it is real data and is really some ascii 8 bit
<headius> ok with my two fixes alloc profie is now all joni and strings
<enebo[m]> lol emacs is very unhappy selecting data in that file
<headius> 70GB of int[] in one minute sampled profile
<enebo[m]> joni needs some threadlocal temp buff love?
<enebo[m]> or is this all in-flight data which needs there own string
<headius> this is the machine workspace
<enebo[m]> oh joni
<headius> it could be cached
<enebo[m]> my brain read it as jcodings even though I even wrote it as joni
<enebo[m]> many many of the fields in joni are int
<enebo[m]> we just need to stuff them all in an array and unsafe write them
<headius> ah this is from Region alloc
<headius> so results of match
<enebo[m]> yeah region feels like it could mostly be [1, 2]
<headius> pretty sure this already has an optimization when there's only one region so not sure how much we can improve this
<enebo[m]> The problem of course some amount of time region is also two arrays of values for multiple matches
<enebo[m]> but many many regexps only have a single begin,end
<headius> actually it does not
<enebo[m]> yeah it makes two arrays all the time
<enebo[m]> I also think if nothing else this could just become one array as well
<headius> ugh why are these fields public
<headius> yeah that would be good too
<enebo[m]> heh
<enebo[m]> single array is doable without breaking an API
<enebo[m]> but yeah on matchless searches it should use an array with no fields
<enebo[m]> err just one beg/end
<enebo[m]> I believe exposting beg/end as public fields will require a major rev and those are pervasively accessed
<headius> yikes
<headius> they sure are
<enebo[m]> I believe the motivation was to "pre inline" what would have been a simple method
<enebo[m]> and there is a lot of fretting about whether everything is monomorphic
<headius> so we can't make any improvement here without a big change in JRuby
<enebo[m]> but the methods have more if stmts here and there to have a single version
<enebo[m]> not until they allow us to impl [] methods
<enebo[m]> I do think we could do this largely mechanically if we added signature for it
<headius> I'm gonna start that process
<enebo[m]> hahah
<headius> yeah first step add the accessors
<headius> deprecate direct field access and release
<enebo[m]> Is my brain messing with me...isn't there a zero method for primitive arrays?
<headius> you're got Ruby brain
<enebo[m]> or Rust brain maybe
<headius> a zero method?
<enebo[m]> Rust has ETOOMANYFUNCTIONS
<enebo[m]> reset all elements in an array to a value
<headius> ah nope nothing like that
<headius> Arrays.fill(ary, 0)
<enebo[m]> aha
<enebo[m]> yeah was what I was thinking of
<enebo[m]> If this had had an API for beg/end it sould have combined the arrays and removed the length field
<headius> yeah that is an easy change once encapsulated
<enebo[m]> but for things which don't care about more than a single beg/end then specialization would be needed
<enebo[m]> there is also some historyRoot which I suspect is not normally used
<headius> yeah either a branch in the accessors or a second Region class
<enebo[m]> ah yeah it is a feature
<enebo[m]> another reason to specialize
<enebo[m]> I guess having a single impl has its benefits
<enebo[m]> null fields with one impl vs not
<enebo[m]> I also would like to see region be thread local
<enebo[m]> maybe
<headius> looks like some joni releases went out without milestones?
<headius> huh we never set up a CI job for joni so I added one
<enebo[m]> headius: ah I did not realize we even had milestones for joni
<headius> yeah they don't get used much but they're there
<enebo[m]> ok thanks for doing them. I probably will do that next time 😀
<headius> initial change to Region accessors is there
<headius> need to release 2.1.47 with the deprecated fields and then 2.2 for their privatization
<enebo[m]> headius: heh...wtf is Ptr :)
<headius> you mind if I move that forward
<enebo[m]> I don't think so. It is surprising it is that few
<headius> it's an in/out var for byte index
<enebo[m]> ah so it needs boxing anyways
<headius> yeah there may be a more efficient way but smarter use of Ptr instances is best we can do for now really
<headius> offset changes could possibly be returned in the EConvResult
<headius> you'd have to manually adjust though
<enebo[m]> I imagine internally in joni the beg/end will be much larger change set
<headius> oops
<headius> this commit doesn't replace any of the reads
<headius> or the other field accesses
<headius> pushed too early
<enebo[m]> yeah I thought there were goign to be a ton
<headius> it's not that big actually
<headius> in jojni
<headius> jonjninionni
<enebo[m]> ah
<enebo[m]> I suppose I am just magnifying how much region is accessed/stored
<enebo[m]> string/regexp are most in JRuby itself although I know we leverage it in some other classes
<enebo[m]> lopex has entered the chat :)
<lopex[m]> numbers
<enebo[m]> I am likely done hacking today once I eat lunch but I almost have my split stuff cleaned up
<enebo[m]> I realize simpleRegexp detection is fairly complicated
<enebo[m]> e.g. making regexp into string splits
<enebo[m]> but my split cleanup has been pretty fun. The MRI port had us doing a lot of things I don't think mattered so I think it is looking a lot easier to understand
<enebo[m]> passing an 'i' which would only be 0 or 1 and depended on lim determining a boolean limit which would then interact with the 'i'.
<enebo[m]> I am still not convinced MRI needs that i either but I killed limit and i from any callers
<enebo[m]> The only obvious point of i I can find is 1 also signifies an explicit limit was passed but it does not use the value for that purpose
<headius> Yeah sounds good
<headius> I will roll with this Joni change before my flight and then review remaining 9.4.1 issues
<headius> My random fix seems nearly there but something is hanging in MRI core suite with it
<enebo[m]> ok
<headius> enebo: could you release joni for me pleeeeease
<headius> apparently I never set up my gpg key on this machine and don't have it handy
<headius> nevermind I managed
<headius> hmm
<headius> those MRI core CI runs are hanging without my branch
<headius> shite
<headius> strscan gem accesses Region fields
<headius> I guess it makes sense but this means updating strscan will break on 9.4.0 and older strscan will not work on 9.4.1
<headius> well, this will just be a failure until I can address it and we should release 9.4.1 with joni 2.1.47
<headius> I will yank this commit out of the optimization PR
<headius> I pushed a PR with the new SingleRegion impl
<headius> going dark for flight but I might pop back in later