#jruby on 2021-10-19 — irc logs at libera.catirclogs.org

2021-10-13 17:53 ChanServ changed the topic of #jruby to: Get 9.3.1.0! http://jruby.org/ | http://wiki.jruby.org | http://logs.jruby.org/jruby/ | http://bugs.jruby.org | Paste at http://gist.github.com

14:19 <headius> Good morning!

18:45 <puritylake[m]> Good evening

18:45 <puritylake[m]> Man been run off my feet this week

18:46 <puritylake[m]> Barely any time to do anything but I am free now

18:47 <enebo[m]> puritylake: howdy

18:49 <enebo[m]> so I nearly have the parser part of sprintf where all formats are correctly parsed

18:49 <puritylake[m]> How are you?

18:49 <enebo[m]> I decided to try and solidify that part so it should involve converting to the new format a little less frustrating

18:50 <enebo[m]> puritylake: doing well.

18:50 <enebo[m]> The new sprintf stuff ported actually passes some specs the old one does not

18:51 <enebo[m]> puritylake: I can do a summary of what is involved and why this is an interesting project

18:51 <puritylake[m]> Sure, sounds good

18:51 <enebo[m]> sprintf itself is a huge piece of esoteric craziness

18:52 <enebo[m]> there is also a layer of stuff Ruby added (I think involving %{name} to extract fields out of hashes

18:52 <enebo[m]> Our implementation and in fact most implementations seem to be written into a massive switch statement in a loop

18:53 <enebo[m]> We do not handle %a or %A and at one point a couple of us looked at adding them

18:53 <enebo[m]> if you look at the code (org.jruby.util.Sprintf) then you will see that a number of format modifies in printf like %d or %b is the same chunk of code

18:54 <enebo[m]> in that chunk of code is lots of if statements

18:54 <enebo[m]> Some are obvious it is only for a particular format (like %u) but others are unclear unless you spent a lot of time reading some fairly complicated code

18:55 <enebo[m]> So motivation #1 was to untangle these sections into more straightforward separate methods even if it means some additional duplication

18:56 <enebo[m]> The second tangle of the printf implementation is that the processing of the format string "%0.2d" is processed every time you call printf and it happens in the same code which is building up the result string

18:56 <enebo[m]> This is two activities which have been stirred together

18:56 <enebo[m]> This not only makes understanding what is happening more difficult it also limits our ability to eliminate work

18:57 <enebo[m]> If at a sprintf call site (place we call it) we have a literal string (e.g. "%2d") then we can just parse that format string once and save it

18:58 <enebo[m]> Then each time we revisit that particular sprintf we just use what we have already parsed

18:58 <enebo[m]> (at this point in time saving this off is out of scope until this is working and debugged)

18:58 <enebo[m]> So this is the basic problem description.

18:59 <enebo[m]> Let me know if you have any comments or questions

19:01 <puritylake[m]> Makes sense to me, I assume a trip to the Ruby docs might be in order to make sure everything is up to mRuby?

19:02 <enebo[m]> well it would not hurt to learn sprintf beyond the simple stuff most of us will do

19:03 <puritylake[m]> I have used sprintf in the past, albeit in C but I am no expert on it yet

19:03 <puritylake[m]> * on it... yet

19:03 <enebo[m]> ok

19:04 <enebo[m]> For example '%*1$.*3$*2d' is a valid specifier :)

19:04 <enebo[m]> I have never actually seen anyone use that

19:05 <enebo[m]> but I have written the new parser already so it will generate a FormatToken which contains the data

19:05 <puritylake[m]> Pretty sure I heard a saying "imagine an obscure miniscule part of a code base, you can't remove it cause someone somewhere uses it" lol

19:06 <enebo[m]> I should say we have two test suites we can run to make sure we are passing all we passed with the old implementation

19:06 <enebo[m]> So let's talk a bit about the process of what you will work on

19:07 <enebo[m]> There is new code in SprintfParser and old code in Sprintf

19:08 <enebo[m]> In the new code I have converted %duibBoxX already

19:08 <enebo[m]> What I did was look for the letter I want to convert like %e and then in Sprintf I look for case 'e':

19:09 <enebo[m]> That will end up being a pretty big blob of code for eEgG

19:10 <enebo[m]> Actually I should not use the most complicated one as an example

19:10 <enebo[m]> %c is probably a better starting one

19:11 <enebo[m]> more or less you will make a method in SprintfParser called format_c() like format_idu() but the body of format_c you will parse from that section of the switch statement from Sprintf

19:11 <enebo[m]> It won't just compile as you will need to make some smallish changes to make it work with the new FormatToken object

19:12 <enebo[m]> but I have two methods already converted so you can look at what changes I made when I moved the code over

19:18 <enebo[m]> lol...someone was at the door

19:18 <enebo[m]> https://gist.github.com/enebo/f75b341c5d3db6ee05c1385f8097872c

19:18 <enebo[m]> At the moment the only way to turn on the new system is to set the env variable SPRINTF=anything

19:18 <enebo[m]> If it is a format specified that is not supported it will just fail back to the old system

19:19 <enebo[m]> but the nice thing about this is that you can test with SPRINTF set and without to compare the outputs

19:20 <enebo[m]> That gist above is how to run both of the tests for printf code that we have (MRI internal test suite and the ruby/spec projects test suite)

19:21 <enebo[m]> One exciting thing if I have fixed a few problems while working on the parser that we have never supported and I have little fear I was unexpectedly breaking something else

19:21 <enebo[m]> This was why I mentioned that we attempted to add %a/%A. We failed because of the complexity of weaving it into that big switch

19:23 <enebo[m]> https://github.com/jruby/jruby/tree/new_sprintf

19:23 <puritylake[m]> I have that branch set as current at the moment

19:23 <enebo[m]> cool

19:23 <enebo[m]> And I just sort of picked this out as a fun and actually a pretty important thing to work on

19:24 <enebo[m]> If this is not fun or you are frustrated then you can say so and we can try and figure something else out

19:25 <puritylake[m]> I should be fine, if I get frustrated I'll stop for maybe a day and work on some personal stuff and come back at it the next day

19:25 <enebo[m]> Also I spent the last few days trying to make sure we passed all tests with what I ported over so my memory of this is good atm

19:25 <enebo[m]> So likely any question you have I will be pretty familiar with the code

19:27 <enebo[m]> If you have never written a recursive descent parser then looking at the Lexer (hahah well some times a lexer and parser is a fuzzy line)

19:28 <enebo[m]> Hopefully we will not need a lot more changes there but it is I think quite a bit simpler to understand than the old loop

19:29 <puritylake[m]> I wrote a lisp-like language for my final college project which was earlier this year, I should be able to figure it out albeit mine wasn't very complex lol

19:30 <enebo[m]> The indexed (unnumbered) parsing is the only icky bit

19:30 <puritylake[m]> Had written a parser combinator for it in Swift but had to change to C# and couldn't figure out it's generics

19:30 <puritylake[m]> Well not in time

19:31 <enebo[m]> heh...well not as cool as parser combinators but it is a work horse

19:31 <enebo[m]> descent parsers are easy to write anyways

19:32 <puritylake[m]> My problem with writing for something more complex than Lisp is I am unsure how to structure and iterate over the AST

19:33 <enebo[m]> yeah in other languages moving over the data requires some explicit code to allow it

19:33 <enebo[m]> If you look at IRBuilder we walk through our AST or you can look at tool/ast which does it a little differently

19:34 <enebo[m]> in JRuby 1.7 we did use an AST interpreter. That is surprisingly simple in that you just make each node type have an interpret() method

19:34 <enebo[m]> you could decouple that if you wanted but it worked well enoug

19:35 <puritylake[m]> I've had half baked ideas to write a language in it itself, a la pypy

19:35 <puritylake[m]> Although I think technically pypy is written in RPython

19:36 <enebo[m]> yeah it is

19:36 <enebo[m]> it is close enough to be metacircular

19:36 <enebo[m]> or I will give it that :)

19:37 <enebo[m]> A lot of people are into the idea of self-hosting a language impl. I guess I am as well but it is not clear and cut that it is always a good idea

19:37 <puritylake[m]> Everything comes with advantages and disadvantages

19:38 <puritylake[m]> If there was a perfect solution we'd all be using it

19:38 <enebo[m]> yeah exactly

19:41 <puritylake[m]> Old sprintf is a chunky file

19:41 <enebo[m]> yeah. The new one will still be pretty chunk

19:42 <enebo[m]> just a bit more separation

19:43 <puritylake[m]> Ya, new one kinda decouples the process

19:43 <puritylake[m]> Liking the look of it so far

19:45 <enebo[m]> It is possible once the first phase of conversion is done some other types like Arg can change

19:45 <enebo[m]> We have a requirement of passing in arguments as an Array of primitive Array when in cases we only pass in a single value

19:45 <enebo[m]> The extra creation of a data structure is a tiny performance cut

19:46 <enebo[m]> but this is why this activity is valuable the more we simplify this the easier it will be to make other changes

19:48 <enebo[m]> SPRINTF=1 jruby -e 'printf("%.*d", 2, 1)'

19:48 <enebo[m]> This is currently broken. I will try and fix it now and then we should be green with new for both test suites

19:58 <puritylake[m]> Is there meant to be no failing tests in the jruby/spec test suite?

19:59 <enebo[m]> nope

19:59 <enebo[m]> err no sprintf tests or are you seeing something else?

19:59 <enebo[m]> 2E?

20:00 <enebo[m]> 2 erros with 2 few arguments?

20:00 <puritylake[m]> No just zero failures on the specs, the first command you have in the gist

20:01 <puritylake[m]> I cleaned the files before rebuilding on the new branch

20:01 <enebo[m]> the idea was that everything in both command lines should not have anything failing or erroring

20:01 <enebo[m]> but there are 2F for the second on involving upto

20:02 <enebo[m]> I just fixed that locally but I think it hit a different error so a little mroe debugging and both should be green

20:02 <puritylake[m]> Ah cool, just making sure

20:03 <enebo[m]> upto for "00" calls sprintf %.*d internally and the new parser was going off the rails

20:03 <puritylake[m]> Thought I might have had some failing tests to make go green as I go

20:03 <enebo[m]> yeah I was hoping to give you a complete as I can do parser so it is just making sure you move over code without having to debug it

20:03 <enebo[m]> but I will push a fix once I have it and then on my machine I will be green for both of those command lines

20:04 <puritylake[m]> Cool, should I hold off til then?

20:04 <enebo[m]> naw. This problem will not bite you per se

20:04 <enebo[m]> and I will have it fixed in next 20 minutes so I doubt you will hit it before I fix it

20:05 <enebo[m]> moving the code over will tkae some time

20:06 <puritylake[m]> Cool, heading for a shower anyway then gonna settle in for the night, hopefully get something done tonight or at least gain some more knowledge of how I can make things work

20:07 <puritylake[m]> It's one thing being told how to do a problem but actually looking at the code is another

20:07 <enebo[m]> oh yeah this will take some time to start to grok it

20:07 <enebo[m]> so many independent variables

20:08 <puritylake[m]> I'll update you on my progress as I feel necessary

20:10 <enebo[m]> sure thing

22:00 drbobbeaty has quit [Ping timeout: 265 seconds]