<headius> Good morning!
<puritylake[m]> Good evening
<puritylake[m]> Man been run off my feet this week
<puritylake[m]> Barely any time to do anything but I am free now
<enebo[m]> puritylake: howdy
<enebo[m]> so I nearly have the parser part of sprintf where all formats are correctly parsed
<puritylake[m]> How are you?
<enebo[m]> I decided to try and solidify that part so it should involve converting to the new format a little less frustrating
<enebo[m]> puritylake: doing well.
<enebo[m]> The new sprintf stuff ported actually passes some specs the old one does not
<enebo[m]> puritylake: I can do a summary of what is involved and why this is an interesting project
<puritylake[m]> Sure, sounds good
<enebo[m]> sprintf itself is a huge piece of esoteric craziness
<enebo[m]> there is also a layer of stuff Ruby added (I think involving %{name} to extract fields out of hashes
<enebo[m]> Our implementation and in fact most implementations seem to be written into a massive switch statement in a loop
<enebo[m]> We do not handle %a or %A and at one point a couple of us looked at adding them
<enebo[m]> if you look at the code (org.jruby.util.Sprintf) then you will see that a number of format modifies in printf like %d or %b is the same chunk of code
<enebo[m]> in that chunk of code is lots of if statements
<enebo[m]> Some are obvious it is only for a particular format (like %u) but others are unclear unless you spent a lot of time reading some fairly complicated code
<enebo[m]> So motivation #1 was to untangle these sections into more straightforward separate methods even if it means some additional duplication
<enebo[m]> The second tangle of the printf implementation is that the processing of the format string "%0.2d" is processed every time you call printf and it happens in the same code which is building up the result string
<enebo[m]> This is two activities which have been stirred together
<enebo[m]> This not only makes understanding what is happening more difficult it also limits our ability to eliminate work
<enebo[m]> If at a sprintf call site (place we call it) we have a literal string (e.g. "%2d") then we can just parse that format string once and save it
<enebo[m]> Then each time we revisit that particular sprintf we just use what we have already parsed
<enebo[m]> (at this point in time saving this off is out of scope until this is working and debugged)
<enebo[m]> So this is the basic problem description.
<enebo[m]> Let me know if you have any comments or questions
<puritylake[m]> Makes sense to me, I assume a trip to the Ruby docs might be in order to make sure everything is up to mRuby?
<enebo[m]> well it would not hurt to learn sprintf beyond the simple stuff most of us will do
<puritylake[m]> I have used sprintf in the past, albeit in C but I am no expert on it yet
<puritylake[m]> * on it... yet
<enebo[m]> ok
<enebo[m]> For example '%*1$.*3$*2d' is a valid specifier :)
<enebo[m]> I have never actually seen anyone use that
<enebo[m]> but I have written the new parser already so it will generate a FormatToken which contains the data
<puritylake[m]> Pretty sure I heard a saying "imagine an obscure miniscule part of a code base, you can't remove it cause someone somewhere uses it" lol
<enebo[m]> I should say we have two test suites we can run to make sure we are passing all we passed with the old implementation
<enebo[m]> So let's talk a bit about the process of what you will work on
<enebo[m]> There is new code in SprintfParser and old code in Sprintf
<enebo[m]> In the new code I have converted %duibBoxX already
<enebo[m]> What I did was look for the letter I want to convert like %e and then in Sprintf I look for case 'e':
<enebo[m]> That will end up being a pretty big blob of code for eEgG
<enebo[m]> Actually I should not use the most complicated one as an example
<enebo[m]> %c is probably a better starting one
<enebo[m]> more or less you will make a method in SprintfParser called format_c() like format_idu() but the body of format_c you will parse from that section of the switch statement from Sprintf
<enebo[m]> It won't just compile as you will need to make some smallish changes to make it work with the new FormatToken object
<enebo[m]> but I have two methods already converted so you can look at what changes I made when I moved the code over
<enebo[m]> lol...someone was at the door
<enebo[m]> At the moment the only way to turn on the new system is to set the env variable SPRINTF=anything
<enebo[m]> If it is a format specified that is not supported it will just fail back to the old system
<enebo[m]> but the nice thing about this is that you can test with SPRINTF set and without to compare the outputs
<enebo[m]> That gist above is how to run both of the tests for printf code that we have (MRI internal test suite and the ruby/spec projects test suite)
<enebo[m]> One exciting thing if I have fixed a few problems while working on the parser that we have never supported and I have little fear I was unexpectedly breaking something else
<enebo[m]> This was why I mentioned that we attempted to add %a/%A. We failed because of the complexity of weaving it into that big switch
<puritylake[m]> I have that branch set as current at the moment
<enebo[m]> cool
<enebo[m]> And I just sort of picked this out as a fun and actually a pretty important thing to work on
<enebo[m]> If this is not fun or you are frustrated then you can say so and we can try and figure something else out
<puritylake[m]> I should be fine, if I get frustrated I'll stop for maybe a day and work on some personal stuff and come back at it the next day
<enebo[m]> Also I spent the last few days trying to make sure we passed all tests with what I ported over so my memory of this is good atm
<enebo[m]> So likely any question you have I will be pretty familiar with the code
<enebo[m]> If you have never written a recursive descent parser then looking at the Lexer (hahah well some times a lexer and parser is a fuzzy line)
<enebo[m]> Hopefully we will not need a lot more changes there but it is I think quite a bit simpler to understand than the old loop
<puritylake[m]> I wrote a lisp-like language for my final college project which was earlier this year, I should be able to figure it out albeit mine wasn't very complex lol
<enebo[m]> The indexed (unnumbered) parsing is the only icky bit
<puritylake[m]> Had written a parser combinator for it in Swift but had to change to C# and couldn't figure out it's generics
<puritylake[m]> Well not in time
<enebo[m]> heh...well not as cool as parser combinators but it is a work horse
<enebo[m]> descent parsers are easy to write anyways
<puritylake[m]> My problem with writing for something more complex than Lisp is I am unsure how to structure and iterate over the AST
<enebo[m]> yeah in other languages moving over the data requires some explicit code to allow it
<enebo[m]> If you look at IRBuilder we walk through our AST or you can look at tool/ast which does it a little differently
<enebo[m]> in JRuby 1.7 we did use an AST interpreter. That is surprisingly simple in that you just make each node type have an interpret() method
<enebo[m]> you could decouple that if you wanted but it worked well enoug
<puritylake[m]> I've had half baked ideas to write a language in it itself, a la pypy
<puritylake[m]> Although I think technically pypy is written in RPython
<enebo[m]> yeah it is
<enebo[m]> it is close enough to be metacircular
<enebo[m]> or I will give it that :)
<enebo[m]> A lot of people are into the idea of self-hosting a language impl. I guess I am as well but it is not clear and cut that it is always a good idea
<puritylake[m]> Everything comes with advantages and disadvantages
<puritylake[m]> If there was a perfect solution we'd all be using it
<enebo[m]> yeah exactly
<puritylake[m]> Old sprintf is a chunky file
<enebo[m]> yeah. The new one will still be pretty chunk
<enebo[m]> just a bit more separation
<puritylake[m]> Ya, new one kinda decouples the process
<puritylake[m]> Liking the look of it so far
<enebo[m]> It is possible once the first phase of conversion is done some other types like Arg can change
<enebo[m]> We have a requirement of passing in arguments as an Array of primitive Array when in cases we only pass in a single value
<enebo[m]> The extra creation of a data structure is a tiny performance cut
<enebo[m]> but this is why this activity is valuable the more we simplify this the easier it will be to make other changes
<enebo[m]> SPRINTF=1 jruby -e 'printf("%.*d", 2, 1)'
<enebo[m]> This is currently broken. I will try and fix it now and then we should be green with new for both test suites
<puritylake[m]> Is there meant to be no failing tests in the jruby/spec test suite?
<enebo[m]> nope
<enebo[m]> err no sprintf tests or are you seeing something else?
<enebo[m]> 2E?
<enebo[m]> 2 erros with 2 few arguments?
<puritylake[m]> No just zero failures on the specs, the first command you have in the gist
<puritylake[m]> I cleaned the files before rebuilding on the new branch
<enebo[m]> the idea was that everything in both command lines should not have anything failing or erroring
<enebo[m]> but there are 2F for the second on involving upto
<enebo[m]> I just fixed that locally but I think it hit a different error so a little mroe debugging and both should be green
<puritylake[m]> Ah cool, just making sure
<enebo[m]> upto for "00" calls sprintf %.*d internally and the new parser was going off the rails
<puritylake[m]> Thought I might have had some failing tests to make go green as I go
<enebo[m]> yeah I was hoping to give you a complete as I can do parser so it is just making sure you move over code without having to debug it
<enebo[m]> but I will push a fix once I have it and then on my machine I will be green for both of those command lines
<puritylake[m]> Cool, should I hold off til then?
<enebo[m]> naw. This problem will not bite you per se
<enebo[m]> and I will have it fixed in next 20 minutes so I doubt you will hit it before I fix it
<enebo[m]> moving the code over will tkae some time
<puritylake[m]> Cool, heading for a shower anyway then gonna settle in for the night, hopefully get something done tonight or at least gain some more knowledge of how I can make things work
<puritylake[m]> It's one thing being told how to do a problem but actually looking at the code is another
<enebo[m]> oh yeah this will take some time to start to grok it
<enebo[m]> so many independent variables
<puritylake[m]> I'll update you on my progress as I feel necessary
<enebo[m]> sure thing
drbobbeaty has quit [Ping timeout: 265 seconds]