jacob22_ has quit [Quit: Konversation terminated!]
jacob22_ has joined #pypy
epony has joined #pypy
mattip has quit [Ping timeout: 260 seconds]
mattip has joined #pypy
lritter has joined #pypy
lritter has quit [Ping timeout: 248 seconds]
slav0nic has joined #pypy
Atque has joined #pypy
<Corbin>
https://bpa.st/VKHA I successfully switched to bytecode, and now my JIT traces are very nice-looking. Sadly, the overall performance is still not what I want. I know that I need to optimize the bytecode emitter and do some peephole work; are there any more obvious JIT things to clean up?
<Corbin>
I did test to see whether I need .can_enter_jit() on my bytecode JIT driver. I don't think I need it, but the traces are certainly more precise with it, since GOTO is a reliable loop marker.
<cfbolz>
Corbin: nice
<cfbolz>
The most expensive looking thing are the news at the end
<Corbin>
cfbolz: Thank you so much for your advice! I feel like I leveled up this weekend.
<Corbin>
Yeah, allocating all the pairs and float boxes is not great. Could those be virtualizable, or would that just create more overhead?
<cfbolz>
Cheers
<cfbolz>
Corbin: virtualizables are for things that are mutated, and exist before the loop
<cfbolz>
What are the pairs actually storing?
<cfbolz>
Pairs of pairs with the occasional float?
<Corbin>
Yeah. Either it's the output type, which is a triple of floats (f, (f, f)) or it's a linked list (f, (f, (f, ...))) because that's how the bytecode machine represents lists.
<Corbin>
I haven't readded packed arrays yet, but my test algorithm doesn't have any.
<cfbolz>
Corbin: you can type specialize your pairs. Everything is immutable, right?
<Corbin>
Er, *not just lists, but stacks in the bytecode machine are also pairs.
<Corbin>
Yeah, everything's immutable. Specializing pairs makes sense.
<cfbolz>
So just have a number of rpython classes
<cfbolz>
And a smart constructor
<cfbolz>
Plus accessors
<cfbolz>
The question is how many variants you want
<Corbin>
I would be okay with using metaprogramming to generate a few dozen of them. This only makes sense if I can collapse two boxes into one, right?
<Corbin>
cfbolz: Oh, stupid question, but I have to ask: I spend like 3% of time in copysign(), according to perf, and that's the hottest frame. I really only use it to implement abs() for floats. Is there a faster technique?
<cfbolz>
Corbin: two boxes or even three
<cfbolz>
Corbin: copysign, I have no idea honestly
<Corbin>
No worries; I realized it was a demanding question as soon as I asked it. Sorry.
<cfbolz>
Corbin: why is copysign better than just calling abs?
<Corbin>
I don't remember. I suspect it isn't.
<cfbolz>
yeah, I'd go with abs
<cfbolz>
seems the JIT knows about that better
<cfbolz>
and produces x & 0x7FFFFFFFFFFFFFFF :-)
<cfbolz>
we could teach the JIT some optimizations about it too, eg that the result is >= 0
<Corbin>
Interesting how it can do that, but can't look inside copysign() with a constant argument. Probably because nobody bothered to optimize it, because copysign() has always been kind of like this. Sometimes history really sucks.
<Corbin>
I should stop complaining and contribute something useful. I could strength-reduce copysign to abs or negate.
<cfbolz>
Corbin: is copysign(1, x) always equal to abs?
<Corbin>
cfbolz: I think so? I'm ready to be wrong.
<Corbin>
...Wait, I'm holding the wrong argument constant, aren't I.
<cfbolz>
I don't know 🤷♀️
<cfbolz>
Corbin: "I'm ready to be wrong." is a dangerous level of precision when floats are involved 😆
<Corbin>
Okay, I now get a float_abs() jitcode. Thanks for the help.
<cfbolz>
Corbin: if you want to add some optimizations for that, would be a fun first jit patch ;-)
<cfbolz>
it's idempotent!
<cfbolz>
the result is positive!
<Corbin>
Yes, I should.
<LarstiQ>
non-negative!
<cfbolz>
yep
greedom has joined #pypy
<LarstiQ>
getting flashbacks to |x| not being differentiable in optimization problems
<cfbolz>
LarstiQ: details
<LarstiQ>
cfbolz: important details in some cases :)
greedom has quit [Remote host closed the connection]
_0az3 has joined #pypy
greedom has joined #pypy
greedom has quit [Remote host closed the connection]
slav0nic has quit [Ping timeout: 248 seconds]
<mattip>
testing scipy with pypy, I am seeing
<mattip>
ImportWarning: can't resolve package from __spec__ or __package__, falling back on __name__ and __path__
<mattip>
I think it is due to a circular import in cython-related code