cfbolz changed the topic of #pypy to: #pypy PyPy, the flexible snake https://pypy.org | IRC logs: https://quodlibet.duckdns.org/irc/pypy/latest.log.html#irc-end and https://libera.irclog.whitequark.org/pypy | the pypy angle is to shrug and copy the implementation of CPython as closely as possible, and staying out of design decisions
itamarst has quit [Quit: Connection closed for inactivity]
MiguelX4135 has quit [Quit: Ping timeout (120 seconds)]
MiguelX413 has joined #pypy
<arigato> korvo: as far as I remember, none of our backends write real sin or cos instructions
<arigato> instead they write calls to functions written in C (maybe directly in the libc, or maybe the version produced by rlib)
<korvo> arigato: That would explain the hiccups I got earlier when I used math.sin() as a reference. It's fast enough most of the time, but it wasn't meant to be called ~48000 calls/s.
<korvo> I'm currently using a Remez-optimized polynomial. It sounds great and only costs seven multiplications, six of which are FMAs. I think I'd need to literally call the CPU's sine instruction to go faster.
<korvo> ...And rumor is that the CPU has a lot of bad edge cases that I probably want to ignore anyway, but I won't know until I've tried.
<arigato> there's no FMA optimization, either
<arigato> but great, if that's already faster than using math.sin()
<korvo> Aw, that's unfortunate. I've gotten so used to seeing the `x * y + z` pattern, I shouldn't have assumed that it would be an FMA automatically.
<korvo> glibc, at least, computes a Taylor series and is concerned with getting under 1 ULP at double precision: https://github.com/bminor/glibc/blob/master/sysdeps/ieee754/dbl-64/s_sin.c
<arigato> it might be possible to detect the pattern and replace it with a VFMADD132SD instruction (ugh what a name), but we can't do that for pypy, because it might cause the lower-precision bits to change slightly
<arigato> perhaps better, if you really need it, would be to add a rlib function that does the fma
<arigato> and that is what we could turn to VFMADD132SD by the JIT
<korvo> Yeah. At that point I should probably get the autovectorizer to recognize what I'm doing.
<korvo> Meanwhile: I have built libsail, sail, and isla-sail at the correct versions. OCaml and Dune have run out of build errors to throw at me. Next step will be the actual compilation of the SAIL models, I guess.
<korvo> I should have sail-arm building, but for some reason `make gen_ir` doesn't build the ir/armv9.ir file. It does build ir/armv9.toml and it does print out a little message "Sail 0.18.0", so surely victory will come in the morning when I see some overlooked flag.
<korvo> sail-riscv and sail-cheriot will require some bash trickery because I am not in the mood to write out the names of 89 modules twice, changing every "32" to "64". But I am sure that I will soon have riscv.toml and cheriot.toml and a curious lack of any usable .ir files.
infernix has quit [Ping timeout: 246 seconds]
infernix has joined #pypy
slav0nic has joined #pypy
dustinm has quit [Ping timeout: 252 seconds]
dustinm has joined #pypy
itamarst has joined #pypy
[Arfrever] has quit [Ping timeout: 246 seconds]
[Arfrever] has joined #pypy
Dejan has joined #pypy
Dejan has quit [Quit: Leaving]
dmalcolm_ has joined #pypy
dmalcolm has quit [Ping timeout: 252 seconds]
slav0nic has quit [Ping timeout: 265 seconds]