#pypy on 2025-01-09 — irc logs at libera.irclog.whitequark.org

2022-11-09 10:48 cfbolz changed the topic of #pypy to: #pypy PyPy, the flexible snake https://pypy.org | IRC logs: https://quodlibet.duckdns.org/irc/pypy/latest.log.html#irc-end and https://libera.irclog.whitequark.org/pypy | the pypy angle is to shrug and copy the implementation of CPython as closely as possible, and staying out of design decisions

05:09 itamarst has quit [Quit: Connection closed for inactivity]

05:41 MiguelX4135 has quit [Quit: Ping timeout (120 seconds)]

05:42 MiguelX413 has joined #pypy

05:56 <arigato> korvo: as far as I remember, none of our backends write real sin or cos instructions

05:57 <arigato> instead they write calls to functions written in C (maybe directly in the libc, or maybe the version produced by rlib)

05:58 <korvo> arigato: That would explain the hiccups I got earlier when I used math.sin() as a reference. It's fast enough most of the time, but it wasn't meant to be called ~48000 calls/s.

05:59 <korvo> I'm currently using a Remez-optimized polynomial. It sounds great and only costs seven multiplications, six of which are FMAs. I think I'd need to literally call the CPU's sine instruction to go faster.

06:00 <korvo> ...And rumor is that the CPU has a lot of bad edge cases that I probably want to ignore anyway, but I won't know until I've tried.

06:00 <arigato> there's no FMA optimization, either

06:01 <arigato> but great, if that's already faster than using math.sin()

06:03 <korvo> Aw, that's unfortunate. I've gotten so used to seeing the `x * y + z` pattern, I shouldn't have assumed that it would be an FMA automatically.

06:12 <korvo> glibc, at least, computes a Taylor series and is concerned with getting under 1 ULP at double precision: https://github.com/bminor/glibc/blob/master/sysdeps/ieee754/dbl-64/s_sin.c

06:13 <korvo> So does fdlibm: https://www.netlib.org/fdlibm/k_sin.c

06:51 <arigato> it might be possible to detect the pattern and replace it with a VFMADD132SD instruction (ugh what a name), but we can't do that for pypy, because it might cause the lower-precision bits to change slightly

06:52 <arigato> perhaps better, if you really need it, would be to add a rlib function that does the fma

06:53 <arigato> and that is what we could turn to VFMADD132SD by the JIT

06:56 <korvo> Yeah. At that point I should probably get the autovectorizer to recognize what I'm doing.

06:58 <korvo> Meanwhile: I have built libsail, sail, and isla-sail at the correct versions. OCaml and Dune have run out of build errors to throw at me. Next step will be the actual compilation of the SAIL models, I guess.

08:01 <korvo> I should have sail-arm building, but for some reason `make gen_ir` doesn't build the ir/armv9.ir file. It does build ir/armv9.toml and it does print out a little message "Sail 0.18.0", so surely victory will come in the morning when I see some overlooked flag.

08:02 <korvo> sail-riscv and sail-cheriot will require some bash trickery because I am not in the mood to write out the names of 89 modules twice, changing every "32" to "64". But I am sure that I will soon have riscv.toml and cheriot.toml and a curious lack of any usable .ir files.

08:32 infernix has quit [Ping timeout: 246 seconds]

08:34 infernix has joined #pypy

10:00 slav0nic has joined #pypy

10:52 dustinm has quit [Ping timeout: 252 seconds]

10:53 dustinm has joined #pypy

12:58 itamarst has joined #pypy

13:26 [Arfrever] has quit [Ping timeout: 246 seconds]

13:29 [Arfrever] has joined #pypy

14:33 Dejan has joined #pypy

17:13 Dejan has quit [Quit: Leaving]

19:12 dmalcolm_ has joined #pypy

19:15 dmalcolm has quit [Ping timeout: 252 seconds]

22:19 slav0nic has quit [Ping timeout: 265 seconds]