<antocuni>
TL;DR: the benchmarks are not too bad, but not too good either. I would like to discuss alternatives before merging it
<phlebas>
antocuni: it's just that it would make the implementation easier on our side
<phlebas>
since it'd just be a single side stack and we just update a pointer to where the top is
<antocuni>
I see
<phlebas>
i need to think about it some more, especially why one would do interleaved trackers
<antocuni>
yes exactly, I'm trying to think the same
<antocuni>
the problem is that we don't know what are the reasonable patterns which will emerge
<antocuni>
right off the bat, I'm tempted to say that nobody will ever want to use interleaved trackers: one per function should be fine
<antocuni>
but if we decide to make it illegale, we surely need to add a check to the debug mode
<antocuni>
because on CPython (and PyPy) it will "just work", so nobody will ever notice that their code is wrong/illegal
<phlebas>
yes, indeed
<mattip>
some people asked me about benchmarks for the 0.0.2 release, maybe a subject for a follow-on blog post?
<antocuni>
uhm, maybe
<phlebas>
mattip: yes, good idea. we should summarize what we have for ujson and piconumpy again in a convenient table. unfortunately, ABI mode for graalpython currently looks really bad - but the universal mode with bitcode numbers are fine ;)
<antocuni>
pypy+piconumpy also didn't look very good last time I tried it
<antocuni>
phlebas: what do I need to do to try graalpython "with bitcode numbers" (whatever it means)?
<mattip>
well, even if it looks bad, it should give an indication of where we are
<phlebas>
you just need to use graalpython to build the extension, and the run the benchmark with that
<phlebas>
the extension binary is identical to the universal that pypy generates, it just has an additional section that is ignored by the loaders of pypy and cpython that contains the bitcode
<antocuni>
ah ok
<phlebas>
for piconumpy, that makes the .so file about 2x as large
<antocuni>
so, if I load an universal extension compiled by CPython or PyPy, it will work but will be slower
<antocuni>
phlebas: maybe I'm doing something wrong, but for me the ujson benchmark is 11.5x slower on graalpython than pypy: https://paste.opendev.org/show/807521/
* antocuni
--> lunch
<mattip>
if there is a blogpost, including full instructions to reproduce the table would also be nice, starting from "download these files, clone this repo"
ronan_ has joined #hpy
ronan has quit [Ping timeout: 255 seconds]
ronan_ is now known as ronan
<Hodgestar>
phlebas: Conceptually trackers should really be independent of each other. If it is good for there to be a separate list under the hood, I think that should be hidden from the user. Maybe it could be a list of "tracker_id, handle" instead of just a list of "handle"? And perhaps "tracker_id, NULL" to represent the first handle of a tracker? Then even if the list grows it should require quite an odd situation for any tracker method to be called when all
<Hodgestar>
its tracker_ids are not close to the end of the list?
<Hodgestar>
If that specific trick won't work, perhaps there is some similar idea that will.
<phlebas>
antocuni: for the ujson-loads benchmark, we're currently 3x slower than cpython running in our CI (0.3s for cpython per iteration, 1s for us); for runge_kutta with piconumpy, we're 7s to cpython's 4s in our CI
<antocuni>
phlebas: what result do you get if you run ujson-loads on your own machine? It's very weird that on CI it's 3x slower and on bencher7 is 11x slower, isn't it?