itamarst has quit [Quit: Connection closed for inactivity]
korvo has quit [Ping timeout: 250 seconds]
korvo has joined #pypy
<nimaje>
seems like some bot found buildbot and tried stuff like sql injections on it
<cfbolz>
eh
<cfbolz>
seems it stopped now
marmoute has joined #pypy
<marmoute>
Okay, so I my current testing with pypy give disapointing result.
<marmoute>
I am looking into using pypy for some of the SoftwareHeritage workload.
<marmoute>
In addition to the core work of hashing stuff, sending stuff around and talking to database. They have large "graph" built from Python object, that uses attrs and a lot of validators.
<marmoute>
These objects show up as takes a two digits percentage in some profile. So I though that maybe pypy could help here.
<marmoute>
I tried to applied pypy on the "storage server" side. It mostly does :
<marmoute>
- receive graph information (including data blob) through HTTP (Flask),
<marmoute>
- validate them (the part I was hoping to speed up),
<marmoute>
- store data in database (postgress for my tests),
<marmoute>
In that case, pypy give very similar timing, (about 60s user CPU time on the server side. maybe 1 minute slower than CPython)
<marmoute>
So I tried applying pypy on the client side. for the case I tried It mostly does :
<marmoute>
- download some data from the internet
<marmoute>
- find data on disk,
<marmoute>
- hash data
<marmoute>
- build the model graph (part I am hoping to speedup)
<marmoute>
- send data to the server using some http rpc.
<marmoute>
In that case pypy is about twice slower >.<
<marmoute>
Am I missing something obvious ?
<cfbolz>
marmoute: how many C libraries are being used?
<marmoute>
In theory, not too many. the main bit is postgresql and I think it is using the non-compiled version
<cfbolz>
right
<cfbolz>
marmoute: and you are testing for long enough that the JIT has a chance to produce code (ie a few minutes of running)?
<marmoute>
It run for 1 hour
<marmoute>
(for the server test)
<marmoute>
The client test runs for between 3 and 5 minutes
<cfbolz>
right
<cfbolz>
both sound sufficient
<cfbolz>
marmoute: the next steps would probably be to do some profiling to see whether anything in particular looks super slow on pypy
<marmoute>
apparently, the interfacing with libpq in psycopg is done through ctypes
<fijal>
that would do it
<fijal>
marmoute: if you can organize me some way to look at the profiles I would be willing to have a quick look?
<marmoute>
I am using `psycopg` the package for psycopg version 3
<marmoute>
fijal: Thanks for the offer! I need to figure out how to gather some profile in the lazagna first.
<marmoute>
.me double check for other unsuspected compiled bits
<marmoute>
fijal: the recommended way to profile pypy is vmprof, right ?
<fijal>
yeah....
<fijal>
I don't think it's super well working, but there isn't anything better
<fijal>
it was a very insightful journey for me, but not necesarily leaving an artifact of a great product
<cfbolz>
I just merged the "known bits" jit optimization, btw
<cfbolz>
it lets the JIT reason much better about bit manipulation, which is uncommon in python but super helpful for pydrofoil
<cfbolz>
eg the jit now knows that even + even is an even number
<fijal>
heh
<cfbolz>
fijal: useful, isn't it? 😆
<fijal>
definitely fun
<cfbolz>
fijal: in the emulators it's used for stuff like this: you have a pointer p in a register that you dereference. the emulated CPU checks that it's aligned (and throws a fault if not). then you dereference p + 8
<fijal>
I can imagine
<cfbolz>
with the changes we can remove the alignment check for p + 8, because it follows from p being aligned
<fijal>
I'm far more skeptical about the social side of writing python interacting with optimizations than I am skeptical about rpython toolchain being good
<cfbolz>
:-)
<cfbolz>
yeah
<fijal>
I believe we addressed what people say makes python slow without making python significantly faster
<cfbolz>
but I'm sure somebody somewhere wrote a weird python program that benefits too ;-)
<fijal>
in any practical sense
<fijal>
yeah, I'm sure you both can write software than runs fast on pypy (I did for sure) and there is some stuff that just does
korvo has quit [Quit: Client closed]
itamarst has joined #pypy
jcea has joined #pypy
marmoute has quit [Quit: Client closed]
[Arfrever] has quit [Ping timeout: 256 seconds]
[Arfrever] has joined #pypy
marmoute has joined #pypy
korvo has joined #pypy
mjacob_ is now known as mjacob
marmoute has quit [Quit: Client closed]
marmoute has joined #pypy
<marmoute>
fidjal: So I have a 166MB file produced by -m vmprof -o /tmp/vmprof.log Is that what you want ?
jcea has quit [Ping timeout: 268 seconds]
<korvo>
Yes. Try `vmprofshow /tmp/vmprof.log` to get an overview.
<cfbolz>
marmoute: you can also try the new (experimental) converter to the firefox profiler format, to reuse the firefox ui