cfbolz changed the topic of #pypy to: #pypy PyPy, the flexible snake https://pypy.org | IRC logs: https://quodlibet.duckdns.org/irc/pypy/latest.log.html#irc-end and https://libera.irclog.whitequark.org/pypy | the pypy angle is to shrug and copy the implementation of CPython as closely as possible, and staying out of design decisions
jcea has quit [Remote host closed the connection]
jcea has joined #pypy
jcea has quit [Ping timeout: 246 seconds]
itamarst has quit [Quit: Connection closed for inactivity]
korvo has quit [Ping timeout: 250 seconds]
korvo has joined #pypy
<nimaje> seems like some bot found buildbot and tried stuff like sql injections on it
<cfbolz> eh
<cfbolz> seems it stopped now
marmoute has joined #pypy
<marmoute> Okay, so I my current testing with pypy give disapointing result.
<marmoute> I am looking into using pypy for some of the SoftwareHeritage workload.
<marmoute> In addition to the core work of hashing stuff, sending stuff around and talking to database. They have large "graph" built from Python object, that uses attrs and a lot of validators.
<marmoute> These objects show up as takes a two digits percentage in some profile. So I though that maybe pypy could help here.
<marmoute> I tried to applied pypy on the "storage server" side. It mostly does :
<marmoute> - receive graph information (including data blob) through HTTP (Flask),
<marmoute> - validate them (the part I was hoping to speed up),
<marmoute> - store data in database (postgress for my tests),
<marmoute> In that case, pypy give very similar timing, (about 60s user CPU time on the server side. maybe 1 minute slower than CPython)
<marmoute> So I tried applying pypy on the client side. for the case I tried It mostly does :
<marmoute> - download some data from the internet
<marmoute> - find data on disk,
<marmoute> - hash data
<marmoute> - build the model graph (part I am hoping to speedup)
<marmoute> - send data to the server using some http rpc.
<marmoute> In that case pypy is about twice slower >.<
<marmoute> Am I missing something obvious ?
<cfbolz> marmoute: how many C libraries are being used?
<marmoute> In theory, not too many. the main bit is postgresql and I think it is using the non-compiled version
<cfbolz> right
<cfbolz> marmoute: and you are testing for long enough that the JIT has a chance to produce code (ie a few minutes of running)?
<marmoute> It run for 1 hour
<marmoute> (for the server test)
<marmoute> The client test runs for between 3 and 5 minutes
<cfbolz> right
<cfbolz> both sound sufficient
<cfbolz> marmoute: the next steps would probably be to do some profiling to see whether anything in particular looks super slow on pypy
<marmoute> apparently, the interfacing with libpq in psycopg is done through ctypes
<fijal> that would do it
<fijal> marmoute: if you can organize me some way to look at the profiles I would be willing to have a quick look?
<LarstiQ> marmoute: https://pypi.org/project/psycopg2cffi/ doesn't look actively maintained, hmm
<marmoute> I am using `psycopg` the package for psycopg version 3
<marmoute> fijal: Thanks for the offer! I need to figure out how to gather some profile in the lazagna first.
<marmoute> .me double check for other unsuspected compiled bits
<marmoute> fijal: the recommended way to profile pypy is vmprof, right ?
<fijal> yeah....
<fijal> I don't think it's super well working, but there isn't anything better
<fijal> it was a very insightful journey for me, but not necesarily leaving an artifact of a great product
<cfbolz> I just merged the "known bits" jit optimization, btw
<cfbolz> it lets the JIT reason much better about bit manipulation, which is uncommon in python but super helpful for pydrofoil
<cfbolz> eg the jit now knows that even + even is an even number
<fijal> heh
<cfbolz> fijal: useful, isn't it? 😆
<fijal> definitely fun
<cfbolz> fijal: in the emulators it's used for stuff like this: you have a pointer p in a register that you dereference. the emulated CPU checks that it's aligned (and throws a fault if not). then you dereference p + 8
<fijal> I can imagine
<cfbolz> with the changes we can remove the alignment check for p + 8, because it follows from p being aligned
<fijal> I'm far more skeptical about the social side of writing python interacting with optimizations than I am skeptical about rpython toolchain being good
<cfbolz> :-)
<cfbolz> yeah
<fijal> I believe we addressed what people say makes python slow without making python significantly faster
<cfbolz> but I'm sure somebody somewhere wrote a weird python program that benefits too ;-)
<fijal> in any practical sense
<fijal> yeah, I'm sure you both can write software than runs fast on pypy (I did for sure) and there is some stuff that just does
korvo has quit [Quit: Client closed]
itamarst has joined #pypy
jcea has joined #pypy
marmoute has quit [Quit: Client closed]
[Arfrever] has quit [Ping timeout: 256 seconds]
[Arfrever] has joined #pypy
marmoute has joined #pypy
korvo has joined #pypy
mjacob_ is now known as mjacob
marmoute has quit [Quit: Client closed]
marmoute has joined #pypy
<marmoute> fidjal: So I have a 166MB file produced by -m vmprof -o /tmp/vmprof.log Is that what you want ?
jcea has quit [Ping timeout: 268 seconds]
<korvo> Yes. Try `vmprofshow /tmp/vmprof.log` to get an overview.
<cfbolz> marmoute: you can also try the new (experimental) converter to the firefox profiler format, to reuse the firefox ui
korvo has quit [Quit: Client closed]
korvo has joined #pypy
marmoute has quit [Quit: Client closed]
marmoute has joined #pypy
<marmoute> I cannot get vmprofshow to show me something usable. I'll try the firefox converter
* marmoute wondered why that IRC Web Client was so bad, and that's the default libera chat "discovery" one…
<cfbolz> marmoute: please file bugs if the converter doesn't work, Christoph is happy to get them
<marmoute> I am sorry, but the converter works just fine. I don't have bug to reports
marmoute has quit [Quit: Client closed]
korvo has quit [Quit: Client closed]
jcea has joined #pypy
korvo has joined #pypy