cfbolz changed the topic of #pypy to: #pypy PyPy, the flexible snake | IRC logs: and | so many corner cases, so little time
slav0nic has joined #pypy
epony has quit [Quit: QUIT]
Guest96 has joined #pypy
Atque has joined #pypy
Guest96 has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
Guest96 has joined #pypy
lehmrob has joined #pypy
epony has joined #pypy
Guest96 has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
Dejan has joined #pypy
Guest96 has joined #pypy
otisolsen70 has joined #pypy
Techcable has quit [Read error: Connection reset by peer]
Techcable has joined #pypy
lritter has joined #pypy
slav0nic_ has joined #pypy
slav0nic has quit [Remote host closed the connection]
otisolsen70 has quit [Quit: Leaving]
Diggsey has joined #pypy
<Diggsey> I'm having some issues with pypy being a lot slower than cpython. One of the issues seems to be `isinstance` - cProfile is showing a lot *more* calls to `isinstance` under pypy than with cpython, and each call is taking much longer
<Diggsey> is there any reason why either of those things should be the case?
<Diggsey> under pypy, it's called `8056582` times and spends a cumulative time of 8.412s within that function, compared to 4486368 and 2.888s for cpython
<arigato> Diggsey: using cProfile is generally not recommended as a way to get a good profile on pypy. About "isinstance", there are some code paths that may internally call isinstance() on pypy, whereas on cpython they would call some C code directly and not show up in the statistics
<mattip> is your code calling into c-extensions? These are known to be slower on PyPy
Guest96 has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
Guest96 has joined #pypy
Guest96 has quit [Client Quit]
lehmrob has quit [Quit: Konversation terminated!]
Guest96 has joined #pypy
<fijal> also isinstance is one of those functions that will be overrepresented, because on pypy checking the clock twice and putting info in a dictionary is quite expensive compared to just "isinstance" call
<cfbolz> Diggsey: what does the program do? is it something we can run ourselves?
<Diggsey> @arigato: I would love to use something else but it seems like no other profiler is supported by pypy on windows
<Diggsey> fwiw, it seems to take a similar amount of overall time without cprofile
<Diggsey> @cfbolz: unfortunately no, it's a fairly large legacy python application. I've written a script to exercise one part of it that is known to be slow
<Diggsey> I was hoping that pypy might give some performance improvements
<Diggsey> @mattip: I think it does, but I don't think they're called frequently enough to be an issue
<cfbolz> What kind of stuff does it do?
<Diggsey> it's exercising some endpoints in a flask CRUD app - it makes uses of sqlalchemy to access a mysql db
<Diggsey> but a large amount of time is spent recalculating stuff in python that's changed
<Diggsey> a large amount of time is also spent just in the sqlalchemy code (ie. not actually waiting on the db)
<fijal> sqlalchemy is *very* slow
<Diggsey> yeah, I was hoping pypy might help with that...
<fijal> Diggsey: I have a non-solution solution for you, which is not use sqlalchemy (or any otheR ORM really(
<fijal> it doesn't
<Diggsey> haha
<fijal> I tried optimizing it and I can go for pages why it doesn't
<fijal> but essentially a JIT-friendly ORM is something that I was wanting to write for a while, but never had a good use-case
<Diggsey> don't suppose you've written that up anywhere?
<Diggsey> it would be useful to understand more about what makes something jit friendly and what actually makes sqlalchemy slow
<fijal> so the main thing that makes sqlalchemy slow is that it's essentially an interpreter
<Diggsey> right, I guess I just don't really understand why that's so bad for pypy
<Dejan> I only use SQLAlchemy core
<Dejan> I wonder if there is something like ActiveJDBC ( ) in Python world...
<Dejan> That is the only "ORM" that I may try
<fijal> Diggsey: well, first of all it's slow to start with
<fijal> but two, you have very general very megamorphic calls to a bunch of functions and you construct the whole thing each time
<fijal> so instead of having say select_something_from_something_else(param1, param2) that does the job you have the generic select(*stuff, **even_more_crazy_stuff) that differs from call to call
<Diggsey> but I mean, my app most only use a few specific combinations right - I would kindof expect a JIT to be able to beat cpython here since it can look at the actual types and specialize
<Dejan> in most cases it waits for data to come, right?
<Dejan> I would honestly expect them to be roughly the same
<Diggsey> pypy is 2x slower overall
<Dejan> then it probably heavily uses some underlying C extension
<Diggsey> and that's ignoring startup cost and making the same sequence of requests 20x
<Diggsey> for a total of 186s in the loop under pypy vs 98s in cpython
<cfbolz> yeah, that's definitely not great (and my lesson here would be "we should investigate ORMs again")
<Diggsey> it's using pymysql as the driver, so there should be no C code for that part...
<Diggsey> is there some way I can see what C calls it's making?
<fijal> Diggsey: well, pypy can't guess which part of the call is constant and should be specialized on
<fijal> (it's kind of expected that a very general python code is 2x slower on pypy, your mileage may vary of course)
<fijal> I would also expect that sqlalchemy got a bit optimized for cpython, which is usually actually bad for pypy
Guest96 has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
Guest96 has joined #pypy
lritter has quit [Quit: Leaving]
slav0nic_ has quit [Remote host closed the connection]
slav0nic_ has joined #pypy
Guest96 has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
Diggsey has quit [Quit: Connection closed for inactivity]
slav0nic_ has quit [Ping timeout: 250 seconds]