cfbolz changed the topic of #pypy to: #pypy PyPy, the flexible snake | IRC logs: and | insert pithy quote here
rindolf has joined #pypy
jacob22_ has quit [Ping timeout: 265 seconds]
otisolsen70 has joined #pypy
otisolsen70 has quit [Remote host closed the connection]
otisolsen70 has joined #pypy
otisolsen70 has quit [Remote host closed the connection]
jacob22_ has joined #pypy
jacob22_ has quit [Client Quit]
rindolf has quit [Quit: Shlomi Fish ("Rindolf") has left the server. β€œChuck Norris was challenged to fight the world, and accepted. He bet on himself, won, and collected the bet money.”]
<cfbolz> ctismer: did you see this?
<ctismer> cfbolz: πŸ˜„ No! But I already recognized some changes in Python 3.10 which sounded like Stackless features. Include/cpython/frameobject.h:
<ctismer> enum _framestate {
<ctismer> FRAME_SUSPENDED = -1,
<ctismer> FRAME_RETURNED = 1,
<ctismer> FRAME_CREATED = -2,
<ctismer> FRAME_EXECUTING = 0,
<ctismer> FRAME_UNWINDING = 2,
<ctismer> FRAME_RAISED = 3,
<ctismer> FRAME_CLEARED = 4
<ctismer> };
<fijal> that's a new one
<cfbolz> ctismer: heh, cool
<cfbolz> fijal: pypy 6.0.0?
<fijal> cfbolz: yeah, I'm trying to understand the move from ssl in RPython to cffi
<cfbolz> ah
<fijal> the idea that old software does not run is very perplexing to me
<fijal> (and very much we are moving more and more towards that)
<fijal> ok, this is somewhat perplexing, why does stuff survive the minor collection at all?
<fijal> are some of the crazy cffi objects not supposed to live in a nursery?
<cfbolz> fijal: is that __del__ maybe
<fijal> in ffi stuff?
<cfbolz> yes
<fijal> so does have a __del__?
<cfbolz> fijal: I think so
<cfbolz> maybe it has the other kind of rpython finalizer
<fijal> it has must_be_light_finalizer
<fijal> ugh, maybe we need an armin here
<cfbolz> ctismer: how old are these ideas on the stackless side? 20 years?
<fijal> ugh ok
<fijal> cfbolz: can I have your attention for 5 min, I want to rubber duck something
<cfbolz> yup
<cfbolz> can I make a coffee first?
<fijal> yeah sure
<fijal> I will start typing
<fijal> so if you look at _cffi_ssl/_stdssl/ in "def read" there is a"char[]", length)
<fijal> and we never actually keep it around
<fijal> so... can we have a way to alloc something that does go out when we go out of scope? this is what we need, I think
<nimaje> I think I remember some context manager for something like that
<cfbolz> fijal: yeah, conceptually a with sounds right
<cfbolz> it's not clear we speed up collection though that way
<fijal> so calling ffi.release helps a bit, but not enough
<fijal> essentially if I cache the buffer in that function, it goes fast
<fijal> which makes me think that what I actually want is a malloc/free pair that I get a python handle for (that does not free it)
<fijal> is ffi.malloc a thing or do I just get a cdef?
<cfbolz> fijal: ok, but the length could change
<cfbolz> fijal: you could probably do something cheap purely in ssl
<cfbolz> keep the char[] and its length around
<cfbolz> if the previous one has the same length, reuse it
<cfbolz> if not, free it
<fijal> there is a bit of issues with threads and shit
<cfbolz> True
<fijal> like, what I really want is malloc/free with a scope that does not create anytyhing with a __del__, I think
<cfbolz> fijal: yup, that would be a real C stack allocation
<cfbolz> But not trivial
<cfbolz> fijal: or we fight find a way to tell the garbage collector that the lightweight finalizer no longer needs to be called anymore
<cfbolz> That's the kind of thing as possible with regular finalizers
<fijal> yeah ok, so I got the improvement indeed
<fijal> right, so I can hack at the ssl module and have immediate improvements or I can hack at pypy to make it smarter
<fijal> but I'm not sure if "let's remove this from a list of things to finalize" is such a winner
<fijal> ok, I think this is bad?
<fijal> like we advertise that's the way to do it, but it's veeeeery slow on pypy
<cfbolz> fijal: can you write a micro bench?
<fijal> I think so, but I failed with a normal TLS ping-pong
<fijal> I have a basic redis setup, which is a bit more than a microbenchmark
<cfbolz> fijal: no, I mean very concretely, just calling new
<fijal> right
<fijal> TypeError: initializer for ctype 'char' must be a string of length 1, not str
<fijal> it probably means "bytes" these days
<fijal> I keep running into really bizzare issues
<ctismer> cfbolz: Yes, in 1999 I decided to decouple the Python stack from the C stack. If Python had followed this idea earlier, we would have saved all the mess with generators and yield from that they use for async. Very messy.
<fijal> oh god
<fijal> ok, so tempfile depends on hashlib which depends on tempfile
<fijal> not sure how to deal with that
<fijal> how did it ever work is a good question, I guess, but I suppose I changed some files?
<fijal> cfbolz: I think that's the benchmark, it's 5x slower
<cfbolz> fijal: ok, but then basically writing a context manager that wraps malloc sounds not bad?
<fijal> yeah, I think that's what we should do
<cfbolz> fijal: can even be local to ssl
<fijal> have a cffi way to have malloc/free (or some form of new) as a context manager
<fijal> I think that's a very common pattern?
<cfbolz> πŸ€·β€β™€οΈ
<cfbolz> probably?
<fijal> passing buffers to C functions that you later forget seems incredibly common
<cfbolz> but needs armin/cffi maintainers and waiting then
<fijal> we have rffi version of that (more than one)
<cfbolz> yep
<fijal> yeah something along those lines, sure adding things to cffi seems a bit bad, but let's see
<fijal> there is ffi.release already - so someone thought about it
<fijal> hah, did you know that has a "PYPYLOG" as a colorer?
<cfbolz> fijal: that's pygments, I think?
<fijal> let's wait for armin to get up
<fijal> yeah seems like it's easy to fix, it's just a question what's the best way to do it
<fijal> _ssl, _sqlite, _hashlib, __decimal (?), _sha3
<fijal> those are all examples where this would be better
yizawa has joined #pypy
<cfbolz> fijal: cool
stkrdknmibalz has quit [Quit: WeeChat 3.0.1]
Julian has joined #pypy
Julian has quit [Ping timeout: 252 seconds]
Julian has joined #pypy
<ronan> fijal: cdata objects can already be used as context managers
<ronan> using that makes your microbenchmark 3x faster
<cfbolz> An immediate win
<cfbolz> I'd say we should do that right away?
<ronan> cfbolz: yes, however it's still 3x slower than raw malloc
<ronan> I guess because of the __del__
<cfbolz> Hrm
<cfbolz> Right
<ronan> BTW, here's my version of fijal's bench:
yizawa has quit [Quit: Connection closed for inactivity]
<fijal> thanks!
<fijal> ronan: that should be quite easily fixable to not have __del__ if it's a context manager, right?
<fijal> should I try that?
<fijal> would be cool to update the examples too, I think
<ronan> fijal: that would change cffi semantics, because the cdata itself is the context manager
<fijal> right
<fijal> meh
<ronan> it's probably doable without changing cffi itself, but not "quite easily"
<fijal> wait a second, do we really rely on the fact that we can stash x somewhere?
<fijal> it would not work anyway, right?
<fijal> on PyPy, ffi.release() frees the memory immediately.
<fijal> from cffi docs, so no, it's not quite legal to use it after with
<fijal> ronan: I think the semantics would not change if (used in a context manager) would not have a __del__
<fijal> I'm not sure how to communicate "used in a context manager" though
<ronan> yes, but you can do anything inside the with, so x.__enter__ should return something that's exactly like x, except for __del__
<ronan> "used in a context manager" is the return value of __enter__()
<fijal> yes, but we already have that inside cffi
<fijal> like W_CDataNewNonStd returns object with _finalize_ and not __del__, for example
<fijal> which I don't know what it does
<ronan> fijal: in all cases, the object still has an applevel __del__
<fijal> why?
<fijal> I don't see the applevel del?
Julian has quit [Quit: leaving]
<ronan> fijal: sorry, you're right, there isn't one
<fijal> so i think we could return the non-standard one?
<arigato> fijal: I *think* you can try that:
<arigato> declare "malloc" and "free" inside cdef()
<arigato> alloc = ffi.new_allocator(lib.malloc)
<arigato>, sorry doesn't work
greedom has joined #pypy
<arigato> it seems that the standard cdata objects that you can use in "with" constructions also have at least a lightweight finalizer
<arigato> you can probably write a general helper, to use in "with TempBuf(size) as p: ...p is a cdata object here..."
<arigato> and get the top speed that way on pypy
<arigato> if you try and it works we can think about adding that to cffi itself
<arigato> write the helper as regular python code I mean
the_drow has quit [Read error: Connection reset by peer]
ambv has quit [Read error: Connection reset by peer]
graingert[m] has quit [Write error: Connection reset by peer]
daubers has quit [Remote host closed the connection]
jryans has quit [Write error: Connection reset by peer]
jryans has joined #pypy
<arigato> alternatively, we could think harder about the problem and come up with a way to have real stack allocations in the JIT
daubers has joined #pypy
the_drow has joined #pypy
graingert[m] has joined #pypy
ambv has joined #pypy
greedom has quit [Remote host closed the connection]
<arigato> thinking about a pair of rpython functions that do malloc and free, but which the JIT knows about and if both appear in a trace (and the size is small enough) it gets turned into a stack allocation; the real problem is what to do if an intermediate guard leaves the trace
<arigato> we'd need to insert a "copy_outside_the_stack" operation; and if we then compile the bridge, then we optimize again "copy_outside_the_stack" followed by "free"
<arigato> that would only work if the rpython code is super careful
<arigato> or maybe the jit optimizer needs to check all usages of the variable returned by "alloc", and check that they are all OK
greedom has joined #pypy
<arigato> ...well, maybe the first step would be to teach the JIT about lightweight finalizers. I don't think we do so far
<arigato> too many possible directions to go, none looks easy
<fijal> heh
<fijal> I think I would start with a pure-python wrapper
<arigato> sounds sane
<fijal> and see if using it in pypy cffi modules helps, then we can think
<fijal> stack allocation in the jit sounds nice, but I think those functions might not even be jitted or something
<fijal> they probably are I guess
<fijal> but seems a bit fishy, maybe a pure python wrapper that has an optimization on top would be ok though
<arigato> yet another simplified hack for the JIT: add a function in rgc to say "well now I don't really care any more about my lightweight finalizer", and if we see pairs malloc/call-to-that in a trace, then we ignore the lightweight finalizer and optimize like usual
<arigato> yes, I see the point
greedom has quit [Remote host closed the connection]
greedom has joined #pypy
greedom has quit [Remote host closed the connection]
jacob22 has joined #pypy
mattip_ has joined #pypy
<mattip_> ronan: is hpy-0.0.3 ready for merging to py3.7 ?
mattip_ has quit [Client Quit]
stkrdknmibalz has joined #pypy