cfbolz changed the topic of #pypy to: #pypy PyPy, the flexible snake | IRC logs: and | insert pithy quote here
lritter has quit [Ping timeout: 255 seconds]
lritter has joined #pypy
kor1 has quit [Quit: Leaving.]
smarr has quit [Quit: Connection closed for inactivity]
ambv has joined #pypy
lritter has quit [Quit: Leaving]
<antocuni> mattip: I tried to edit your hackmd documents to add some notes, but apparently instead of editing your document I created a new one
<antocuni> anyway, my version is now here:
<antocuni> grep for @antocuni to see my notes
ambv has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
Julian has joined #pypy
Julian has quit [Quit: leaving]
ambv has joined #pypy
Julian has joined #pypy
Julian has quit [Quit: leaving]
stkrdknmibalz has quit [Ping timeout: 246 seconds]
shodan45 has quit [Ping timeout: 268 seconds]
shodan45 has joined #pypy
<arigato> Jin^eLD, habnabit_: I think it's just a case of missing use case from me. Indeed I would have noticed that cffi_start_python() shouldn't be static if I really had a use case.
<arigato> in the meantime, yes, the workaround of implementing another function to call cffi_start_python() is good enough
<arigato> mattip: my windows machine doesn't have AVX512 (most probably), otherwise I would definitely help
<Jin^eLD> arigato: ok, thx for the feedback
<arigato> ...or try to help at least
<arigato> Jin^eLD: I'll at least mention this workaround in the documentation. Fixing it this way would pollute of the global namespace with an extra name, so I'm still undecided about it
<arigato> it might actually introduce issues if you try to link two unrelated cffi embedded modules into the same big application
<arigato> if you call cffi_start_python() from the main application, which one is called?
<arigato> at least using your workaround each module can give a different name to the exported wrapper function
<Jin^eLD> arigato: indeed it would have helped a lot if this was documented, gave me quite a hard time yesterday as I kept thinking that I was missing something as the docs said that should just be callable
<Jin^eLD> :)
<arigato> right, but also it's not in any header file, so I can see how the docs are very confusing :-)
<Jin^eLD> could we by any chance get a cffi_stop_python() function too? :> I know Python is leaky when reinitializing, but I guess those leaks could be fixed eventually... reinitializing the interpreter is one of things that I miss in cffi
<arigato> at the moment I fear it would be hard to implement it on PyPy
<Jin^eLD> and I guess you would not want to deviate features between cpython and pypy?
<arigato> as much as possible, yes
<arigato> you should be able to take everything you wrote with cpython in mind and swap pypy and it should work, ideally
<arigato> and vice-versa
<Jin^eLD> I see, to be honest I never tried pypy :) and it was my first time embedding python stuff into C
<Jin^eLD> writing an application that allows to run user scripts to do stuff
<Jin^eLD> so its the embedding scenario
<Jin^eLD> I think I got everything set up by now with the exception that I am still fighting a bit in running a new user script (i.e. one that gets replaced on the fly) in a clean environment - that's where the python-reinit question comes from
<arigato> the current model is that the python interpreter stays initialized forever in your application, and we hope the user script doesn't do crazy things like messing up with the builtins module---just like we have anyway to assume the user script doesn't do crazy things like erase all user files
<Jin^eLD> my concern was lesser about malicious intent, but userscripts having some leftovers that could cause undesired behavior
<Jin^eLD> right now I am trying to help myself by removing the loaded user stuff from sys.modules
<arigato> yes, the best you can do is try to run the user script in its own new module, and remove the module at the end; this should minimize the unintentional leftovers
<arigato> cleaning sys.modules, exactly
<Jin^eLD> but I still have some issues loading a user script (using zipimporter) that organizes its own code in several .py files
<arigato> yes, you'll end up fighting python's complicated and inflexible module system
<Jin^eLD> basically if I load the main module and that module happens to import localsubmod the latter is not being found although its inside the zip
<Jin^eLD> aha :) I thought it was flexible, but indeed complicated :) but i do not think I have enough python knowledge to make a qualified statement
<arigato> depends a bit what you're after, but I'd recommend the hackish but simple solution: unzip the file in some directory, insert in front of sys.path that directory, run the script, and at the end restore sys.path and sys.modules to their original states
<Jin^eLD> that could work, I was hoping to take the easy way out of just loading the zip, was surprised that included sources that are within the zip can't find each other then...
<Jin^eLD> basically that's the reproduction scenario:
stkrdknmibalz has joined #pypy
<arigato> there are some very complicated systems that should make this possible without unzipping, but there is quite a lot to learn (and I entirely forgot)
<Jin^eLD> well, if you're saying the above pasted attempt is a PITA, then I'd rather not waste time on it and do the unzipping; I was still hoping that I just missed something trivial
<arigato> I would classify it as a PITA; maybe others here have a simple solution but I don't know it
<Jin^eLD> :)
<arigato> just do 'copy = sys.modules.copy()' at the start, and at the end something like 'sys.modules.clear(); sys.modules.update(copy)'
kor1 has joined #pypy
<arigato> around 'try: finally:' to make sure it's always called
<Jin^eLD> wouldn't sys.modules.clear() nuke the builtins making it impossible to continue doing anything reasonably?
<arigato> uh :-) no, I think in this case it should still work
<Jin^eLD> I'll try it :)
<arigato> sys.modules.clear() only makes import statements not find the modules, but everything else should continue working
<arigato> of course you can also be more conservative and do:
<arigato> for key in sys.modules: if key not in copy: del sys.modules[key]
<arigato> for example
<Jin^eLD> that's pretty much what I have been thinking about, but guys in #python suggested I might get away with just removing the loaded user module script, because everything that it does gets attached to it
<Jin^eLD> removing from sys.modules that is
<arigato> no, not if it does further imports
<arigato> these new imported modules will also get added in sys.modules
<Jin^eLD> so if usermod does "import somethingelse" then it gets attached on the same level as usermod itself?
<Jin^eLD> I see
<Jin^eLD> thanks for pointing that out
<arigato> yes
<arigato> unless you make a package instead of a module, but that is normally done with extra subdirectories
<Jin^eLD> the idea was that users could organize the zip how they like, with the exception that the app will call "__userscript__" as entry point
<arigato> you can hack until the initial user module is imported under the name "myscripts.initialmodule" instead of "initialmodule", and then everything it imports will be "myscripts.someothername" by default, but that's again PITA territory
<Jin^eLD> now you know why I was so much looking for a reinit way ;)
<arigato> yes :-)
<Jin^eLD> I think your suggestion with a copy and clearing works just fine
<Jin^eLD> did not get any errors
<arigato> cool
<Jin^eLD> thanks!
<arigato> it should restore the initial module state, but it's not safe against manual changes done to these modules (e.g. the script doing `import sys;"bar"`)
<arigato> hope that's good enough.
<Jin^eLD> well yes, a script could mess up anything in Python since everything is some sort of a writable object
<Jin^eLD> I looked into sandboxing at the beginning but quickly saw that it just makes no sense
<Jin^eLD> but I think its good enough
<arigato> OK
<Jin^eLD> since the app "belongs to the user" i.e. its not a service on some server, but something that you'd run yourself, so I doubt one would be messing themselves up with their own scripts :)
<arigato> makes sense :-)
<Jin^eLD> arigato: what happens to running code in the module when the module suddenly gets removed from sys.modules?
<mattip> arigato,: fwiw the reproducer in the hackmd does not depend on AVX512F
<mattip> antocuni: thanks for the thoughts
ambv has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<mattip> antocuni: I merged your changes back to the original hackmd document
<arigato> mattip: ah
<arigato> Jin^eLD: sys.modules is just used when running "import" statements. If there is code still running in a module, the module itself is still valid, and anything already imported in it is still valid, recursively
<mattip> I wonder if we can reason about the problem thus:
<mattip> the third argument to PyUFunc_GenericFunctionInternal is a struct with two pointers
<mattip> outside the call, the struct is valid
<mattip> inside the call it is getting zeroed out
<mattip> what register will the struct be passed in on?
<Jin^eLD> arigato: is there any way to kill or terminate it from the outside?
<arigato> Jin^eLD: not really
<Jin^eLD> that's another scenario where I am trying to work around the missing reinit
<Jin^eLD> doh :(
<arigato> if code is running, it might be caught in an infinite or very long loop anyway
<arigato> it could be running in a different thread though
<Jin^eLD> arigato: well, in cpython I would have tried doing something like Py_AddPendingCall() with a yErr_SetString(PyExc_InterruptedError, "nuke") or something like that
<arigato> Jin^eLD: which might or might not work
<arigato> pretty sure at least some versions of CPython will only invoke Pending Calls in the "main thread", and which thread is the "main thread" when embedding CPython is very unclear
<Jin^eLD> mhhhh
<Jin^eLD> this whole embedding stuff is more complicated than I anticipated
<arigato> yes, it's not completely thought out, neither in CPython nor in PyPy
<Jin^eLD> perhaps I should have went with Lua :)
<Jin^eLD> I had some spidermonkey/js experience in the past and it was just a huge PITA to get that right
<Jin^eLD> especially to make sure your C code would not crash due to JS gc
<arigato> ...ah
<Jin^eLD> so this time I thought OK, lets try something else
<Jin^eLD> there was Lua, but I am not familiar with it, now thinking if that would have been a better choice
<arigato> Lua is definitely muuuuuuuuuch simpler
<Jin^eLD> and then of course - Python...
<arigato> but Lua is also a much simpler language with limitations when trying to do something large
<Jin^eLD> I do not think I am going to do something large, so it might have been a better choice.. I am just not familiar with it at all
<Jin^eLD> and Python is also more widespread in terms of userbase
<Jin^eLD> everybody knows Python, so thats nicer towards the users I guess which was one of my considerations
<arigato> (I can grumble about Python developers trying to fix the embedding issues by adding even more complexity, which of course doesn't make anything simpler, but well)
<Jin^eLD> :)
<Jin^eLD> I had the feeling that embedding is not a very well tested and not very often used scenario...
<Jin^eLD> of course I got that feeling only after I started to try to use it...
<arigato> I guess everybody who tries hits the same snags as you do
<arigato> mattip: normally, if you know exactly which calling function and which called function are involved, we could find out what is wrong by looking at the assembler for both
<mattip> I have a debugger with the assembler,
<mattip> if you have some time I could send a zoom link and share my screen
<arigato> would be cool, but tonight I'm off now
<mattip> ok
<arigato> if you can grab the C sources and the corresponding assembler for both, at least I could tell you if it looks correct or a C compiler bug
ambv has joined #pypy
ambv has quit [Ping timeout: 255 seconds]
ambv has joined #pypy
<mattip> Sebastian from NumPy is debugging this with me. He says
<mattip> I found something by stepping through at the assembly level, I think: is stored in the first part of XMM6, and this is NULLed by PyPy, but should be non-volatile
<mattip> Although this is obviously not a floating point number, it is definitely where the value is stored.
<mattip> </endquote>
<mattip> Here is the C code and the assembler generated (by stopping in the debugger)
<mattip> and the documentation for the x4 calls says "In other words, user-written assembly language routines must be updated to save/restore XMM6 and XMM7 before/after the function when being ported from x86 to x86-64."
seberg has joined #pypy
ambv has quit [Ping timeout: 252 seconds]
<arigato> mattip: "ah"
<arigato> I'm pretty sure that on 32 bits and on linux it's xmm8-xmm15
<arigato> that might be clearly the problem
<mattip> should I try to add them to CALLEE_SAVE_REGISTERS in x86/runner .py?
<mattip> or does it need a more complicated fix
<arigato> let me check
<arigato> no, that constant does not list xmm registers, only non-xmm ones
<arigato> I think that on linux none of the xmm registers need to be saved, whereas on windows you need to save xmm6-xmm15
<arigato> we use xmm15 as a scratch register, which is already wrong
<arigato> then xmm0-xmm14 are listed in X86_64_XMMRegisterManager in
<mattip> from what I can gather, it seems the XMMRegisterManager is tied to float values, where xmm6, xmm7 are used as general purpose registers
<arigato> it's a real bug in our code on win64, probably my fault when I tried to port it to win64
<arigato> in linux all xmm regs are save_around_call_regs, but that's not the case on windows
<mattip> anything I can do to expedite a fix?
<arigato> it's suboptimal but you could reduce X86_64_XMMRegisterManager.all_regs to be a list containing only [xmm0,xmm1,xmm2,xmm3,xmm4] and have regloc.X86_64_XMM_SCRATCH_REG be xmm5
<arigato> at least on WIN64
<arigato> grepping for xmm[1-9], I think none of the xmm registers is used explicitly apart from there and in, which looks correct
<mattip> I can try that and see if it fixes the problem, pending a more extensive fix later
<arigato> yes
<arigato> this fix will just make the JIT never use xmm6-xmm15
<arigato> it's actually very unclear if it's a good idea to do anything else, because if we use more xmm registers, then we need to save them in the prologue of all functions produced by the JIT, and that's additional cost everywhere even in functions that don't use floats
<mattip> ok
<arigato> yes, we might get rid of X86_64_XMM_SCRATCH_REG and use xmm0-xmm5 instead of xmm0-xmm4, but apart from that, I'm very unsure we could do better
<arigato> (like 32-bit which uses xmm0-xmm7 and has no XMM_SCRATCH_REG)
yuiza has joined #pypy
<mattip> the reproducer passes with the changes you suggested: reduce X86_64_XMMRegisterManager.all_regs to be a list containing only [xmm0,xmm1,xmm2,xmm3,xmm4] and have regloc.X86_64_XMM_SCRATCH_REG be xmm5
<Jin^eLD> arigato: another question on the embedding scenario... still thinking about how to force stop those user scripts; so let's say I have my python embedding_init_code that loads and runs the user script; what if it would spawn a multiprocessing.Process() first and load the user code there? then I could simply terminate it from C via a C->Python function which has the process handle?
lritter has joined #pypy
<bbot2> Started: [mattip: test branch, win64-xmm-registers]
<bbot2> Started: [mattip: test branch, win64-xmm-registers]
<bbot2> Started: [mattip: test branch, win64-xmm-registers]
seberg has quit [Ping timeout: 255 seconds]
seberg has joined #pypy
seberg has quit [Client Quit]
<antocuni> mattip, arigato: wow, congtrats at tackling down the numpy+pypy segfault, that was hard
<bbot2> Failure: [mattip: test branch, win64-xmm-registers]
<bbot2> Failure: [mattip: test branch, win64-xmm-registers]
<bbot2> Failure: [mattip: test branch, win64-xmm-registers]