#pypy on 2021-07-17 — irc logs at libera.irclog.whitequark.org

2021-05-30 17:38 cfbolz changed the topic of #pypy to: #pypy PyPy, the flexible snake https://pypy.org | IRC logs: https://quodlibet.duckdns.org/irc/pypy/latest.log.html#irc-end and https://libera.irclog.whitequark.org/pypy | insert pithy quote here

00:18 lritter has quit [Ping timeout: 255 seconds]

00:19 lritter has joined #pypy

03:20 kor1 has quit [Quit: Leaving.]

05:07 smarr has quit [Quit: Connection closed for inactivity]

08:47 ambv has joined #pypy

08:50 lritter has quit [Quit: Leaving]

09:34 <antocuni> mattip: I tried to edit your hackmd documents to add some notes, but apparently instead of editing your document I created a new one

09:34 <antocuni> anyway, my version is now here: https://hackmd.io/cXjf406FRJqZfgUxsgWOAQ?view

09:34 <antocuni> grep for @antocuni to see my notes

11:39 ambv has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

11:50 Julian has joined #pypy

12:25 Julian has quit [Quit: leaving]

12:28 ambv has joined #pypy

12:58 Julian has joined #pypy

13:07 Julian has quit [Quit: leaving]

14:13 stkrdknmibalz has quit [Ping timeout: 246 seconds]

15:16 shodan45 has quit [Ping timeout: 268 seconds]

15:21 shodan45 has joined #pypy

15:43 <arigato> Jin^eLD, habnabit_: I think it's just a case of missing use case from me. Indeed I would have noticed that cffi_start_python() shouldn't be static if I really had a use case.

15:43 <arigato> in the meantime, yes, the workaround of implementing another function to call cffi_start_python() is good enough

15:44 <arigato> mattip: my windows machine doesn't have AVX512 (most probably), otherwise I would definitely help

15:45 <Jin^eLD> arigato: ok, thx for the feedback

15:45 <arigato> ...or try to help at least

15:46 <arigato> Jin^eLD: I'll at least mention this workaround in the documentation. Fixing it this way would pollute of the global namespace with an extra name, so I'm still undecided about it

15:47 <arigato> it might actually introduce issues if you try to link two unrelated cffi embedded modules into the same big application

15:48 <arigato> if you call cffi_start_python() from the main application, which one is called?

15:49 <arigato> at least using your workaround each module can give a different name to the exported wrapper function

15:49 <Jin^eLD> arigato: indeed it would have helped a lot if this was documented, gave me quite a hard time yesterday as I kept thinking that I was missing something as the docs said that should just be callable

15:49 <Jin^eLD> :)

15:50 <arigato> right, but also it's not in any header file, so I can see how the docs are very confusing :-)

15:50 <Jin^eLD> could we by any chance get a cffi_stop_python() function too? :> I know Python is leaky when reinitializing, but I guess those leaks could be fixed eventually... reinitializing the interpreter is one of things that I miss in cffi

15:52 <arigato> at the moment I fear it would be hard to implement it on PyPy

15:52 <Jin^eLD> and I guess you would not want to deviate features between cpython and pypy?

15:52 <arigato> as much as possible, yes

15:53 <arigato> you should be able to take everything you wrote with cpython in mind and swap pypy and it should work, ideally

15:54 <arigato> and vice-versa

15:55 <Jin^eLD> I see, to be honest I never tried pypy :) and it was my first time embedding python stuff into C

15:55 <Jin^eLD> writing an application that allows to run user scripts to do stuff

15:56 <Jin^eLD> so its the embedding scenario

15:57 <Jin^eLD> I think I got everything set up by now with the exception that I am still fighting a bit in running a new user script (i.e. one that gets replaced on the fly) in a clean environment - that's where the python-reinit question comes from

15:58 <arigato> the current model is that the python interpreter stays initialized forever in your application, and we hope the user script doesn't do crazy things like messing up with the builtins module---just like we have anyway to assume the user script doesn't do crazy things like erase all user files

15:59 <Jin^eLD> my concern was lesser about malicious intent, but userscripts having some leftovers that could cause undesired behavior

15:59 <Jin^eLD> right now I am trying to help myself by removing the loaded user stuff from sys.modules

15:59 <arigato> yes, the best you can do is try to run the user script in its own new module, and remove the module at the end; this should minimize the unintentional leftovers

16:00 <arigato> ...by cleaning sys.modules, exactly

16:00 <Jin^eLD> but I still have some issues loading a user script (using zipimporter) that organizes its own code in several .py files

16:00 <arigato> yes, you'll end up fighting python's complicated and inflexible module system

16:00 <Jin^eLD> basically if I load the main module and that module happens to import localsubmod the latter is not being found although its inside the zip

16:01 <Jin^eLD> aha :) I thought it was flexible, but indeed complicated :) but i do not think I have enough python knowledge to make a qualified statement

16:02 <arigato> depends a bit what you're after, but I'd recommend the hackish but simple solution: unzip the file in some directory, insert in front of sys.path that directory, run the script, and at the end restore sys.path and sys.modules to their original states

16:03 <Jin^eLD> that could work, I was hoping to take the easy way out of just loading the zip, was surprised that included sources that are within the zip can't find each other then...

16:04 <Jin^eLD> basically that's the reproduction scenario: https://bpa.st/LESA

16:04 stkrdknmibalz has joined #pypy

16:04 <arigato> there are some very complicated systems that should make this possible without unzipping, but there is quite a lot to learn (and I entirely forgot)

16:05 <Jin^eLD> well, if you're saying the above pasted attempt is a PITA, then I'd rather not waste time on it and do the unzipping; I was still hoping that I just missed something trivial

16:05 <arigato> I would classify it as a PITA; maybe others here have a simple solution but I don't know it

16:05 <Jin^eLD> :)

16:06 <arigato> just do 'copy = sys.modules.copy()' at the start, and at the end something like 'sys.modules.clear(); sys.modules.update(copy)'

16:06 kor1 has joined #pypy

16:06 <arigato> around 'try: finally:' to make sure it's always called

16:06 <Jin^eLD> wouldn't sys.modules.clear() nuke the builtins making it impossible to continue doing anything reasonably?

16:07 <arigato> uh :-) no, I think in this case it should still work

16:07 <Jin^eLD> I'll try it :)

16:07 <arigato> sys.modules.clear() only makes import statements not find the modules, but everything else should continue working

16:08 <arigato> of course you can also be more conservative and do:

16:08 <arigato> for key in sys.modules: if key not in copy: del sys.modules[key]

16:08 <arigato> for example

16:09 <Jin^eLD> that's pretty much what I have been thinking about, but guys in #python suggested I might get away with just removing the loaded user module script, because everything that it does gets attached to it

16:10 <Jin^eLD> removing from sys.modules that is

16:10 <arigato> no, not if it does further imports

16:10 <arigato> these new imported modules will also get added in sys.modules

16:10 <Jin^eLD> so if usermod does "import somethingelse" then it gets attached on the same level as usermod itself?

16:10 <Jin^eLD> I see

16:11 <Jin^eLD> thanks for pointing that out

16:11 <arigato> yes

16:11 <arigato> unless you make a package instead of a module, but that is normally done with extra subdirectories

16:12 <Jin^eLD> the idea was that users could organize the zip how they like, with the exception that the app will call "__userscript__" as entry point

16:12 <arigato> you can hack until the initial user module is imported under the name "myscripts.initialmodule" instead of "initialmodule", and then everything it imports will be "myscripts.someothername" by default, but that's again PITA territory

16:13 <Jin^eLD> now you know why I was so much looking for a reinit way ;)

16:13 <arigato> yes :-)

16:15 <Jin^eLD> I think your suggestion with a copy and clearing works just fine

16:16 <Jin^eLD> did not get any errors

16:16 <arigato> cool

16:16 <Jin^eLD> thanks!

16:17 <arigato> it should restore the initial module state, but it's not safe against manual changes done to these modules (e.g. the script doing `import sys; sys.foo="bar"`)

16:17 <arigato> hope that's good enough.

16:17 <Jin^eLD> well yes, a script could mess up anything in Python since everything is some sort of a writable object

16:18 <Jin^eLD> I looked into sandboxing at the beginning but quickly saw that it just makes no sense

16:18 <Jin^eLD> but I think its good enough

16:18 <arigato> OK

16:19 <Jin^eLD> since the app "belongs to the user" i.e. its not a service on some server, but something that you'd run yourself, so I doubt one would be messing themselves up with their own scripts :)

16:20 <arigato> makes sense :-)

17:34 <Jin^eLD> arigato: what happens to running code in the module when the module suddenly gets removed from sys.modules?

17:35 <mattip> arigato,: fwiw the reproducer in the hackmd does not depend on AVX512F

17:36 <mattip> antocuni: thanks for the thoughts

17:36 ambv has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

17:57 <mattip> antocuni: I merged your changes back to the original hackmd document

18:12 <arigato> mattip: ah

18:13 <arigato> Jin^eLD: sys.modules is just used when running "import" statements. If there is code still running in a module, the module itself is still valid, and anything already imported in it is still valid, recursively

18:13 <mattip> I wonder if we can reason about the problem thus:

18:13 <mattip> the third argument to PyUFunc_GenericFunctionInternal is a struct with two pointers

18:13 <mattip> outside the call, the struct is valid

18:14 <mattip> inside the call it is getting zeroed out

18:14 <mattip> what register will the struct be passed in on?

18:14 <Jin^eLD> arigato: is there any way to kill or terminate it from the outside?

18:14 <arigato> Jin^eLD: not really

18:15 <Jin^eLD> that's another scenario where I am trying to work around the missing reinit

18:15 <Jin^eLD> doh :(

18:15 <arigato> if code is running, it might be caught in an infinite or very long loop anyway

18:15 <arigato> it could be running in a different thread though

18:16 <Jin^eLD> arigato: well, in cpython I would have tried doing something like Py_AddPendingCall() with a yErr_SetString(PyExc_InterruptedError, "nuke") or something like that

18:17 <arigato> Jin^eLD: which might or might not work

18:17 <arigato> pretty sure at least some versions of CPython will only invoke Pending Calls in the "main thread", and which thread is the "main thread" when embedding CPython is very unclear

18:18 <Jin^eLD> mhhhh

18:18 <Jin^eLD> this whole embedding stuff is more complicated than I anticipated

18:19 <arigato> yes, it's not completely thought out, neither in CPython nor in PyPy

18:19 <Jin^eLD> perhaps I should have went with Lua :)

18:19 <Jin^eLD> I had some spidermonkey/js experience in the past and it was just a huge PITA to get that right

18:19 <Jin^eLD> especially to make sure your C code would not crash due to JS gc

18:19 <arigato> ...ah

18:19 <Jin^eLD> so this time I thought OK, lets try something else

18:20 <Jin^eLD> there was Lua, but I am not familiar with it, now thinking if that would have been a better choice

18:20 <arigato> Lua is definitely muuuuuuuuuch simpler

18:20 <Jin^eLD> and then of course - Python...

18:20 <arigato> but Lua is also a much simpler language with limitations when trying to do something large

18:21 <Jin^eLD> I do not think I am going to do something large, so it might have been a better choice.. I am just not familiar with it at all

18:21 <Jin^eLD> and Python is also more widespread in terms of userbase

18:21 <Jin^eLD> everybody knows Python, so thats nicer towards the users I guess which was one of my considerations

18:22 <arigato> (I can grumble about Python developers trying to fix the embedding issues by adding even more complexity, which of course doesn't make anything simpler, but well)

18:22 <Jin^eLD> :)

18:22 <Jin^eLD> I had the feeling that embedding is not a very well tested and not very often used scenario...

18:22 <Jin^eLD> of course I got that feeling only after I started to try to use it...

18:22 <arigato> I guess everybody who tries hits the same snags as you do

18:24 <arigato> mattip: normally, if you know exactly which calling function and which called function are involved, we could find out what is wrong by looking at the assembler for both

18:27 <mattip> I have a debugger with the assembler,

18:27 <mattip> if you have some time I could send a zoom link and share my screen

18:28 <arigato> would be cool, but tonight I'm off now

18:30 <mattip> ok

18:30 <arigato> if you can grab the C sources and the corresponding assembler for both, at least I could tell you if it looks correct or a C compiler bug

18:40 ambv has joined #pypy

18:47 ambv has quit [Ping timeout: 255 seconds]

18:48 ambv has joined #pypy

19:33 <mattip> Sebastian from NumPy is debugging this with me. He says

19:33 <mattip> I found something by stepping through at the assembly level, I think: full_args.in is stored in the first part of XMM6, and this is NULLed by PyPy, but should be non-volatile

19:34 <mattip> Although this is obviously not a floating point number, it is definitely where the value is stored.

19:34 <mattip> </endquote>

19:40 <mattip> Here is the C code and the assembler generated (by stopping in the debugger)

19:40 <mattip> https://gist.github.com/mattip/e7834c9c93b38d05f177a4d4e334a829

19:41 <mattip> and the documentation for the x4 calls says "In other words, user-written assembly language routines must be updated to save/restore XMM6 and XMM7 before/after the function when being ported from x86 to x86-64."

19:43 seberg has joined #pypy

20:01 ambv has quit [Ping timeout: 252 seconds]

20:03 <arigato> mattip: "ah"

20:04 <arigato> indeed, https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-160 lists xmm6-xmm15 as non-volatile

20:04 <arigato> I'm pretty sure that on 32 bits and on linux it's xmm8-xmm15

20:04 <arigato> that might be clearly the problem

20:06 <mattip> should I try to add them to CALLEE_SAVE_REGISTERS in x86/runner .py?

20:06 <mattip> or does it need a more complicated fix

20:17 <arigato> let me check

20:19 <arigato> no, that constant does not list xmm registers, only non-xmm ones

20:22 <arigato> I think that on linux none of the xmm registers need to be saved, whereas on windows you need to save xmm6-xmm15

20:22 <arigato> we use xmm15 as a scratch register, which is already wrong

20:22 <arigato> then xmm0-xmm14 are listed in X86_64_XMMRegisterManager in regalloc.py

20:28 <mattip> from what I can gather, it seems the XMMRegisterManager is tied to float values, where xmm6, xmm7 are used as general purpose registers

20:29 <arigato> it's a real bug in our code on win64, probably my fault when I tried to port it to win64

20:29 <arigato> in linux all xmm regs are save_around_call_regs, but that's not the case on windows

20:32 <mattip> anything I can do to expedite a fix?

20:36 <arigato> it's suboptimal but you could reduce X86_64_XMMRegisterManager.all_regs to be a list containing only [xmm0,xmm1,xmm2,xmm3,xmm4] and have regloc.X86_64_XMM_SCRATCH_REG be xmm5

20:36 <arigato> at least on WIN64

20:38 <arigato> grepping for xmm[1-9], I think none of the xmm registers is used explicitly apart from there and in callbuilder.py:ARGUMENTS_XMM, which looks correct

20:38 <mattip> I can try that and see if it fixes the problem, pending a more extensive fix later

20:39 <arigato> yes

20:39 <arigato> this fix will just make the JIT never use xmm6-xmm15

20:41 <arigato> it's actually very unclear if it's a good idea to do anything else, because if we use more xmm registers, then we need to save them in the prologue of all functions produced by the JIT, and that's additional cost everywhere even in functions that don't use floats

20:42 <mattip> ok

20:43 <arigato> yes, we might get rid of X86_64_XMM_SCRATCH_REG and use xmm0-xmm5 instead of xmm0-xmm4, but apart from that, I'm very unsure we could do better

20:44 <arigato> (like 32-bit which uses xmm0-xmm7 and has no XMM_SCRATCH_REG)

21:12 yuiza has joined #pypy

21:57 <mattip> the reproducer passes with the changes you suggested: reduce X86_64_XMMRegisterManager.all_regs to be a list containing only [xmm0,xmm1,xmm2,xmm3,xmm4] and have regloc.X86_64_XMM_SCRATCH_REG be xmm5

22:08 <Jin^eLD> arigato: another question on the embedding scenario... still thinking about how to force stop those user scripts; so let's say I have my python embedding_init_code that loads and runs the user script; what if it would spawn a multiprocessing.Process() first and load the user code there? then I could simply terminate it from C via a C->Python function which has the process handle?

22:23 lritter has joined #pypy

22:31 <bbot2> Started: http://buildbot.pypy.org/builders/rpython-win-x86-64/builds/30 [mattip: test branch, win64-xmm-registers]

22:31 <bbot2> Started: http://buildbot.pypy.org/builders/rpython-linux-x86-64/builds/467 [mattip: test branch, win64-xmm-registers]

22:31 <bbot2> Started: http://buildbot.pypy.org/builders/rpython-linux-x86-32/builds/438 [mattip: test branch, win64-xmm-registers]

22:39 seberg has quit [Ping timeout: 255 seconds]

22:50 seberg has joined #pypy

22:51 seberg has quit [Client Quit]

23:00 <antocuni> mattip, arigato: wow, congtrats at tackling down the numpy+pypy segfault, that was hard

23:07 <bbot2> Failure: http://buildbot.pypy.org/builders/rpython-linux-x86-64/builds/467 [mattip: test branch, win64-xmm-registers]

23:32 <bbot2> Failure: http://buildbot.pypy.org/builders/rpython-linux-x86-32/builds/438 [mattip: test branch, win64-xmm-registers]

23:44 <bbot2> Failure: http://buildbot.pypy.org/builders/rpython-win-x86-64/builds/30 [mattip: test branch, win64-xmm-registers]