<adrien>
I'm hitting an issue with pyml ( https://github.com/thierry-martinez/pyml/issues/97 ) where GIL is problematic from an ocaml program using threads; I've put all the python operations in a single thread and I'm now using OCaml 5.2 but it still seems threads that aren't the one dedicated to python can collect pyml's custom values
<adrien>
and that requires holding the GIL which causes either a segfault/race-condition, or deadlock if I try to get the GIL from the finalizer
<adrien>
the part I don't understand is that I thought that with OCaml 5, values allocated from a given thread would be collected by the same thread
<adrien>
or is it that _one_ thread collects values allocated by _one_ thread but the mapping isn't fixed?
neiluj has joined #ocaml
bartholin has joined #ocaml
<octachron>
Values allocated by a domain can be collected by any domain (in the shared major heap)
contificate has joined #ocaml
<adrien>
these get freed through caml_empty_minor_heap() (4.x I guess) or caml_stw_empty_minor_heap_no_major_slice() (5.2, or at least in my case) but are custom values allocated in the minor heap?
<adrien>
thanks; I've been toying with Gc controls too but I was missing minor heap size probably; I also resurrected an older version of my code which seems to survive even without Gc controls (it didn't survive that well before I think...)
<adrien>
I still have the option of spawning another process but I'd prefer to avoid that
dmoerner has quit [Ping timeout: 252 seconds]
<discocaml>
<otini_> I thought the Python GIL was a thing of the past…?
<discocaml>
<otini_> the issue OP is using Lwt. Lwt with multiple domains is not safe, except maybe when Lwt is confined to the main domain. Don’t know if you’re using Lwt too
dmoerner has joined #ocaml
<discocaml>
<otini_> afraid I don’t know the python runtime well enough to help
<discocaml>
<otini_> pyml looks a bit dead
neiluj has quit [Ping timeout: 260 seconds]
theblatte has quit [Quit: leaving]
<adrien>
I'm using stuff on a single domain; the additional thing I have is Lwt_preemptive to communicate with a thread that I dedicate to python and which works in a blocking manner
<adrien>
but that's still a single domain I think
<adrien>
(maybe I could have the python stuff in a domain rather than a thread?)
<adrien>
with large enough Gc controls, I don't get a crash anymore, or at least not before I get output from my program so I'll probably run with that for now and come back to this issue when I'm not in the middle of a large refactoring!
<adrien>
the python gil is being removed but you need a separate build for that and it's only in 3.13 which was published very recently
<adrien>
and the performance is lower so I don't think it'll be default before a while (but I hope distros provide both; that may be easier said than done however)
<adrien>
and pyml is indeed not very active but I don't know if it really needs to be more active (stdcompat which it depends upon would benefit from an update in opam however because it's not compatible with 5.2)
<adrien>
also, if distros start shipping gil-less python builds, but I also just tried using Domain.spawn instead of Thread.create and it seems to work well!
theblatte has joined #ocaml
mbuf has joined #ocaml
sroso has quit [Quit: Leaving :)]
<companion_cube>
Be aware you shouldn't create more domains than you have cores
alexherbo2 has quit [Remote host closed the connection]
spew has joined #ocaml
pi3ce has quit [Ping timeout: 252 seconds]
pi3ce has joined #ocaml
<adrien>
companion_cube: I don't create any other than the one dedicated to the python interpreter so I'm pretty safe; thinking about it, I may want to create a few for some CPU-intensive tasks but I'm confident that will be <= 4 and I can assert at startup that there are enough cores
<adrien>
I'm writing something incredibly specific and that should see two concurrent deployments at most so I can do that :D
<adrien>
(it takes ubuntu's migration "excuses" page, rewrites it and enhances it with up-to-date information from launchpad builders and bug tracker; and if you know enough to follow, no, I don't want to touch britney and link it more to external services!)
mbuf has quit [Quit: Leaving]
Haudegen has quit [Quit: Bin weg.]
euphores has quit [Quit: Leaving.]
euphores has joined #ocaml
spew has quit [Quit: spew]
spew has joined #ocaml
raskol has joined #ocaml
bartholin has joined #ocaml
tyzef has joined #ocaml
Haudegen has joined #ocaml
tyzef has quit [Quit: WeeChat 3.8]
nirvdrum has quit [Quit: Ping timeout (120 seconds)]
nirvdrum has joined #ocaml
Anarchos has joined #ocaml
euphores has quit [Ping timeout: 246 seconds]
euphores has joined #ocaml
bartholin has quit [Quit: Leaving]
neiluj has joined #ocaml
neiluj has quit [Ping timeout: 265 seconds]
neiluj has joined #ocaml
humasect has joined #ocaml
Haudegen has quit [Quit: Bin weg.]
Haudegen has joined #ocaml
neiluj has quit [Ping timeout: 252 seconds]
Stumpfenstiel has joined #ocaml
YuGiOhJCJ has joined #ocaml
humasect has quit [Quit: Leaving...]
neiluj has joined #ocaml
<discocaml>
<mm3315> Can it be memory efficient when writing parser combinator, writing a parser for char list, rather than for string?
neiluj has quit [Ping timeout: 272 seconds]
<discocaml>
<mm3315> Can it be more memory efficient when writing parser combinator, writing a parser for char list, rather than for string?
<discocaml>
<mm3315> Can it be more memory efficient when writing parser combinator, writing a parser for char list, instead of for string?
<discocaml>
<sim642> A `char list` will definitely use more memory than `string` for the same length
<discocaml>
<sim642> The only way I can see that `char list` may save memory is that GC could free some elements from the beginning that already have been parsed and made unreachable
<discocaml>
<sim642> Whereas a `string` is monolithic
<discocaml>
<mm3315> you mean returning the substring of string will not use more memory if I don't modify them right?
<discocaml>
<contextfreebeer> you should probably use a buffer instead if you want efficient parsing
<discocaml>
<contextfreebeer> or a library designed for it
Everything has joined #ocaml
neiluj has joined #ocaml
<discocaml>
<mm3315> I'll check for it, thanks
<discocaml>
<deepspacejohn> I don't think you should be needing to return substrings during parsing. You can just track your location within the string.
<discocaml>
<qrpnxz> using a `char Seq.t` would save memory
<discocaml>
<qrpnxz> this is for writing, how does it help for reading (parsing)?
<discocaml>
<contextfreebeer> oh maybe I misunderstood what they wanted to use this for
<companion_cube>
If you write, have a string Seq.t, not individual chars, at least
<companion_cube>
Writing individual characters to anything but a buffer will be fairly inefficient
neiluj has quit [Ping timeout: 252 seconds]
<discocaml>
<mm3315> Thanks for answer @qrpnxz and <companion_cube>, but how `char Seq.t` can be better at parsing compared to `char list`?
<discocaml>
<mm3315> Does it provide sharing between thunks?
<discocaml>
<mm3315> Thanks for answer @qrpnxz and <companion_cube>, but how `char Seq.t` can be better at parsing compared to `char list`?
<discocaml>
<mm3315> Does it provide sharing between thunks, or something?
<companion_cube>
I think char list and char seq are both terrible tbh
<discocaml>
<deepspacejohn> you can just track your position in the string without using String.sub, unless you actually need to copy a substring to the output.
neiluj has quit [Ping timeout: 276 seconds]
<discocaml>
<qrpnxz> I didn't say better performance, it would use potentially less memory because you'd be streaming the characters rather than have them all in memory in a list all at once
Serpent7776 has quit [Quit: leaving]
<discocaml>
<qrpnxz> I didn't say better parsing, it would use potentially less memory because you'd be streaming the characters rather than have them all in memory in a list all at once
<discocaml>
<qrpnxz> damn, sorry for edit irc chat
<discocaml>
<qrpnxz> terrible
<discocaml>
<mm3315> 🤯
<discocaml>
<qrpnxz> see channel description
<discocaml>
<mm3315> I'm deeply sorry for all the crimes I committed..
<discocaml>
<mm3315> anyway I got what you mean, thanks
<dh`>
if you're worried about performance, you probably shouldn't be using a parser combinator library at all
<Anarchos>
companion_cube char list, char seq, string, buffer.... I feel like it's too much. I remember caml light, when there was only mutable string.
<companion_cube>
For substrings you can use string*int*int
<companion_cube>
It sucks but there's nothing much better
<discocaml>
<qrpnxz> No doubt caml light also had char list. I for one very much welcome immutable string and hope standard immutable arrays or something more sophisticated happen
halloy1870 has joined #ocaml
halloy1870 has quit [Remote host closed the connection]
maybe has joined #ocaml
Anarchos has quit [Quit: Vision[]: i've been blurred!]
maybe has quit [Remote host closed the connection]
maybe has joined #ocaml
maybe has quit [Remote host closed the connection]
maybe has joined #ocaml
neiluj has joined #ocaml
neiluj has quit [Ping timeout: 260 seconds]
maybe has quit [Remote host closed the connection]
xd1le has joined #ocaml
Stumpfenstiel has quit [Ping timeout: 252 seconds]