beneroth changed the topic of #picolisp to: PicoLisp language | The scalpel of software development | Channel Log: https://libera.irclog.whitequark.org/picolisp | Check www.picolisp.com for more information
<tankf33der> Morning
<abu[m]> Good morning tankf33der!
<tankf33der> now i think we should fix fork issue on solaris because it makes more correct linux platform too. Double win.
<abu[m]> Agreed
<abu[m]> Also because it may be a hidden bug in Pil21 (as the other cases yesterday)
<abu[m]> Something is wrong with the global '$Child'
<abu[m]> in the parent fork
<tankf33der> When we can start?
<abu[m]> : (vi 'llvm~forkLisp)
<abu[m]> Probably it is wrong already after the first fork
<tankf33der> rebuilding with latest
<tankf33der> bugis here
<tankf33der> bug is here
<abu[m]> Crash on second fork?
<tankf33der> crash on second this:
<tankf33der> (unless (fork) (wait 60000) (bye))
<abu[m]> Yesterday it crashed here: (Cld: buf null) # No buffer yet
<tankf33der> remember
<abu[m]> So Cld: is probably null
<tankf33der> pil21 passed all tests on linux 6.0
<tankf33der> pil21 passed all tests on kernel linux 6.0
<abu[m]> 👍
<tankf33der> Do you need me debug something on solaris?
<abu[m]> Yes, I have no idea what goes wrong
<abu[m]> I'm back in 30 minutes Then I can send test dbg's
<tankf33der> Afk
<abu[m]> First try: http://ix.io/487Q (near end of 'forkLisp'
<abu[m]> Gives something like http://ix.io/487R
<abu[m]> Back in 20 min
<tankf33der> Back in 1h
<tankf33der> Back
<tankf33der> doing
<tankf33der> http://ix.io/488k
<abu[m]> ok, so (Cld:) is not null
<abu[m]> and the place of crash is the same
<abu[m]> reproducible
<abu[m]> Size 28 (difference between the two pointers) is also correct for 'child' structure
<abu[m]> But (Cld: buf null) fails
<abu[m]> This just sets the first field in the struct to null
<abu[m]> Perhaps something wrong with alloc()?
<tankf33der> easy to test
<abu[m]> This prints the size it tries to alloc
<abu[m]> I see "224 N"
<abu[m]> on the first call
<tankf33der> doing
<abu[m]> The second call does not need to allocate then
<tankf33der> http://ix.io/488m
<abu[m]> ok, 224
<abu[m]> The crash makes no sense then
<abu[m]> Can you expect the core dump and paste the instruction?
<tankf33der> what command in gdb ?
<abu[m]> yes, gdb on the core dump
<tankf33der> bt ?
<tankf33der> http://ix.io/488n
<abu[m]> I need the instruction which caused the crash
<abu[m]> Perhaps "l" gives a listing of the current position?
<tankf33der> nope
<abu[m]> or some "print $pc" (I forgot the syntax)
<abu[m]> "x" command perhaps? With program counter register
<tankf33der> (gdb) print $pc
<tankf33der> $1 = (void (*)()) 0x100058dc8 <forkLisp+584>
<tankf33der> (gdb)
<abu[m]> Getting close!
<abu[m]> We know the position 584
<abu[m]> So please do "disas forkLisp" (very long)
<abu[m]> Somehow redirect to file and paste ;)
<abu[m]> For "disas" you have to keep pressing Enter for each page
<tankf33der> http://ix.io/488q
<tankf33der> downloading latest possible patches for solaris 11 os
<tankf33der> upgrade will end in several days
<abu[m]> Good, <+584>: clrx [ %i4 + %i5 ]
<abu[m]> Looks like the right place, stores null
<abu[m]> I cannot see any reason why it crashes
<abu[m]> %i5 is N
<tankf33der> good you checked
<abu[m]> Strange it gives a bus error
<abu[m]> 'dbg' output looks like a valid address
<abu[m]> What exactly does bus error mean? Non-existing addrees?
<abu[m]> Segfault means illegal access of existing memory I think
<abu[m]> So it looks a lot like an alignment problem!
<abu[m]> Yes! That's it!
<abu[m]> The second one, 4297738620, is not a multiple of 8
<abu[m]> Solaris requires 8-byte-alignment here
<abu[m]> Can be solved easily, but I should check the other structures too (in @src/dec.l)
<abu[m]> dbFile has the same requirement
<abu[m]> I think it would crash too
<abu[m]> Can you test that?
<tankf33der> How?
<abu[m]> Make a DB with more than one file
<abu[m]> Then access the second file
<tankf33der> damn :)
<abu[m]> I make an example
<tankf33der> Please
<tankf33der> i can do it without example
<tankf33der> i do not understand db completely
<abu[m]> no problem
<abu[m]> (new 2) creates a symbol in the second DB file
<abu[m]> I think it will crash on Solaris
<tankf33der> $ ../pil
<tankf33der> Bus Error (core dumped)
<tankf33der> : (pool "xxx" (2 2))
<abu[m]> 👍
<abu[m]> I change the structure definitions in @src/dec.l to be multiples of 8
<abu[m]> Before I make a release, can you test with this http://ix.io/488x for src/dec.l?
<tankf33der> doing
<tankf33der> do it for solaris only ?
<abu[m]> yes
<abu[m]> The others should be ok with this change
<tankf33der> http://ix.io/488y
<tankf33der> FIXED
<tankf33der> but @lib/test.l crash with bus error anyway
<abu[m]> yuhuu!
<abu[m]> Which test is it?
<tankf33der> digging
<tankf33der> ### kids ###
<tankf33der> (test
<tankf33der> (make
<tankf33der> (do 7
<tankf33der> (link (or (fork) (wait 2000) (bye))) ) )
<tankf33der> (flip (kids)) )
<tankf33der> this one
<abu[m]> again in (fork)?
<tankf33der> yes
<tankf33der> (unless (fork) (wait 6000) (bye))
<tankf33der> ^^^^ this works
<abu[m]> Do you still have the 'dbg' output of the pointers?
<tankf33der> i will create then
<abu[m]> good, let's check the sizes again
<tankf33der> http://ix.io/488B
<abu[m]> The pointers are good now. But what is that "0"?
<tankf33der> o
<tankf33der> wait
<tankf33der> this is my 0, i missed
<abu[m]> Just an output?
<tankf33der> yes
<tankf33der> (test NIL (pipe (prog (kill *Pid) (pr 7)) (rd)))
<tankf33der> this
<abu[m]> So now it crashes on th2 8th time
<abu[m]> ok, pipe calls forkLisp
<abu[m]> I try here too to fork more than 8 times
<abu[m]> works
<tankf33der> (test 7 (pipe (protect (kill *Pid) (pr 7)) (rd)))
<tankf33der> here crashes too
<abu[m]> No idea again ;)
<abu[m]> The crash is in another place now
<abu[m]> "9 T" is printed
<abu[m]> Can you find out from core dump where it crashes now?
<tankf33der> (pipe (call *CMD "-prog (println (argv)) (bye)" "abc" 123) (read))
<tankf33der> 9 T
<tankf33der> 4297607744
<tankf33der> 256 N
<tankf33der> Bus Error (core dumped)
<tankf33der> doing
<tankf33der> <tankf33der> 4297607744
<tankf33der> is this address ok?
<abu[m]> Not a multiple of 8
<abu[m]> Where is that?
<tankf33der> line above
<tankf33der> pipe call again
<abu[m]> I mean in the base source. forkLisp()
<abu[m]> It is after "9 T", so probably after forkLisp returned
<abu[m]> Perhaps some other unaligned data
<tankf33der> http://ix.io/488F
<tankf33der> output and code
<abu[m]> yes, this was clear
<abu[m]> But *where* in 'pipe'?
<abu[m]> after "9 T" this time, so forkLisp exited
<tankf33der> $ cat pipe1.l
<tankf33der> (test NIL (pipe (prog (kill *Pid) (pr 7)) (rd)))
<tankf33der> (msg 'ok)
<tankf33der> (bye)
<tankf33der> backtrace
<tankf33der> :
<tankf33der> #0 0x000000010004a2b4 in pushOutFile ()
<abu[m]> ok
<abu[m]> So we are in the child process this time
<abu[m]> pushOutFile operates on structures on he stack
<abu[m]> so this means we also have to align all stack structures
<abu[m]> big task!
<tankf33der> http://ix.io/488J
<abu[m]> checking
<abu[m]> hmm, let's look at the pointer in pushOutFile http://ix.io/488M
<abu[m]> I think it is not aligned, because the pil21 llvm compiler does not align on the stack
<tankf33der> http://ix.io/488N
<abu[m]> The 'T' ones are from pushOutFile?
<tankf33der> i think so
<abu[m]> The last one is it
<abu[m]> 18446744071562062524
<abu[m]> This is not aligned
<tankf33der> http://ix.io/488P
<tankf33der> this clean output, i disabled dbg in forklisp
<abu[m]> good
<abu[m]> All are from pushOutFile, right?
<tankf33der> right
<abu[m]> These structures are allocated on the stack with alloca(). I have no control over the alignment
<abu[m]> This is tough
<tankf33der> you can release dec.l changes anyway
<tankf33der> good we check as much as can
<abu[m]> But it would be better also for the other CPUs if these structures were properly aligned
<abu[m]> yes
<abu[m]> ok, I release first, then think about a rewrite of some stack structures
<abu[m]> Released
<tankf33der> x64 ok
<tankf33der> s390x ok
<abu[m]> Great! But I must give up for now. Need to think more about it, needs a different strategy for all those stack structures
<abu[m]> I must call alloca() sometimes with an alignment arg, so I must extend @src/lib/llvm.l
<tankf33der> Ok, enough for today
<tankf33der> too much picolisp in my life.
<abu[m]> yep
<tankf33der> See you
<abu[m]> Thanks a lot!!! ☺
<abu[m]> Now I found some time ☺
<abu[m]> Changed the relevant stack structures
<abu[m]> And I will release it now.
<abu[m]> Though this alignment was not required by the hardware on the architectures we tested before, it is better in any case.
<abu[m]> Uses a little more stack space but is faster.
<abu[m]> tankf33der: When you have time, please test! I hope very much that Solaris passes now too.
chexum has quit [Quit: No Ping reply in 180 seconds.]
chexum has joined #picolisp
clacke has quit [Remote host closed the connection]
hrberg has quit [Ping timeout: 256 seconds]
clacke has joined #picolisp