beneroth changed the topic of #picolisp to: PicoLisp language | The scalpel of software development | Channel Log: https://libera.irclog.whitequark.org/picolisp | Check www.picolisp.com for more information
<tankf33der> testing
<tankf33der> can you add (pool "xxx" (2 2)) (new 1) (new 2) to tests?
<tankf33der> x64 ok
<tankf33der> http://ix.io/48du
<tankf33der> coroutines crashes on solaris
<tankf33der> both ext and ht is broken on s390 and solaris
<tankf33der> ===========
<tankf33der> status for now:
<tankf33der> o) x64 ok
<tankf33der> o) solaris - co crash
<tankf33der> o) s390 - ht:ex isssue, rest all passed
<tankf33der> ============
<tankf33der> afk
chexum has quit [Ping timeout: 268 seconds]
chexum has joined #picolisp
<abu[m]> Thanks! But the Solaris (fork) and (pipe) issues are solved, right?
<abu[m]> Is it a Bus error again in the coroutines?
<tankf33der> In solaris pipe and fork and pool-new-new solved.
<tankf33der> Co crashes in simple crash
<abu[m]> Segfault?
<tankf33der> Segfault
<abu[m]> Cause if Bus error, it is probably still an alignment thing
<tankf33der> Pasting
<tankf33der> http://ix.io/48dI
<abu[m]> There are two 'memset's in '_co'
<tankf33der> ok
<abu[m]> So again the question is which value?
<tankf33der> what command to execute ?
<abu[m]> We need to know if it is the first memset
<abu[m]> Can you put 'dbg's around the two memsets to see which one?
<tankf33der> http://ix.io/48dO
<tankf33der> sure, show me code
<abu[m]> in '_co' in @src/flow.l
<abu[m]> I do (vi 'co)
<tankf33der> i am waiting for commands
<abu[m]> Anything
<tankf33der> i do not remember dbg syntax
<abu[m]> (dbg 1 0) etc. immediately before and after the two memset's
<abu[m]> (dbg 2 0) (dbg 3 0)
<abu[m]> When we know which one, we look at the args
<tankf33der> doing
<tankf33der> 1
<tankf33der> 3
<tankf33der> Segmentation Fault (core dumped)
<tankf33der> 2
<tankf33der> mihailp@landau:~/pil21$
<tankf33der> seond
<tankf33der> second
<abu[m]> Good, so the second one
<abu[m]> Now let's look at the Dst: pointer http://ix.io/48dR
<abu[m]> For test, just (co 'a (* 3 4))) is ok
<abu[m]> (co 'a (* 3 4))
<tankf33der> Doing asap
<abu[m]> (I put this line into my local ".pilrc" history)
<tankf33der> doing
<abu[m]> wait
<abu[m]> Let's also test Siz
<tankf33der> $ ./pil
<tankf33der> 2
<tankf33der> 1
<tankf33der> 18446744071561802448 P
<tankf33der> : (co 'a (* 3 4))
<tankf33der> Segmentation Fault (core dumped)
<abu[m]> ok, I have http://ix.io/48dT
<tankf33der> 1$ ./pil +
<tankf33der> : (co 'a (* 3 4))
<tankf33der> 1
<tankf33der> 2
<tankf33der> 18446744071561802352 P
<tankf33der> 65536 N
<tankf33der> Segmentation Fault (core dumped)
<abu[m]> Looks good
<abu[m]> Ok, let's test only (stack) then
<tankf33der> 18446744071561802352 P
<tankf33der> is this ok ?
<tankf33der> ends on 2
<abu[m]> I think so. It is aligned, and a (high) stack address
<abu[m]> OK, now a little more tricky
<tankf33der> doing
chexum has quit [Remote host closed the connection]
chexum has joined #picolisp
<tankf33der> i hope i modified code correctly
<abu[m]> I'm sure :)
<tankf33der> $ ../pil
<tankf33der> Bus Error (core dumped)
<tankf33der> 1
<tankf33der> : (co 'a (+3 4))
<tankf33der> 2
<abu[m]> Interesting
<abu[m]> Then it is not inside memset, but the stack allocation
<abu[m]> hmm, is such stack operation not allowed on Solaris?
<tankf33der> does it alligned ?
<abu[m]> I think not an alignment issue, because then it would be Bus error
<tankf33der> now bus error
<abu[m]> hmm
<abu[m]> To be sure, let's insert "1 T": http://ix.io/48dY
<abu[m]> so we have output before and after 'stack'
<tankf33der> : (co 'a (* 3 4))
<tankf33der> $ ../pil +
<tankf33der> 1
<tankf33der> 1 T
<tankf33der> 2
<tankf33der> Bus Error (core dumped)
<abu[m]> "1 T" is not printed
<tankf33der> printer
<tankf33der> printed
genpaku has quit [Remote host closed the connection]
<abu[m]> (dbg 1 $T) immediately before (let S ...
<tankf33der> yes, and it printed
genpaku has joined #picolisp
<abu[m]> see above "To be sure, ..."
<abu[m]> a slightly modified dbg
<tankf33der> i already inserted it
<tankf33der> and paste here
<tankf33der> http://ix.io/48e0
<abu[m]> So it is definitely the (let S (stack ... which crashes?
<tankf33der> yes, right after (dbg 1 $T)
<abu[m]> I'm afraid this is bad news
<abu[m]> Solaris does not allow such stack manipulations
<tankf33der> good we checked as deep as possible
<abu[m]> frustrating
<abu[m]> Not sure if there is some way around that
<abu[m]> And it still does not make sense to me
<tankf33der> lets close this issue
<abu[m]> If Solaris prohibits certain stack manipulations, I would expect it to crash in another place
<tankf33der> i checked modified code several times
<tankf33der> do it again
<abu[m]> ok
hrberg has joined #picolisp
hrberg has quit [Client Quit]
<tankf33der> patch is correct
<abu[m]> I'm not sure. Coroutines are nasty.
<abu[m]> Some Solaris-Guru should debug this
<tankf33der> does not exist
<tankf33der> laid off
hrberg has joined #picolisp
<abu[m]> yeah
<abu[m]> But the same restrictions may exist in other OSes too
<abu[m]> I will meditate what 'co' exactly does on the stack
<abu[m]> We still have the ext and ht issues
<abu[m]> Finding shared objects
<abu[m]> Does 'native' work with external libs?
<abu[m]> E.g. your libcrypt tests?
<tankf33der> let me try
<tankf33der> we should stop solaris tests, very hard to debug.
<abu[m]> good
<tankf33der> lets concentrate on s390, it is on my laptop, easy.
<tankf33der> testing libcrypt
<abu[m]> On s390x, the coroutines work, right?
<tankf33der> right
<tankf33der> libcrypt works on s390
<abu[m]> So let's hope this co trouble is *only* in Solaris ;)
<tankf33der> http://ix.io/48e1
<tankf33der> this is strace after enter (ext:Snx "sss")
<tankf33der> open("/root/pil21/lib/ext", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
<abu[m]> Pil21 makes some assumptions how freely it can jump around in the stack
<abu[m]> Perhaps the lib is not built correctly?
<abu[m]> How about an existing lib like libcrypt?
<tankf33der> problem in ext
<tankf33der> not ext.so
<abu[m]> ah!
<tankf33der> .so is lost somehow
<abu[m]> yes, expects lib/ext.so
<abu[m]> good find!
<abu[m]> Makefile only says ext.so
<tankf33der> ext.so and ht.so created in lib and ready to use
<abu[m]> 'sharedLib' in @src/main.l says (strcpy (ofs Q Len) ($ ".so"))
<abu[m]> So it should try to open "ext.so"
<tankf33der> lets debug, send commands
<tankf33der> lets debug, send me commands
<abu[m]> ok
<abu[m]> Let's see first if we look at the right place: http://ix.io/48e6 in 'sharedLib'
<abu[m]> oops
<abu[m]> The second '1' is better '2'
<abu[m]> but even so it would be understandable
<abu[m]> Sorry, $NIL must be $Nil
<tankf33der> opt: base.ll:5669:29: error: '%39' defined with type 'i64' but expected 'i8*'
<tankf33der> %43 = call i8* @dlsym(i8* %39, i8* %42)
<tankf33der> make: *** [Makefile:40: base.bc] Error 1
<tankf33der> ^
<tankf33der> error
<abu[m]> yeah, moment
<abu[m]> The 'and' is tricky to debug
<tankf33der> everything is trick inside pil21 sources
<abu[m]> yeah
<abu[m]> It is LLVM's fault. Very restrictive.
<tankf33der> doing
<tankf33der> 1 NIL
<tankf33der> : (ext:Snx "sss")
<tankf33der> $ pil +
<tankf33der> -> "S"
<tankf33der> 2 T
<tankf33der> :
<tankf33der> works now
<abu[m]> oh, strange!
<abu[m]> Yes, same output here
<abu[m]> So perhaps the strcpy goes wrong?
<abu[m]> Lets look at the string
<abu[m]> Lets hope the error is not gone
<tankf33der> doing
<tankf33der> $ ../pil
<tankf33der> : (ext:Snx "sss")
<tankf33der> -> "S"
<tankf33der> |../lib/ext|
<tankf33der> 1 NIL
<tankf33der> :
<abu[m]> good!
<abu[m]> So the string operations go wrong
<abu[m]> But why does it work?
<abu[m]> without .so
<tankf33der> under strace
<tankf33der> ===
<tankf33der> write(2, "|/home/mpech/pil21/lib/ext|\n", 28|/home/mpech/pil21/lib/ext|
<tankf33der> ) = 28
<tankf33der> openat(AT_FDCWD, "/home/mpech/pil21/lib/ext.so", O_RDONLY|O_CLOEXEC) = 3
<tankf33der> ===
<tankf33der> it output wrong path, but use correct one in openat
<abu[m]> mysterious!
<abu[m]> Something with the 'Len' calculation goes wrong? So that there is some null byte in between?
<abu[m]> No idea again ;)
<abu[m]> Looks all correct to me
<abu[m]> And such calculations are not system-dependent I think
<tankf33der> strange how several lines of code shifts something and it works
<abu[m]> right
chexum has quit [Remote host closed the connection]
<tankf33der> @lib/test.l passed :)
<abu[m]> The above write in strace is immediately before (and (dlOpen ... right?
<tankf33der> above strace is:
<tankf33der> $ strace pil +
<tankf33der> run
chexum has joined #picolisp
<abu[m]> The write is from (stderrMsg ($ "|%s|\n") Q)
<abu[m]> just before dlOpen
<tankf33der> doing
<abu[m]> It is this one http://ix.io/48e9 right?
<tankf33der> $ pil +
<tankf33der> 1 NIL
<tankf33der> strcpy |/home/mpech/pil21/lib/ext|
<tankf33der> : (ext:Snx "sss")
<tankf33der> dlOpen |/home/mpech/pil21/lib/ext.so|
<tankf33der> -> "S"
<tankf33der> :
<abu[m]> yeah, how can that be?
<tankf33der> (dbg 1 $Nil)
<tankf33der> (strcpy (ofs Q Len) ($ ".so"))
<tankf33der> (stderrMsg ($ "dlOpen |%s|\n") Q)
<tankf33der> (stderrMsg ($ "strcpy |%s|\n") Q)
<tankf33der> (and
<tankf33der> (dlOpen Q)
<abu[m]> ah, this is ok then, right?
<tankf33der> right
<abu[m]> the strcpy works correctly
<abu[m]> So we have a heisenbug
<abu[m]> Without debug output it is different
<abu[m]> So we must test without debug output again
<tankf33der> doing
<abu[m]> And just use ltrace() and strace()
<tankf33der> $ ../pil +
<tankf33der> : (ext:Snx "sss")
<tankf33der> :
<tankf33der> -> "S"
<tankf33der> works
<abu[m]> uh
<tankf33der> impossible
<abu[m]> exactly ;)
<abu[m]> Make clean and make?
<tankf33der> now i remove debug lines, not comment them
<abu[m]> ok
<tankf33der> works
<tankf33der> unbelievable
<tankf33der> full clean and remake
<abu[m]> What if you install in another place from scratch?
<tankf33der> works
<tankf33der> damn
<tankf33der> i all do it on x64 :)
<tankf33der> wow
<abu[m]> oh :)
<abu[m]> Confusing
<tankf33der> damn
<tankf33der> i so sorry
<abu[m]> no problem, so many versions are confusing
<tankf33der> i was wonder why make is so fast
<abu[m]> Not in emulator
<tankf33der> yeah
<tankf33der> opt is very slow on qemu
<abu[m]> So how is the state? ext:Snx fail on s390x and Solaris?
<tankf33der> yes
<abu[m]> Then perhaps an endianness issue
<abu[m]> But it is only string operations ...
<tankf33der> s390x:~/pil21/src# ../pil
<tankf33der> : (ext:Snx "sss")
<tankf33der> !? (ext:Snx "sss")
<tankf33der> strcpy |../lib/ext|
<tankf33der> strcpy |../lib/ext|
<tankf33der> ext:Snx -- Undefined
<tankf33der> ?
<tankf33der> must run away
<tankf33der> back in 1h
<abu[m]> ok, I'm away in one hour for a few hourl
<abu[m]> hours
<abu[m]> We go cycling ☺
<abu[m]> Meanwhile I meditate it
<tankf33der> back
<tankf33der> : (ext:Snx "sss")
<tankf33der> strcpy |../lib/ext|
<tankf33der> s390x:~/pil21/src# ../pil
<tankf33der> !? (ext:Snx "sss")
<tankf33der> ext:Snx -- Undefined
<tankf33der> dlopen |../lib/ext|
<tankf33der> ?
chexum has quit [Ping timeout: 268 seconds]
chexum has joined #picolisp
<abu[m]> Arrived in Biergarten ☺
chexum has quit [Remote host closed the connection]
chexum has joined #picolisp
<abu[m]> Let's trace it a little http://ix.io/48ez
<tankf33der> doing
<tankf33der> # ../pil
<tankf33der> <1> |../lib/ext|
<tankf33der> <2> |../lib/ext|
<tankf33der> : (ext:Snx "sss")
<tankf33der> !? (ext:Snx "sss")
<tankf33der> ext:Snx -- Undefined
<tankf33der> ?
<abu[m]> ok, so it fails on the last strcpy
<abu[m]> So let's look at Len
<abu[m]> It should be the length between the two '|'s
<abu[m]> Driving home now, bbl
<tankf33der> Afk
<tankf33der> back in 2h
<abu[m]> Back, now shower ;)
<tankf33der> Back
<abu[m]> Same
<tankf33der> ready
<tankf33der> doing
<tankf33der> rebuilding ~8mins
<abu[m]> ok
<tankf33der> : (ext:Snx "sss")
<tankf33der> # ../pil
<tankf33der> <1> |../lib/ext|
<tankf33der> 10 N
<tankf33der> <2> |../lib/ext|
<tankf33der> !? (ext:Snx "sss")
<tankf33der> ext:Snx -- Undefined
<abu[m]> 10 is correct
<abu[m]> I cannot see why the following strcpy could fail
<tankf33der> let me run this code without opt
<abu[m]> Q is correct, Len is correct, and ".so" is a constant string
<abu[m]> I think it cannot be opt, as it happens on different machines, right?
<tankf33der> right
<tankf33der> and works without opt in s390x, iirc
<tankf33der> s390x:~/pil21/src# ../pil
<tankf33der> <1> |../lib/ext|
<tankf33der> : (ext:Snx "sss")
<tankf33der> 10 N
<tankf33der> <2> |../lib/ext.so|
<tankf33der> -> "S"
<tankf33der> :
<abu[m]> Perfect
<abu[m]> And with opt it fails here too?
<tankf33der> yeap
<abu[m]> Strange, completely different architectures
<tankf33der> only s390x for today
<abu[m]> So LLVM has serious problems with optimization
<tankf33der> just with and without opt
<abu[m]> Shall we give up?
<tankf33der> sure. lets finish this.
<abu[m]> Good
<abu[m]> Thank you very much!
<tankf33der> see you.
<abu[m]> We did some improvements on align etc.
<abu[m]> cu ☺/
beneroth_ has quit [Quit: Leaving]
clacke has quit [Remote host closed the connection]
clacke has joined #picolisp
chexum has quit [Remote host closed the connection]
chexum has joined #picolisp
chexum has quit [Remote host closed the connection]
chexum has joined #picolisp
chexum has quit [Remote host closed the connection]
chexum has joined #picolisp