beneroth changed the topic of #picolisp to: PicoLisp language | The scalpel of software development | Channel Log: | Check for more information
avocadoist has quit [Ping timeout: 255 seconds]
rob_w has joined #picolisp
razzy has joined #picolisp
razzy has quit [Ping timeout: 255 seconds]
razzy has joined #picolisp
<razzy> is everything ok?
<abu[m]> I'm ok, and I hope everybody here too!
<fbytez> If anyone is interested, this is the kind of thing I was looking to be able do:
<fbytez> That is, just be *able* to impose limits.
<abu[m]> Hi fbytez! Nice!
<fbytez> Hi!
<abu[m]> So if a line is too long, you abort the transaction
<abu[m]> assuming that you exit the child process after the (quit)
<abu[m]> or close the connection
<fbytez> As is, yeah. It's not really the point as such though; more just being able to impose limits and then handle "errors" however you decide to do so.
<abu[m]> Yes, looks good
<beneroth> fbytez, yeah nice code. Can be used to sanitize arbitrary input, nice!
rob_w has quit [Remote host closed the connection]
chexum_ has joined #picolisp
chexum has quit [Ping timeout: 255 seconds]
razzy has quit [Ping timeout: 248 seconds]
razzy has joined #picolisp
seninha has quit [Quit: Leaving]
razzy has quit [Ping timeout: 260 seconds]
razzy has joined #picolisp
<fbytez> I was just think how nice some Haskell laziness for this sort of thing would be. Like: `(take Limit (till "\n"))`
<fbytez> *thinking
<abu[m]> Yeah, there is no lazy list processing in Pil
<abu[m]> But in this case (head Limit (till ...)) is fine too.
<abu[m]> as the list is not infinite
<fbytez> It potentially is, though.
<abu[m]> So you can easily simulate it in a loop with (char) calls
<abu[m]> Lazy construct just hide this
<fbytez> Well, that just goes back to why I wrote that example code.
<fbytez> In testing, (char)--as opposed to (rd)--didn't work for me because it hang trying to read more data. I don't get why, though.
<abu[m]> Strange
<abu[m]> (char) would hang if a multi-byte char is not completed
<fbytez> I had something like (and (= "\r" (char)) (= ("\n" (char))).
<fbytez> Struggling to remember clearly now. I think I'll have to test it again.
<abu[m]> We could also pick beneroth's proposal an use 'input'
<abu[m]> A bit different from what you do
<abu[m]> It does not limit individual lines, but the total size
<abu[m]> Minor correction: Should be (when (> (inc 'Cnt) Limit)
seninha has joined #picolisp
clacke has quit [Read error: Connection reset by peer]
razzy has quit [Ping timeout: 246 seconds]
razzy has joined #picolisp
<fbytez> `(char)` definitely hangs:
<fbytez> My socket input is coming via stdin, hence `(in NIL`. If `nextChar` is swapped with `char` it hangs.
<abu[m]> '(in NIL ' is fine, but you should put it outside the input expression: (in NIL (input (case ...
<abu[m]> otherwise the channel is switched for each char
<fbytez> Right.
<abu[m]> Where exactly does it hang?
<fbytez> I don't know actually, because I don't see any of the output at all.
<fbytez> I'll try something else...
<abu[m]> Hmm, (rd 1) does a binary read. It does not use the normal I/O channels
<abu[m]> It calls read(2) directly iirc
<abu[m]> So the input stream buffering is bypassed
<abu[m]> I still see no reason here to use (rd 1) instead of (char) directly
<abu[m]> rd call getBinary() which calls slow()
<abu[m]> in @src/io.l
<abu[m]> But I'm not sure
<abu[m]> rd should work unless it is not mixed with stream calls like char, line, read etc
<fbytez> It hangs until a second blank is read.
<fbytez> (when using `char`)
<abu[m]> Can you try if it also hangs if you call (char) instead of (nextChar)?
<fbytez> *blank line
<fbytez> `nextChar` is what works correctly.
<abu[m]> I see
<abu[m]> Do you (flush) on the sending side?
<abu[m]> I understand why nextChar hangs less
<abu[m]> stream input does a single char lookahead
<fbytez> Whatever the browsers and nc do, which presumably is, yes, it's flushed.
<abu[m]> But this is not a problem if the sender properly flushes its output
<fbytez> This is nc, w3m and firefox.
<abu[m]> The input to (in NIL ?
<abu[m]> Try (in "file" ...
<fbytez> a dup2 server socket
<fbytez> file input is different because it closes after whatever is sent.
<fbytez> Oh no, I'm thinking of when I piped or redirected file input.
<fbytez> Not sure about (in "file").
<fbytez> The difference is that the socket stays open.
<abu[m]> This looks a lot like a flush problem to me
<abu[m]> if the receiving part hangs
<fbytez> `(in "file"` is fine.
<abu[m]> Becaus 'in' closes the file and flushes the stream
<fbytez> Although Firefox doesn't seem to like it when its headers haven't been read.
<fbytez> A file doesn't work the same as a socket anyway though; the client isn't closing its end--waiting for a response.
<abu[m]> This is fine. But if the sender of data does not flulh, the receiver cannot finish
<abu[m]> The normal pil http stuff works fine with all kinds of clients
<fbytez> Kind of besides the point though because the clients are firefox, w3m and nc, so doing the "right thing".
<fbytez> stdin is just a dup2 of the accepted socket, that's all.
rob_w has joined #picolisp
<abu[m]> Is dup2 needed? Why not (in Socket ?
<fbytez> It's a separate process.
<abu[m]> ok
<abu[m]> you could strace the involved processes
<abu[m]> look at write() in the sender and read() in the receiver
<fbytez> With `char`, there's always an extra `read()`.
<fbytez> read(0, "GET / HTTP/1.0\r\nUser-Agent: Mozi"..., 4096) = 312
<abu[m]> Not necessarily. But it does a single-byte look-ahead
<fbytez> That's ALL the headers there, the full 312 bytes. nothing prints until after it does this...
<fbytez> read(0, "", 4096)
<fbytez> = 0
<abu[m]> this is EOF
<fbytez> Yeah, but the read() is completely pointless, causing the hang. All the data has been read in the previous `read()`.
<abu[m]> (char) returns NIL on EOF
<fbytez> You don't seem to understand.
<abu[m]> No, it is not pointless
<abu[m]> It detects EOF
<abu[m]> as it returns 0, it did not hang
<fbytez> It was hung until then.
<abu[m]> and the calling (char) returns NIL
<fbytez> Both read()s had to happen before any line processing was done.
<abu[m]> The sender must either close or flush
<abu[m]> can you try $ cat file | myProg
<abu[m]> It must have to do with the sender
<fbytez> With `nextChar` ....
<fbytez> read(3, "#!/usr/bin/env pil\n\n(de http_rea"..., 4096) = 841
<fbytez> read(0, "GET / HTTP/1.0\r\nUser-Agent: Mozi"..., 4096) = 312
<fbytez> close(3) = 0
<fbytez> write(1, "HTTP/1.0 200 OK\r\nContent-type: t"..., 347) = 347
<fbytez> With `char`...
<fbytez> read(3, "#!/usr/bin/env pil\n\n(de http_rea"..., 4096) = 841
<fbytez> read(0, "GET / HTTP/1.0\r\nUser-Agent: Mozi"..., 4096) = 312
<fbytez> close(3) = 0
<fbytez> write(1, "HTTP/1.0 200 OK\r\nContent-type: t"..., 347) = 347
<fbytez> read(0, "", 4096) = 0
<abu[m]> Stream input like (char) does a look-ahead
<abu[m]> Thats why the additional read()
<abu[m]> But then it should return NIL
<abu[m]> (while (line T) needs NIL
<fbytez> When the client has sent the headers in a GET, there's no more data and no closing of the socket. So `char` is incapable of being used for something like this.
<abu[m]> But (char) returns NIL only on EOF
<abu[m]> wrong what I said
<abu[m]> yes, (char) returns NIL, but this is not the point.
<abu[m]> (line) needs an empty line
<abu[m]> 'char' is *surely* capable of doing this
<fbytez> No, the problem is that the char tries to read beyond the final "\n".
<abu[m]> I have hundreds of use cases, and never use 'rd' for this
<fbytez> I'm laying it out here in black and white.
<abu[m]> Why don't you use the built-in @lib/http.l ?
<fbytez> When reading the final "\n", (char) triggers another 'read()', causing the hang.
<fbytez> You don't care about this behaviour?
<abu[m]> The built-in stuff user chunked transfers
<abu[m]> I don't observe such hangs ☺
<fbytez> So you don't care.
<fbytez> Which is a different issue.
<abu[m]> I do not understand your setup
<fbytez> Irrelevant.
<abu[m]> Why not connect the browser directly to a socket in Pil as the standard server does?
<abu[m]> Works perfectly since 25 years
* fbytez sighs
<fbytez> don't know why I bother.
<fbytez> Like talking to a wall.
<abu[m]> It is your idea of limiting headers
<abu[m]> This is barking uq the wrong tree
<abu[m]> Headers are not the problem, it is the whole transaction
<abu[m]> The payload may be huge
<abu[m]> But anyway, the logic worked as we see with a single file
<abu[m]> If ycu want to do multiple transactions over a persistent connection, you need to use chunked transfer
<fbytez> No, the logic didn't work.
<fbytez> "Works perfectly since 25 years".... OK...
<fbytez> Even something as simple as `hex` was broken until I got you to fix it via mailing list.
<fbytez> But, you clearly don't care about correctness. I'll waste my time further.
<fbytez> *not waste
<abu[m]> I don't remember well. It was not broken but did no error check, no?
<abu[m]> How do do define correctness?
<abu[m]> Was the hex issue 15jul22 ?
<fbytez> Sounds right.
<abu[m]> It was correct, just did no error checking
<abu[m]> This is always a tradeoff
<abu[m]> speed vs checking
<abu[m]> has nothing to do with correctness here
<fbytez> Sure...
fbytez has left #picolisp [Leaving]
<abu[m]> you too a nice evening! ;)
<beneroth> oh boys
<beneroth> I just entered now
<beneroth> well I just read now
JITn is now known as DKordic
<beneroth> hi DKordic
<DKordic> Hi beneroth
<abu[m]> Yeah, kind of sick disussion
<DKordic> hi abu[san]
<abu[m]> Hi DKordic )
<DKordic> dbytez was in /imperative mood/ :3 .
<DKordic> s/db/fb/
<beneroth> ah don't annoy yourself, abu[m]. he mentioned he is a haskell coder, they usually care more about theoretical correctness than practicability and are sensitive about it ;-)
<beneroth> DKordic, haha
<DKordic> I don't know what to say... I apologize in his place.
<abu[m]> beneroth: T DKordic No worry!
<DKordic> I understand his frustration...
<abu[m]> OK, yes, me too
<abu[m]> In any case I would not recommend reading such a stream with (rd 1)
<abu[m]> (char) looks ahead, but this is not a problem if the client sends an stream of transactions and then closes
<abu[m]> I don't understand his socket setup
<abu[m]> Anyway, no worries ;)
<beneroth> I think I managed to have issues with (char) and reading from (pipe) (other executables)... but it always was because I did mistakes in calculation how much I'm supposed to read. Or with the executable being a client and slow depending on the server and request, then I use (poll) to first check if there is something to read.
<beneroth> abu[m], I guess he has a client which sends a certain number of bytes but doesn't close.
<beneroth> of course you cannot use byte numbers with (char), as (char) might be multibyte
<beneroth> that is a mistake likely to be made
<beneroth> dunno if its the case or not
<abu[m]> He says it is firefox, w3m etc, but connects to some separate process and then via a pipe to pil
<beneroth> yeah no idea
<abu[m]> I think byte counts are not the issue
<beneroth> I can definitely say that firefox doesn't care if you don't read all of the clients HTTP headers
<beneroth> as I did made my webserver that way.. receiving request, only read headers if needed, maybe not at all, maybe just GET and the URL to decide what do respond
<abu[m]> Let's see if he ever returns here. I cannot help atm
<beneroth> maybe he thought in his code example at that he can read from the stdout he previously printed to?
<beneroth> I mean it doesn't hang after reading anything. it hangs because there is nothing to read
<abu[m]> Yes. The problem is the pipe does not close after the first connection, but the next does not come
<beneroth> not sure there is any connection. what purpose is the (prin) on the top-level?
<abu[m]> I don't know why. Firefox may send several transactions over a single connection, but that's fine
<beneroth> did you look at the example abu[m] ?
<beneroth> there are no sockets involved
<abu[m]> It is the reply
<beneroth> just writting to stdout, than reading from stdin
<abu[m]> Yes, the socket is in a separate C program which connects by a pipe to stdin
<beneroth> <fbytez> It potentially is, though.
<beneroth> well nothing real can be infinite in this universe, as far as we can see :)
<beneroth> abu[m], where his code for that? the code he posted makes no sense
<abu[m]> We don't have the full code, but he described it this way
<beneroth> ah yes I see
<abu[m]> (rd 1) works for him cause it does no look-ahead
<abu[m]> But a look-ahead is fine in a normal setup
<beneroth> well define normal setup
<beneroth> he is right
<abu[m]> It hangs after the first transaction, but the second is not sent yet. That's how I understood it
<beneroth> you are also right, its the look-ahead which had im blocking
<beneroth> yes
<beneroth> exactly
<beneroth> so no EOF
<abu[m]> yep
<beneroth> nothing sent
<beneroth> so hanging
<beneroth> all as expected
<beneroth> though
<beneroth> I mean he didn't use (char), he used (char 'num)
<beneroth> so it's not the char
<beneroth> ah
<beneroth> its the (line T) after the last line which hangs
<abu[m]> (rd 1) he uses
<beneroth> because, there is nothing coming anymore
<beneroth> yes
<beneroth> but thats in the inner level
<beneroth> on the outer level is (while (line T) ...)
<beneroth> so yes, that hangs
<beneroth> what else should it do?
<beneroth> ah
<beneroth> ok its supposed to call the inner part, which is then supposed to (quit)
<abu[m]> line would return NIL for empty line at end of header
<beneroth> yes, first loop is ok, I'd say
<abu[m]> quiit is only in error case
<beneroth> he has explicit quits
<abu[m]> At the end of headers is two returns/newlines
<abu[m]> so NIL
<abu[m]> I think it should work. No idea what exactly goes wrong
<abu[m]> Even if the last return is missing, there is the newline and 'line' returns
<abu[m]> Nevermind, not my beer
<beneroth> well see the (case (nextChar))
<beneroth> so wouldn't it (quit "CR without EOL") on the very first \r\n pair?
<abu[m]> yes, this works
<abu[m]> I said to use (char) instead of (nextChar)
<beneroth> aaaah
<beneroth> yeah well then it hangs during reading of the last \n in \r\n\r\n
<abu[m]> (nextChar) seems to work
<abu[m]> cause of rd
<abu[m]> yes, but this should work
<beneroth> no
<beneroth> there is no EOF after \r\n\rn in his setup
<abu[m]> And i t always works if Firefox connects directly to Pil
<abu[m]> (line) returns on "\n"
<abu[m]> eof is not needed
<beneroth> yes, but line doesn't use (char) and look-ahead, does it?
<abu[m]> only after the last transaction
<abu[m]> line does the same look ahead
<abu[m]> char, line, till, read *all* use look-ahead
<abu[m]> the same internal stream
<beneroth> well last transaction.. you expect a new one after the first one
<beneroth> right
<abu[m]> 'rd' is separate
<beneroth> so if the sender doesn't send nor EOF after \r\n\r\n it hangs, right?
<beneroth> as designed, not wrong per se
<abu[m]> I think 'line' detects \n, then lomks ahead for \r
<abu[m]> So it should work
<beneroth> the issue you discussed about is not really the code or pil (even if you both thought so), but the expectations how the pil text reading functions work
<abu[m]> and it *does* work with direct browser connection
<beneroth> you are right
<beneroth> (in (accept (port 8080)) (while (line T) (println @)]
<beneroth> second terminal, telnet localhost 8080
<abu[m]> good test
<beneroth> (in (accept (port 8080)) (while (char) (println @)]
<beneroth> hangs
<beneroth> because of the look-ahead
<abu[m]> Also without look-ahead
<beneroth> might be telnet not flushing
<abu[m]> (char) returns NIL only on EOF
<abu[m]> This is not a look ahead or flush problem
<beneroth> ah right
<beneroth> yes of course
<beneroth> yeah
<abu[m]> You must close
<beneroth> that's the same char problem I stumpled on
<beneroth> but yeah
<beneroth> its correct
<abu[m]> ok
<beneroth> use (and (poll Socket) (char))
<beneroth> :P
<abu[m]> this is also possible, yes
<abu[m]> also a kind of look ahead
<beneroth> yes, just without consuming
<beneroth> while (peek) is consuming from input, but not within pil
<beneroth> so you
<beneroth> are right
<beneroth> he/she didn't listen or not understand/process what you wrote :)
<abu[m]> (peek) just *looks* at the next char, while (char) consumes it and fills the look-ahead
<beneroth> yes, put (peek) on a pipe which has (poll ...) -> NIL also hangs, right? :)
<beneroth> that is what I meant
<abu[m]> I think he/she understood, but we all don't understand why it hangs
<abu[m]> yes (poll) checks the fd for being ready
<beneroth> yup
<beneroth> well we don't know about the sender, so the issue might well be there
<beneroth> <fbytez> When reading the final "\n", (char) triggers another 'read()', causing the hang.
<beneroth> <fbytez> You don't care about this behaviour?
<abu[m]> and if the sender is a browser, we don't know what goes wrong in between
<beneroth> so understood the issue, but didn't understand that it's supposed to work that way as (char) is designed exactly that way on purpose?
<abu[m]> Not sure if (char) hangs or (line)
<abu[m]> hmm, perhaps it is complicated because of the 'input' in between
<beneroth> (line) doesn't as I just tested. but (char) does if there is no more content to look ahead.
<abu[m]> So this is perhaps the problem
<beneroth> and I think that is correct. it's wrong to use (char) to consume the last char of a blocked connection
<abu[m]> I never used 'input' in such a case
<beneroth> maybe?
<abu[m]> Maybe the input adds another buffering?
<abu[m]> So that might explain it
<beneroth> you mean, we have two levels of char look-ahead buffers because of (input) ?
<abu[m]> (line) got \n, looks at \r and fetches it
<abu[m]> So 'input' does not stop yet
<beneroth> T
<abu[m]> and asks for *yet* another char
<beneroth> true!
<beneroth> yes
<beneroth> so the input char is doing another look-ahead
<abu[m]> yes
<abu[m]> Works when EOF follows
<beneroth> which causes the "using (char) to read the final \n" which is not correct on a blocking connection (one where (poll) returns NIL)
<beneroth> yeah
<abu[m]> but not if the connection stays open but sleeps
<beneroth> I think you solved it. makes perfect sense
<abu[m]> Seems so
<beneroth> well
<beneroth> don't be angry at people who first have to do multiple attempts finding bugs in pil before finding out that its usually them who is wrong. even more so if they already found real bugs beforehand.
<abu[m]> This whole idea of limiting the traffic at this level is not a good idea
<beneroth> I was the same for a while, I guess? maybe less dismissing, I hope
<abu[m]> Well, 'input' was your and mine proposal ;)
<beneroth> people are used to find the error in the software and not themselves, and unfortunately its often correct with most software out there
<beneroth> yeah
<beneroth> but I haven't used it yet tbh :D
<beneroth> I'm speaking about the probability of finding a real bug in pil is lower than one would think. once in a while you can find one, but in nearly all cases its wrong thinking what happens or wrong expectation what should happen.
<beneroth> the error was overlooking the layered calls to (char) and therefore multiple look-ahead buffers
<beneroth> and instead thinking its a bug in (char) when it is not
<beneroth> good training for me, thanks for continuing the discussion anyway abu[m] :)
<abu[m]> yes, true
<beneroth> <abu[m]> This whole idea of limiting the traffic at this level is not a good id
<abu[m]> (real bugs)
<beneroth> well happens with long-running connections. more prone to off-by-one errors.
<beneroth> abu[m], yeah, you did and do a really good job. not just the design but also the implementation.
<beneroth> both can contain bugs
<abu[m]> I mean I would not know which limit to choose
<beneroth> training with a scalpel can produce some cuts now and then ;)
<abu[m]> The header is not the problem
<beneroth> yeah true
<abu[m]> We can catch an oversized header, but not oversized data
<abu[m]> I *want* to allow large data
<beneroth> exactly
<abu[m]> So if someone sends a huge transaction, it will be killed eventually
<beneroth> maybe monitor memory usage, if its over a certain threshold, kill the connection which consumes the most. but then again.. can be completely the wrong approach in a real world setting and make things worse.
<beneroth> it depends
<abu[m]> Not nice, but no showstopper
<beneroth> if you know you only need to expect small messages, whatever that is, than you can configure it and use as limit. but highly depends on the specific application
<abu[m]> right
<abu[m]> And attackers can also harm by sending many very small transactions
<beneroth> then again you probably should also add some alert/monitoring system to the limits, and inform someone if the limits are constantly reached. because then either your assumptions are now wrong, or you are under active attack.
<beneroth> letting memory run full and collapse has the same effect of sending an alert (hopefully), so whats the difference? :P
<abu[m]> There are many possibilities
<abu[m]> In any case, I would first harden httpGate
<abu[m]> both to detect multiple connections and to limit data size
<beneroth> if things go out of reasonable expectations, better crash, loudly. that will force someone to fix it. better than hiding weird incidents and then it looks normal, and when it crashes the issue is standard for a long time already
<abu[m]> yep :)
<beneroth> yeah I agree. httpGate is the first line of contact, so if you want to detect malicious clients, you want to detect and stop them as early as possible
<abu[m]> ok, I stop for today
<beneroth> good
<beneroth> recover your zen
<beneroth> have a good night :)
<beneroth> thanks for the talk!
<abu[m]> the cat is hungry
<abu[m]> and noisy
<abu[m]> Thank you too!
<beneroth> ok, recover the cat zen :)
<abu[m]> Good night! ☺
<abu[m]> T
<beneroth> always a pleasure :)
rob_w has quit [Read error: Connection reset by peer]