<aeth>
you can't just reinterpret an array of (unsigned-byte 8)s as (unsigned-byte 32)s or whatever because the type system protects you (and you can abuse CFFI for this, but it's probably slower because of the overhead)
<aeth>
so these bit operations are surprisingly necessary when dealing with binary formats
<aeth>
well, I guess if it's a file you can use file-position
* ldb
is dealing with CSI data which is not byte aligned and is big-endian
<gilberth>
The PDP-10 is word oriented but has byte pointers, pointing to a subrange of bits within a word. That's where LDB came from.
<ldb>
PDP-10 word size is 37 bits
<ldb>
*36
<ldb>
so doing ASCII is weird
Inline has quit [Read error: Connection reset by peer]
<ldb>
you are free to make a choice to use 7bit or 8bit per byte, and other options are also avaliable
<gilberth>
aeth: Endianness?
wbooze__ has joined #commonlisp
wbooze has quit [Ping timeout: 256 seconds]
<aeth>
gilberth: endianness is easy to reason about... if it looks wrong, you got it backwards, and nobody uses big endian anymore ;-p
<gilberth>
It's always an error to read some binary file format into memory and treat that as anything but a sequence of octets.
<ldb>
aeth: almost everything in network are still big endian
<aeth>
unfortunately, besides floats (float-features) and string encodings, there's not much available to you out of the box afaik
wbooze__ is now known as wbooze
<aeth>
so you do have to reason in bits in CL (or find a smaller library I guess)
<aeth>
gilberth: SPIR-V has a safeguard for this, actually!
<aeth>
the magic number is intentionally long, 0x07230203... so you can use it to deduce endianness because the format is intended to be (unsigned-byte 32)s
<aeth>
and if you choose not to handle the niche formatting (I'm guessing big endian), you'll simply not recognize it as SPIR-V so things won't break, you just won't read it
<gilberth>
As I said, a binary file is a sequence of octets. Treat it as such and you're fine.
<aeth>
in general, yes
<Bubblegumdrop>
Since you're on the topic, what if you know the data is ASCII?
<ixelp>
GitHub - gunnarmorling/1brc: 1️⃣🐝🏎️ The One Billion Row Challenge -- A fun exploration of how quickly 1B rows from a tex [...]
<Bubblegumdrop>
the general idea seems to be spawn threads and mmap the file into segments
<Bubblegumdrop>
which I'm able to do
<ldb>
there is a extended bit so UTF-8 can be compatible
<Bubblegumdrop>
yeah it is utf8 data
<Bubblegumdrop>
actually...
<Bubblegumdrop>
@_@
<Bubblegumdrop>
I already ran into that
<Bubblegumdrop>
code-char doesn't work on utf8
<ldb>
or you can just say "it is 8-bit clean"
Inline has joined #commonlisp
<ldb>
I think UTF-32 is internally used in popular CLs
<gilberth>
Bubblegumdrop: How could it? Same story, when it's a sequence of octets, treat it like that. Decode it into your internal representation if needed.
<Bubblegumdrop>
yeah that's what i'm working on right now
<gilberth>
ldb: But ABCL and ACL. No idea about Lispworks though.
<gilberth>
Sadly both are victim to early adoption of Unicode which once was thought to be 16-bit.
<aeth>
Windows as well
lispmacs[work] has quit [Remote host closed the connection]
<gilberth>
And it is extremely annoying, but I tend to ignore the issue and just pretend that char-code is an unicode codepoint.
<aeth>
and iirc Java (so, yes, ABCL) and JavaScript (so if JSCL ever was complete)
<gilberth>
Windows is not a programming language.
<aeth>
well, Windows is not a programming language, but it has APIs
<ldb>
windows api has wide char variants
<aeth>
so you can think of it as presenting dialects of C and C++
<gilberth>
aeth: Yes, but still my code-char on Windows with SBCL and CCL is an Unicode code point and thus I don't care.
<aeth>
hopefully it doesn't mess with pathnames
<gilberth>
aeth: what difference does it make whether pathnames are encoding in UTF-8 or UTF-16 or whatever Windows happens to use? You're confusing external representation and internal representation again.
<gilberth>
Common Lisp is not C.
pfdietz has quit [Quit: Client closed]
son0p has joined #commonlisp
Alfr has quit [Remote host closed the connection]
green_ has quit [Ping timeout: 264 seconds]
Alfr has joined #commonlisp
amb007 has joined #commonlisp
rgherdt has quit [Quit: Leaving]
<Bubblegumdrop>
,(char-code #\世)
<ixelp>
(char-code #\世) => 19990
<Bubblegumdrop>
Hm.
ymir has quit [Ping timeout: 256 seconds]
<Bubblegumdrop>
,(code-char #x4e16)
<ixelp>
(code-char #x4e16) => #\U+4E16
<gilberth>
,(princ (code-char #x4e16))
<ixelp>
(princ (code-char #x4e16)) 世 => #\U+4E16
<gilberth>
Whatever that is.
<Bubblegumdrop>
my cffi:*default-foreign-encoding* is :utf8
<Bubblegumdrop>
presumably ixelp doesn't have cffi package loaded
<Bubblegumdrop>
,cffi:*default-foreign-encoding*
<ixelp>
cffi:*default-foreign-encoding* ERROR: There is no package named "CFFI" .
<Bubblegumdrop>
yea..
<gilberth>
No, why should it. And what has FFI to do with that?
amb007 has quit [Ping timeout: 260 seconds]
<Bubblegumdrop>
I'm using mmap to open a file with utf8 encoded text in it
<Bubblegumdrop>
So I'm using cffi:mem-aref to read it 1 byte at a time
<gilberth>
I'm no friend of such micro-optimizations. Have you run a benchmark on the speed difference compared to a READ-SEQUENCE approach? How many cycles do you save? When the data is still on disk even? Was the slowness of read(2) a show-stopper warranting such a specialized implementation?
<Bubblegumdrop>
the file is 13 GB
<Bubblegumdrop>
1 billion lines of text
<Bubblegumdrop>
the 1 billion row challenge.
pfdietz has joined #commonlisp
<Bubblegumdrop>
the challenge is to average/min/max the floating point values on every line, so you need to "read" the whole file
<gilberth>
Still. Try mmap(2)ing it once and then try with a reasonable buffer for read-sequence. Make sure you read data from disk. And what happens when I only have say 8GB RAM total?
<Bubblegumdrop>
All great questions to which I hope to learn the answer
<gilberth>
The floating point number are in textual representation?
<gilberth>
The description even says that the input data is fixed point.
<Bubblegumdrop>
yes
<gilberth>
Would have been funny when to calculate the correct result would need more than 52-bit of precision and thus would make every entry using doubles fail. :-)
<Bubblegumdrop>
There is a reference implementation, I'm not sure what data type they use
<gilberth>
Or the data would be skewed accordingly. I mean you can't represent 1/10 in binary FP.
<Bubblegumdrop>
yeah it's using Double.parseDouble
<Bubblegumdrop>
baseline takes 5 min to run
<Bubblegumdrop>
the fastest submission is <3 seconds
<Bubblegumdrop>
1.5 seconds
<Bubblegumdrop>
pretty neat
<Bubblegumdrop>
presumably on the test box the file is in tmpfs
<gilberth>
I was about to say: That is faster than my disk can deliver the data.
<Bubblegumdrop>
it doesn't say in the readme, but they do claim to be running on an EPYC 7502P with 128 GB RAM
<Bubblegumdrop>
So presumably it's all in tmpfs.
<gilberth>
In practice though data has to come from somewhere. When I/O becomes a bottleneck, you're fast enough.
<Bubblegumdrop>
I used to dream about the days you could boot your entire OS from tmpfs. the reality is almost here.
<Bubblegumdrop>
well, it's been here, but I'm talking like full blown deskto pOS
<Bubblegumdrop>
Very cool stuff.
<gilberth>
Bubblegumdrop: Use core memory.
<Bubblegumdrop>
mount /dev/cache/L1 /tmp
<Bubblegumdrop>
zoom
<gilberth>
No need to boot, just power on.
<Bubblegumdrop>
One can dream.
<gilberth>
Actually FRAM exists.
<Bubblegumdrop>
yeah I was actually just going to say, I believe I've seen systems like that, I don't know any off the top of my head though..
<gilberth>
Which is basically the same thing on a chip. It's quite fast even.
Alfr has quit [Ping timeout: 256 seconds]
<Bubblegumdrop>
I read a cool blog post about somebody's big fancy FPGA camera with 64 GB RAM, they double buffer the entire FPGA memory to the RAM then dump that to a *second* stick of RAM, kinda like boehm-gc style
amb007 has quit [Read error: Connection reset by peer]
amb007 has joined #commonlisp
<beach>
(= <something> 0) is better expressed as (ZEROP <something>).
<Bubblegumdrop>
Hm, actually, I don't even need to use flex:octets-to-string either
<Bubblegumdrop>
I could just sum up the bytes and use that as the key...
<Bubblegumdrop>
Neato. This is a fun one.
amb007 has quit [Ping timeout: 268 seconds]
amb007 has joined #commonlisp
bendersteed has quit [Quit: bendersteed]
msavoritias has quit [Ping timeout: 255 seconds]
<beach>
If I were you, I would not use those DEBUG settings, nor those other declarations until I got my algorithms right, and then only if 1. There is a performance problem, and 2. They actually help performance.
<beach>
You are asking for trouble the way it is now.
NotThatRPG has quit [Remote host closed the connection]
NotThatRPG has joined #commonlisp
dnhester has joined #commonlisp
varjag has joined #commonlisp
NicknameJohn has joined #commonlisp
danse-nr3 has joined #commonlisp
tok has joined #commonlisp
bilegeek has quit [Quit: Leaving]
cmack has quit [Ping timeout: 246 seconds]
dino_tutter has joined #commonlisp
Lycurgus has quit [Quit: leaving]
Jach has joined #commonlisp
overclucker has quit [Read error: Connection reset by peer]
overclucker has joined #commonlisp
amb007 has quit [Read error: Connection reset by peer]
amb007 has joined #commonlisp
msavoritias has joined #commonlisp
dnhester has quit [Ping timeout: 246 seconds]
amb007 has quit [Read error: Connection reset by peer]
shka has joined #commonlisp
amb007 has joined #commonlisp
Equill has joined #commonlisp
donleo has joined #commonlisp
awlygj has joined #commonlisp
brokkoli_origin has quit [Ping timeout: 264 seconds]
notzmv has joined #commonlisp
rainthree has joined #commonlisp
rainthree has quit [Ping timeout: 264 seconds]
brokkoli_origin has joined #commonlisp
awlygj has quit [Ping timeout: 264 seconds]
awlygj has joined #commonlisp
jrx has joined #commonlisp
rendar has joined #commonlisp
jmdaemon has quit [Ping timeout: 268 seconds]
random-nick has joined #commonlisp
<paulapatience>
Bubblegumdrop: Even if you don't care about the Lisp parameter, one day you will start to care and forget that you used nreverse. Then you will be annoyed when you discover the reason for your bug.
<paulapatience>
s/Lisp/list/
kurfen has quit [Remote host closed the connection]
kurfen has joined #commonlisp
dra has joined #commonlisp
dra has quit [Changing host]
dra has joined #commonlisp
mm007emko has quit [Ping timeout: 256 seconds]
mm007emko has joined #commonlisp
zetef has quit [Ping timeout: 256 seconds]
dra has quit [Ping timeout: 256 seconds]
prokhor has quit [Remote host closed the connection]
prokhor has joined #commonlisp
zetef has joined #commonlisp
pillton has quit [Quit: ERC 5.5.0.29.1 (IRC client for GNU Emacs 29.2)]
mm007emko has quit [Read error: Connection reset by peer]
mm007emko has joined #commonlisp
zetef has quit [Ping timeout: 268 seconds]
Inline has joined #commonlisp
madmuppet006 has joined #commonlisp
zetef has joined #commonlisp
Oddity has quit [Ping timeout: 264 seconds]
dnhester has joined #commonlisp
madmuppet006 has quit [Remote host closed the connection]
dra has joined #commonlisp
dra has quit [Changing host]
dra has joined #commonlisp
dnhester has quit [Ping timeout: 264 seconds]
Inline has quit [Quit: Leaving]
dnhester has joined #commonlisp
tyson2 has joined #commonlisp
zetef has quit [Read error: Connection reset by peer]
danse-nr3 has quit [Read error: Connection reset by peer]
igemnace has quit [Read error: Connection reset by peer]
igemnace has joined #commonlisp
yitzi has joined #commonlisp
msavoritias has quit [Remote host closed the connection]
msavoritias has joined #commonlisp
Inline has joined #commonlisp
wbooze has joined #commonlisp
jrx has quit [Quit: ERC 5.5.0.29.1 (IRC client for GNU Emacs 29.2.50)]
josrr has joined #commonlisp
pyooque has joined #commonlisp
puke is now known as Guest9533
Guest9533 has quit [Killed (zinc.libera.chat (Nickname regained by services))]
pyooque is now known as puke
unl0ckd has joined #commonlisp
green_ has quit [Ping timeout: 264 seconds]
<younder>
lessons learned today.. #+or and #-t will make a expression no be evaluated. ((or) => nil)
<younder>
Also WITH var = (expr) does not work the way i thing. puttig it after for does not make it evaluate for each iteration. use let or let*
<yitzi>
younder: Use `for x = (bla)` for evaluation on every iteration
<yitzi>
or `for x = (initial) then (following)`
<younder>
My code now works again after the last refactoring. Unfortunately my spelling has not improved I see.
pfdietz has joined #commonlisp
<younder>
thx yitzi I'll try to remember that
<yitzi>
`for x =` is essentially CL:DO
X-Scale has joined #commonlisp
awlygj has quit [Ping timeout: 255 seconds]
green_ has joined #commonlisp
cmack has joined #commonlisp
zetef has joined #commonlisp
<paulapatience>
younder: #+(or) and #+(and)
<paulapatience>
nil and t could theoretically be defined as features
<paulapatience>
#+(and) being #+t basically
<younder>
good point. Still it would be a epic-ally bad idea for a feature.
<paulapatience>
Indeed
<beach>
The typical example given is "New Implementation of Lisp" also known as NIL.
<younder>
Seems Jonathan Reese worked on NIL Lisp and the went on to create T Scheme. I guess some people never learn right from wrong ;)
rtypo has joined #commonlisp
traidare has joined #commonlisp
<Equill>
Ha!
varjag has quit [Quit: ERC (IRC client for Emacs 27.1)]
traidare has quit [Ping timeout: 264 seconds]
danse-nr3 has joined #commonlisp
a51 has quit [Quit: WeeChat 4.2.1]
<younder>
My hunchentoot application mostly works fine. But suddenly for no apparent reason (to me) it starts sending the session-id as a GET parameter and everything goes boom. The only thing that works is restarting the app. Any idea why this happens?
<younder>
The application relies heavily on cookies to transfer state and data between pages.
<ixelp>
Hunchentoot - The Common Lisp web server formerly known as TBNL
<Equill>
"Once a request handler has called START-SESSION, Hunchentoot uses either cookies or (if the client doesn't send the cookies back) rewrites URLs to keep track of this client"
amb007 has quit [Read error: Connection reset by peer]
<ixelp>
Generic Equality and Comparison for Common Lisp
<lispmacs[work]>
okay, thank you
zetef has joined #commonlisp
<yitzi>
The other CDR docs also at the zendo host, btw
green__ has quit [Ping timeout: 268 seconds]
tyson2 has joined #commonlisp
yitzi has quit [Remote host closed the connection]
<lispmacs[work]>
so, without reading the whole 12 page CDR document, and just looking at the source code, it appears you just need to implement compare
<lispmacs[work]>
hmm, no, those are not defined in terms of compare
<lispmacs[work]>
guess I'll need to read it
<ldb>
12 page is nothing
chomwitt has quit [Ping timeout: 255 seconds]
green__ has joined #commonlisp
thuna` has joined #commonlisp
shka has quit [Ping timeout: 246 seconds]
zetef has quit [Ping timeout: 240 seconds]
zetef has joined #commonlisp
<lispmacs[work]>
just one of those scenarios where a guy with 30 minutes of spare time per day wants to read a 12 page document to solve a 5 minute programming problem
<lispmacs[work]>
I was only interested in the libary because I thought it would make defining some comparison operators easier for my data structure
<ldb>
then you are probably disappointe
<lispmacs[work]>
so far, yes
<ldb>
to me it is too common to spend a day or two then find out no actual progress was made
zetef has quit [Remote host closed the connection]
<ldb>
at least you have utilized your spare time, hey
alcor has quit [Remote host closed the connection]
<lispmacs[work]>
In the past, I seem to recall I used some (Haskell) library where you only needed to implement >= and then you object was fully sortable
tyson2 has quit [Remote host closed the connection]
ymir has quit [Ping timeout: 252 seconds]
dra has quit [Ping timeout: 264 seconds]
ymir has joined #commonlisp
amb007 has quit [Read error: Connection reset by peer]
attila_lendvai has quit [Ping timeout: 255 seconds]
amb007 has joined #commonlisp
<lispmacs[work]>
ldb: that would be I think exactly what I was looking for if it also gave you all the other comparison operators for free
<Alfr>
lispmacs[work], make a macro.
<lispmacs[work]>
Alfr: yes, yes, just checking if the wheel has already been invented
<Kingsy>
could someone take a look at something for me me -> https://bpa.st/FH6Q <- howis it possible that this function returns LAST. BUT the repl also contains THEY ARE THE SAME... ? that doesnt make any sense... the if is doing the same as what the case should be? no?
<ixelp>
View paste FH6Q
<Mon_Ouie>
I don't think case does what you want
<Mon_Ouie>
(case pi ((pi) "pi matches itself?") (t "it does not"))
Odin-LAP has quit [Ping timeout: 255 seconds]
<Kingsy>
hrm ok, so if I dd #. infront of the case statements it works.. why?
<Alfr>
Kingsy, that's likely not what you want.
<Alfr>
Kingsy, that'd evaluate the (case ..) at read time.
<Alfr>
Kingsy, what you're seeing is, that the keys for normal-clauses are not evaluated.