freakazoid333 has quit [Ping timeout: 260 seconds]
<dh`>
muurkha: be careful of the term "lock-free"
<dh`>
it doesn't mean "doesn't use locks", it's something more like "doesn't deadlock", except that there are apparently some unclear side conditions that amount to "never waits unless I say it's ok" :-/
<dh`>
(the rest of what you said sounds reasonable)
seninha has quit [Quit: Leaving]
<muurkha>
dh`: right, I meant specifically that at least one thread is always making progress (as opposed to "wait-free")
<muurkha>
which wouldn't be the case in the round-robin preemption case I described
<muurkha>
I might actually implement the lock-free transaction semantics using (interpreter-internal) locks if I switch to multiple processors
<muurkha>
Fraser's FSTM seems to be very close to what I was thinking of, though without the interpreter
<muurkha>
however he makes a "shadow copy" when you open an object for write, and your transaction modifies the new shadow copy, but also supports upgrading read access to write access, and I haven't figured out how that can work yet
mahmutov_ has quit [Ping timeout: 246 seconds]
<dh`>
well hold on, if you're not copying when you're reading you can just do another read to copy when you want to write
<dh`>
and if you get an inconsistent version, abort
<dh`>
unless you want to make sure your transactions never read anything only once (instead of detecting if your multiple reads are inconsistent) but that seems quite expensive
kaph has quit [Ping timeout: 260 seconds]
aerkiaga has quit [Remote host closed the connection]
lumberjack123 has quit [Ping timeout: 240 seconds]
jacklsw has joined #riscv
kaph has joined #riscv
geranim0 has joined #riscv
vagrantc has quit [Quit: leaving]
lumberjack123 has joined #riscv
ts_ has joined #riscv
handsome_feng has joined #riscv
lumberjack123 has quit [Ping timeout: 240 seconds]
ivii has quit [Remote host closed the connection]
lumberjack123 has joined #riscv
geranim0 has quit [Remote host closed the connection]
lumberjack123 has quit [Ping timeout: 240 seconds]
riff-IRC has quit [Remote host closed the connection]
riff-IRC has joined #riscv
handsome_feng has quit [Quit: Connection closed for inactivity]
Sofia has quit [Ping timeout: 240 seconds]
ts_ has left #riscv [Leaving]
drmpeg has quit [Ping timeout: 240 seconds]
Sofia has joined #riscv
<muurkha>
dh`: right, you need to copy when you want to write, which is fine, that's not the issue
<muurkha>
the issue is that in function f you have foobar *p = open_for_reading(handle); g();
<muurkha>
and then inside g, unbeknownst to f, you have foobar *q = open_for_writing(handle); with the same handle
<muurkha>
and then g maybe does q->wuddle++; return; and then we are faced with this problem
<muurkha>
does p->wuddle in f give us the incremented wuddle or the original wuddle? it should give us the incremented wuddle, because the write to wuddle was in the same transaction
<muurkha>
but how do you achieve that? do you walk the stack and rewrite p to point to the new shadow copy? because I want to use lazy versioning in order to keep low-priority transactions from blocking high-priority ones, you don't want to overwrite the "master" copy of the object designated by handle
drmpeg has joined #riscv
<muurkha>
so that's what led me to the approach where I make a new master copy and abort any other transactions still reading the copy that p was pointing at
<muurkha>
but apparently Fraser handled that problem in a different way, and he did it in C, where "walk the stack and rewrite p" is not a thing you can do
pabs3 has quit [Quit: Don't rest until all the world is paved in moss and greenery.]
pabs3 has joined #riscv
handsome_feng has joined #riscv
Sofia has quit [Ping timeout: 240 seconds]
<dh`>
I dealt with that by not copying things, because multiversion is expensive :-)
<dh`>
but it sounds like you've got enough levels of indirection to invalidate the older handle, at which point the upper guy will reload it and get the modified one
<muurkha>
well, but the upper guy is just a function, he's not a separate transaction
Sofia has joined #riscv
<muurkha>
it's not practical to reload all your pointers to transactional variables after every function call, and I'm sure it isn't what Fraser is doing
<muurkha>
and if you retry the entire transaction in the same way, the same thing will happen again
<muurkha>
you could potentially retry with a list of handles that you need to open for writing even when the transaction requests to open them only for reading, but if you have N of them, you will have to retry the transaction N times before you get through it, doing N² - N opens
<muurkha>
you could in theory revalidate every transactional object every time you index into it, instead of getting a pointer to (a copy of) the whole object once. but that's also ruinously expensive for many common workloads, much worse than making a copy of the object when you open it
<muurkha>
(and again it isn't what Fraser is doing)
prabhakarlad has joined #riscv
pecastro has joined #riscv
winterflaw has joined #riscv
Sofia has quit [Ping timeout: 240 seconds]
freakazoid12345 has joined #riscv
freakazoid343 has quit [Ping timeout: 240 seconds]
Sofia has joined #riscv
kaph has quit [Read error: Connection reset by peer]
kaph has joined #riscv
eroux has joined #riscv
jmdaemon has quit [Ping timeout: 260 seconds]
compnerd has quit [Quit: Connection closed for inactivity]
freakazoid343 has joined #riscv
freakazoid12345 has quit [Ping timeout: 256 seconds]
radu242 has quit [Ping timeout: 246 seconds]
aerkiaga has joined #riscv
mjacob has quit [Read error: Connection reset by peer]
mjacob has joined #riscv
cassiel has joined #riscv
X-Scale has quit [Ping timeout: 272 seconds]
jjido has joined #riscv
pho has quit [Quit: You have been kicked for being idle]
jjido has quit [Client Quit]
radu242 has joined #riscv
jacklsw has quit [Quit: Back to the real world]
radu242 has quit [Ping timeout: 240 seconds]
jacklsw has joined #riscv
jacklsw has joined #riscv
jacklsw has quit [Changing host]
jjido has joined #riscv
dilfridge is now known as undercaffeinated
Heston has quit [Ping timeout: 246 seconds]
Heston has joined #riscv
freakazoid343 has quit [Read error: Connection reset by peer]
ivii has joined #riscv
freakazoid343 has joined #riscv
littlebobeep has quit [Remote host closed the connection]
littlebobeep has joined #riscv
drmpeg has quit [Ping timeout: 260 seconds]
littlebobeep has quit [Ping timeout: 240 seconds]
littlebobeep has joined #riscv
drmpeg has joined #riscv
jjido has quit [Quit: My MacBook Air has gone to sleep. ZZZzzz…]
undercaffeinated is now known as dilfridge
littlebobeep has quit [Ping timeout: 240 seconds]
jacklsw has quit [Read error: Connection reset by peer]
littlebobeep has joined #riscv
prabhakarlad has quit [Ping timeout: 250 seconds]
radu242 has joined #riscv
prabhakarlad has joined #riscv
dilfridge is now known as dionysos
BOKALDO has quit [Quit: Leaving]
joev has quit [Ping timeout: 245 seconds]
elastic_dog has quit [Ping timeout: 256 seconds]
joev has joined #riscv
toulene has joined #riscv
elastic_dog has joined #riscv
elastic_dog has quit [Quit: elastic_dog]
elastic_dog has joined #riscv
aerkiaga has quit [Remote host closed the connection]
BOKALDO has joined #riscv
joev has quit [Ping timeout: 246 seconds]
jjido has joined #riscv
joev has joined #riscv
cassiel has quit [Quit: Client closed]
aerkiaga has joined #riscv
kaph has quit [Read error: Connection reset by peer]
kaph has joined #riscv
geranim0 has joined #riscv
jjido has quit [Quit: My MacBook Air has gone to sleep. ZZZzzz…]
raym has quit [Remote host closed the connection]
raym has joined #riscv
oaken-source has quit [Remote host closed the connection]
freakazoid343 has quit [Remote host closed the connection]
freakazoid343 has joined #riscv
compnerd has joined #riscv
jn has quit [Ping timeout: 260 seconds]
<gordonDrogon>
ugh.. when you've been looking at documents/schematics for the SD card and realise it's called by it's old name.. TF
balrog has quit [Quit: Bye]
jacklsw has joined #riscv
elastic_dog has quit [Ping timeout: 240 seconds]
ZipCPU_ has joined #riscv
ZipCPU has quit [Ping timeout: 272 seconds]
ZipCPU_ is now known as ZipCPU
balrog has joined #riscv
freakazoid343 has quit [Remote host closed the connection]
freakazoid343 has joined #riscv
elastic_dog has joined #riscv
Andre_H has joined #riscv
elastic_1 has joined #riscv
elastic_dog has quit [Ping timeout: 250 seconds]
elastic_1 has quit [Client Quit]
elastic_dog has joined #riscv
TianruiWei[m] has quit [Quit: You have been kicked for being idle]
freakazoid343 has quit [Read error: Connection reset by peer]
freakazoid343 has joined #riscv
sobkas has joined #riscv
aburgess has quit [Ping timeout: 256 seconds]
jjido has joined #riscv
freakazoid343 has quit [Read error: Connection reset by peer]
freakazoid343 has joined #riscv
eroux has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
eroux has joined #riscv
vagrantc has joined #riscv
jmdaemon has joined #riscv
jacklsw has quit [Read error: Connection reset by peer]
mahmutov_ has joined #riscv
mahmutov_ is now known as mahmutov
handsome_feng has quit [Quit: Connection closed for inactivity]
aerkiaga has quit [Remote host closed the connection]
mahmutov has quit [Ping timeout: 246 seconds]
Sofia has joined #riscv
aburgess has joined #riscv
aburgess_ has joined #riscv
aburgess has quit [Ping timeout: 260 seconds]
radu242 has quit [Ping timeout: 260 seconds]
kaph has quit [Read error: Connection reset by peer]
kaph has joined #riscv
EchelonX has joined #riscv
freakazoid343 has quit [Read error: Connection reset by peer]
freakazoid343 has joined #riscv
justyb11 has quit [Quit: Leaving]
seninha has joined #riscv
Gravis_ has joined #riscv
Gravis has quit [Read error: Connection reset by peer]
jjido has quit [Quit: My MacBook Air has gone to sleep. ZZZzzz…]
freakazoid12345 has joined #riscv
jn has joined #riscv
jn has joined #riscv
freakazoid343 has quit [Ping timeout: 252 seconds]
radu242 has joined #riscv
elastic_1 has joined #riscv
elastic_dog has quit [Ping timeout: 250 seconds]
radu242 has quit [Ping timeout: 245 seconds]
elastic_1 is now known as elastic_dog
elastic_dog has quit [Client Quit]
Gravis has joined #riscv
elastic_dog has joined #riscv
Gravis_ has quit [Ping timeout: 240 seconds]
zjason` has joined #riscv
zjason has quit [Ping timeout: 246 seconds]
<palmer>
@atish: I'm on Matrix, so it's always a bit broken...
jmdaemon has quit [Ping timeout: 240 seconds]
sobkas has quit [Quit: sobkas]
Guest46 has joined #riscv
Guest46 has quit [Client Quit]
atishp has joined #riscv
atishp[m][m] has joined #riscv
atishp has quit [Client Quit]
<palmer>
atishp[m][m]: ya, I see ;)
<palmer>
(and for once, it wasn't broken on my end!)
<atishp[m][m]>
it's been a while..I think I got kicked when freenode craziness happened.
<atishp[m][m]>
and never joined this one
<palmer>
ya, makes sense
<palmer>
figured it's best to try and get folks to do upstream stuff here, with so many people at Rivos it's way too easy to end up doing stuff in slack and nobody else can see
jmdaemon has joined #riscv
<palmer>
I'm seeing `-smp 64` / `NR_CPUS=32` boot fine
<atishp[m][m]>
yeah. I forgot to remove CONFIG_RISCV_BOOT_SPINWAIT
<atishp[m][m]>
Now it boots fine for me as well
<palmer>
ah, OK
<atishp[m][m]>
running through your configs to see which one fails
<palmer>
so I think it might have just been over-loaded
JanC has quit [Remote host closed the connection]
<palmer>
a bunch just finished (but I only went up to -smp 64)
JanC has joined #riscv
<palmer>
I specifically remember the some userfault ones hanging before any boot spew, which is when I figured it wasn't just too many threads
freakazoid12345 has quit [Read error: Connection reset by peer]
freakazoid12345 has joined #riscv
wingsorc has joined #riscv
ivii has quit [Remote host closed the connection]
ivii has joined #riscv
vagrantc has quit [Quit: leaving]
pecastro has quit [Ping timeout: 260 seconds]
ivii has quit [Remote host closed the connection]
epony has joined #riscv
<atishp[m][m]>
Were you running any other tests after the boot ?
<palmer>
not much
<atishp[m][m]>
I am almost half way through your configs with the following config
<atishp[m][m]>
-smp 128` / `NR_CPUS=64`
<atishp[m][m]>
it seems to boot fine
<palmer>
ya, I beginning to think it was just a combination of oversubscribing and triggering some existing bugs
<palmer>
maybe not, I just poked back in and something's blown up
<palmer>
(but I think it got eaten by flock/timeout/make)
<muurkha>
someone needs to change their name to [m][m][m]