gog has quit [Killed (NickServ (GHOST command used by pog))]
pog is now known as gog
gildasio has quit [Ping timeout: 260 seconds]
gildasio has joined #osdev
netbsduser has joined #osdev
leon_ is now known as leon
navi has joined #osdev
zetef has quit [Ping timeout: 260 seconds]
pog has joined #osdev
neo|desktop has joined #osdev
Gooberpatrol66 has joined #osdev
gog has quit [Ping timeout: 260 seconds]
neo_ has quit [Ping timeout: 260 seconds]
Matt|home has quit [Ping timeout: 260 seconds]
Gooberpatrol_66 has quit [Ping timeout: 260 seconds]
ski has joined #osdev
foudfou has quit [Remote host closed the connection]
foudfou has joined #osdev
zxrom has quit [Quit: Leaving]
kfv has joined #osdev
goliath has joined #osdev
bauen1 has quit [Ping timeout: 246 seconds]
elastic_dog has quit [Ping timeout: 272 seconds]
gildasio has quit [Ping timeout: 260 seconds]
gildasio has joined #osdev
xvmt has quit [Remote host closed the connection]
xvmt has joined #osdev
<mcrod>
hi
elastic_dog has joined #osdev
<nikolapdp>
hello mcrod
<mcrod>
hard faults suck
kfv has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<mcrod>
this might be why
<mcrod>
printf() new lib calls malloc
<mcrod>
nice!
<pog>
smh
<pog>
hidden allocations
<mcrod>
i’m still not sure that’s the problem yet
<mcrod>
i don’t think it is
<mcrod>
it’s dying on __swsetup_r
<mcrod>
dies more specifically on a half word store…
bauen1 has joined #osdev
<mcrod>
ok so it isn’t calling sbrk
<pog>
brk brk
<mcrod>
i hate this :)
<mcrod>
but i’m closer and closer
<mcrod>
it’s not stack pointer issues i don’t think
<pog>
you know what, fuck it
* pog
misaligns your stack pointer
* GeDaMo
pushes pog onto the stack
m3a has quit [Ping timeout: 255 seconds]
<pog>
oh nooo
<zid>
pog can we round up all the people who like to make software that never visibly 'goes wrong' and shoot them
<zid>
people who catch exceptions then exit(0)
<zid>
people who redirect 404 page to /index.html
<zid>
etc
<pog>
yes
<pog>
please shoot me
<GeDaMo>
It's "resilient" :P
<zid>
can it resist bullet
voidah has quit [Ping timeout: 264 seconds]
<GeDaMo>
Is "bullet" some kind of software framework? :|
netbsduser has quit [Remote host closed the connection]
netbsduser has joined #osdev
<mjg>
fearless bullet
<mcrod>
ok so it’s a bus fault that escalated to a hard fault
<mcrod>
i didn’t think ARM bare metal would be this painful
<mcrod>
it’s 0x23
<mcrod>
that’s IMPOSSIBLE.
<mcrod>
“it” being the PC that caused the fault
<sham1>
I've already whinged about this elsewhere, but man, I'm so tired. I've been interviewing a bunch of people who are interested in a summer trainee position and next week I'll have to also do it. Exhausting
<clever>
mcrod: is VBAR set? what does that table contain?
<clever>
maybe one of the exception vectors caused the fault?
<mcrod>
that’d be weird
<clever>
i had a similar problem booting linux a few years back
<mcrod>
they do nothing but for (;;) nop
<clever>
it turns out, linux doesnt initialize the vector table early on
<mcrod>
and my hardfault handlers DO get called
<clever>
and one of the very first things an SMP build of linux does, once switching on the MMU, is grab a mutex for printk
<clever>
except, i had SMP support turned off, and loadex was an illegal opcode
<clever>
so it jumped to an undefined exception vector
<clever>
that took days to diagnose
<sham1>
You'd think that loadex would degrade to a simple load of SMP is not on, but I also understand why that doesn't happen
<clever>
prior to SMP arm, that opcode didnt exist
<mcrod>
note this is bare metal
<clever>
and to emulate that older system, it takes the normal illegal opcode route
<mcrod>
but one sec
<clever>
on the rpi2, SMP support is optional, and you need to enable it in a control reg
<sham1>
Hm, if the opcode doesn't exist then that makes sense
<zid>
sham1: Can I be a summer trainee?
<clever>
also, by pure chance, the same control bit fixed an unrelated pi3 problem
<clever>
on pi3, that allows non-secure kernel to flush the arm L2 cache
<clever>
without that, i was getting insane memory corruption on the pi3
<sham1>
zid: well the application submission time passed last month, so no
<zid>
sham1: you can't take me in the backdoor? (oi oi)
<clever>
mcrod: are you able to jtag that system?
<mcrod>
of course
<clever>
for my linux issue, the only thing that worked in the end, was to single-step until it behaved odd
<mcrod>
it dies almost immediately calling printf()
<clever>
but, single-stepping from linux entry, to post-mmu, is a huge chunk of code, including all of bunzip'ing itself
<sham1>
Unzip those buns
<clever>
so it was a lot of bisection, setting breakpoints at random spots
<clever>
and constantly rebooting
<clever>
and dealing with the fact that i'm bouncing between 3 different arm binaries (bootloader, linux premmu, linux postmmu)
<clever>
and the breakpoints fail when switching modes
<mcrod>
possibilities: newlib sucks, .bss/.data not properly initialized, some random thing i’m not doing that i’m supposed to magically just know
<clever>
mcrod: what is between _start and main()? nearly everything ive done has .bss not yet cleared
<clever>
and i have to clear it myself
<mcrod>
i clear it’s
<mcrod>
it
<mcrod>
when i say bare metal
<clever>
and my .text and .data are all just one big blob
<mcrod>
i really mean “i’m on my own except for newlib”
<sham1>
So more like embedded
<mcrod>
yes
<mcrod>
I guess it’s okay if I set the stack pointer to a specific address directly in the vector table
<clever>
mcrod: can you set a breakpoint in printf, and then just single-step until it malfunctions? and compare the disassembly to the pc as it runs, and see where it goes off the rails
<mcrod>
problem
<mcrod>
i can’t single step into newlib code
<clever>
why not?
<mcrod>
the source isn’t included in arm gnu embedded toolchain, just the blob
<mcrod>
which is *bull*
<clever>
single-step should still work, you just wont have source to back it up
<clever>
and you can always `objdump -d` to see the assembly
<mcrod>
right i’ve done all of that
<clever>
look at the pc and assembly, run the opcode in your head, single-step, did it go where you expected? repeat
<mcrod>
wellll
<mcrod>
i call puts(), which is actually _puts_r(), which calls _sinit(), then it promptly dies on an ldr instruction
<clever>
what is the exact ldr instruction?
<mcrod>
ldr r3, [r4, #100] @ 0x64
<clever>
and what is the value of r3 and r4?
<mcrod>
no clue
<clever>
`info registers` in gdb
<clever>
ah yes, and r3 is the dest register, so that one doesnt matter
<mcrod>
it is bad
<mcrod>
0x89ab83f8
<mcrod>
for r4
<clever>
now check the disassembly, where did it get r4 from?
<zid>
now go in reverse and see where it came from, fun
<sham1>
Not having the source code for the toolchain feels weird
<mcrod>
blame ARM
<zid>
I've never looked at the gcc or binutils source though?
<mcrod>
that’s why i’m quite annoyed
<mcrod>
i’m waiting for zi- yep
<clever>
source alone doesnt fully help, you also need debug info that maps addr to source line
<mcrod>
anyway
<clever>
it can sometimes be simpler to just throw it into a decompiler like ghidra or ida
<zid>
I used ida too much this week sorry, it's banned
<mcrod>
the next time r4 is referenced
<zid>
nikolar made me do a ctf
<zid>
not next time, mcrod
<zid>
previous time
<mcrod>
yes that’s what i mean
<mcrod>
the previous instruction is ldr r4, [r0, #8]
<mcrod>
note that we are VERY close to the top of the function
<mcrod>
only a few instructions down
<zid>
yea that's natural, and good
<zid>
it's likely just a bum arg
<zid>
puts(0x89ab83f8) likely
<mcrod>
puts(“hi”); doesn’t seem like it
<clever>
mcrod: which function exactly are you in?
<zid>
it does if your linker script is fucky
<mcrod>
_puts_r
<mcrod>
it might be
<mcrod>
thankfully i have a wonderful map file
<zid>
I would probably quickly cheat and make sure the arg register isn't loaded with 0x89.. before puts is called, it's unlikely to have come from anywhere else
<clever>
mcrod: what does `bt` report? can you pastebin the entire `objdump -d foo.elf` ?
<mcrod>
no
<mcrod>
it’s not that i don’t want to
<mcrod>
it’s work related :p
<clever>
just the bt to start with?
<mcrod>
bt is literally
<mcrod>
0x00000120: main
<mcrod>
0x00000334: _puts_r
<zid>
if it doesn't do ldr r4, =0x89sjdsd; bx puts
<zid>
I will be surprised
<mcrod>
sig handler called
<zid>
bblxlrlxlr
<clever>
if you `objdump -d foo.elf`, how sensitive is the body of main? can you pastebin just that part?
<mcrod>
irq_cm3_HardFault
<mcrod>
i. cannot. pastebin.
<clever>
do you see a `bx _puts_r` in the body of main?
<mcrod>
standby
<zid>
oh I guess it'd be ldr r0 or whatever
<clever>
zid: yep
<zid>
what's [r0, #8], I know arm likes to decorate things
<zid>
so that could be a shift or an index or anything
<clever>
zid: thats just r0 + 8, as the src addr
<zid>
ah index
<zid>
is r0.. the stack pointer?
<clever>
i already had to google it earlier, for the #100
<clever>
r0 is the first argument to a function
<mcrod>
i see
<zid>
what else would it be doing indexed deref on