00:32
dolphinana has quit [Remote host closed the connection]
00:32
dolphinana has joined ##raspberrypi-internals
00:48
dolphinana has quit [Quit: Leaving]
02:21
jn has quit [Ping timeout: 256 seconds]
02:22
jn has joined ##raspberrypi-internals
02:22
jn has joined ##raspberrypi-internals
02:22
jn has quit [Changing host]
05:49
Stromeko has quit [Ping timeout: 256 seconds]
05:55
Stromeko has joined ##raspberrypi-internals
08:29
bonda_000 has joined ##raspberrypi-internals
08:31
<
bonda_000 >
clever u here?
08:38
f_ has joined ##raspberrypi-internals
10:36
<
clever >
bonda_000: morning
10:37
<
bonda_000 >
Good morning
10:38
<
bonda_000 >
so it looks like Minix has its own cut-down version of libc in the source files
10:39
<
clever >
thats what i expected
10:40
<
bonda_000 >
but I also tried to build full glibc-2.39 with these flags
10:41
<
bonda_000 >
../glibc-2.39/configure CC="vc4-elf-gcc" LD="vc4-elf-ld" --prefix=/home/pi/Downloads/libc-vc4-minix --exec-prefix=/home/pi/Downloads/libc-vc4-minix --with-headers=/home/pi/Downloads/minix/minix/include/minix --with-binutils=/home/pi/Desktop/vc4-toolchain/prefix/bin
10:42
<
bonda_000 >
and it says
10:42
<
bonda_000 >
*** These critical programs are missing or too old: GNU ld gawk
10:42
<
bonda_000 >
*** Check the INSTALL file for required versions
10:43
<
clever >
is gawk installed? what does "which gawk" say?
10:43
<
bonda_000 >
says "no"
10:43
<
clever >
then that wouldbe why its complaining about gawk
10:44
<
bonda_000 >
ok I thought its part of binutils its installing now
10:44
<
clever >
nope, gawk is part of gawk
10:44
<
bonda_000 >
checking version of vc4-elf-ld... 2.23.51.20121030, bad
10:45
<
bonda_000 >
*** These critical programs are missing or too old: GNU ld
10:45
<
clever >
the vc4 binutils is based on a fairly old fork of binutils
10:45
<
bonda_000 >
I should get older glibc then?
10:45
<
clever >
you have a few choices
10:45
<
clever >
1: rebase the changes on a newer binutils
10:45
<
clever >
2: use an older glibc
10:45
<
clever >
3: use the minix libc
10:45
<
clever >
4: fix the new glibc to use an older ld
10:53
<
clever >
3 has the best changes of actually working with the minix kernel
10:55
<
bonda_000 >
yeah that's what I'm thinking too
10:55
<
bonda_000 >
it says the default compiler to use is LLVM clang but gcc is also an option
10:56
<
clever >
there is another repo (i lost the link) that adds vc4 support to llvm
11:58
jcea has joined ##raspberrypi-internals
12:15
bonda_000 has quit [Ping timeout: 268 seconds]
12:20
bonda_000 has joined ##raspberrypi-internals
12:20
<
bonda_000 >
and the task switching is done using that SystemTimer you told me earlier?
12:27
<
clever >
bonda_000: timer and other irq handlers
12:27
<
clever >
if the system is idle for example, and something arrives on the uart, you want to switch to a task that was waiting for the uart
12:27
<
clever >
the timer just helps when multiple things want cpu, and forces them to share
12:41
<
bonda_000 >
yep I'm just looking at the decompile and there's a function
12:42
<
bonda_000 >
systimer_init
12:42
<
bonda_000 >
it's tricky somewhat, branches to core_get_func_table
12:43
<
bonda_000 >
then the return value of that, stores to gp+0x405054
12:43
<
bonda_000 >
actually does
12:43
<
bonda_000 >
bl memset
12:43
<
bonda_000 >
bl core_get_func_table
12:47
<
clever >
bonda_000: i do have working context switching in little-kernel, which is probably easier to understand then the decompile
12:52
<
bonda_000 >
what file is it
12:53
<
clever >
bonda_000: when the LK thread core (in upstream LK) wants to context switch, it calls arch_context_switch
12:53
<
clever >
that may print a bunch of debug, then it calls vc4_context_switch
12:54
<
clever >
lines 7-9 saves the full cpu state to the stack
12:54
<
clever >
lines 12-19 swaps the stack pointer, saving the old one to the old thread, and restoring the new one from the new thread
12:54
<
clever >
lines 22-30 then restores all of the state that was saved to the new stack
12:55
<
clever >
so basically 7-15 saves the current state
12:55
<
clever >
and whenever this thread gets resumed and the sp restored, 22-31 restores it back
12:55
<
clever >
new threads work via arch_thread_initialize, which creates a fake "saved state"
12:56
<
clever >
so you can then restore that, the first time you enter the thread
13:00
<
bonda_000 >
is that like setjmp?
13:00
<
clever >
not entirely, setjmp doesnt change the stack
13:01
<
clever >
here is setjmp
14:30
<
bonda_000 >
I'm writing the hardware glue for Minix RS232 but they seem to be using UART interrupts
14:30
<
bonda_000 >
which I've seen hermanhermitage never does that in his dump programs but just spins in a polling loop
14:30
<
bonda_000 >
is lk using UART interrupts or you also sit in a busy loop waiting?
14:31
<
clever >
bonda_000: its using the PL011 uart with irq support
14:33
<
bonda_000 >
are you using the FIFO or reading writing char by char?
14:33
<
clever >
there is both a hw and sw FIFO
14:33
<
clever >
the hw FIFO allows small bursts and lets the irq be a bit late
14:34
<
clever >
the sw FIFO then greatly expands the buffer
14:34
<
bonda_000 >
it says mini uart has 8 char fifo
14:34
<
clever >
but i'm using the PL011 uart
14:34
<
bonda_000 >
I know but is the buffer much bigger there?
14:34
<
clever >
i think it has a 16 char fifo
14:35
<
bonda_000 >
the code here seems to be written for a 16565 UART like the mini uart uses on bcm
14:37
<
bonda_000 >
I just replace the TI stuff with Broadcom stuff
14:37
<
bonda_000 >
this minix arch is for BeagleBone
14:40
<
bonda_000 >
has some kind of "oxoff" stop byte
14:40
<
bonda_000 >
to let it know when to stop receiving
14:40
<
bonda_000 >
is that the "Enter" keystroke?
14:41
<
bonda_000 >
then what is it
14:41
<
clever >
xon and xoff are dedicated bytes
14:41
<
clever >
i forget what they are, but google should know
14:42
<
clever >
its part of software flow control
14:43
<
bonda_000 >
yes it looks like a state machine here
14:43
<
clever >
hw flow control is far simpler and much better
14:43
<
clever >
when the receive fifo is full, the hw flow control tells the remote end to stop, entirely under hw control
14:44
<
clever >
and if the remote end is configured for hw flow control, it will stop transmiting
14:44
<
clever >
so you never lose a byte, and the fifo is always kept full
14:44
<
clever >
but that comes at the cost of needing 4 wires, rx/tx, and rts/cts
14:45
<
bonda_000 >
yeah i've seen that on the datasheet
14:45
<
clever >
sw flow control is just the receiver sending a special xoff byte, to tell the remote end to stop
14:45
<
clever >
but that can only happen inside the irq handler
14:45
<
clever >
and then the remote end has to receive that, and stop sending data
14:45
<
bonda_000 >
that's what I'm doing right now the irq handler
14:45
<
clever >
and what happens if there is already 16 bytes in the remote tx fifo?
14:45
<
clever >
it cant just stop on a dime
14:46
<
bonda_000 >
well I have no idea the chip on the uart dongle I have is some silabs
14:47
<
bonda_000 >
is that important?
14:47
<
clever >
you mentioned it only has 3 wires, so it cant do hw flow control
14:47
<
bonda_000 >
I thought minicom handles
14:48
<
bonda_000 >
this stuff for me the fifos
14:48
<
bonda_000 >
if minicom does it in timely manner read its fifos at the agreed baud rate
14:49
<
bonda_000 >
then there shouldnt be a problem no?
14:49
<
clever >
that only handles the vpu->minicom direction
14:49
<
bonda_000 >
unless my remote is dead
14:49
<
clever >
if minux cant read the fifo fast enough, for whatever reason, then you start droping bytes in that direction
14:50
<
bonda_000 >
yeah so the irq should be handled fast enough
14:51
<
bonda_000 >
there is also a lot of typos in the mini UART register map
14:51
<
bonda_000 >
IIR and IER they mixed up descriptions
14:51
<
bonda_000 >
AUX_MU_IIR and AUX_MU_IER which one is which
14:52
<
bonda_000 >
I'm copying the code from hermanhermitage that worked but its hard to read that section
14:52
<
bonda_000 >
the bits
14:54
<
clever >
i would just get the official PL011 uart docs from ARM
14:54
<
bonda_000 >
there's less coding for me all the register names here are same to what BCM mini uart has
14:55
<
bonda_000 >
I still have to do the rest of it like figure out the threads and where are the two cores after the bootrom
14:57
<
clever >
there are many issues with the mini-uart
14:57
<
clever >
so i would recommend, just forget it exists
14:57
<
clever >
the only reason you need to even consider it, is when you get around to bluetooth support
14:58
<
bonda_000 >
serial_in(rs, OMAP3_LCR); this is the only type of line I'm replacing, the second arg should be the BCM AUX_MU register offset
14:59
<
bonda_000 >
and comment out this line offset <<= rs->reg_offset; from serial_in() and serial_out() functions
15:00
<
bonda_000 >
idk there seems to be a lot I don't where exactly to start
15:04
<
clever >
bonda_000: i think you should start by getting more familiar with the hw first, and ignore minix for now
15:11
<
bonda_000 >
yeah the grey box AUX_MU_IIR_REG and AUX_MU_IER_REG are mixed up
15:11
<
bonda_000 >
in the datasheet
16:10
<
bonda_000 >
what does this in arm do
16:10
<
bonda_000 >
ldm r9, {r0-r7}
16:10
<
bonda_000 >
how is it going to load 38 bytes into a 4 byte register
16:12
<
bonda_000 >
oh I see now
16:12
<
bonda_000 >
its like a stack pointer in r9
16:48
bonda_000 has quit [Ping timeout: 260 seconds]
17:36
bonda_000 has joined ##raspberrypi-internals
17:46
<
bonda_000 >
is this the ldm analogue of vc4?
17:47
<
bonda_000 >
v32ld H32(0x0,0x0),(r1)
17:49
<
clever >
bonda_000: thats a vector load, it will get an entire uint32_t[16] from the addr in r1, and load it to 0,0 in the vector registers
17:49
<
clever >
only other vector opcodes can then interact with it
17:53
bonda_000 has quit [Read error: Connection reset by peer]
17:55
bonda_000 has joined ##raspberrypi-internals
17:58
<
clever >
bonda_000: i would just use normal memcpy if all you want is to copy things
17:59
<
bonda_000 >
I just can't see the ldm instruction
17:59
<
bonda_000 >
in the decompile
17:59
<
bonda_000 >
I see stm used as push
17:59
<
bonda_000 >
but yeah I have memcpy
18:02
<
bonda_000 >
nvm found it
18:02
<
bonda_000 >
0ed021a8 23 02 ldm r6-r9,(sp++)
18:03
<
bonda_000 >
that's gonna load 16 bytes from where sp points at and increment it
18:03
<
clever >
looks like a normal ldm to pop from the stack
18:03
<
clever >
i count 32 bytes
18:04
<
clever >
r6, r7, r8, r9, 4 registers, 32bits(4bytes) each, thats 16
18:04
<
clever >
i somehow got an 8 in my math, lol
18:04
<
bonda_000 >
do u know what's 'lea?
18:04
<
clever >
load effective address
18:04
<
bonda_000 >
it's all over the place is it also some kind of load?
18:05
<
clever >
you use it like `lea r1, _start`
18:05
<
clever >
and the assembler/linker will store the relative offset between that opcode and _start
18:05
<
clever >
the cpu will then add that offset to PC to get the address of _start
18:05
<
clever >
and put the addr of _start into r1
18:06
<
bonda_000 >
so its a pseudo instruction?
18:06
<
clever >
its a pc-relative thing
18:07
<
bonda_000 >
in ARM we park cores1,2,3 and let core0 go into the kernel
18:07
<
bonda_000 >
and then send the message where's the entry point for parked cores
18:07
<
clever >
VPU core1 is already parked when things start
18:07
<
bonda_000 >
so you end up in bootcode with just one core?
18:08
<
bonda_000 >
and then how do you un-park it
18:09
<
bonda_000 >
I've seen 64-bit instructions had that
18:09
<
bonda_000 >
from the VCIV manual
18:09
<
clever >
/home/clever/apps/rpi/rpi-open-firmware/common/broadcom/bcm2708_chip/intctrl1.h:#define IC1_WAKEUP HW_REGISTER_RW( 0x7e002834 )
18:09
<
bonda_000 >
IC1 thats interrupt controller 1?
18:10
<
clever >
IC1 is just the name of core1 in some places
18:10
<
clever >
there it is
18:10
<
clever >
line 76 puts the top of the stack into a global variable, line 78 sets the 2nd core loose, at the core2_start function
18:11
<
clever >
core2_start then loads that global variable into sp, and jumps to core2_entry
18:11
<
clever >
which then just starts counting like mad and never exits
18:11
Herc has left ##raspberrypi-internals [Leaving]
18:13
<
bonda_000 >
btest r0, 0x10
18:13
<
bonda_000 >
version r0
18:13
<
bonda_000 >
I thought thats the hardware identifier
18:13
<
clever >
one bit of the version register is the core id
18:14
<
clever >
so core0 and core1 have slightly different identifiers
18:14
<
bonda_000 >
mine read 0x04000140h
18:14
<
bonda_000 >
other bits are also useful?
18:14
<
clever >
undocumented
18:15
<
bonda_000 >
it often compares version to 0x10000
18:16
<
clever >
that might be the core bit
18:29
<
bonda_000 >
so I guess vc4 has nothing like ARM's ldm r9, {r0-r7}
18:29
<
bonda_000 >
that would load 32 bytes from r9 points at and fill r0 through r7 with those bytes
18:30
<
bonda_000 >
all the ldms I see are stack-related
18:30
<
bonda_000 >
0000 0010 0bbm mmmm ldm rb-rm,(sp++) Load registers from stack (highest first).
18:30
<
bonda_000 >
0000 0011 0bbm mmmm ldm rb-rm,pc,(sp++) Load registers from stack and final value into pc.
18:30
<
bonda_000 >
0000 0011 1bbm mmmm stm rb-rm,lr,(--sp) Store lr followed by registers onto stack.
18:30
<
bonda_000 >
0000 0010 1bbm mmmm stm rb-rm,(--sp) Store registers to stack (lowest first).
18:30
<
bonda_000 >
- rb is r0, r6, r16, or r24 for bb == 00, 01, 10, 11.
18:30
<
bonda_000 >
- rm = (rb+m)&31
18:30
<
bonda_000 >
If sp is stored, then the value after the store is stored.
18:30
<
bonda_000 >
If mmmmm is 31 and pc/lr are stored/loaded, then no register
18:30
<
bonda_000 >
but pc/lr is stored/loaded ("stm lr/ldm pc"). The same
18:30
<
bonda_000 >
applies at least to "stm r24-r7, lr, (--sp)".
18:30
<
clever >
just use normal ld several times
18:30
<
bonda_000 >
0000 010o oooo dddd ld rd, (sp+o*4) Load from memory relative to stack pointer.
18:30
<
bonda_000 >
0000 011o oooo dddd st rd, (sp+o*4) Store to memory relative to stack pointer.
18:30
<
bonda_000 >
0000 1ww0 ssss dddd ld<w> rd, (rs) Load from memory.
18:30
<
bonda_000 >
0000 1ww1 ssss dddd st<w> rd, (rs) Store to memory.
18:30
<
bonda_000 >
0001 0ooo oood dddd add rd, sp, o*4 rd = sp + o*4
18:30
<
clever >
also, your flooding again
18:30
<
bonda_000 >
0001 1ccc cooo oooo b<cc> $+o*2 Branch on condition to target.
18:30
<
bonda_000 >
0010 uuuu ssss dddd ld rd, (rs+u*4) rd = *(rs + u*4)
18:31
<
bonda_000 >
0011 uuuu ssss dddd st rd, (rs+u*4) *(rs + u*4) = rd
18:31
<
clever >
that message took 30 seconds to go thru
18:31
<
bonda_000 >
you said it doesn't handle too much text
18:31
<
bonda_000 >
ld<w> rd, (rs)
18:32
<
clever >
you can just do `ld r0, (r1+4)` for example, i believe
18:32
<
bonda_000 >
the <w> I have no clue what that is
18:32
<
clever >
the width of the load
18:32
<
clever >
8/16/32 bits
18:32
<
bonda_000 >
ld8 ld16 or ld32?
18:33
<
clever >
ldb is byte, 8 bits
18:33
<
bonda_000 >
ldh is 16bits?
18:34
<
clever >
whenever i'm in doubt, i just ask gcc to compile something for me
18:34
<
bonda_000 >
I tried compiling helloworld from vc4-toolchain and objdump showed me pretty much nothing
18:35
<
bonda_000 >
on a Pi
18:35
<
clever >
what did it show?
18:39
<
clever >
and what was the input source? how did you make helloworld.o ?
18:46
<
bonda_000 >
do you have vc4-toolchain folder?
18:47
<
clever >
i just build things with nix, i have all of the needed tools in $PATH
18:47
<
bonda_000 >
vc4-toolchain has helloworld.c
18:47
<
bonda_000 >
I compiled that
18:47
<
bonda_000 >
with vc4-elf-gcc
18:48
<
bonda_000 >
althoug not sure how if printf is a part of libc which you said vc4-toolchain doesn't have
18:49
<
clever >
try just `vc4-elf-objdump -dr helloworld.o`
18:53
<
clever >
yep, now it works fine
18:53
<
bonda_000 >
but you said
18:53
<
clever >
line 9, it pushes the link register to the stack
18:53
<
bonda_000 >
it doesnt have libc
18:53
<
clever >
line 10, it loads the addr of the string in .rodata
18:53
<
bonda_000 >
so how does it know about printf
18:53
<
clever >
vc4-toolchain includes newlib, which is a libc that has partial vc4 support
18:53
<
clever >
but its a mess, and ive stopped using it
18:54
<
clever >
LK includes its own libc
18:54
<
clever >
line 11 of the gist, tells the linker to shove in a 32bit addr of the string from .rodata, so that 0x0 isnt the truth
18:55
<
clever >
line 12 is a branch&link to a function
18:55
<
clever >
line 13 says to fill in the address of puts, the compiler got sneaky, and realized your not using any printf features
18:55
<
clever >
line 14 is setting the return value to 0, and 15 returns
18:56
<
clever >
adjust helloworld.c to do things, like loading a 16bit value from memory, compile again, and gcc will answer your questions
19:01
<
bonda_000 >
okay I will try
19:02
<
bonda_000 >
and what about the supervisor call? do you use it in your code?
19:02
<
bonda_000 >
minix tells me it uses user mode and supervisor mode Idk if I really need that since I'm the only user and supervisor
19:03
<
bonda_000 >
okay so you just go with whatever state the vpu is after the initialization?
19:03
<
clever >
but it does have user and supervisor
19:03
<
clever >
ive just not investigated the details of it
19:12
<
bonda_000 >
something wierd
19:13
<
clever >
you want to use `vc4-elf-gcc -c helloworld.c -o helloworld.o`
19:13
<
clever >
without -c, it will try to link a complete binary, and then it cant find things
19:56
<
bonda_000 >
if you try to compile it
19:56
<
bonda_000 >
do you see it's not storing lr
19:56
<
bonda_000 >
if there is an exception the original lr is lost
19:57
<
bonda_000 >
here it tells me:
19:58
<
clever >
bonda_000: leaf functions (those not calling another function) dont have to save the lr, because nothing modifies the lr
19:58
<
clever >
arm does the same thing
19:58
<
clever >
thats the same gist as before with no changes
19:59
<
bonda_000 >
the __user_copy_msg_pointer_failure
20:00
<
bonda_000 >
so if one of the pointers is bad, the exception handler will send me to __user_copy_msg_pointer_failure and I won't be going back to copy_msg_to/from_user
20:00
<
clever >
ive not looked into getting exceptions working on the VPU
20:00
<
clever >
so i dont know what is missing there
20:01
<
bonda_000 >
I mean it won't hurt if I push the lr on the stack
20:02
<
clever >
depends on what your doing,
20:02
<
clever >
youll need to understand the context better
20:07
<
bonda_000 >
I don't think its good that it compiled this way
20:07
<
bonda_000 >
without saving the lr
20:08
<
bonda_000 >
it saw there is no further function calls
20:08
<
bonda_000 >
and figured it's not necessary to save the lr
20:09
<
bonda_000 >
also the memory map
20:09
<
bonda_000 >
VPU has no MMU but that doesn't mean we can't do it in software right?
20:17
<
clever >
bonda_000: you could need to replace every load and store opcode with a function call
20:17
<
clever >
which would require re-writing all asm, and major overheads
20:18
<
clever >
at that point, your basically making an emulator
20:18
<
bonda_000 >
wish we knew how big these "SDRAM" partitions are in the alias tabl
20:19
<
bonda_000 >
and whats that other unnamed rectangle
20:19
<
bonda_000 >
in each of the four aliases
20:19
<
bonda_000 >
in blue
20:19
<
clever >
bonda_000: the sdram partition is up to 1gig in size, it depends on how big the ram is
20:20
<
bonda_000 >
say, 1GB
20:20
<
clever >
let me draw up a diagram....
20:21
<
clever >
read this while i draw one up...
20:27
<
clever >
basically, any access first goes thru that overlay layer, if its in one of those 3 windows, its a hit, and it does the listed thing
20:27
<
clever >
if its not in any of those 3 windows, it thru falls thru to the base layer
20:27
<
clever >
all 4 aliases in the base layer, refer to the same 1gig of ram
20:30
<
bonda_000 >
pvVar1 = (void *)rtos_malloc_priority(__size,0x20,1,unaff_lr | 0x80000000);
20:30
<
bonda_000 >
is what they do in vcos
20:31
<
clever >
thats just tagging the allocation with a return addr, so you can know what function to blame for the heap usage
20:31
<
clever >
its just a performance debug thing
20:32
<
bonda_000 >
well but with binary loading thing, kernel is gonna malloc memory for them to operate on
20:32
<
bonda_000 >
isn't that how it's done on the low level of OS
20:33
<
clever >
yeah, but lr doesnt have anything to do with that
20:33
<
bonda_000 >
and from the program side of view that memory should look contiguous
20:33
<
bonda_000 >
I understand
20:34
<
clever >
thats only possible if you have an mmu
20:34
<
bonda_000 >
it usually saves the lr though in the decompile. I think the program I wrote is just too trivial so no nested function calls it didn't bother saving the lr
20:36
<
bonda_000 >
but why do you call a software MMU an emulator
20:37
<
bonda_000 >
all it needs is add some arithmetic to each ld/st
20:37
<
clever >
the hardware doesnt support an MMU of any form
20:37
<
clever >
so you need to intercept every load/store
20:37
<
clever >
and there is no way to intercept just load/store
20:37
<
clever >
so you have to pass every single opcode thru software
20:37
<
clever >
and thats what an emulator does
22:03
bonda_000 has quit [Quit: Leaving]