IRC channel logs
2024-04-07.log
back to list of logs
<azert>solid_black: as a real dumbass I just tried to boot your kernel on an arm64 board just to see what would happen. Of course it didn't work, but would you like to check my steps to see if there is not anything obviously wrong? <azert>solid_black: when you will be available, I will have a few questions <gnucode>sneek: later tell azert did you try to boot an AArch64 Mach on an AMD processor? That's daring my friend! <sneek>azert, gnucode says: did you try to boot an AArch64 Mach on an AMD processor? That's daring my friend! <azert>gnucode: it’s a allwinner a64. Basically an Arm Cortex-a53 <anatoly>azert have you prepped device tree, I'd guess this is a requirement. I also wonder if it's possible to re-use device trees from linux? <azert>anatoly: what I did is to recompile u-boot myself and use the device tree for my board in the u-boot source tree compiles during the process <azert>u-boot alone is a singular beast of a boot loader, I don’t think we should look into Linux as much as u-boot <azert>I was thinking of driving some leds to the gpio for debugging, or perhaps using memory to store info that will persist after the cpu reset <azert>I cannot envision any other ways at this point <azert>I’ve some notes on what I did, please tell me if you are interested in sharing. I’m not making them public because nothing works so far <azert>My intuition is that the fancy el2 and el3 should come handy for debugging <azert>But I cannot find anything useful on the net <azert>Maybe is just Google that felt in the shitter and me refusing to use the other alternative <azert>There should be a standard way to do kernel debugging on arm from the higher execution levels, eg attaching some sort of gdb or at least proprietary debugger up there <anatoly>azeem: I don't have an unused aarch64 board but I have another board with armv7 <azert>anatoly : did you ever do debugging with that? <anatoly>azeem: can you please tell me more about u-boot and device trees? How does is end up in mach? <azert>I just followed solid_black indications <azert>You boot gnumach exactly like Linux <azert>Im afraid arm is playing some dark cabal game with the debugger, keeping basic info secret <azert>I’ll just go with leds. In theory once the IO layer is initialized one can proceed with the in-kernel debugger <anatoly>azert: won't it look like some in-kernel support for debugging (kdbg) then jtag and then gdb on the other side <azert>anatoly: for that you need the IO to be working <azert>It would be obvious that Arm had some lower lever debuggers available since they force you into an useless firmware already, but it don’t <azert>Even Linux allows you to kgdb only after the IO layer is initialized <azert>When I buy a PIC processor I get an assembler and a debugger for free, that’s the bare minimum you would expect from an embedded processor <azert>It has been like that from at least the late 90s <azert>i know that these cortex-a were meant to run phones but then why even selling dev boards at all? <azert>Sorry for the rants , but even grub pales as « simple » if compared to u-boot. U-boot is a full os that don’t even have the decency to include a tool like MSDoS debug to help you <azert>Ok, I figured out that the a64 has a jtag interface, but has been hijacked by the board developer for things like the power button <sneek>Welcome back solid_black, you have 1 message! <sneek>solid_black, gnucode says: that I am about to submit a qoth for Q1 of 2024. I would like to mention for Alpine Hurd distribution. Do you have a git repo somewhere? <solid_black>this is really cool that you're experimenting with it <azert>Thing is that I’m struggling in figuring out how to debug <solid_black>so the thing that is "obviously wrong" is that you're setting size cells / address cells to 2, and then only passing a single cell <solid_black>however, looks like it crashes before it gets to parse that <azert>Do you know where it crashes? <solid_black>your u-boot says it moved the image from 0x42000000 to 0x40000000 <azert>Yes it does move stuff for no reasons <azert>I think I can, it’s your branch latest commit btw <solid_black>it's not, let me push the latest & greatest ~~bugs~~ fixes and performance improvements <azert>Ok I’ll try your latest and come back to you <solid_black>I don't know how one would debug on bare metal, something something JTAG but I have no idea what it is or does <solid_black>we're not going to have Linux-like kgdb, at least for foreeable feature <solid_black>but that's not going to help you if it crashes early <azert>I know, probably leds can halp <solid_black>also one possible reason for why it would crash: it tries to write things to UART, and UART is not yet wired up super properly <solid_black>it will read the base address from the device tree, but who knows whether that worked in your case <azert48>If it crashed at that point, I’d be super happy <solid_black>so, 0x84 is "adr x1, .boot_stack_end" in my build, it shouldn't be anything different in yours since we're still in pure asm at this point <solid_black>any idea whether your board gets entered at EL{1,2,3}? <azert48>How can loading a register fault with this? <solid_black>no, what happens is we eret with an apparently invalid execution state indicated in spsr, and doing that works in a funny way <solid_black>it does eret, but keeps the execution state / el as is, and sets the IL flag in PSTATE <solid_black>so the very next instruction traps with ESR_EC(esr) = ESR_EC_IL <azert48>That would mean it didn’t start in el1 <solid_black>i.e. you can set IL for yourself with thread_set_state <solid_black>and I have a special-purspoe exception code just for this <solid_black>so what we have is 1. gnumach doesn't think it's entered at el1 <solid_black>let's try to remember / figure out what that ((0x7 << 6) | (5 << 0)) value stands for <solid_black>I think I stole it from some online guide back when I was writing that part <solid_black>but now that I have a much better understanding (& definitions) of all the SPSR bits, we should be able to break it down <solid_black>look at aarch64/aarch64/pcb.c for the bit definitions <solid_black>it would be a good idea to mask A (& D?) too at this point <solid_black>it could be some stupid "secure"/"non-secure" thing, does your hardware have it? <solid_black>it also could be that RW in either SCR_EL3 ot HCR_EL2 indicates AArch32 for EL1 <azert>But I’m not sure you can set breakpoints <solid_black>can you please see the values of SCR_EL3 and HCR_EL2? <solid_black>"ATF will then drop into U-Boot proper (in EL2)" => sounds like u-boot and gnumach are entered at EL2 <azert>Yes I can do this it will take me some hours or days <solid_black>I'm planning to teach Mach to be able to run in EL2 as well, for virtualization <solid_black>anatoly: I could tell you more about device trees if you're interested :) <biblio>solid_black: still no new update yet. <solid_black>azert: well, I'm dump, we have the full register dump, so we can see what EL that was in by looking at x4 <solid_black>x5 is 0x40000084 (where we eret'ed) and x6 is 0x1c5 (what we tried to set as CPSR), as expected <solid_black>so this must be HCR_EL2.RW being 0, which might just be the reset value <solid_black>see, we got useful info even from the little information that we had :) <anatoly>solid_black: not doing :-D haven't spent much time on it <solid_black>well, and I haven't spent much time on the alpine thing, so I guess we're even :D <anatoly>solid_black: does u-boot read dtb file? <anatoly>and then passes it to linux, for example <solid_black>though I'm not sure if it uses it for much, since you configure u-boot for a particular board at build time <solid_black>yes, it loads it, it can print / modify nodes, and eventually it passes the tree on to the kernel that it boots <anatoly>I now remembering that dtb files are under u-boot boot difrectory in armbian. I was "patching" board's dts to change mode of otg usb port <anatoly>solid_black: re. "alpine" hurd, have you done more changes locally? <solid_black>I don't remember if I pushed this, but they have done some changes to drop more libc abstractions <solid_black>basically to hardcode musl in more places instead of "libc" <solid_black>but that's a couple of months old by this point anyway, would need another rebase <anatoly>I need to finish my little shell script and push stuff as well <anatoly>that script will produce disk image within container, so basically run build container, then run script in container and you'll get an image to play qemu <solid_black>another important TODO item that started hacking on back then is netdde <solid_black>I should push my changes, and then it'd be great if you could look into netdde <solid_black>if I had 100x more time, I'd make us an awesome new dde, one that'd be Hurd-native from the start and would run modern linux drivers and not some ancient broken code <anatoly>also would damo's feature for xattrs help to build an image for qemu for example? <solid_black>you'd have to write a separate script to set static translator records with xattrs from Linux <solid_black>I wanted to run my ideas/understanding for how virtualization would work by you <anatoly>as I understood doing it the current way depends on hurd-specifics in ext2 which are not available on non-hurd environments but using xattrs solves this issue <youpi>(I advise not to try to stuff the linux network stack in a translator, linux is way too volatile for a stable approach, see how netdde was supposed to be maintained, but nobody took the time to, while the bsd drivers should be really little problem) <anatoly>solid_black: isn't it rump-stuff replacing netdde in future? <youpi>rump seems a simpler way to get something now <solid_black>youpi: so, I've been reading about VHE, which is an arm v8.1 feature that makes it possible to run geenral-purpose kernels (and not special hypervisors) in EL2 <youpi>what is EL2? something like the intel rings? <solid_black>EL2 is hypervisor, or, with VHE, the host kernel, EL1 is guest kernel then <youpi>ok, so sort of hypervisor level <solid_black>you'd use all the existing vm_ APIs to manipulate what the VM sees as its physical memory <solid_black>but it can run into exceptions, and Mach will send them off to the exception port, as usual <youpi>how there's a difference between the syscalls from guest kernel and from guest guest <solid_black>QEMU, or whatever hypervisor, would catch the exceptions, and do whatever it wants <solid_black>guest guest's syscalls go to guest's kernel, that traps to EL1, not EL2 <solid_black>but when the kernel tries to write to an MMIO address to output something via UART for example, <solid_black>well, the hypothetical QEMU port would ensure that there is no VM mapping at that "physical" address <solid_black>so the VM task will get EXC_BAD_ACCESS => QEMU catches that implements the UART write <solid_black>anatoly: I looked into virtio and decided it's too complicated to be implemented in Mach, let's leave that to userland <solid_black>oh, the guest kernel can do "hypercalls" (HVC instruction) <solid_black>but Mach wouldn't really implement any semantic for them <solid_black>it will just make them into exceptions and let qemu do whatever it wants to do with them <solid_black>from what I've seen so fat, PSCI is available via HVC <solid_black>PSCI includes stuff like "start that CPU", "stop that CPU", "shut down the whole system" <solid_black>starting/stopping vCPUs is of course just thread_resume() / thread_suspend() that qemu would do <solid_black>this all makes sense to me nicely, and I've read that newer nanokernels do things that way <solid_black>because a nanokernel and a hypervisor is basically one and the same, and that makes perfect sense now too <solid_black>does Intel vt-x also let you implement something like this? <youpi>though quite involved since the x86 semantic is stuffy <solid_black>the big difference between this plan and KVM / Hypervisor.framework is with them, the guest runs inside your thread synchronously IIUC <solid_black>i.e. you do kvm_run(), and it runs in your thread until a "vm exit" happens <youpi>isn't that what you propose too? <solid_black>whereas in this model, you make a separate task and a thread within it, and resume that <solid_black>and a "VM exit" is a Mach exception that's delivered to you <solid_black>you can block on incoming exceptions from the VM of course <youpi>I don't know how userland kvm-qemu tells the kernel about the guest pagetable etc. <youpi>but creating a task looks a very sane way to do it <solid_black>i.e. they have a special-purpose API to map things into the guest's address space <solid_black>and a special-purpose API (that is not ptrace) to access vCPU registers <solid_black>whereas we'd just have a task with an vm_map, and threads with PCBs, with all the usual APIs <solid_black>and not just user-level APIs, that'd literally be the implementation <solid_black>all the things like paging out/in VM's "physical" memory will work in the natural way too <solid_black>and to load a blob (a kernel, a dtb, ...) into guest's memory, you'd just vm_map it there from a file on the host <solid_black>that lets you make it look to the guest kernel in EL1 that it's running in EL2 <solid_black>but supporting that requires a bunch of tricky code, so I'm not implementing that any time soon <solid_black>we drop down to EL1 (enabling AArch64 for EL1 first, that was the bug), and keep booting there <solid_black>I've also written the experimental VHE / E2H code path <solid_black>we start booting in EL2 w/ E2H, but die on user memory access <pavlx>Is it possible to see a picture of Gnu/hurd that is running on the computer? <solid_black>"When the value of this PAN state bit is 1, any privileged data access from EL1, or EL2 when HCR_EL2.E2H is 1, to a virtual memory address that is accessible to data accesses at EL0, generates a Permission fault" <solid_black>that sounds like accessing EL1-accessible memory from EL2 doesn't cause a PAN fault <solid_black>as in, all the way, with a Hurd starting and a Unix PID 1 running as expected in userland <solid_black>azert: the last commit may breaks things again for you, please try reverting it if it does <azert>solid_black: I think your plan for virtualization is super sweet <azert>I’ll try your fixes soon and tell you if I need to revert the last commit and where we arrive in both cases <azert>solid_black: tried, in both cases the system reboots without any outputs <azert>hinting to the fact that probably your exception handlers has been installed over the ones of u-boot <azert>which I interpret as good news <azert>although I was hoping for the same output.. <azert>could you tell me where the exception handlers are installed in the source? <solid_black>azert: the exception handlers are installed by aarch64/aarch64/locore.S:load_exception_vector_table() <solid_black>which is called from aarch64/aarch64/model_dep.c:c_boot_entry() <azert>thanks, i'll try to skip that <solid_black>which would mean that we succeed in enabling the MMU <solid_black>the most likely reason for you not seeing any output is that gnumach doesn't pick up your uart <azert>no I think it's another uart <azert>it's not described as a pl011 in the device tree <solid_black>and when you try to do that, you'll discover that we don't have a nice story about dynamically dispatching to the right uart <azert>problem is that i'm not sure we arrive there, I was hoping that if I disable the interrupt handlers I'll discover if it dies accessing it <solid_black>if you do know your uart's base address, you can insert debug prints here and there <solid_black>because right now it's in the "my code is working, but I've no idea why" state <azert>I'm sure it's not trivial tech <solid_black>whereas for the last month or so, I did have the feeling that I know exactly what each individual instruction does <azert>did you check if booting in el1 still works? <solid_black>so yeah, point is, I need to look more into VHE / E2H <solid_black>no reason for it not to work, but sure, let me check <pavlx>Good evening, i go to take my dinner here in Italy, have a good day at all <azert>solid_black: it arrives to somewhere in boot_script_exec and dies there <azert>I think it is useless to debug the way I'm doing further, since I'd say everything works <azert>I'd like to port/implement the serial driver, where should it be plugged in? where is it called? <azert>is it in device_service_create ?