IRC channel logs
2024-03-07.log
back to list of logs
<azert>What is the design plan in regard to the Device Tree? Will there be a DT server as there is an ACPI server? Will the two merge? <damo22>azert: a simple tree parser probably needs to live in gnumach so the irqs can be parsed <azert>damo22 : Hi! ok but then the info will need to be exposed to user space drivers somehow, right? <azert>The current acpi server has acpi_get_pci_irq <azert>If you have pci without acpi there is probably another way to get that info <damo22>it falls back to reading the pci_interrupt field of the pci space <damo22>i dont know how pci works on arm <youpi>damo22: why in gnumach? can't the userland drivers do it? <azert>If I understand correctly, most devices on arm boards are not even connected to a pci bus <biblio>demo22: I am trying to understand smp - I am running with smp 2. But rump kernel is getting stuck in - https://paste.debian.net/1309707/ (SMBus serial bus, revision 0x02) at pci0 dev 31 function 3 not configured. <damo22>youpi: the irq layout needs to be parsed from the DT <damo22>so that mach can provide irq device <youpi>ok but can't that happen in userland? <youpi>I mean, userland can tell mach what to do <damo22>uh, that would mean the first device cannot be used until the device tree server starts <azert>I also think userland would be the right place to parse a configuration file <biblio>damo22: is it possible to run smp without rump ? <youpi>ah, right, we don't know where console etc. live <youpi>well, you can, but you won't have a disk :) <biblio>youpi: ok. got it. Yap, I faced that issue already. <damo22>2 might be problematic because its isolated to 1 cpu in each set <youpi>damo22: why would that be a problem? <azert>youpi: I suspect even the memory layout is drafted in the DT <azert>It’s not a suspect, I’ve read about ir <biblio>damo22: at least some progress with -smp 4. getting "Give root password for maintenance..." <biblio>damo22: it worked :) I had to edit /etc/fstab also for wd0. <biblio>damo22: FYI. I am using latest gnumach master branch. <damo22>biblio: so you have everything running on cpu0, and cpu1-3 are sitting idle waiting for someone to use them <biblio>damo22: cat /proc/hostinfo showing "avail_cpus = 4 /* number of cpus now available */" <damo22>you need to assign a task to the slave processor set <biblio>damo22: fatal error: version.h: No such file or directory from $ gcc -o smp smp.c <biblio>damo22: stress: info: [803] dispatching hogs: 7 cpu, 0 io, 0 vm, 0 hdd <damo22>well -smp 4 will not like 7 hogs <biblio>damo22: so far 3 process 100% each approx <damo22>so the shell you spawned with ./smp /bin/bash has the 3 other cpus attached <damo22>so you can use it to test things with > 1 cpu <biblio>damo22: yes. I am checking output of top <biblio>damo22: which things are pending now. so far it is working with smp 4 <azert>Maybe Mach doesn’t really need a console in kernel space at all: it could implement a ring buffer a la Linux dmesg for debugging <damo22>biblio: sometimes there are crashes or hangs <biblio>damo22: only with smp 8 i faced panic debug <azert>Did you try to see if compilation with make -j 3 runs faster? <damo22>biblio: i tried compiling gnumach in parallel but it sometimes hangs or crashes and then the disk gets messed up <biblio>damo22: ok that i can try to debug. Before I could not even boot with smp. Now, at least I can test <biblio>azert: I have only gnumach inside this vm. now let me try <biblio>I am getting "configure: error: C compiler cannot create executables" inside smp /bin/bash. I will check later again. <azert>biblio: maybe you need to install build-essentials <biblio>azert: it is installed but after "/path/to/smp /bin/bash" it could not find c compiler. I will check later. <solid_black>the plan is to have basic support for it in gnumach, and then expose it to userland to handle the rest <solid_black>we need to handle device trees in gnumach, because that is how we learn anything at all about the system we're running on <solid_black>as I understand it (mainly from youpi's and damo22's explanations), on x86 (or rather "PC", what Mach calls "i386at") there are fixed, known things that you can just assume <solid_black>like physical memory being available starting from address 0x0, or a specific interrupt number having a fixed meaning (say, keyboard or timer) <solid_black>none of this is true in the arm world, platforms/boards wildly differ <solid_black>we don't know where the physical memory is, we don't know what interrupt controllers there are, we don't know what interrupt numbers mean to different devices <solid_black>the only thing we have is the device tree, passed by the bootloader <solid_black>and by reading the device tree, we learn of physical memory, interrupt controllers & interrupt numbers, timers, UARTs, etc <solid_black>my current plan is to have dirvers in Mach for basically those few devices I just listed -- timers, interrupt controllers, and UARTs, because that's what we need to run userland code (UART, to have an early console) <solid_black>and a lot of this is already done, certainly the device tree parser (which is a bit fun, because it needs to work prior to there being a kernel heap, so it must not allocate anything) <solid_black>and we're already using this (info read from the device tree) to initialize memory & UART & the interrupt controller (a single GICv2, so far) <solid_black>nothing is statically hardcoded, or at least that's the intention <solid_black>and for example, while the initial development is happening against qemu's "virt" machine, reading things from the device tree rather than hardcoding them is what allows Mach to start booting on (qemu's emulated) rasberri pi <solid_black>as for how specifically we'd export the device tree to userland: I was thinking we'd just make it into a Mach device, so you could device_open("dtb") and then device_map()/vm_map() it into your memory <solid_black>but one issue with that is while qemu's bootloader page-alignes the dtb, this is not a requirement <solid_black>meta note: if you're interested in this, or if you have any experience with baremetal arm, please do talk to me about this! preferably, over email (cc'ing bug-hurd) <damo22>solid_black: maybe device_open('dtb') can be done inside mach to access the devicetree <damo22>or at least the same code that parses it <solid_black>it's not really a device, it's just a blob of memory, I was just thinking of a Mach device as an API to let userland mmap it <solid_black>we don't need to additionally mmap it inside Mach, it's already in our memory <damo22>is there a way to expose it as a heirarchical json object? <solid_black>we'd probably want to expose it as a filesystem hierarchy (like Linux does), but that'd be a userland translator <damo22>i mean, you need to parse it twice <solid_black>we want to export the raw unparsed form from Mach to userland, just the DTB blob <damo22>once in gnumach to get the uart and irqs <solid_black>yes, actually more than twice probably, each driver would parse it <damo22>it seems wasteful to write two parsers <solid_black>it's not that bad though, it's not more complex than JSON <solid_black>and in-Mach parser has constraints on it as I've just said, it cannot use dynamic allocations to represent nodes for example <solid_black>so userland could use something a lot more convenient <solid_black>and we could try to use the existing dtb library in userland, too <damo22>i mean, we already do MADT and HPET parsing in gnumach and then ACPICA in userland <damo22>but that is because putting entire ACPICA into gnumach is dumb <damo22>but this dtb thing is just a json object <damo22>can we make a mach device that exposes a blob of json? <solid_black>but what would be an advantage of converting it to json, and then parsing json in userland, over just exporting the dtb as-is and parsing that? <damo22>eg rust has a json parser built in <solid_black>not built-in, you have to add a crate from crates.io <damo22>ok looks like you already wrote something <solid_black>I mean it, gnumach on aarch64 exists and boots and works <solid_black>the infamous 80% of the work that take 20% of the time are done, now we just have to deal with the remaining 20%, and progress is a lot slower <damo22>can you try to upstream your work <solid_black>and figuring out a way to manage interrupts that is not "virt" board specific is an important TODO <solid_black>there certainly are non-embedded arm systems, laptops, high-end servers, what have you <solid_black>I thought it was a point of pride that they supported every single architecture/platform under the sun <damo22>the main developer of rump is not contributing to netbsd currently <damo22>they dont even know if rump works on real hardware, they were suprised when i told them i was using rump to boot hurd off an actual disk <solid_black>if you really cannot keep the runner up, fine, let's disable it <solid_black>but we should not give them impression that we're unresponsive <solid_black>hm, so GICv2 supports both level- and edge-triggered interrupts <solid_black>also do I understand it right that in the irq handler on x86 we inherently know which irq number that was, without having to query the IC for the number, because it jumps to different handlers? <solid_black>where is the "irq flow" logic (mask/ack/unmask/eoi) implemented in i386 gnumach? <solid_black>so can we just mask that vector and never have to deal with them? <solid_black>why does that happen, though? in software, spurios wakeups happen when the blocking logic *cannot* tell if a waiter should be waken up or not, and leaves it to waiter's own logic to determine whether to consider the wakeup spurious and keep waiting, or to take action <solid_black>what sense does it make to send an interrupt if you *know* it's sprurious and is just going to be ignored? <damo22>in LAPIC land, you need to define a vector where the hw can raise a spurious interrup <solid_black>also how hoes the spl/cli/sti thing work wrt to entering/exiting an irq handler? on aarch64, interrupts get automatically disabled/reenabled when entering/existing an irq/fiq handler <solid_black>sure, we could do that on aarch64 too I guess, but initially upon taking an interrupt and jumping to the handler, does hardware auto-disable interrupts? <solid_black>so what happens if multiple irqs happen in quick succession? <damo22>the stack that stores all the stuff overflows <solid_black>"do we need to ack slave?" "no, only master" <- how does this work? <solid_black>also: this does eoi/ack *before* calling the actual handler? <damo22>sometimes you want this for an IPI so the ipi fires again if it needs to <youpi>solid_black: I guess the controler ack is precisely there to avoid interrupt nesting <youpi>so you can ack the controller when you have disabled interrupt nesting and are ready to receive another one <youpi>(even if it means it'll only trigger after you have handled the current one) <solid_black>so until you ack, the controller won't send any other irq? <youpi>we want to ack the interrupt before calling the handler because the handler might make the hardware trigger another one <youpi>and we don't want to miss that <solid_black>(I still don't have a consistent mental model of how this works as you can see) <solid_black>do we have to do anything special at all *to the controller* after we've run the actual handler? or is the eoi/ack that we did before running the handler "it" from the controller's perspective? <youpi>no we don't have to do anything to the controller after we have dealt with the hardware <youpi>the controller doesn't care whether we have called a handler or ot <youpi>we are just telling it that we are fine with receiving another interrupt <solid_black>(assuming interrupt handlers are quick and never block -- or is that not so?) <solid_black>hm, so GIC has spurious interrupts too, and it too is a separate vector <solid_black>but at least they document the reason, and that makes sense indeed <solid_black>here's one thing that might be confusing me: in GIC docs, "ack"ing an interrupt means basically querying GIC about the interrupt number, which also tells GIC that you've started handling it <solid_black>but in i386at/interrupt.S, acking seems to mean the same as eoi, which tells the APIC/PIC that you've *done* phandling the interrupt (and you already know the number anyway), even though we're only calling the real handler after that <youpi>solid_black: I'm not saying that we *want* to have nested interrupts. I'm just saying that they do happen, and we have not to explode, whenever the handler happens to trigger the irq again, for instance. So we have to deal with that, by telling the controller when we are ready not to explode <youpi>originally, what people would do would be to just mask irqs completely while calling the handler <youpi>but that' poses problem with shared irqs <youpi>so it was reshaped into just acking when we're ready to face another irq <youpi>eoi, which tells the APIC/PIC that you've *done* phandling the interrupt <solid_black>yes, masking it completely (as in cli) is what I would expect <youpi>eoi doesn't really mean we have finished handling the work involved by the interrupt <youpi>see e.g. bottom halves in Linux <youpi>they're called *way* after the interrupt has been triggered <youpi>eoi just means we're done with managing the irq itself (the signal itself, not what it means for the hardware) <youpi>shared irq = several hardware under the same irq <solid_black>yes, I understamd the idea of bottom halves and not actually doing all the work in the handler <youpi>used to happen a lot with ISA hardware <solid_black>ok, and why is that an issue without nested handlers? <youpi>some irq handlers might take longer than other <youpi>and some irq handlers might *have* to be reactive <youpi>so irq handlers that take long would like to allow other irqs calls <solid_black>so you mean it causes latency issues because a handler can take a long time <youpi>I don't remember the details (that's two decades ago I really looked at these) <youpi>and it so happens in our case (with userland irq handlers) that we do want that :) <solid_black>I'd expect *that* (messaging userland) to be done in the bottom half? <youpi>that's what we do in gnumach already <youpi>the irq handler only queues an irq and wakes the irq handler thread <youpi>that pushes a message on the irq port <solid_black>"each CPU interface can see up to 1020 interrupts" -- from GICv2 doc <solid_black>and I believe there can be alot more in later GIC versions <youpi>depends on the information one has <youpi>a linear table is not a problem when you know the index <youpi>damo22: do you happen to remember where in gitlab one adds a runner ? <solid_black>I do, Admin Area -> CI/CD -> Runners -> New instance runner <youpi>does one need to be admin on the instance for that? <youpi>ok, so I can't do it on the gnome gitlab, that's why I wasn't finding it :) <solid_black>that's for instance-global runners, there can probaly be project-local runners? not sure about that <youpi>do you remember who you contacted for the gnome gitlab case ? <youpi>yes, there are project-local and group-local runners <youpi>but that's most probably the same issue <youpi>I'm not admin on the gnome group and on the glib project <solid_black>me, I talked to @pwithnall; you also likely need @barthalion <solid_black>@pwithnall is one of glib maintainers, @barthalion is a GitLab/infra admin <solid_black>1. something external tells GIC of an event => the interrupt is in "pending" state <solid_black>2. GIC tells the CPU about it, and as soon as CPU unmasks irqs in DAIF (aka spl0) it jumps to the irq handler <solid_black>3. from there, CPU tells GIC it started to handle the irq ("ack"). the interrupt transitions from "pending" to "active", and CPU gets back from GIC the irq *number* <solid_black>4. CPU runs the handler, during this time if the even happens again, it becomes "active + pending" <solid_black>5. CPU signals to GIC that it completed handling the interrupt ("eoi"), this transitions the state to the initial "inactive" state (or "pending" again, it if was "active + pending", in which case goto 2) <solid_black>6. CPU does "eret", which likely unmasks irqs again, so it can tak the next one <solid_black>but do I understand it right that the PIC/APIC/whatever on x86 is simpler than that? <solid_black>that ack/eoi is the same thing, and there are only two states, not four? <youpi>there is no eoi like above indeed <youpi>i.e. there is no "active" state <youpi>I don't really see why the active state in GICv2 actually <youpi>ah, to avoid an irq if one is already pending, perhaps <youpi>that does save some overhead for intensive loads <solid_black>we're probably not going to use those, but in GIC, irqs have priorities <solid_black>and a lower-priority irq won't be signalled to the cpu while it's handling a higher-prio irq <solid_black>each, my current implementation of the GICv2 driver is clearly very wrong <solid_black>how much of interrupt handling logic lives outside of i386/ ? <solid_black>and if I develop this new interrupt handling framework (primarily for aarch64), would it also be useful for x86? <solid_black>Linux people seem to have found theirs quite useful for x86 too <azert>Thanks solid_black for the explanations on the DT and the GIC <azert>sad damo22 to hear that rump needs a maintainer, I guess it’s a super hard plumbing and build tools task <azert>Luckily netbsd has stable internal interfaces, with Linux it would have already rot <youpi>azert: no, it's not that hard <youpi>the fact that it still works fine shows that <youpi>and that's precisely why targetting netbsd is way better for that <youpi>it's just that people also have other life and swithc projects <youpi>so it's just waiting for somebody to take some time on it <youpi>I don't think it's much time <youpi>compared to various other kinds of projects which indeed require careful follow-up <solid_black>in the i386at/interrupt.S code that talks about master/slave controllers, which of those is one that is connected to the CPU directly? <solid_black>and if an interrupt originates from the chained IC (slave), it needs to be ack'd/eoi'd on both of them? <solid_black>how does that work wrt irq numbers? are all of slave's irqs not mapped to a single master irq number? <youpi>for irq numbers it's chained <youpi>you indeed have a shared interrupt for the slave <youpi>but also multiplexing for the slave <youpi>I don't remember the details for x86 <youpi>but basically, somehow the cpu gets the sub-irq number from the slave <youpi>and thus it's dispatched over different irqs <youpi>0-7 is master, 8-15 is slave <youpi>2 being the irq used on the master for the slave irqs <youpi>anatoly: the Interrupt Flag of the CPU <youpi>or the mask on the controller itself <youpi>both can be used, but the flag ont he cpu is way cheaper <solid_black>but that 0-7/8-15 number, what "irq number namespace" is that in exactly? is that what master sends to the CPU? is that the number of the handler that CPU jumps to? (can that be different from what the master sends?) is that just gnumach's internal representation? <youpi>it's the numbers that the master/slave controllers somehow send to the cpu <youpi>it's then mapped somewhere in the 0-255 irq space of the cpu <youpi>see PICM_VECTBASE in gnumach <azert>I’m quite confident that on arm you have a number space for each controller <solid_black>and what Linux does it is maps them all into internal irq numbers that don't correspond to hardware-anything <solid_black>but I was hoping we could avoid doing that in gnumach, and instead explicitly keep track of what controller we're talking about <azert>You should do that otherwise user space will have a hard time parsing the device tree <youpi>solid_black: possibly one doesn't need to do any kind of mapping to irq numbers for the cpu <youpi>since GIC allows to just read the irq number from the controller <solid_black>azert: do you mean we should map everything into a linear space or we should not? and how's that related to suersapce parsing the device tree <youpi>we have a handler for each irq, which just pushes the irq number of the stack, then calls the shared handler <youpi>it's dumb do have different irq handlers only to record a number <youpi>while the notion of irq number can be purely software <solid_black>yes, on aarch64 we just have a single handler by design (well, two of them, irq and fiq, but we're not going to use fiq, unless we want to implement support for Apple Silicon machines) <solid_black>but that's not the point; you can (and will) have multiple chained ICs <azert>I mean don’t do that! Userspace will get the info about interrupt mapping from the DT and have no idea about the linear space <solid_black>so for chained ICs, we need some way of tracking which which irq number on which IC we're talking about <solid_black>and it sounds like on x86 you just went with a simple fixed mapping into a linear irq number space, since you already know which ICs there are and how they're configured <anatoly>so "unmasking" is an execution of STI instruction (in x86)? <youpi>(also note the gsi thing, but yes in the end we have a global numbering) <anatoly>telling from OS that it's ready to ahndle maskable interrupts? <solid_black>anatoly: from what I've heard, the situation on risc-v is similar to aarch64, both in how interrupt controllers behave and in device tree usage <solid_black>so these design questions should be very relevant to your work too <solid_black>and maybe we can share the irq handling framework between aarch64/riscv (if not x86) <solid_black>(we'd definetely want to share the device tree parser, there's nothing arch-specific about it) <anatoly>I haven't looked into interrupts handling so far :-) Still in the area of virtual memory <youpi>which is not about mster/slave controller <youpi>but other kinds of controllers <youpi>see the acpi lookups that we're adding <youpi>I don't remember where exactly, in libpciaccess most probably <youpi>damo22 will remember the detail, probably <solid_black>are you saying there are more than 2 interrupts controllers on x86/pc? <youpi>PCI does *not* use the master/slave controllers <youpi>nowadyas we use the APIC controller <youpi>(but also simpler in some way, no master/slave relation) <solid_black>master/slave is PIC, and damo22 said that's exlusive in hardware (and in Mach configuration) from APIC, right? <anatoly>solid_black: well I hope re sharing. Not sure if I'm correct but mach so far feels quite x86-oriented. But it's a chiken-and-egg issue <azert>PCI devices on arm will not be on the device tree, since the PCI bus supports enumeration <youpi>solid_black: I guess there's just one yes <solid_black>azert: yeah I was hoping that all the existing infrastructure (pci-arbiter, libpciaccess) could be reused mostly as-is <solid_black>(not that I know what PCI really is, other than "a bus", mind you) <azert>Probably libpciaccess will need a few tweaks but will work <anatoly>so masking/unmasking of lower priority interrupts gives an OS a mechanism to better schedule handling of them. Is it right? <youpi>gnumach used to do that with spls <youpi>but nowadays it's really not useful <youpi>and since it's expensive, we got rid of it <solid_black>libpciaccess:src/hurd_pci.c has include "x86_pci.h" :( <solid_black>char server[NAME_MAX]; <- seriously, in a Hurd-specific source file? <azert>But will mostly be the same as the x86 one <youpi>solid_black: contribution welcome <youpi>can most probably be replaced with asprintf or such <solid_black>though in that case I guess we know the actual limit on the resulting path length, since it gets constructed by formatting a few numbers into a template <youpi>yes but it's hairy to compute it <youpi>and we're opening files anyway, so it's not really performance-critical to avaiod a malloc <solid_black>back to x86 interrupts, damo22 said there are both level- and edge-triggered ones. is there a difference in how gnumach handles/acks them? if so, where in the source that is? <youpi>I guess it's drivers that do care <youpi>possibly there are currently issues about that, I haven't looked <solid_black>for a level-triggered interrupt, wouldn't acking/eoiing it before running the handler trigger it again immediately? <youpi>I guess we always ask for edge-triggered then <solid_black>I thought that was something that the peripheral decides on <youpi>there's the controller in between <youpi>I guess we ask the controller to be edge <youpi>and the driver copes with what the hardware is doing behind <solid_black>makes sense, so the controller can convert l-t to e-t <solid_black>this is... interesting, so my current (broken) driver for gic_v2 never actually entered the "active" state, it only forcefully cleared the "pending" state upon entering the irq handler <solid_black>"ivect", which is a vector of interrupt handlers in a single flat irq number namespace, is referenced from non-i386 specific places (device/intr.c) <youpi>this could be changed to make the arch-specific provide a way to register the handler <solid_black>hmm, so when registering an interrupt handler, not only it should not be a linear map, it should not even be an 'int irq' argument <solid_black>we should pass the interrupt data as encoded in the device tree 'interrupts' property <youpi>for upstream hurd, there is a contribut <youpi>yes, on the oftc network like other debian channels <youpi>damo22: which version of gitlab-runner are you using? <youpi>I'm using 13.3.1+dfsg-4+hurd.1 <youpi>the token doesn't seem to work <youpi>ERROR: Registering runner... forbidden (check registration token) <solid_black>I think I have a much better idea now about how I want to structure this (interrupt handling) <solid_black>it still feels very clunky at the C level, need to think of how to make that nicer