IRC channel logs

<junlingm>Hi, it seems that the Debian Hurd 20190705 image does not have GnuPG installed, so it is not possible to do an apt update.

<junlingm>in fact, the port repo is not recognized, so it is also not possible to install gnupg. Any suggestions how to fix?

<junlingm>The problem is that the key has expired, and without gnupg, there is no way in install a new key.

<AlmuHS>jumlingm: there are a newer image here https://cdimage.debian.org/cdimage/ports/latest/hurd-i386/20200101/

<AlmuHS>junlingm: there are a newer image here https://cdimage.debian.org/cdimage/ports/latest/hurd-i386/20200101/

<junlingm>thanks.

<AlmuHS>don't forget upgrade after install. The 2020 image also have a couple of errors

<junlingm>AlmuHS: I will keep that in mind! Thanks.

<jrtc27>1. apt doesn't need gnupg

<jrtc27>(it uses gpgv, which is a subset thereof)

<jrtc27>2. you can manually download the .deb from https://deb.debian.org/debian-ports/pool/main/d/debian-ports-archive-keyring/debian-ports-archive-keyring_2019.11.05_all.deb and then dpkg -i it

<AlmuHS>btw. Installing the 2020 image in real hardware, Hurd freeze during the extraction of hurd package

<AlmuHS>in apt upgrade

<AlmuHS>when I install the 2020 image of Debian GNU/Hurd in real hardwareu, and try to do "apt upgrade", Hurd freeze during hurd package unpack

<AlmuHS>I had the same problem in many machines: Thinkpad T60, Thinkpad T60 widescreen and Thinkpad R60e

***Server sets mode: +nt

<junlingm>I am installing in a virtualbox image.

<junlingm>anyway, installing using the 2020 netinstall iso fails at the debconf step. I will use the raw img to give it a try.

<junlingm>The raw image works! Thanks everyone for your help.

<junlingm>When I do an "apt update && apt dist-upgrade", I met an error at configuring python3.8-minimal because of a wrong libc version. Obviously, python3.8 is somehow upgraded before libc. Installing the libc0.3 package manually using dpkg fixed the problem.

<junlingm>PCI arbitor seems to use libnetfs and mmapping the config and regions is not available. Do we expect that multiple servers access the same PCI device?

***GeneralDuke1 is now known as GeneralDuke

<damo22>junlingm: i am in the process of upstreaming patches for libpciaccess i think that is a bug

<damo22>youpi: how will users share pci cards if interrupts occur and there is only one handler per device?

<damo22>ie does it even make sense for a device to have multiple users

<damo22>considering we are refactoring interrupts currently, i wonder if we can make interrupts have an extra identifier that can split irqs among different users of the card

<damo22>would it even make sense

<damo22>ie, divide the bandwidth of the device among all users who have opened it

<damo22>for something like a disk, it might make some sense, although im not sure how you would mount a shared filesystem twice

<damo22>for displays, it makes no sense to flicker round robin between users' displays

<youpi>damo22: there would be only one translator for each PCI function

<youpi>if several functions use the same IRQ, that's fine

<youpi>we report the IRQ raise to all of them

<youpi>IRQs don't tell which function is concerned

<youpi>so we can not identify anything, try to shared bandwidth, etc., there is just no such information

<youpi>the information is only the IRQ number

<youpi>quite often hardware try to use different IRQs precisely for this, but that's not guaranteed, but that's not a problem we are supposed to solve ourself anyway, we just deliver the interrupt notification to all users that asked for it

<youpi>(I specify "function", not "device", because that's the whole point of PCI devices to provide several functions, notably network cards do this so that functions can be passed-through to guests)

***rekado_ is now known as rekado

<rekado>FWIW I was able to avoid the boot freeze by passing “-cpu base” to qemu-system-i386

<rekado>with the default it would very frequently get stuck while booting

<rekado>this is on rather new servers with recent AMD CPUs

<AlmuHS>rekado: what freeze do you refers?

<AlmuHS>if you're using Debian GNU/Hurd 2019 image, I advice you about this version has a bug with libpciaccess, which freeze the machine during network configuration

<youpi>rekado: it'd be interesting to test with various -cpu values, to see which piece of the x86 instruction set triggers such freezes

<AlmuHS>the bug which i refers only affected me in real machines, not in qemu

<AlmuHS>in 2020 image this bug was solved

<janneke>AlmuHS: can you point to a specific commit/patch in gnumach/hurd that fixes this? we're using a guix image, so it would be nice to know!

<janneke>ah in libpciaccess ... (sorry)

<janneke>the freeze here is earlier, is this freeze DDE-related? we are using libpciaccess 0.16, afaics

<AlmuHS>janneke: earlier than...

<AlmuHS>?

<rekado>AlmuHS: we saw it freeze while starting the disk driver: “start ext2fs: Hurd server bootstrap: ext2fs[device:hd0s1] exec”

<rekado>youpi: good idea, I’ll try different values

<rekado>it doesn’t happen *all* the time with the default, but maybe 1/2 to 3/4 of all attempts

<AlmuHS>rekado: yes, this is the bug of Debian GNU/Hurd 2019 which i refered before

<rekado>just to be clear: we aren’t using the Debian image.

<youpi>AlmuHS: no, the hang after exec print is not the network hang

<youpi>the network hang happens much later during boot

<youpi>if it's 1/2 to 3/4, that's often enough that you can experiment, yes

<AlmuHS>yes, but I notices the exec bug in 2019

<AlmuHS>*noticed

<youpi>the exec bug has been there for years

<youpi>possibly decades

<AlmuHS>oh, now I understand

*jrtc27 has traumatic memories of fixing possible deadlock conditions in ext2fs's remount code that gave similar symptoms

<jrtc27>(never underestimate the power of slightly underclocking virtualbox for precision, pausing when near the deadlock, then underclocking to 1% and letting it sloooowly run so you can then pause right where you want to during boot)

<jrtc27>(and also virtualbox's built in debugger)

<AlmuHS>LOL

<AlmuHS>i was using a similar debugging, but using Qemu with GDB

<junlingm>damo22, youpi: it may be hard to avoid racing conditions if multiple servers concurrently mmap and access the same PCI reg space.

<youpi>sure, that's why the pci-arbiter will be needed

<youpi>as of now we don't actually have the issue since we have gnumach, then netdde, then xorg, with no overlapping

<junlingm>would it make more sense to allow a single server (driver) to manage a PCI device and expose multiple interfaces for functions, so that each function interface can be attached a server?

<youpi>AIUI there is no need to have a driver for the device, only drivers for the various functions

<junlingm>youpi: if a server set up a PCI request, and another server sets up another valid request, but at the cost of stopping the first job, since both servers have equal access rights, how would an arbiter prevent this?

<youpi>junlingm: the requests go *through* the arbiter

<youpi>just like on all OS which make them go through the kernel

<junlingm>So if a server writes to a reg, no other servers can write to the same reg?

<youpi>the writes are going through the arbiter, where they get serialized

<junlingm>youpi: Sorry I am a bit slow. If server B sets up a data buffer and write to a reg, how does the arbitor prevent another server from overwriting the buffer address?

<junlingm>server B -> a server

<youpi>separate functions have separate regs

<janneke>AlmuHS: in a completely different, installing guix/hurd on my x60, i'm still struggling to get the network going; so i'm wondering if the libpciaccess problem you refer to is fixed in 0.16, which is what we use

<junlingm>I understand. But that requires that servers must trust each other?

<janneke>*different effort

<youpi>no, functions work completely separately

<youpi>so servers don't care what the other is doing with its own function

<junlingm>sure, but it does not prevent users from attaching two conflicting servers.

<youpi>what do you call "users" and "servers"?

<youpi>(I guess the misunderstanding comes from there)

<junlingm>a user of the computer may mistakenly attach a server twice, and the two copies of the same server will conflict.

<youpi>one will get control of the function, the other will get EBUSY

<youpi>but onrmally what happens is that the translator (proper term for "server") is attached to a filesystem node

<youpi>and the filesystem does take care of starting only one instance, for a given filesystem node

<junlingm>Ok. that makes sense for me. Thanks.

<junlingm>Another question, does hurd or gnumach support allocating memory for DMA?

<youpi>you mean with proper address constraints?

<youpi>that's precisely one of the TODOs that damo22 is to be working on :)

<junlingm>I mean getting physical address of a memory block, and passing it to a PCI function.

<junlingm>oh. good!

<youpi>we had to have it for netdde to be able to work

<junlingm>I realized that, but not sure where the API is exposed.

<youpi>vm_allocate_contiguous

<junlingm>youpi: ah, in mach/experimental.defs! Thanks!

<janneke>oh youpi, "vm_allocate_contiguous" is what i've been wondering about trying to get netdde to work on my x60!

<janneke>i "found" gnumach/debian/patches/70_dde.patch ... and compiled that in...

<janneke>...but after that, i got stuck with a link error building the hurd

<AlmuHS>janneke: show the link error

<AlmuHS>pastebin is it's very long

<AlmuHS>*if it's very long

<janneke>i586-pc-gnu-gcc ... -o boot => boot/deviceServer.c:1227: undefined reference to `ds_device_intr_enable'

<janneke>=> https://paste.debian.net/1155517/

<janneke>(sorry for any duplicate messages, not sure what's happening here!)

<janneke>and i haven't found yet "who" should define that function (or who shouldn't use it)

<AlmuHS>it seems that you missed a #include directive

<AlmuHS>grep this function, and search the file in which this symbol is defined

<youpi>"undefined reference" is *not* about #include

<youpi>#include do not *define* symbols

<youpi>they only declare them

<youpi>it's the libraries which define symbols

<youpi>janneke: you need to build glibc against the patched gnumach headers

<youpi>for libmachuser to contain the RPC stu

<youpi>bs

<youpi>that's the same for all mach and hurd RPCs

<AlmuHS>ok, i'm sorry

<janneke>youpi: OK, right; i will double check that

<youpi>no worry, *all* my students are making the same mistake

<youpi>it seems no C teacher manages to get the difference through

<janneke>(and it's good to know: i certainly tried first without rebuilding libc)

<youpi>(in the heads of the students)

<jrtc27>it doesn't help that definition and declaration are easily confused....

<youpi>confused? how?

<jrtc27>in english

<youpi>ah, you mean the common words

<jrtc27>the words are too similar

<youpi>well, yes, but that's only one of so many examples in maths & computer science

<youpi>technical terms have a precise meaning, so you must use the exact words

<jrtc27>yeah

<AlmuHS>in spanish the terms are "declaración" and "definición", so it's not an advantage over english terms

<youpi>ditto in french

<jrtc27>it's just unhelpful that the two exact words given to the different terms sound so alike

<jrtc27>(and also that #define exists...)

<youpi>well, talk about authorization vs authentication ;)

<youpi>#define *is* a definition :)

<jrtc27>sure

<jrtc27>but there are two kinds :P

<AlmuHS>there are "implementation" term too

<youpi>.h can contain both declarations and definitions, yes

<youpi>worse, a .h can both contain an extern declaration *and* an inline definition

<jrtc27>... or a non-inline definition (that's not a #define) if you're a bad person who doesn't care for -fno-common :)

<AlmuHS>yes, the definitions usually are in .c , is not?

<jrtc27>usually

<AlmuHS>or the implementation

<jrtc27>people used to be a bit lax for zero-initialised globals

<jrtc27>because they didn't count as duplicates

<AlmuHS>i learned programming in c++, so i learned to avoid globals if possible

<jrtc27>same principles apply to C too

<AlmuHS>now i'm refactoring my old SMP code to hide the globals under a set of functions, trying to follow an OOP-like model

<AlmuHS>like this https://github.com/AlmuHS/GNUMach_SMP/blob/smp-new/i386/i386/apic.c

<junlingm>youpi: where is vm_allocate_contiguous implemented? I cannot seem to find it in the gnumach master branch.

<youpi>it's in the userland driver branch

<junlingm>Thanks!

<Pellescours>youpi: I don't think this will change something but the check was not fully correct (I think) https://github.com/etienne02/gnumach/commit/8d2f2735c21c681baa3cc23d248e4022c47ae8cf

<youpi>it won't ever be >, precisely since we panic

<Pellescours>okay

<AlmuHS>but is it not possible that size increase high than limit? By example, if the limit were 1000 and the previous iteration was 950, if the size increase 100, the new size will be higher than limit

<AlmuHS>(i don't know what do this piece of code, it's only a question)

<youpi>it's a one-by-one allocation that starts at 0

<AlmuHS>oh, then it's good

<youpi>you can see that by reading the function itself

<youpi>what it does to that variable

IRC channel logs

2020-07-07.log