IRC channel logs

2024-02-12.log

back to list of logs

<damo22>if you fork a process will it inherit the same task parent?
<damo22>i have set my mach_task_self() to be in a different pset, how do i launch /bin/bash with the same pset?
<damo22>is there a way to execute a new process inside the current process?
<youpi>fork probably doesn't inherit the same task parent, since technically the forker creates the forkee task
<youpi>testing will tell you
<youpi>"execute a new process inside the current process" does not mean anything
<damo22>execvpe
<youpi>if you mean "execute a new program inside the current process", that's not how things work
<youpi>see exec
<youpi>it creates a new task, copies over various stuff that has to be inherited
<youpi>and drops the previous task
<damo22>ok cool
<youpi>again, testing will tell you
<damo22>i made a process hang because i assigned it to the slave_pset and the slave_pset does not currently execute anything
<damo22>i guess i need to create a new pset and assign AP processors to it
<damo22>maybe there is a flag to enable the processor_set?
<damo22># stress -c 7
<damo22>1914295 damien 0 0 5201612 529248 26500 S 695.7 1.6 6:40.69 qemu-system-i38
<damo22># ./smp /bin/bash
<damo22># stress -c 7
<solid_black>hi all
<etno>o/
<solid_black>hi etno :)
<damo22>hi
<solid_black>damo22: cool work!
<solid_black>fork does not inherit the parent, no, it's same as in Unix, the new task's parent is the old task (the one that is forking)
<solid_black>also, no, exec only creates a new task if you pass EXEC_NEWTASK (or EXEC_SECURE), mostl of the time it really reuses the same task
<damo22>i just mailed in a one line patch for gnumach that lets you run on slave_pset
<solid_black>could someone help me understand how (& why) the remapping from physical -> high memory addresses is done in gnumach? I understand some of it, but apparently not enough
<solid_black>first question would be, gnumach is built as non-PIC, right? so it expects to be loaded at a specific address, and starts running without paging enabled, so it's physicall addresses. But how do we know there will be any RAM at that physicall adrdess range at all?
<solid_black>(did I really misspell physical with two l-s both times)
<damo22>solid_black: hmm i dont know but probably memory exists at low addresses
<damo22>unless e820 says its reserved
<solid_black>at some low addresses, yes, but how do we know it exists specifically where gnumach's PT_LOAD points to?
<damo22>as long as there is more than 16MB ram it will work probably
<etno>solid_black: I don't know, but this has to be part of the multiboot spec. If I was to speculate, I would imagine that paging is enabled and the physical address of the loaded blobs are in the phys memory map provided by grub... but ... :shrug:
<solid_black>but that's about RAM size and not its physical address? or, do all "PC"s have a standard RAM base address?
<damo22>0
<solid_black>ah
<damo22>:)
<damo22>it all starts with zero
<damo22>im pretty sure the chipset is configured to have ram at 0
<damo22>because it needs to be 16 bit addressable
<damo22>so first 1MB is 16 bit, then gnumach starts at 0x1000000 ? 16MB?
<solid_black>oh this is actually interesting, and must be the first time I'm seeing this
<solid_black>what does p_vaddr != p_paddr mean?
<solid_black>in PT_LOAD
<solid_black>next question would be, I see that "Physical memory is direct-mapped to virtual memory starting at virtual address VM_MIN_KERNEL_ADDRESS."
<solid_black>is that *all* physical memory? some of it?
<solid_black>do I understand it right (I heard it here recently, and then saw mentions of it in the source code) that the way this is done, is gnumach is linked for a low address, and then shifted to a high virtual address using x86 segments? so it "thinks" it's running at a low address, but it's actually high virtual because of the segments, but then it's low physical again because of the mmu?
<damo22>probably
<damo22>i messed with that part and was very confused
<solid_black>then IIUC there are no segments on x86_64 (other than fsgsbase); how does that work for x86_64 gnumach?
<damo22>0xC0000000 is the offset for i386
<solid_black>also exactly what is a "linear address"? (physical? virtual?)
<solid_black>does e.g. "call" push the instruction address (with the segment base already added in), or does it push the instruction address relative to CS base?
<solid_black>which one does %eip store?
<solid_black>it must be the latter
<solid_black>but then how does eip-relative addressing (for PIC) work at all?
<solid_black>or, is the compiler *not allowed* to ever generated eip-relative addressing for -fno-pic?
<solid_black>oh, does %ebp implicitly use SS and not DS? that makes sense, but that also means all -fomit-frame-pointer builds are broken, unless we also assume CS = DS?
<damo22>solid_black: these are good questions but we have something we can work on: smp works now, and stress -c N works, but gcc does not
<damo22>something about writing to the filesystem is broken with multicore
<solid_black>so does stress -c8 run 8 times as fast?
<damo22>stress -c 8 runs on 8 cpus
<solid_black>does time(1) report it properly?
<damo22>good q
<damo22>root@zamhurd:/home/demo# time stress -c 7 -t 10
<damo22>stress: info: [974] dispatching hogs: 7 cpu, 0 io, 0 vm, 0 hdd
<damo22>stress: info: [974] successful run completed in 10s
<damo22>real 0m10.010s
<damo22>user 0m0.000s
<damo22>sys 0m0.000s
<solid_black>(also you should put error (1, errno, "Failed to execve %s", argv[1]) after that execve() call
<damo22>ah yes
<solid_black>and in the other error () calls, don't pass 0, pass err
<solid_black>user and sys are 0?
<damo22>yes i dont know hwy
<damo22>when i use stress -d 5 it hangs
<damo22>it doesnt like writing to disk
<damo22>in parallel
<damo22>i think that is why gcc fails
<damo22>eg make -j7
<solid_black>does it work if you write tmpfs?
<solid_black>i.e. is it an issue with writing to disk, or with libpager / libdiskfs / libports etc?
<damo22>i dont know where it writes
<damo22>ohhh maybe its trying to write too much data?
<damo22>and fills up ram
<damo22>and gcc might run out of ram and swap hard
<damo22>if i run -j7
<damo22># stress -d 5 --hdd-bytes 50M -t 10
<damo22>this works
<damo22>maybe 4GB ram is not enough for make -j7
<damo22>on gnumach source
<damo22>heh bash completion hangs the system
<damo22>when im in smp shell
<damo22> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
<damo22> 8 root 8 -12 618264 8104 0 R 24.5 0.4 0:05.83 ext2fs
<damo22> 7 root 2 -18 356572 213212 0 R 23.3 10.2 0:05.70 rumpdisk
<damo22> 759 root -3 -23 636624 5392 0 R 21.1 0.3 0:02.56 ext2fs
<damo22> 9 root 8 -12 143892 1440 0 R 6.6 0.1 0:01.13 exec
<damo22> 4 root -3 -23 167604 1408 0 R 3.5 0.1 0:01.11 proc
<damo22>locked up
<damo22>when running ar
<damo22>with make -j3
<damo22>youpi: do we need to make all the drivers eg rumpdisk, reentrant?
<damo22>so that if another cpu enters the driver code, it can still execute while the previous cpu is still executing part of the driver?
<solid_black>presumably the kernel the driver was taken from (NetBSD, Linux) already has its own synchronization primitives, e.g. mutexes
<solid_black>and we'd map those to pthread_mutex
<solid_black>the same is true for regular threads on single-core, no need for SMP
<damo22>no, not quite, on UP there is only one cpu executing at a time, therefore only one thread executing at a time
<damo22>we can now have threads simulaneously executing the same code....
<damo22>all the locks now become critically important
<damo22>is glibc 100% reentrant?
<solid_black>huh?
<solid_black>I mean, yes, one CPU is executing at a time, since there's only a single CPU in the first place
<solid_black>but userland is fully preemptible, that CPU can, at any moment at all, switch to running a different thread
<solid_black>which, to userland, looks much like all those threads running concurrently
<solid_black>which means, the entirety of our userland is fully synchronized, and should not require any changes for SMP, other than latent bugs exposed by it
<damo22>no there are nasty races that could be exposed
<solid_black>this is different from the kernelmode, which can just disable preemption whener it likes to, userland cannot do that
<damo22>by having threads simultaneously executing code on different processors at the same time
<damo22>instead of context switching to and from different threads one by one
<solid_black>that's what I mean by latent bugs, yes
<damo22>thats what im hitting now
<damo22>but i dont know where to begin debugging this
<damo22>as far as i know, a reentrant function means any cpu can enter this function and complete the calculation at any time it wants to, even two cpus at the same time can, and they get the expected result
<damo22>does that mean it must be stateless?
<damo22>or the state has to be handled with locks so that only one cpu can use it at a time?
<damo22>usually we dont have to care in linux userspace
<damo22>because drivers are in the kernel
<damo22>we are probably going to have endless synchronisation issues
<damo22>between servers etc
<damo22>or simultaneous requests to the same server
<damo22>simultaneous requests to the same server that actually need to write out some state are going to have issues
<damo22>i had a hang earlier where auth, proc, exec, ext2fs, rumpdisk and another ext2fs were running
<damo22>maybe i should start with those
<youpi>damo22: rumpdisk is very probably already reentrant
<youpi>glibc is not 100% reentrant, in userland terms it's call signal-safe
<youpi>the state "only" has to be in registers or stack, to be safe
<youpi>servers are already thread-safe
<youpi>bugs may lie in several places:
<youpi>- ipc support in mach might have a few races because it's been a long time since it's been used in smp
<youpi>- userland code might have concurrency issues
<youpi>the latter case could have been already exposed in up by preemption, and smp makes it way more probable
<damo22>ok
<youpi>solid_black: linear addresses are between physical and virtual
<youpi>linear = paging has been applied, segmentation has not
<youpi>so physical -> paging -> linear -> segmentation -> virtual
<solid_black>does paging get applied *before* segmentation?
<youpi>es
<youpi>yes
<youpi>it'd *reaaaaallly* be good that you guys read the OS book from tanenbaum
<solid_black>I mean, before, as in when converting virtual to physical
<youpi>you'd understand way more what's going on
<youpi>then no that's the converse
<youpi>virtual -> segmentation -> linear -> paging -> physical
<solid_black>yes, so segmentation is before virtual, as I thought
<youpi>the PC BIOS standard says there is some memory starting at 0
<youpi>(physical)
<solid_black>so linear is after segmentation, before virtual
<youpi>that can be e.g. 512K, 640K
<solid_black>that's what I'd call virtual
<youpi>then memory starts after 1MB
<youpi>virtual is what kernel & processes see in the end
<youpi>so you have both segmentation and paging applied
<solid_black>can you answer the question about how pc-based accesses work if CS != DS?
<youpi>gnumach tells through paddr to be loaded at 16MB, to leave room for grub to load stuff
<youpi>and it tells through vaddr to be mapped at 0xc100000
<youpi>let me time to write answers
<solid_black>sure, please take your time, and thanks :)
<solid_black>but I did not understand what you just said
<youpi>jmps/calls etc. usually only use absolute addresses without segment
<youpi>if you use a far version, you additionally have the cs
<solid_black>GRUB does not even look at the ELF headers, does it? it looks at multiboot
<youpi>jmp and call can also be given a relative address, if it's not too large
<youpi>call will push the absolute result
<solid_black>yes, that (about jmp/call vs ljmp/lcall) I understand
<youpi>(which is convenient to pop right after that to get it)
<solid_black>yes, but the address it pushes, is it CS-base-relative ("virtual") or zero-relative ("linear")?
<youpi>cs-relative
<youpi>userland/kernelland almost never talk about linear
<solid_black>right, that's what I expected, but then how can you use it to access .data?
<youpi>you don't
<youpi>that's what segmentation is all about
<youpi>you're not supposed to jmp/call into .data
<solid_black>but the compiler can generate PC-relative addressing for .data, no?
<solid_black>I'm not jumping there, I'm getting my eip and adding an offset to that
<youpi>that's 64bit
<youpi>it's different and indeed assumes that the segmentation is trivial
<solid_black>not necesserily, on 32-bit too
<youpi>? no
<azert>damo22: I’ve been looking in the way netbsd deals with processor sets. Netbsd has a namespace for that, problem is that gnumach uses file handles (ports) instead
<youpi>that's why libraries use the ebx pointer
<azert>So porting netbsd user side would mean a hack at best
<damo22>azert: i wrote a trivial program to become a task on APs
<damo22>i sent it to the mailing lis
<damo22>t
<youpi>solid_black: multiboot does interpert the elf
<azert>I’ll look
<azert>Thanks damo22
<solid_black>this thing: __x86.get_pc_thunk.ax
<azert>Ideally Hurd gets a translators for cpu and sets. Maybe inside the acpi translator is the good place
<youpi>solid_black: yes
<damo22>azert, this is really a kernel thing
<damo22>it affects the scheduling of tasks
<solid_black>that gets emited with -fPIC, sure, but does -fno-pic guarantee 100% that the compiler will never ever use something like that?
<solid_black>PC-relative addressing I mean
<youpi>no
<solid_black>or, is that still compatible with cs_base != ds_base, in some way that I'm not seeing?
<youpi>iirc the sysv abi says cs=ds
<solid_black>ah
<youpi>the thing is: the process was initially built without that
<solid_black>wdym multiboot interprets ELF?
<youpi>thus why x86-32 doesn't have cs-relative support for ds-instructions
<youpi>solid_black: it reads paddr and vaddr from the lef header
<damo22>where would i look for ipc races?
<damo22>in the ipc code itself in gnumach?
<youpi>yes, but first check whether there actually are
<youpi>by making your test program do some ipc etc.
<youpi>it's useless to spend time hunting for bugs before you actually know where bugs lie
<damo22>what is a simple rpc i can hit repeatedly without causing any damage?
<damo22>would that test it?
<youpi>you can start with kernel-only rpcs such as mach_port_names
<youpi>then you can try proc-based rpcs that get process information
<damo22>my rpc to mach_port_names() returns a mach_port_array_t of mach_port_t, what do i do with a mach_port_t?
<damo22>do i just print the number
<damo22>ok i got it to lock up
<damo22>doing one rpc in parallel
<damo22>its now idle
<damo22>but not continuing
<damo22>i mean i am calling the same rpc over and over with an openmp loop on APs
<biblio>hi all, I am doing experiments with hurd on riscv. Intro to riscv on baremetal in my blog: https://netuse.dynamicmalloc.com/riscv_bare_metal.html. Initial "hello world" example code https://netuse.dynamicmalloc.com/cgit/gnumach-riscv.git/commit/?id=3004496dad956569ffa7a7f35c459dba7517739f . I am still facing issue to call external functions from kern/startup.c. still learning...
<solid_black>biblio: this is so cool!
<solid_black>please keep going
<solid_black>one small thing: I don't think there's any need for riscv/include/mach/riscv/exec/elf.h to be an installed header?
<solid_black>same for eflags.h
<solid_black>you don't want the x86-specific stuff that's in mach_i386.defs (and you copied over to mach_riscv.defs), just drop it
<biblio>solid_black: I can remove it. It was in some other dependecy. I agree, initial goal was to make it compile and replace all riscv/ files one by one.
<biblio>solid_black: the arch memory model is super simple. But, they use some extensive optimization during link up (LD).
<solid_black>does it make any sense to support user32 on riscv? (I don't know, I'm asking)
<solid_black>is multiboot a thing on risc-v? (if no, drop multiboot.h)
<solid_black>thread_status.h, you should rewrite to be riscv-specific and not x86-specific
<solid_black>which ld optimizations do you mean?
<biblio>solid_black: yes, i will do in followup changes.
<solid_black>like, linker relazations?
<solid_black>relaxations*
<biblio>solid_black: yes linker relaxation.
<solid_black>does your branch build?
<biblio>solid_black: yes in local, let me check again.
<solid_black>(not saying that it doesn't, just asking)
<biblio>solid_black: sure.
<biblio>solid_black: also mig needs to build with riscv target.
<solid_black>yes
<solid_black>you should start by adding a riscv64-gnu target to binutils and gcc
<biblio>solid_black: yes https://github.com/riscv-collab/riscv-gnu-toolchain
<biblio>solid_black: I will write a readme doc to make it easier for others.
<solid_black>isn't that a riscv-linux-gnu toolchain?
<solid_black>as in, for Linux
<solid_black>this probably doesn't matter for kernel (gnumach) hacking, if you specify the right compiler with CC etc
<biblio>solid_black: debian also ship riscv64 gcc package. In case you want to have multi arch with 32 bit option then you need to install from source.
<solid_black>but I prefer to build the proper whatever-gnu toolchain
<biblio>solid_black: yes
<solid_black>so what I was trying to say is that for aarch64-gnu, I found that MIG needed no changes
<solid_black>it just built with --target=aarch64-gnu
<biblio>solid_black: yes, I used similar command for mig --target=riscv64-unknown-linux-gnu
<solid_black>but that's wrong, there should not be a linux in there :(
<biblio>solid_black: i just followed https://github.com/riscv-collab/riscv-gnu-toolchain . It just binary name nothing do with with OS i think.
<solid_black>it has everything to do with the platform
<solid_black>in case of MIG, it probably doesn't check, so it may not be a big deal
<solid_black>but in general you'll get a toolchain that does wrong things if you pass whatever-linux-gnu when whatever-gnu is needed
<biblio>solid_black: ok
<solid_black>I'll try to hack on userland side of things (glibc) once you have a more complete gnumach port
<solid_black>and figure out details such as syscall interface & riscv_thread_state
<biblio>solid_black: it would be great.
<biblio>solid_black: added another patchset to fix build issue with latest gnumach. I was able to compile and run it. https://paste.debian.net/1307059/
<solid_black>👍️
<Pellescours>cannot tools like valgrind help to detect the locking issues ?
<youpi>inside the kernel, no
<youpi>in userland, perhaps, but porting valgrind is very involved
<youpi>I'd rather recommend working on tsan
<youpi>(and no, valgrind or tsan cannot detect all locking issues)
<Pellescours>i see
<Pellescours>I know that no tool is abke to magically find all thoses
<Pellescours>able*
<azert>What I earlier to damo22 wasn’t that cpu sets can be taken out of the kernel . What I meant is that the netbsd userspace support cannot be cleanly ported
<azert>Typically netbsd has cpu_ids, while gnumach has hosts and processor ports
<azert>shell scripting doesn’t interact very elegantly with handles as far as understand it
<azert>Btw what is the point of supporting multiple hosts? Was it used to allow virtualization?
<azert>One could think of exposing hosts and processors from the kernel to the filesystem, then one could use shell scripting to interact with processor sets in a hurdish way
<Gooberpatrol_66>azert: gnumach supports clustering, maybe that's it?
<azert>Gooberpatrol_66: do you mean that on a cluster gnumach would see all other hosts and processors on those hosts? How would one configure that?
<azert>Also, since I suppose you run multiple gnumachs, how do they coordinate?
<azert>I know about NetMsg, you seem to suggest a totally different level
<azert>In any case this looks like obsolete baggage to me, nobody build clusters like that anymore
<Gooberpatrol_66> https://darnassus.sceen.net/~hurd-web/microkernel/mach/rpc/ there's a thing called norma
<azert>I can see the use of an host port for virtualization
<Gooberpatrol_66>it's not in gnumach though, but other mach versions
<azert>macOS doesn’t do that either
<Gooberpatrol_66> https://web.mit.edu/darwin/src/modules/xnu/osfmk/man/
<Gooberpatrol_66>I don't think clustering is useless, just people don't use it because no mainstream OS supports it well
<Gooberpatrol_66>and people have kind of recreated it in a clunkier way with kubernetes
<azert>Threads are incompatible with transparent clustering
<azert>Im not an expert, but kubernetes looks like something devops would use to manage servers
<azert>So a totally different beast
<Gooberpatrol_66>i mean they use it to spread workloads across machines/loadbalance
<azert>I think the best use of this is to partition the cpu into subset to guarantee to subhurds
<azert>And subhurds would have no way to know there are other CPUs available since their host port is not gnumach but a man sitting in the middle
<Gooberpatrol_66>azert: have you read Brent Baccala's paper on hurd clustering?
<Gooberpatrol_66> https://www.freesoft.org/software/hurd/building.pdf
<azert>Yes, I’ve seen his presentation, fascinating
<azert>I wish he merged his code
<Gooberpatrol_66>same
<rekado>his code is here: https://github.com/BrentBaccala/hurd/
<rekado>to revive the work we could look at whether the patches in https://github.com/BrentBaccala/hurd/tree/master/patches are still required.
<rekado>AFAIK he doesn’t continue working on the Hurd because of the copyright assignment policy.
<azert>I think he got pissed off by something
<azert>He probably couldn’t sign those
<Gooberpatrol_66>so...reverse engineer the patches and merge them, and then maintain the rest of the components out-of-repo?
<azert>Only real patches there are rpctrace
<azert>Also extremely well documented
<azert>Maybe fork rpctrace if it’s really better
<azert>Like it doesn’t really have to live with the hurd
<azert>Im still wondering if his work wasn’t a good step toward a truly distributed filesystem
<Gooberpatrol_66>from the paper
<Gooberpatrol_66>>
<Gooberpatrol_66>We could use a shim process on remote nodes, to allow requests for an existing page go to a node that holds
<Gooberpatrol_66>a copy of that page, instead of to the node with the disk. Two ways I can imagine this. Either ext2fs forks
<Gooberpatrol_66>off multiple processes on different nodes, or a mechanism is developed to detect when a port is remote, figure
<Gooberpatrol_66>out which node it (currently) resides on, and then libpager can fork a process and push it to the other node,
<Gooberpatrol_66>invisible to ext2fs proper
<Gooberpatrol_66>yikes, konversation doesn't paste well
<Gooberpatrol_66>also, maybe could wrap an existing distributed fs (ceph?)
<damo22>azert: i dont think MACH_HOST is about multiple hosts, its just to enable managing cpus as sets