IRC channel logs

2026-01-14.log

back to list of logs

<sobkas>So is there x86_64-gnu-gcc-15 for linux?
<sobkas>I was trying to use distcc
<nexussfan>IIRC you can cross-compile hurd programs on linux
<azert>sobkas: are you sure that defining those structures for the ioctl make rendering faster?
<azert>I don’t think anything in the Hurd implement those synchronization primitives
<sobkas>I don't think it was caused by this, but it was quiet funny, so I mentioned it
<azert>It’s funny how probably they claimed the same on Linux
<azert>like someone put lots of thought to make better something that was already working
<azert>quite decently
<sobkas>?
<azert>Slide 5: https://events.static.linuxfound.org/sites/events/files/slides/Explicit-Fencing_Talk.pdf
<azert>But it can freeze the whole desktop!
<azert>Except it never happened
<sobkas>nexussfan: I hope it involves sbuild?
<sobkas>I want to use qxl but there is memory leak in it when used with opengl, I hope to have some time to look into it
<sobkas>I was thinking about wayland, isn't it depends on libegl which depends on libgbm?
<nexussfan>Yes i think
<sobkas>So adding hurd console support to libgbm could make a short term "solution"?
<sobkas>Because wayland is essentially a buffer mover?
<sobkas>It won't be fast but it will be something...
<sobkas>good night
<damo22>0x43468c(...)
<damo22>0x50b80f(...)
<damo22>repeat
<damo22>(gdb) p *regs
<damo22>$4 = {r15 = 0x0, r14 = 0x0, r13 = 0x8, r12 = 0x30, r11 = 0x246, r10 = 0xffffffff, r9 = 0x0, r8 = 0x8, edi = 0x7ffffffe0040,
<damo22> esi = 0x3, ebp = 0x7ffffffe0020, cr2 = 0x7ffffffdfff8, ebx = 0x7, edx = 0x50, ecx = 0x30, eax = 0xc8800000000, trapno = 0xe,
<damo22> err = 0x6, eip = 0x4344a5, cs = 0x1f, efl = 0x10246, uesp = 0x7ffffffe0000, ss = 0x17}
<damo22>i think those addresses are in pci-arbiter
<youpi>damo22: you can use `show task` to see what task that is
<youpi>and then you can addrline the binary
<damo22>ok i recompiled pci-arbiter.static with debugging symbols
<damo22>mach_msg 0x004356a5 <+21>: push %rbx
<damo22> 0x00435887 <+87>: call 0x50cd60 <mach_port_mod_refs>
<damo22> 0x0050ce3a <+218>: call 0x435830 <mig_dealloc_reply_port>
<damo22>>>>>> user space <<<<<
<damo22>0x4356a5(...)
<damo22>0x50cdec(...)
<damo22>0x43588c(...)
<damo22>0x50ce3f(...)
<damo22>0x43588c(...)
<damo22> 1 pci-arbiter (ffffffffdc2eac18): (ffffffffdc2de908) R....F.
<damo22>the stack trace goes on for pages and pages
<damo22>its a recursion
<damo22>0x43588c(...)
<damo22>0x50ce3f(...)
<damo22>but i think its also because the stack is overflowed so it cant push %rbx
<damo22>mach_port_mod_refs calls mig_dealloc_reply_port which calls mach_port_mod_refs...
<damo22>and it recurses
<damo22> https://paste.debian.net/plainh/d00a9598
<youpi>damo22: it can happen if the fs-based tls access to get the reply_port is broken, making the reply port bogus and thus the mach_port_mod_refs rpc error out, and the stub calls __mig_dealloc_reply_port
<damo22> https://paste.debian.net/plainh/d00a9598
<youpi>you have the C code of mach_port_mod_refs in a glibc build tree, in build-tree/mach/RPC*mach_port_mod_refs.c
<damo22>i think you must be right
<damo22>how is FS supposed to work with amd64 smp?
<damo22>if theres no swapfs, i guess we just dont use fs in kernel?
<youpi>just as usual, switch_ktss should load the fs base on task switch
<youpi>we just don't use fs in kernel
<damo22>ok
<damo22>but it should be doing that
<youpi>it already does
<damo22>yes
<damo22>unless kernel is touching fs somewhere
<damo22>other than loading in user fs during task switch
<damo22>kernel uses FS for copyin copyout
<youpi>but that's not new with smp
<damo22>right
<damo22>inst_fetch uses fs
<damo22>could taking a trap be breaking fs?
<aculnaig>settrans -fac /tmp/site/ ./httpfs -D www.cloudfare.com
<aculnaig>./httpfs: Url must have a /, e.g., www.gnu.org/
<aculnaig>ok, then I have to append the /
<aculnaig>settrans -fac /tmp/site/ ./httpfs -D www.cloudfare.com/
<aculnaig>In the HTML parser for parsing tmp
<aculnaig>trying to open www.cloudfare.com:80/http://www.cloudfare.com/
<aculnaig>Error Page not Accesible
<aculnaig>400 Bad Request
<aculnaig>./httpfs: Error in Parsing.: Bad file descriptor
<aculnaig>settrans: ./httpfs: Translator died
<aculnaig>it is just me or the www.gnu.org site is down ?
<damo22>settrans -fac /tmp/site/ ./httpfs -D https://www.gnu.org/
<aculnaig>yeah i usually try with gnu site but at the moment is down
<damo22>its working here
<damo22>but you forgot the protocol://
<aculnaig>i think the code does have some sort of control that prepend the protocol if missing
<youpi>damo22: inst_fetch is only used for emulated int80 system call, which we don't us
<youpi>e
<damo22>yeah im not sure what is going on
<damo22>during a syscall64, do we need to touch fs?
<youpi>we should be able to just leave it as it is
<damo22>gnumach-sv$ git grep PCB_ISS
<damo22>x86_64/locore.S: addq $ PCB_ISS,%r11 /* point to saved state */
<damo22>x86_64/locore.S: addq $ PCB_ISS,%r11 /* point to saved state */
<damo22>how does PCB_ISS resolve to anything?
<youpi>it's generated in i386asm.h from i386asm.sym
<youpi>as the offset of iss in the structure pcb
<damo22>ah yes
<damo22>can a syscall be negative?
<damo22>syscall number
<youpi>iirc they are all negative :)
<damo22>because we sign extend the eax register holding the syscall number, if its negative it will set all the upper bits of rax
<damo22>not "mask" it
<youpi>It'd say that's expected
<youpi>and again, it does work in non-smp
<damo22>aha
<damo22>CPU_NUMBER_NO_STACK uses %cs:
<damo22>so that needs to be kernel segment
<damo22>but in the syscall we are in user segs?
<youpi>cs is necessarily the kernel segment
<youpi>since that's what is used to fetch instructions
<damo22>how does the cs change to kernel when syscall64 enters?
<youpi>the processor does it
<youpi>it's set in the syscall trap
<damo22>thanks
<damo22>booted with -smp 1
<damo22>--enable-ncpus=8
<damo22>-smp 2:
<damo22>Cannot load user executable module (error code 6000): pci-arbiter
<damo22>Panic(...)+0x10f
<damo22>Kernel Breakpoint trap, eip 0xffffffff81017a26, code 0, cr2 ffffffffdc2e5e
<damo22>Bad frame pointer: 0xffffffff81072895
<damo22>hmm lets see if this works
<damo22>when the syscall_call takes place, it needs to be in kernel gs as well i think
<damo22>well thats interesting, i get invalid opcode
<damo22>when about to call the actual syscall i am not allowed to swapgs
<youpi>what do you call syscall_call?
<damo22>t_debug(...)+0xb
<damo22>switch_ktss(...)+0xa5
<damo22>stack_handoff(...)+0x165
<damo22>thread_handoff(...)+0x13f
<damo22>mach_msg_trap(...)+0x18ed
<damo22>syscall64(...)+0xf2
<youpi>within switch_ktss, gsbase is still the kernel gs yes
<youpi>it's kgsbase which has the user gs
<damo22>interesting inside a syscall it can switch contexts
<youpi>sure, if it's a blocking system call, you have to switch to something else
<damo22>aha
<damo22>thats why my code is broken
<youpi>or if you wake some thread that is more prioritized
<damo22>syscall64 function is tricky
<damo22>you cant use many registers
<damo22>only r11
<youpi>but there you only need to swapgs
<youpi>inconditionally
<youpi>since you know you are doing user-to-kernel and kernel-to-user
<youpi>note that if you switch to another thread, you *won't* be returning through that syscall64 code immediately
<youpi>it's only when you get back to your thread that syscall64 will take control back, and swapgs
<youpi>switch_ktss having put back the user gs base into kgsbase for swapgs to install it into gsbase
<damo22>kernel: Invalid opcode (6), code=0
<damo22>Stopped at t_debug+0xb: TODO
<damo22>t_debug(...)+0xb
<damo22>switch_ktss(...)+0xa5
<damo22>stack_handoff(...)+0x165
<damo22>thread_handoff(...)+0x13f
<damo22>mach_msg_trap(...)+0x18ed
<damo22>syscall64(...)+0xee
<youpi>were/how are you stopping?
<youpi>(what line is switch_ktss+0xa5? you can ask gdb gnumach)
<damo22>245 fpu_load_context(pcb);
<damo22>line before is db_load_context(pcb)
<youpi>what instruction is there? the fpu_load_context macrois empty
<damo22> 0xffffffff810718e0 <+160>: call 0xffffffff8106f060 <db_load_context>
<damo22> 0xffffffff810718e5 <+165>: mov -0x8(%rbp),%rbx <----
<damo22>leave; ret;
<youpi>is there something non-zero in pcb->ims.ids.dr ?
<youpi>it looks as if something overflows into them
<youpi>they are normally only used by gdb for hardware breakpoints
<youpi>if something bogus is in there that could trigger the debug trap which doesn't understand why it's called
<damo22>(gdb) p percpu_array[0]->active_thread->pcb.ims.ids
<damo22>$11 = {dr = {0x0, 0x0, 0x0, 0x0, 0x810bb000, 0xffffffff, 0x810bb000, 0xffffffff}}
<damo22>thats the gs base address twice
<youpi>so you overflowed somehow somewhere
<damo22>wow
<youpi>that makes me realize that we need to fix i386_debug_state for 64bit
<damo22>sbs = {fsbase = 0x200000000480, gsbase = 0x0}}
<youpi>Mmm, but 0xffffffff810bb000 would be the kernel gs base
<youpi>we don't want that in the user thread state
<youpi>it's the user gs base that should be in the thread state
<youpi>did you add accessing the thread state by hand in assembly?
<youpi>I don't see why you would need such a thing
<youpi>switch_ktss handles updating kgsbase with the user thread's gs base
<youpi>and in assembly you just need to swapgs
<youpi>on return to user
<damo22>syscall64 had all that already
<youpi>had what?
<damo22>accessing the thread state
<youpi>yes, it does so for the generic-purpose registers, since these are clobbered by C
<youpi>but for gs you don't need that, switch_ktss does it
<damo22>yeah i just use swapgs
<damo22>to allow MY() to work
<damo22>let me check if i pushed my latest
<damo22>just pushed now
<damo22>* 5909d070 x86_64: Implement swapgs logic
<youpi>+#if NCPUS > 1
<youpi>+ wrmsr(MSR_REG_KGSBASE, pcb->ims.sbs.gsbase);
<youpi>I don't think we want to have a different logic in smp and non-smp
<damo22>ok
<youpi>that'd avoid swapgs, sure, but hairy code means buggy code :)
<damo22>so you want swapgs in UP as well?
<youpi>that'll make the code simpler yes
<damo22>ok
<youpi>and it shouldn't be very expensive
<youpi>SWAPGS_ENTRY_IF_NEEDED: I don't see why re-writing to KGSBASE, it should already be there, that's the whole point of that msr
<youpi>SWAPGS_EXIT_IF_NEEDED: you are rewriting the percpu_array into KSGBASE before swapgs, but before swapgs, it's the user gs base that is there, that we don't want to overwrite either
<youpi>in all_intrs, you removed SET_KERNEL_SEGMENTS and CPU_NUMBER, in the USER32 case we want them
<damo22>oh
<youpi>also SET_KERNEL_SEGMENTS in ast_from_interrupts
<damo22>ok i will preserve SET_KERNEL_SEGMENTS as before
<damo22>i changed it because i had an iteration of my code where SWAPGS_* was calling it
<youpi>I wonder if syscall64 is not already without interrupts enabled
<youpi>I have no idea if that's the case or not
<youpi>it'd be good to check, because that can be troubles more generally, and the tiny window just before calling cli could already be a problem
<damo22>aha
<damo22>thanks i have a lot to work on now
<youpi>I don't see why swapgs in _syscall64_check_for_ast, aiui it already has kernel gs base there (set by syscall64)
<youpi> swapgs /* switch to user gs */in syscall64: no we don't want that
<youpi>we're about to call the kernel code for the trap
<youpi>and in _syscall64_restore_state we are still with kernel gs base
<youpi>really, syscall64 only needs two swapgs, at very beginning and very end
<damo22>but if the syscall blocks it may switch context and never return to the syscall code to swap back?
<youpi>again, that's not a problem
<youpi>if it blocks we get into another thread, another context, which will restore its own state
<youpi>whenever we unblock, we'll get back to syscall and restore the user gs base
<youpi>which will have been put in kgsbase by switch_ktss
<youpi>(performed by the thread just before ours)
<damo22>gs is set to percpu value at entry to syscall64
<youpi1>that's swapgs yes
<youpi1>and it'll remain that way until we swapgs again within another thread, to load that user thread's gs base
<damo22>i mean before swapgs is executed, at the entrypoint into syscall64: gs is already set to kernel value
<damo22>when i run swapgs it sets to 0
<damo22>something is backwards
<youpi1>? I don't see how it can already be the kernel value
<youpi1>except if we got it wrong *before*
<youpi1>in the previous syscall
<damo22>no i set breakpoint on syscall64 this was the first syscall
<damo22>is the trap gate calling a different function that already swaps gs?
<youpi1>no, syscall64 is the first instruction executed after userland did syscall
<youpi1>but possibly it's the initial user context load that didn't put the user base gs but the kernel base gs
<damo22>maybe we dont load real kernel one at the bootstrap we can just load zeroes
<damo22>boothdr
<damo22>bedtime
<youpi1>damo22: thinking about it: maybe you'd want to test the gnumach testsuite, before running a full distribution
<youpi1>that'll give you way simpler userland programs to check
<youpi>and you can make them print debugging stuff with mach_print and such
<youpi>without having to care about tls and whatnot
<azeem>if I am trying to debug Postgres processes, and attach gdb to them, all I ever see are 2-3 threads in _start () from /lib/ld-x86-64.so.1 with no meaningful stack, am I doing something wrong?
<youpi>azeem: no, external attach should be just working, even on amd64
<sobkas>What is best option if I have some random crashes to find what happened?
<sobkas>code=139
<sobkas>and such
<youpi>you can run through gdb
<youpi>if you are using 32b, you can look at the core file
<youpi>on 64b you can symlink /servers/crash to crash-suspend so the process stays there, and you can attach to it with gdb
<sobkas>youpi: thanks
<azeem>youpi: hrm
<azeem>so the main postgres process as a reasonable stack trace, but all its children are just in _start () from /lib/ld-x86-64.so.1 (same on 32bit, with ld.so.1)
<azeem>s/as a/has a/
<azeem>I'm trying to debug the frequent test suite hang, but without gdb that's a bit hard