IRC channel logs
2023-09-29.log
back to list of logs
<gnucode>oh...so I met my old college professor today. He has a lab with his minimal OS. He actually gave me a short tour...And he recommended that I could build my own lab with the Hurd... <gnucode>He said something about making charges to the Hurd, then power cycling the machine... <gnucode>He is of the opinion that OS development be done with kprintf <damo22>sometimes i put in printf("W"); into gnumach and reboot, just to see when and how often it executes a code path <damo22> 28 f58f6660 (((/bin/sh(119)))) [2] <damo22>Processor set runq: count(0) low(25) <damo22>Processor #0 runq: count(0) low(32) <damo22>Processor #1 runq: count(0) low(32) <damo22>runqs are empty even though there are plenty of threads <damo22>so nothing is running, its just idling <damo22>i think i fixed smp slowness, except i am getting a general protection fault returning from an interrupt <damo22>what does it mean when iret faults? <janneke>ooh, great -- hoping you find the segfault soon <damo22>how do i debug with gdb and get a back trace without kdb getting in the way? <damo22>i guess i can compile gnumach without kdb <damo22>ok so it was about to call iret, and got a IPI interrupt... is that bad? <damo22>it also had a level interrupt from ethernet <youpi>damo22: that's supposed to happen fine <youpi>as in: IF is supposed to be clear before calling iret, so the interrupt happens *after* the iret <damo22>the IF was set just before the iret executed <youpi>with a lot of IPIs you can end up with cascaded interrupts that fill the stack <youpi>why is the IF set just before iret is executed? <damo22>not set "just as", i meant at the point where it wanted to iret, it was still set <damo22>is it possible that th->processor_set is not set? <youpi>it's null in the thread_template <youpi>and it's set to null in pset_remove_thread <youpi>so that's probably expected it can be null <youpi>but possibly it's always set to non-null before unlocking the thread <youpi>thread_deallocate is the only caller of pset_remove_thread <damo22>can i just use default_pset instead of th->processor_set ? we dont need it <youpi>and thread_create initializes the pset from the parent_task <youpi>so possibly the pset gets passed from tasks to tasks to threads <youpi>well, for a start you can put an assert to check whether it does happen to be null <youpi>not respecting the pset would mean not respecting the thread binding <damo22>i think we only have one processor_Set <youpi>yes but later people may create others <solid_black>I did see that you sent a patch, and it sounds great! but I haven't looked too closely <damo22>the run queues are almost empty and the code that schedules directly to idle processors seems to slow it down <damo22>so while its deciding if it should schedule onto an idle processor it could just put the thread on a run queue and the idle threads will detect it <solid_black>no really, why does it take that long to check whether it can be scheduled onto an idle processor? <damo22>i think it takes locks that are expensive perhaps <damo22>or there is a side effect of not putting the thread actually on a runq <solid_black>maybe; but if that's the case everything everywhere would be super slow <solid_black>is there a way to profile gnumach and get like a flamegraph? <damo22>it calls iret and explodes during boot <damo22>login: Kernel Page fault trap, eip 0x46, code 0, cr2 46 <damo22>Stopped at 0x46:Kernel Page fault trap, eip 0xc1034d35, code 0, cr2 46 <damo22> Caught Page fault (14), code = 0, pc = c1034d35 <damo22>it seems like the net_write() call caused the cpu to jump to 0 <solid_black>I saw you recreated the event once again? why? are you sure everybody got it? <gnucode>I edited the event. I just changed the text that it said. Then google asked if I wanted to resend the the updated info. <gnucode>Everybody on the Hurd side has accepted the invitation. <solid_black>but according to the message I got from Google, you actually cancelled it, and then created a new one <gnucode>I did accidentally press the delete event button, then pressed undo. <gnucode>as far as I can tell it's still the same event. <gnucode>As smart as you are, I bet Kent is smart too. :) <solid_black>Kent is much smarter than me no doubt, but we're already stretching his willingness to spend his time on this <solid_black>also I collected a list of things I'd like to discuss, not sure an hour will be enough <gnucode>I'm going to go work out for a half hour now. And you might be right. But it is also possible that he will be very cordial. <solid_black>also, I did enable PipeWire media support in Firefox, but it still doesn't capture video from my front/selfie camera <gnucode>last thing...I have to use a windows computer to join the meeting. My linux machine is not accepting my camera... <gnucode>and I don't know how to record video on this windows computer that I do not own. <solid_black>ohhhh that's an awesome idea, I'll reboot into windows <gnucode>hmm. I'll think about that after I work out. gotta get started. <wleslie>kent = kent mcleod? that could be entertaining <gnucode>wleslie: bcachefs author. kent overstreet <xelxebar>Man, that sounds like a cool conversation to be in on! <gnucode>xelxebar: We are planning on recording said conversation. <gnucode>I don't want to invite everyone, because I am trying to avoid it turning into a circus. :) <wleslie>front page of website mentions capnproto, good sign <gnucode>solid_black I am back now. eating bkfast. will shower soon. <gnucode>nikolar: if you wanna see a young Einstein, you could log into the meeting room now. <gnucode>that would just let us double check things. <solid_black>so I rebooted into Windows, let Windows Update do its thing, and now I can no longer boot into GNU/Linux <solid_black>is anyone in the room currently? should I try joining? <gnucode>solid_black: what distro do you use? <solid_black>Fedora (with linux-surface stuff) on the host, typically Debian in VMs <gnucode>gotcha. I'm a guix system fan myself. and OpenBSD. <solid_black>it's still 1.5 hours until the actual meeting starts <wleslie>in a week you'll be in DST and you'll have a real hard time of international meetings ^___^ <damo22>us aussies have a hard time with international meetings <janneke>gnucode: most probably i'll be tied up this afternoon, i'll see if i can find a moment to attend... <xelxebar>gnucode: Cool. Looking forward to the video! <janneke>ACTION has a special day; their daughter returns home for a bit after 2months of schooling in norway <janneke>(fresh mentioning of old news, still nice) <wleslie>you had a good hack though, I'm glad they bought the holiday forward <wleslie>fridays off always seem more productive <damo22>im trying to fix the last known bug with gnumach smp <wleslie>I saw; interrupts have been perplexing me for the last couple of months too <damo22>maybe i should try compiling with -O0 <wleslie>it's fun that it says /page fault/. Are you getting vector 0xd or 0xe? <damo22>i thought it was a general prot fault <damo22>because it happens when iret is called <damo22>but maybe its randomly hitting that or a page fault after <damo22>when it tries to push something on the zero stack <wleslie>could it have happened while handling the interrupt, and now that you're returning it can proceed? <damo22>something happened on the edge of returning from an interrupt, iret caused a general prot fault <wleslie>I mean, it's possible that the interrupt occurred earlier, but it was masked, right? <damo22>i remember reading on osdev that if interrupt flag is set before iret is called, you can get a general prot fault <wleslie>you mean once you handle one interrupt, if there's a pending interrupt, you'll hit it when you iret? <wleslie>did you clear the previous interrupt? <damo22>there is a comment in the old code that says you should call the EOI before handling the interrupt so it can occur again <damo22>seems like 0x58 level interrupt is getting stuck <wleslie>I don't have anything on how to ack an ipi (haven't gotten that far). I do have a note saying that you can take exceptions on IRET if someone loads a segment selector with a nonsense value. <damo22>hmm maybe the initial value of gs is garbage <wleslie>segment selectors are used to hold thread-local storage on gnu systems, and it's possible to remove those pages from the pmap I guess <damo22>but we use gs to hold percpu area <damo22>maybe when it restores gs, gs has a nonsense value? <wleslie>it could be set to nonsense by the user, but I'm not sure why anyone would be messing with it outside of glibc <wleslie>we get GP if the code segment is bogus, according to the note I have <damo22>im pretty sure we save gs value upon entering the kernel and restore it on exit <damo22>and in between, we set it to 0x68 <wleslie>are you looking at i386at/interrupt.S ? <damo22>for the first time i was able to get a shell and ssh to -smp 2 <damo22>is there a simple command i can use to stress 2 cores? <damo22>without creating files on my disk <janneke>hmm, stress is not a GNU tool, make that <janneke>(it _might_ be linux specific, dunno!) <janneke>meanwhile, /me tries "guix shell stress -- stress -c 2" in a childhurd <damo22>damn, the ethernet level triggered interrupt is stcuk <damo22>maybe that is a ioapic specific issue <wleslie>goodnight. hope you get some good sleep <damo22>Explicit EOI is only supported for IOAPIC version 0x20 <damo22>qemu is emulating version 0x11 of ioapic <nikolar>I am on my phone and can't reply on call <gnucode>Gooberpatrol66: you have the recording I believe. thanks again for that! <Gooberpatrol66>also my video and nikolar's comments are cut out of the screen, sorry <gnucode>We did not hear the loud grinding noise at all. <gnucode>honestly a video, even with poor sound, is better than nothing. <gnucode>Kent really likes rust, and encourages us to use Rust when writing new filesystems. <gnucode>Kent believes and much of linux's libraries could and should be modified to run in userspace. <gnucode>and then we could use those libraries in userspace. <Gooberpatrol66>him trying to get all of linux VFS running through fuse is cool, that makes a fuse translator an even better sell <gnucode>I believe he said that linux's hash tables could mostly already be used on the Hurd right now in userspace. <gnucode>true. It was nice that he was very laid back. <gnucode>solid_black definitely helped on asking the technical questions. <gnucode>Gooberpatrol66: had some good questions too! <gnucode>Gooberpatrol66: thinks that we should host Linus Torvalds next and restart the age old debate of microkernels vs. monolithic kernels! <gnucode>Kent also invited anyone to help out with bcachefs. Apparently he does a lot of mentoring for people wanting to become developers. <gnucode>youpi: provided that I have your blessing to invite other cool software people to talk to the Hurd people... <gnucode>would you like to join in these talks? What days and times work for you? <gnucode>also bcachefs is licensced GPLv2. google owns the copyright on much of bachefs' code. So it will most likely stay GPLv2. <gnucode>people are already teasing me about my "not trying to smile face". <gnucode>Gooberpatrol66: can you put a link to Kent's patreon on the youtube description <Gooberpatrol66>google needs to verify my face before i can put urls in the description, which takes 24hrs <youpi>gnucode: well, the thing is that if the meeting goes bad, that can bring bad press, we don't really want that. Inviting Linus is probably not a good idea, notably <nckx>gnucode: Congrats, it seems like as usual I missed something cool. <damo22>youpi: it seems that qemu is emulating an old ioapic version that does not support directed EOI per irq <damo22>this is a problem for level triggered interrupts <damo22>there is a workaround apparently in linux where you mask the irq then change the trigger mode or something like that, which i did implement in hurd but i dont know if its working <damo22>i introduced a check for ioapic version <damo22>but i did not submit it as a patch yet <damo22>i am noticing that ethernet interrupt is getting stuck, 0x58 (level) <gnucode>nckx: haha. No worries. I just think it's funny that I have a silly smile on my face for most of the interview. <gnucode>youpi: sounds good. It was interesting to hear kent talk about using various kernel libraries in userspace. But apparently there is a bit of work to make that happen <gnucode>also most of the linux kernel is GPLv2 only. <gnucode>to hear kent encourage to use various kernel libraries in userspace.* <damo22>Kernel General protection trap, eip 0xc100a997, code 0, cr2 c10a8190 <damo22>kernel: General protection (13), code=0 <damo22>The General Protection Fault sets an error code, which is the segment selector index when the exception is segment related. Otherwise, 0.