***alMalsam1 is now known as alMalsamo
***FragByte_ is now known as FragByte
<clarity_>From what I understand, hurd is cooperative multitasking. I thought that programs have to be designed to work in that type of system. How does hurd run programs not designed for cooporative multitasking? <youpi>no it's not cooperative multitasking <clarity_>it's not? I was looking at gnu mach I think, ` <clarity_>There currently is no kernel preemption in GNU Mach. <clarity_>If GNU Mach were made a a preemptive kernel, using continuations would probably no longer make sense as the kernel itself, that is, kernel threads can be preempted, and then their full state needs to be preserved. <youpi>doesn't mean no user preemption <clarity_>ah, so the gnu's are cooporative, but the programs running on hurd are preemptive? <youpi>the hird of servers is preemptive too <youpi>only the kernel is cooperative <clarity_>Another question, why is IPC slow with Mach? Is it because of memory pages being swapped out during context switches? <youpi>from what are you saying that IPC is slow with Mach? <clarity_>I was reading that the slow IPC/context switching is the big performance issue with hurd? <clarity_>There was an effort to port Hurd to L4 I think it was called which is a microkernel with better performance for IPC/context switching? <youpi>I mean, if it's a paper from 10 years ago, computers have changed so much <clarity_>I read, "the critique," and that's from 2007 <damo22>i dont understand how biblio's patch makes any difference on his system. the function fails before his code changes are executed <youpi>yes, processors have changed so much <youpi>what kills performance is I/O <damo22>which layer is supposed to prefetch? <youpi>which does see pages accesses etc. <youpi>possibly that can be delegated <damo22>how would the kernel know its a disk access? <youpi>and the memory object provider can work on providing it <damo22>so then we should disable caching in rump disk, and prefetch in the kernel instead <youpi>I don't think caching is needed at all in rump disk actually <youpi>a page cache in rump disk is duplicate with the caching already done by ext2fs <damo22>the "r" prefix uses it as a character device <damo22>Pellescours: did you try my latest acpica-nothread branch? i cant seem to figure out why the root table works but the rest doesnt <damo22>(20:08:19) damo22: ACPI: RSDT 0x000000007FFE2337 000038 (v01 BOCHS BXPC 00000001 BXPC 00000001) <damo22>(20:08:19) damo22: ACPI: ? 0x4350584200000001 C3F000FF (v226 ?S? ? 53F000FF ? 87F000FE) <clarity_>neato, I'm looking at the source for gnumach/hurd <clarity_>is gnumach still only supporting i386? I see amd64 assembly files in the source? ***Noisytoot is now known as [
<Pellescours>damo22: not yet, but I don't understand why that was working before and not now...exsept if that actually never worked but it wasn't crashing/covered <Pellescours>clarity_: some work was done, but the kernel is not yet able to boot in amd64. someone ent patch for that but they are not upstreamed <clarity_>it seems like development is picking up on hurd <clarity_>I'm going to look into the source a bit more and see if I can contribute after I understand it a bit more <clarity_>I've been getting really interested in operating system kernels lately <clarity_>it's been a good 15 years since I took the operating systems course in college, and even longer since I've coded in assembly <damo22>youpi: did the kernel memory start address change recently? <damo22>how can any userspace process get access to physical addresses? <damo22>is it possible to mmap() a physical address to a virtual address? <damo22>/dev/mem seems to be different now <damo22>if (off >= 0xa0000 && off < 0x100000) the mem device returns i386_btop(off) <damo22>instead of vm_page_lookup_pa(off) <damo22>#define i386_btop(x) (((unsigned long)(x)) >> I386_PGSHIFT) <Pellescours>damo22: your latest commit on acpica will have conflict with upstream master if ever <damo22>Pellescours: we dont need that commit if we can figure out how to access physical memory properly <damo22>we are duplicating code in gnumach and hurd <damo22>the acpi tables need to be parsed from physical memory addresses <damo22>the ACPICA code assumes it has access to raw physical memory <Pellescours>SMP need ACPI so it’s complicated to remove it from gnumach <damo22>yes we cannot remove that part from gnumach <damo22>it just needs to read the irq overrides <damo22>and parse the table regarding cpu cores <damo22>but its also tricky because in userspace we need acpi to parse the AML without a root filesystem present <damo22>acpi needs access to the physical addresses where the tables are stored <Pellescours>is there other way to get this? the get_device() maybe? <damo22>i guess we just device_open the mem device <damo22>but using /dev/mem i thought was the same thing (for testing now) <damo22>but the acpica code tries to use 16 bit paragraph address <damo22>whereas, we want to skip that region and try 0xe0000 instead <damo22>i dont think 0x40e works in 32 bit mode? <damo22>what is confusing is that this function acpi_find_root_pointer() used to work <Pellescours>Do you know if it stopped to work due to recent patches or to something else ? <damo22>i upgraded my hurd system from something very old to current <Pellescours>ok, I gonna check an old commit of acpica-nothread to see if parsing is working <damo22>maybe we can just slog through it with gdb <Pellescours>if 880526c1182e8b1f8d3f17ec7ebedd73d90388cc was able to find root table pointer in the past it’s no longer the case <Pellescours>I think biblio does not have updated his hurd/gnumach and that’s why he is able to get it working <damo22>that does not seem to be a hurd commit <Pellescours>If it’s possible to downgrade gnumach to confirm that <damo22>my testacpi.c program does not depend on any hurd libs except libacpica, which i have in a hurd branch just for ease of versioning with the rest of acpi <damo22>it literally calls acpi_init(); and then the irq functoin <Pellescours>can it be gnumach commit 230d7726ce55114c5c32c440c5928f104a085ba6 that change the behavior ? <youpi>damo22: of course you can mmap a physical address, that's what page tables are for :) <youpi>the mem device has changed behavior, though: now it refuses to map any non-reserved memory (that is used for normal allocations) <damo22>would i be allowed to map just under 1M? <youpi>> (08:11:51) damo22: we cant use /dev <damo22>bootstrapping the disk requires knowledge of irq <youpi>you're allowed to map anything which is not RAM used for normal allocations <damo22>so therefore acpi needs to be available before disk <youpi>so all bios-reserved areas, acpi areas etc. are fine <youpi>that uses biosmem_addr_available to check whether the address is marked as available in e820 <damo22>memmmap has a special case for under 1M <damo22>biosmem: 00000000000009fc00:0000000000000a0000, reserved <damo22>biosmem: 0000000000000f0000:000000000000100000, reserved <youpi>then mem should be allowing its mmap <youpi>since it's not available memory <damo22>biosmem: 0000000000feffc000:0000000000ff000000, reserved <Pellescours>if mem was not able to memmap, then the call should have returned a non zero value to notify there was an error. And the call retuns 0 <damo22>Thread 4 hit Breakpoint 1, acpi_os_map_memory (phys=4850473839169110017, size=36) <youpi>acpi uses 32bit values, doesn't it? how do you end up with such number? <youpi>perhaps a phys_addr_t type that is 64 while it should be 32bit? <damo22>$1 = {signature = "RSDT8\000\000", checksum = 1 '\001', oem_id = "EBOCHS", revision = 32 ' ', <damo22> rsdt_physical_address = 1129338946, length = 538976288, xsdt_physical_address = 4850473839169110017, <damo22> extended_checksum = 1 '\001', reserved = "\000\000"} <youpi>is the structure type really properly defined ? <youpi>with proper sizes and aligns <damo22>Thread 4 hit Breakpoint 1, acpi_tb_parse_root_table (rsdp_address=2147361591) <damo22> at ../../libacpica/tbutils.c:230 <damo22>ACPI: RSDT 0x000000007FFE2337 000038 (v01 BOCHS BXPC 00000001 BXPC 00000001) <damo22>it prints that, and then i checked the rsdp contents <youpi>> thats using the acpica code <youpi>the acpica code could still be wrong for whatever reason <youpi>because it assumes things that aren't true on hurd/i386 <damo22>its not the right address for the root pointer <damo22>000000007ffe2337: 'R' 'S' 'D' 'T' '8' '\x00' '\x00' '\x00' '\x01' 'E' 'B' 'O' 'C' 'H' 'S' ' ' <damo22>000000007ffe2347: 'B' 'X' 'P' 'C' ' ' ' ' ' ' ' ' '\x01' '\x00' '\x00' '\x00' 'B' 'X' 'P' 'C' <Pellescours>damo22: I don’t see any step that should correspond to the API_MOVE_16_TO_32 when I do step by step in gdb, maybe it’s the macro definition which is not configured properly <damo22>it wants the root pointer not the sdt_base <damo22>but we still need my custom root table searcher, because there is no 16 bit mode <biblio>damo22: i need to update gnumach to test your latest fix. <biblio>damo22: are you testing on real hardware or qemu ? <biblio>damo22: I checked API docs and examples form Linux. I could not find anything wrong yet. <damo22>we will need to work out if its feasible to upstream the custom root table search <damo22>i made it reusable for our purposes <damo22>if you can find a way to reuse their root table finder instead of using mine, we should do that <damo22>then we can remove /dev/mem call <biblio>damo22: I did not get "their root table finder". You mean root table finder from acpi API call ? <damo22>but it does not work on hurd currently <biblio>damo22: we should try to use acpi_find_root_pointer() instead of your custom acpi_get_root_table_pointer(...) ? <damo22>if possible, its better to use the built in code, otherwise we reinventing the wheel <damo22>but calling that function instead breaks <biblio>damo22: just to be sure. You want to replace your acpi_get_root_table_pointer() with acpi_find_root_pointer() in future ? <damo22>there was a hook available to override its implementation <AlmuHS>I used this line to configure. `../configure --host=i686-gnu CC=gcc LD=ld` <Pellescours>AlmuHS: I tried to build with the same command as you, but that’s works for me. Do you have any change compared to origin/master? <AlmuHS>I fixed a conflict in configure.ac <AlmuHS>I've just did it, but it doesn't works <AlmuHS>i'm trying to compile "master" branch of my repository <AlmuHS>but there are not any significant change in compile process <Pellescours>do you have latest mig ? I know that headers changed a bit <AlmuHS>meanwhile, I changed my configure line to "../configure --host=i686-gnu CC='gcc -m32' LD='ld -melf_i386' ", and now I have another error <AlmuHS>ld: relocatable linking with relocations from format elf64-x86-64 (libkernel.a(model_dep.o)) to format elf32-i386 (gnumach.o) is not supported <AlmuHS>upstream kernel working now, successfully by moment <AlmuHS>i found a kernel panic in my smp kernel, and I want to check if the panic is from my work or is from upstream <AlmuHS>I got to boot multicpus two years ago, but my previous implementation was very dirty and i'm refactoring <AlmuHS>but, when this error appears, then i can't boot any kernel <Pellescours>ok, I’m trying to boot your branch to test it on my side <AlmuHS>replace N by the number of cpus of your preference <AlmuHS>upstream without this flag seems stable, no crash <Pellescours>fun fact, I first had a kernel panic. but after the reboot, it boot correctly <AlmuHS>now compiling master branch with --enable-ncpus=4 <AlmuHS>my repo has a merged master branch <AlmuHS>now checking master branch with ncpus flag <Pellescours>in your master branch ./device/device.server.h:38:9: error: unknown type name ‘const_dev_name_t’; did you mean ‘dev_name_t’? <Pellescours>AlmuHS: with your smp_stage2 branch, I get a kernel panic (page fault) <AlmuHS>it's strange, because i didn't modified any harddisk controller's source code <Pellescours>disk corrupted (multiple inode claim), but after fix corruption it boot <AlmuHS>i will try again then. Crossing fingers <AlmuHS>upstream kernel continues booting without problems <AlmuHS>a friend is debugging the error, but we don't find the origin <Pellescours>but it don’t happen as long as you don’t write to disk <AlmuHS>i write FROM the disk (the assembly routine) but not TO the disk <AlmuHS>the only that i remember that could generate problems is the copy of this assembly routine. I copy this in a address which is not mapped <AlmuHS>this is the memory address in which i copy the assembly routine <AlmuHS>memcpy((void*)phystokv(AP_BOOT_ADDR), (void*) &apboot, (uint32_t)&apbootend - (uint32_t)&apboot); <AlmuHS>maybe. It's a preliminary work, and i don't add concurrency controls yet <Pellescours>I tried to write a file multiple time using vim, no kp. I did make clean, and kp after some RM <AlmuHS>but the cpus are not added to the scheduler yet <AlmuHS>at moment, i simply start and configure the cpus, but the only working is cpu0 <Pellescours>and the corruption is really strange because it’s not left inode but real inode being corrupted (multiple inode claim) <Pellescours>I will try to compile master with ncpu > 1 and check <Pellescours>doing a "bisect" of your changes maybe would help, or using the kdb <luckyluke>AlmuHS: Pellescours did you try to see where is the address causing the page fault? for example, what code is executing at that moment... <luckyluke>from the last screenshots you sent, this would mean checking address 0xc101cc4d and decoding with addr2line <luckyluke>but it seems the address causing the fault is 0xc101ccba, which could be some variable <luckyluke>also, if you use qemu you can redireect the console to a serial port, so you can copy the text :) <luckyluke>if you enable kdb, you should also be able to get a backtrace <luckyluke>you need to add --enable-kdb at configure stage <luckyluke>do you use gdb? I see you launch qemu with -s -S <luckyluke>in that case, you could also try to set a breakpoint at the code address above, just before the panic <AlmuHS>i have a friend who are doing just this <AlmuHS>he has more experience in debugging than me <luckyluke>so you can examine the state before the issue <AlmuHS>i've just added a loop counter after raise each IPI, but the problem continues <luckyluke>AlmuHS: I was having a look at your code, it seems once you start the other cpus, they execute cpu_setup() but it seems that after that, cpu_ap_main() returns and I don't see a point where the cpu should "wait"... is this correct? <luckyluke>I see a hlt instruction in cpuboot.S, but it's commented, could it be that the cpu just goes on and eventually it causes some mess? <AlmuHS>i created this loop to keep the cpus in a infinite loop if they are not added to scheduler <AlmuHS>i've just reenabled the loop, but the panic continues <luckyluke>did you check what code corresponds to the address reported? <AlmuHS>c101cc4d: 87 43 24 xchg %eax,0x24(%ebx) <luckyluke>you should check to which function it belongs <AlmuHS>c101cb40 <thread_quantum_update>: <luckyluke>there are some NCPU > 1 defines there, you could try adding some print() to see what is wrong there <luckyluke>do you have the clock_interrupt() running on the secondary cpus? <AlmuHS>the secondary cpus only are alive, but these are not added to the kernel yet <AlmuHS>even, i didn't enable paging in these <AlmuHS>I go to disable this line: machine_slot[i].running = TRUE <curiosa>why there is a special array kind in mig called c_string ? <curiosa>if then it is just array[n] of (MSG_TYPE_STRING_C, 8) <curiosa>wouldn't be better to get rid of itN <curiosa>it is really the only thing in gnumig that breaks the mig grammar <curiosa>so it is not just weird, it is just plain ugly <curiosa>apparently it was already there in the first commit in 1998 <curiosa>ah damo22, I've seen your talk, very interesting! <AlmuHS>he is australian, so it can't read your messages until europe's late night ***biblio_ is now known as biblio
<damo22>demo@zamhurd:/part3/demo/git/hurd-sv/build/acpi$ sudo ./testacpi <damo22>ACPI: RSDP 0x00000000000F5860 000014 (v00 BOCHS ) <damo22>ACPI: RSDT 0x000000007FFE2337 000038 (v01 BOCHS BXPC 00000001 BXPC 00000001) <damo22>ACPI: MCFG 0x000000007FFE22D3 00003C (v01 BOCHS BXPC 00000001 BXPC 00000001) <damo22>ACPI: WAET 0x000000007FFE230F 000028 (v01 BOCHS BXPC 00000001 BXPC 00000001)