IRC channel logs

2019-10-31.log

back to list of logs

<AlmuHS>youpi:
<youpi>ok, that looks good
<youpi>I guess your thread_info.h file doesn't have the field
<AlmuHS>yes
<AlmuHS>I have a default gnumach installation
<AlmuHS>wait, I go to execute "make install" in my SMP gnumach
<AlmuHS>now It detect the header https://pastebin.com/esHpUWwa
<AlmuHS>but I notice that gnumach install headers in /include/mach , meanwhile hurd search It in /usr/include/mach
<AlmuHS>I had to copy the first to the second plce
<AlmuHS>place
<AlmuHS>youpi:
<youpi>the default prefix on GNU is / indeed, you need to pass --prefix=/usr to configure to get the debian default
<AlmuHS>ok, I didn't know It
<AlmuHS>then, can I send the patch?
<youpi>you also need to put the proper ifdefs in the code
<AlmuHS>where?
<youpi>around what you added
<AlmuHS>ok
<youpi>so that both cases work as appropriate
<AlmuHS>yes
<AlmuHS>in proc/info.c ?
<AlmuHS>or in procfs/process.c ?
<youpi>possibly both
<youpi>just try to think about compatibility in both case
<youpi>imagine what happens in both case
<AlmuHS>ok, I'm working in this
<AlmuHS> http://dpaste.com/3TE3CWR
<AlmuHS>new patch
<youpi>no need to initialize it to 0 before the ifdef
<youpi>you can also test with a #error that what you want to be compiled really is compiled
<youpi>sicne your last_processor is already a long unsigned, there is no need to cast it
<youpi>apart from that it looks good
<AlmuHS>I prefered to use an auxiliary variable to clean a bit the code
<youpi>sure
<youpi>I just mean you don't need to initialize it to zero
<AlmuHS>I prefered this than add the #ifdef in the same sprintf
<youpi>since you set it right after it
<AlmuHS>oh, ok
<AlmuHS>simply this? http://dpaste.com/0BYAWZ0
<youpi>no need for casting last_processor
<AlmuHS>oh, I forgot remove It
<youpi>also, we usually do not indent just because of #ifdef
<youpi>so the two "last_processor =" lines would be indented just like above
<AlmuHS>ok, I'll fix It
<AlmuHS>now? http://dpaste.com/3D29Z31
<youpi>that looks good except the indentation in the "last_processor = " lines
<youpi>also, take care how braces are to be written for the if
<youpi>see just above, you have an example
<AlmuHS> http://dpaste.com/35Q19EE
<AlmuHS>anyways, I could pass an e-style to process.c. It has some irregular indentation
<youpi>what is e-style?
<youpi>the GNU Coding Style provides the correct /usr/bin/indent call to make
<youpi>your if braces need to be on separate lines, with some indentation
<youpi>and put a space after the if
<youpi>really, look around in the file to see the coding style
<youpi>about irregular indentation, are you sure it's not just that your editor makes tabs 4-chars for instance?
<AlmuHS>e-style is a plugin which applies the coding style what you want, to the entire file
<youpi>tabs are 8 chars
<youpi>all our files assume that
<youpi>you don't generally want to blindly apply a coding style
<youpi>because that introduces a lot of spurious changes
<youpi>which make reviewing way harder
<AlmuHS>it's irregular because, It depends of tabs
<youpi>you can make an indent run, but only to find incoherency and fix what you want
<youpi>then tell your editor that tabs are 8 spaces
<youpi>and everything be regular
<AlmuHS>but the indentation are depending of the size of this tabs
<youpi>yes, and I tell you that tabs are 8 chars
<youpi>always, for GNU source
<AlmuHS>when I set 8 tabs, same codeblocks are bad indented. but, when I set 4 tabs, some blocks previosly well indented, feels wrong
<youpi>provide examples, what I'm reading doesn't have such issue
<AlmuHS>the same process.c
<youpi>yes, but where exactly?
<youpi>all I'm reading is indented fine
<AlmuHS>227
<AlmuHS>247
<AlmuHS>247-249 (the comment)
<youpi>not sure which line that is
<youpi>since I guess you have a modified file
<AlmuHS>251
<youpi>provide the line content
<AlmuHS> start_code = end_code = 0; /* To make killall5.c consider it a
<AlmuHS>and the comment of this line
<youpi>here it looks all good
<youpi> http://dept-info.labri.fr/~thibault/blip.png
<youpi>erglb
<youpi>reload that url :)
<AlmuHS>also, It seems that some blocks mix spaces and tabs
<youpi>the blue does are just here to tell that they are tabs
<youpi>yes, again that is expected
<youpi>8 spaces are just replaced by tabs
<AlmuHS>when I change the tabs size, some pieces move, but some not
<youpi>that's only usual
<youpi>yes
<AlmuHS>ok, It was so strange for me
<youpi>really, the only thing you need is to tell your editor that tabs are 8 chars
<youpi>most source code on the net assume that, if they assume something about tabs
<AlmuHS>yes, but I have to add some spaces in some lines to fix well the line
<youpi>and if source code is so indented that 8-char tabs make them not fit on the screen, the code is too complex and should be rewritten
<youpi>for intsance?
<youpi>and show screenshots
<AlmuHS>in these files, in some lines, I had to add 2 spaces, followed with an 8 space tab
<youpi>really, for me everything is just as expected
<youpi>that's not supposed to be needed
<youpi>lines begin with tabs
<AlmuHS>how can I send screenshots by IRC?
<youpi>and then maybe a few spaces
<youpi>like I did
<youpi>put somewhere on the web
<youpi>didn't you look at my blip.png?
<AlmuHS>yes, but I don't have any similar web
<youpi>looks like https://pasteboard.co/ could do
<youpi>(random pickup from googling "paste image"
<AlmuHS> https://pasteboard.co/IErmbjc.png this is fine
<youpi>yes, this looks good
<AlmuHS>but, if the indentation was fine, It don't might to happen https://pasteboard.co/IErmGgq.png
***Server sets mode: +nt
<AlmuHS>then, can I send the patch now?
<youpi>if you have fixed the braces for the if yes
<AlmuHS> http://dpaste.com/1AXD279
<AlmuHS>check It
<AlmuHS> http://dpaste.com/0AJMBE8
<youpi>check again how it is above
<youpi>indentation is 2 chars
<youpi>the braces are indented
<youpi>compared to the if
<youpi>and the content of the braces is indented
<youpi>again, see above
<AlmuHS>ok, I'm seeing
<youpi>(I'm really pissed that all these new fancy editors are just not respecting decades-old convention that a tab is 8 chars)
<youpi>(that keeps posing problems to newcomers)
<AlmuHS> http://dpaste.com/12THG33
<AlmuHS>I'm using Gedit
<youpi>now the brace is correct, but the content is not, it's too indented
<AlmuHS>CodeBlocks has a plugin to indent file in GNU mode, but this plugin reindent ALL file, so I prefered not to use
<youpi>yes, better not use a systematic reindenter
<youpi>there are always cases when manual indentation looks better
<AlmuHS>now? http://dpaste.com/0V5GGNT
<youpi>there we are :)
<youpi>now, all that being said, here there is no need for braces since there's only one line
<youpi>it's just that it's better you learn these parts of the coding style as early as possible
<AlmuHS>yes, but It's not bad anyway
<youpi>ah, again you need space between if and (
<AlmuHS> http://dpaste.com/1T1ZYPJ edited directly in the patch
<youpi>right, that looks good
<AlmuHS>do you need a changelog?
<youpi>yes
<jrtc27>eh, I deliberately default to 4 spaces per tab because I find it more readable/looks nicer (even for low depth), but will quickly recognise files for which I should change to 8 :P
<AlmuHS>I usually use 4 space per tab in my personal code
<AlmuHS>but, if GNU standard is 8 tabs, I'll have to follow It
<jrtc27>yeah (and beware of the extra indentation when {} is involved)
<jrtc27>(and the spaces before ('s in function calls)
<jrtc27>which perhaps comes from certain early developers' use of lisp, where it's `(f (a b))`, so translates more to `f (a, b)` in C
<AlmuHS>I don't use lisp
<AlmuHS>but It can be true
<AlmuHS>youpi: I've just sent the patch to maillist
<AlmuHS>I don't tested It with hurd, I advice
<AlmuHS>I only test to compile
<youpi>(also note that comments contain sentences, and should thus start with a capital, and finish with a dot and two spaces)
<AlmuHS>ok, I'll take notes
<jrtc27>I'd also suggest you use git-format-patch so you can bundle the changelog in with the diff rather than them being 2 separate files
<AlmuHS>oh, I didn't know this! ty
<youpi>I didn't notice in your previous change log: what is in parenthesis is the name of the function being modified, not the exactly code location
<AlmuHS>because the file is said before
<AlmuHS>in the parenthesis I said not the modified function, instead I said that I modified the callers of this function
<AlmuHS>I might had been more clear, It's true
<AlmuHS>but It's so late
<youpi>it's not a question of being clear, it's a question that it is the convention
<youpi>all other changelog entries are made this way
<AlmuHS>ok, I didn't know the convention either
<youpi>see how it is done in git log
<youpi>general rule: look around how it is done, mimic it
<AlmuHS>ok
<youpi>that'll save you a lot of GNU Coding Style reading
<AlmuHS>I feel as a kid, hahaha
<AlmuHS> http://git.savannah.gnu.org/cgit/hurd/hurd.git/commit/?id=d8671bc2a0fead7655b9e80736db33d84f14025c
<AlmuHS>oops, I notice a typo, sorry
<AlmuHS>+ [#include <mach/thread_info.h]) must be + [#include <mach/thread_info.h>])
<AlmuHS>youpi: can you fix It or I resend the patch?
<youpi>I'mll fix
<youpi> sending a modified patch is useless once the previous is commited
<AlmuHS>youpi: I sent a new patch. I found a new error (I'm embarrased, sorry)
<AlmuHS>now compiles without problems
<damo22>youpi: how do you install an interrupt handler from userspace?
<damo22>i found a bunch of symbols in gnumach that are pointers to functions *ivect[NINTR]
<damo22>im a bit confused how that can translate to something a userspace driver can implement
<youpi>damo22: you don't
<youpi>damo22: you just forward the event through a notification
<youpi>see the 70_dde.patch in the debian package that forwards interrupts to netdde
<youpi>it's possibly full of issues, but at least the principle is correct
<damo22>so every interrupt is going to be implemented as a separate rpc message
<youpi>yes
<damo22>ok thats kind of easier to handle
<damo22>i could have a single function that takes a parameter of the IRQ number
<damo22>that fired
<damo22>but i think i need to extend *ivect to 48 interrupts
<damo22>for possibly 2x ioapics?
<youpi>sure
<damo22>i probably dont even need all that
<damo22>but i want to allow scheduling of irqs
<damo22>it seems like we are inventing a new interrupt mechanism
<damo22>by passing it as a message
<damo22>it can be interposed
<youpi>that's useful for giving a PCI device to a subhurd, yes
<damo22>do i just set them all as intnull?
<damo22>the problem is, this is all very hardware dependent
<youpi>sure, intnull is fine
<youpi>waiting for anything to register a handler
<youpi>be it the user-land forwarding handler
<damo22>how does it decide which interrupt to register
<youpi>pic_isa.c is in i386/i386at/pic_isa.c, so it's fine to depend on x86 :)
<youpi>see the patch, the register function says which interrupt should be used
<damo22>ok
<youpi>netdde determines that from reading the PCI config space
<damo22>nice
<youpi>a proper sub-hurd forwarded would make sure the subhurd only requests interrupts for the PCI device it's given
<youpi>to avoid abuses
<youpi>though getting more interrupts than needed is not really that much a problem
<youpi>of course this model doesn't support the driver doing cli/sti
<youpi>but nowadays nobody does this any more
<damo22>does rump do cli/sti?
<youpi>no
<damo22>gooood
<youpi>afaik
<youpi>it's too dangerous to be doing this in userland anyway, you could get stuck in a blocking system call
<damo22>yeah
<youpi>and then never have the chance of calling sti
<damo22>is that why interrupts.S has cli in the middle?
<damo22>to stop interrupts from interrupting interrupt handlers?
<youpi>yes
<youpi>otherwise you may have unbound stacking
<damo22>stack overflow
<youpi>that does happen with very busy systems
<damo22>ok
<damo22>im learning a lot from reading this code
<youpi>it's much simpler to read than linux' :)
<damo22>interesting that irq2 is intnull, its not even connected to a line
<damo22>youd never receive an interrupt on irq2
<damo22>because the pics are plugged together on that line
<youpi>better have a null handler than a hang if for whatever reason the hardware invented an interrupt on irq2
<damo22>lol
<mabox>Did you saw ideologing at Cristosan.com
<mabox>Did ya ideology like at Cristosan.com
<AlmuHS>are there any secure way to test and upstream hurd compilation? Sometimes, after do "make install", hurd hang at boot. And I have to reinstall entire system
<AlmuHS>**test an
<youpi>using a sub-hurd yes
<AlmuHS>ok, I go to search about this
<youpi>note that you usually don't need to reinstall the entier system
<youpi>just unpack hurd_foo.deb
<youpi>either by booting from the debain installer, chrooting into your system and running dpkg -i
<youpi>or by mounting the disk, and unpacking by hand with ar p hurd_foo.deb | tar x
<AlmuHS>yes, but in this time, the filesystem crashed
<youpi>depends how it crashed
<AlmuHS>the translator which crashed was procfs
<youpi>I have never had to reinstalled the buildd boxes for a decade
<AlmuHS>btw, can anyone test the latest gnumach with the latest hurd? from upstream repositories. Yesterday hurd crashed after do It, but may I did any step wrong
<AlmuHS>I'm repeating the test now
<youpi>I uploaded them in debian yesterday
<AlmuHS>tonight?
<youpi>no, yesterday
<youpi>early yesterday
<AlmuHS>then only gnumach
<youpi>see http://qa.debian.org/gnumach
<youpi>see http://qa.debian.org/hurd
<AlmuHS>gnumach seems runs well, the problem seems to be in hurd
<youpi>ah, the hurd binaries may have come this morning only
<AlmuHS>I'm repeating the test now, as I said
<youpi>1140 oct. 30 08:17 hurd_0.9.git20191029-1_hurd-i386.upload
<AlmuHS>ok
<youpi>+archive delay
<AlmuHS>but, has you test It this configuration?
<AlmuHS>first test: upstream gnumach with default debian hurd. OK
<youpi>I have, sure
<AlmuHS>now I have to test this with upstream Hurd
<AlmuHS>my debian gnu/hurd is fully updated, btw
<youpi>note that upstream hurd doesn't have the debianish startup changes
<youpi>it'll expect scripts from /libexec etc.
<youpi>so in general it can't boot
<youpi>that might work by luck, but there is really no guarantee
<youpi>GNU and Debian just disagree on where files should be put
<youpi>if you want to test something in particular, just copy over that part in your system
<AlmuHS>upstream gnumach boot well, but without network
<youpi>upstream gnumach doesn't have the dde changes
<youpi>so netdde can't work
<AlmuHS>yes, I know
<AlmuHS>but boot
<AlmuHS>(excuse my spanglish)
<AlmuHS>compiling hurd. It takes a while
<AlmuHS>excuse me, I was eating lunch
<AlmuHS>youpi: upstream gnumach with upstream hurd: FAILED https://pasteboard.co/IEwY3Xq.png
<AlmuHS>or partial failed
<AlmuHS> https://pasteboard.co/IEwYFuI.png
<AlmuHS> https://pasteboard.co/IEwZmE0.png The fail is in procfs
<AlmuHS>same problem with default Debian gnumach
<AlmuHS> https://pasteboard.co/IEx0gdJ.png
<AlmuHS>in my SMP gnumach, also fails https://pasteboard.co/IEx1dHt.png
<AlmuHS>youpi: I finished my tests
<youpi>as I said, upstream's init system is very different from debian's init system, so it's not surprising that it fails
<youpi>just copy over what you actually need
<youpi>similarly, installing an upstream glibc over debian's glibc would be deemed to fail
<AlmuHS>yes, I forgot this
<AlmuHS>has procfs any debian patch?
<AlmuHS>and proc
<youpi>nothing that would be necessary
<AlmuHS>ok
<AlmuHS>and, where is proc and procfs? To copy manually
<AlmuHS>instead "make install"
<youpi>in proc/ and procfs/
<AlmuHS>in "/proc" and "/procfs"
<AlmuHS>in "/servers" I don't see It
<youpi>from proc/ and procfs/ to /hurd/
<youpi>see showtrans /proc
<AlmuHS>ok
<AlmuHS>ok, I see
<AlmuHS>ok, now Debian's gnumach with upstream proc and procfs (over Debian's Hurd) runs well
<AlmuHS>but, in my SMP gnumach, something goes wrong
<AlmuHS> https://pasteboard.co/IExcbv5.png
<AlmuHS>ok, with Debian's gnumach, the top command shows this crash
<AlmuHS>similar to my SMP gnumach
<youpi>you can attach gdb to the running procfs before triggering the crash
<AlmuHS> https://pasteboard.co/IExdm8m.png
<AlmuHS>ok, I'll try It
<jrtc27> http://git.savannah.gnu.org/cgit/hurd/hurd.git/tree/proc/info.c?id=7719472e3410d009c89d775ea05dc7f305d9ebce#n714 looks like that should have a `continue;` now too?
<jrtc27>not that I would expect that to be a segfault
<youpi>ergl, sure, the added code should be protected by an if (!err)
<jrtc27>not going to help, err gets cleared
<youpi>or continue in the if (err)
<youpi>note that that's the proc source code
<youpi>he has procfs dying, not proc
<jrtc27>ah right yes
<jrtc27>also the format string for last_process is %d yet the local is declared as long unsigned
<jrtc27>which is fine on i386, and probably harmless on x86_64
<jrtc27>(I'm assuming varargs stack slots on x86_64 are 64-bit)
*jrtc27 wonders whether libps needs to be in sync with procfs
<AlmuHS>excuse me, I was afk
<youpi>oh yes sure
<youpi>you also need to install libps
<AlmuHS>oh, ok. I go to try It
<AlmuHS>where is libps?
<jrtc27>/lib/i386-gnu/libps.so.0.3
<AlmuHS>Is It a package?
<jrtc27>built by hurd
<AlmuHS>out of hurd
<AlmuHS>?
<jrtc27>same package
<youpi>it's in the hurd-libs0.3 package
<youpi>same source package (hurd), but different binary package (hurd vs hurd-lib0.3)
<AlmuHS>ok
<AlmuHS>hurd-libs0.3/unstable,now 1:0.9.git20191029-1 hurd-i386 [instalado]
<AlmuHS>It's installed by default
<youpi>by "you also need to install libps", I mean like proc and procfs
<AlmuHS>but, Is It generated in "make", in hurd sources?
<youpi>yes
<youpi>see in libps
<AlmuHS>oh, I've just find It
<AlmuHS>libps.so ?
<youpi>yes, copy that into your lib/i386-gnu/libps.so.0.3
<AlmuHS>ok
<AlmuHS>done
<AlmuHS>in Debian's gnumach, continues same problem
<youpi>getting a backtrace with gdb would be really insightful
<AlmuHS>gdt must be executed inside the machine? or out
<youpi>inside the machine
<youpi>(fun typo, looks like your head is full of x86 :) )
<youpi>gdb /hurd/procfs the_pid_of_procfs
<AlmuHS>oh, It's strange. "ps -e | grep procfs" don't shows anything
<AlmuHS>only the same grep: pruebas 728 p0 S 0:00.03 grep procfs
<AlmuHS>I'm in my smp gnumach
<AlmuHS>ls /proc is empty too
<AlmuHS>It seems procfs if off
<AlmuHS>*is off
<youpi>showtrans /proc
<youpi>to be sure that it's still recorded
<youpi>possibly it just fails to start early
<youpi>and thus it doesn't show up
<AlmuHS>showtrans /proc don't shows anything
<youpi>then that's why
<youpi>ah wait that's perhaps mounted by init scripts nowadays
<youpi>but you can as well make it passive as well
<youpi>settrans /proc /hurd/procfs
<AlmuHS>but, If this, ps might show It, is not?
<youpi>not if it crashed already
<AlmuHS>ok, now /proc is not empty
<AlmuHS>and "showtrans /hurd/procfs" shows /hurd/procfs
<AlmuHS>but the error continues
<AlmuHS>now I can try gdb
<AlmuHS>nope, "ps -e" don't show procfs anyway
<AlmuHS>now
<youpi>if it crashed you need to let it restart by looking in /proc
<AlmuHS>ok, I'm in gdb
<AlmuHS>what I have to do
<AlmuHS>?
<youpi>xc
<youpi>grr
<youpi>c
<youpi>to let it continue
<youpi>(c as in "continue")
<youpi>then trigger the crash
<youpi>then bt
<youpi>(bt as in "backtrace")
<youpi>to get the backtrace
<AlmuHS>warning: Can't wait for pid 783: No child processes
<youpi>not a problem
<AlmuHS>gdb freeze
<youpi>that's expected
<youpi>it's letting the program continue
<AlmuHS>waiting
<youpi>waiting what ?
<youpi>did you trigger the crash?
<AlmuHS> https://pasteboard.co/IExImjHP.png
<youpi>like I said: trigger the crash
<youpi>on another console
<youpi>or through ssh
<youpi>I don't really understand what you expected, actually
<youpi>to get the backtrace of a crash, you need to trigger the crash
<AlmuHS>but I don't know how
<youpi>wasn't ps triggering the crash?
<youpi>or top, or whatever you did that triggered the crash
<AlmuHS>oh, I understand
<AlmuHS>wait, I go to reboot
<AlmuHS>I found the error
<AlmuHS>0x08033676 in process_file_gc_stat (ps=0x803f870, contents=0x1fffaa8)
<AlmuHS> at ../../procfs/process.c:237
<AlmuHS>237 last_processor = thsi->last_processor;
<AlmuHS>It's my code :(
<youpi>oh, you need to add PSTAT_THREAD_SCHED in the entries array
<youpi>along the others
<youpi>otherwise libps will not have filledit
<AlmuHS>in process.c ? (in stat writting)
<AlmuHS>or in info.c? calling thread_info()
<youpi>in the same file
<youpi>see where process_file_gc_stat is referenced
<AlmuHS>info.c, I remember
<AlmuHS>process.c is what has crash
<AlmuHS>but, what is "entries array"? stat array?
<youpi>look for "entries[]"
<youpi>or, like I said, look where process_file_gc_stat is referenced
<AlmuHS>hurd hasn't any caller for "process_file_gc_stat"
<youpi>no, but it's referenced in the file
<youpi>look for the function name
<AlmuHS>procfs/process.c:497: .get_contents = process_file_gc_stat,
<AlmuHS>or the own declaration
<AlmuHS>procfs/process.c:219:process_file_gc_stat (struct proc_stat *ps, char **contents)
<youpi>.get_contents, that's it, it's a reference to the function
<youpi>see the .needs just below
<AlmuHS>ok
<youpi>you need to add PSTAT_THREAD_SCHED, to express that it needs that information
<AlmuHS>ok
<AlmuHS>done
<AlmuHS> http://dpaste.com/03R4SDT
<AlmuHS>compiled with this mod. Booting
<AlmuHS>now works!! :-)
<AlmuHS>although top only shows 3 process
<AlmuHS>but all process are in cpu 0, anyway
<AlmuHS> https://pasteboard.co/IExVWc2.png
<AlmuHS>forcing "make -j2" to gen 2 threads, both runs over cpu 0 https://pasteboard.co/IExWLMa.png
<AlmuHS>youpi: fixed. But bad news for my smp work
<youpi>AlmuHS: could you post the output of ps -feM
<AlmuHS>ok
<youpi>I'm wondering if an idle thread shows up for the second processor
<youpi>rootdir_gc_stat would need to be fixed to include all processors information
<AlmuHS> https://pastebin.com/5RmX387T
<youpi>ah, sorry I missed a flag
<youpi>ps -feMj
<AlmuHS>ok
<AlmuHS> https://pastebin.com/8XrU0VTq
<youpi>Mmm, perhaps even ps -feMT
<AlmuHS>ok...
<youpi>ps -feMTj
<youpi>even :)
<youpi>so we have all informations
<AlmuHS> https://pastebin.com/RrfaXrYs
<youpi>yay, there are two gnumach thread with "N"
<youpi>so there is an idle thread for the second processor indeed
<AlmuHS>maybe I can know the cause. wait
<youpi>I wonder how get_idletime can determine which one is for which processor, though
<youpi>it's expected to have these two idle threads
<youpi>it's however surprising that one of them has no time
<AlmuHS> https://github.com/AlmuHS/GNUMach_SMP/blob/wip/i386/i386/cpuboot.S#L105-L109
<AlmuHS>maybe is this
<youpi>does cpu_ap_main terminate?
<AlmuHS>I added this hack to avoid that, when the processor is not added to the system, It returns to the snippet and finish
<AlmuHS>ideally, after "slave_main()" the processor never regret to cpuboot.S
<AlmuHS>because slave_main() generate a new thread over this
<AlmuHS>but maybe It regret anyways and does "hlt"
<youpi>you should put printfs to make sure what happens
<AlmuHS>yes
<youpi>normally slave_main doesn't return, so put printf there
<AlmuHS>I added this hack because, when I disable pagging and slave_main(), and the processor don't enter to the system, this regret to cpuboot.S, finish the thread and dead
<AlmuHS>ok
<youpi>you'll need to release ap_config_lock before calling it
<youpi>otherwise that lock will never be released
<AlmuHS>what?
<youpi>cpu_launch_first_thread() doesn't return
<youpi>(at least it's not supposed to)
<youpi>so slave_main won't either
<AlmuHS>It might not
<youpi>it's not supposed to
<youpi>put printfs to be suer
<AlmuHS>ok
<youpi>but in the long run it won't
<youpi>so releasing a mutex after it doesn't make sense
<youpi>that'll never happen
<AlmuHS>the mutex in cpu_setup() is useless, I think
<AlmuHS>I added this as an experiment
<youpi>does the second CPU have its own PIT?
<AlmuHS>PIT?
<youpi> cpu_launch_first_thread calls startrtclock
<youpi>you wouldn't want to do that again for the second processor, if you have only one PIT
<AlmuHS>what is PIT? :-(
<youpi>the programmable interrupt timer
<AlmuHS>ok, ok
<AlmuHS>in APIC, the Local APIC has Its own timer, but Mach don't use It
<youpi>I'd say in cpu_launch_first_thread put an if (th != THREAD_NULL) before calling startrtclock
<AlmuHS>I haven't implemented It yet
<AlmuHS>yes
<youpi>so only the boot cpu does it
<AlmuHS>added
<youpi>in idle_thread, put a printf
<youpi>to make sure that it's getting started
<AlmuHS>ok
<youpi>also put prints before thread_block, and between thread_block and idle_thread_continue
<youpi>to be sure what happens there
<AlmuHS> printf("cpu %d entering in idle_thread", cpu_number());
<youpi>and also at the begining of idle_thread_continue
<youpi>concerning get_idletime in procfs, the answer is: idle threads are created in the cpu number order
<youpi>so get_idletime can assume that
<youpi>note that for later :)
<youpi>that'll be needed to get proper cpu stats in top
<AlmuHS>I added this printf in the own idle_thread() function, not in the callet
<AlmuHS>*caller
<AlmuHS> printf("cpu %d continue in idle thread", mycpu);
<AlmuHS>?
<youpi>for instance
<youpi>the message doesn't really matter as long as it's different for each print, and contains the cpu number
<AlmuHS>what do you say about get_idletime ?
<youpi>err, well, what I said
<youpi>you'll see that later
<youpi>just copy/paste what I said
<youpi>for when you get to that function to get proper cpu stats in top
<youpi>so, about that idle thread not getting time in ps, that might be "normal": clock_interrupt() is supposed to be called in order to update that
<youpi>but since there's only the pit, for which cpu0 catches the interrupt, cpu1 never gets such interrupt and never calls clock_interrupt
<youpi>so you need to program the local apic insteda of the pit, to get these interrupts
<AlmuHS>ok
<AlmuHS>damo22 said that he could help me with local apic configuration
<youpi>yes, that's it: in the ivect array, irq0 is hardwired on the hardclock() function, which calls clock_interrupt
<youpi>you want to make the local apic call that as interupt handler
<youpi>that might also actually help to make thread go to cpu1
<youpi>without it, only scheduling IPIs would get that
<youpi>I guess you haven't implemented IPIs for scheduling yet?
<AlmuHS>I've implemented a generic function to write ICR
<AlmuHS>and the startup IPI sequence
<AlmuHS>anymore
<youpi>but that's only used at startup, not by the scheduler, right?
<AlmuHS>yes
<AlmuHS>only in startup
<AlmuHS>the sequence is start_other_cpus() -> intel_startCPU() -> startup_cpu()
<AlmuHS>---> (from interrupt) cpuboot.S -> cpu_ap_main() -> cpu_setup() - ... -> slave_main()
<AlmuHS> https://github.com/AlmuHS/GNUMach_SMP/commit/46c4bc76ac9b0dffa5105f29a11cffbba9f88e20
<youpi>so apparently what you need to do is to define an aston() function and define MACHINE_AST, so that in kern/sched_prim.c's thread_setrun(), the ast_on call does something
<youpi>aston would have to send the ipi
<AlmuHS> https://github.com/AlmuHS/GNUMach_SMP/commit/e1c6fac3fcc1ac0ff7c88402057f3566e8458913
<AlmuHS>ast.c
<AlmuHS>there are a ast_check() function empty https://github.com/AlmuHS/GNUMach_SMP/blob/master/i386/i386/ast_check.c
<AlmuHS>and here there are some NCPU > 1 cases without implement (I think) https://github.com/AlmuHS/GNUMach_SMP/blob/master/kern/act.c
<youpi>basically, your IPI interrupt handler wouldn't have to do anything
<youpi>ast_on only has to record the ast reason in the need_ast array, and send the IPI
<AlmuHS>I haven't an IPI interrupt handler by own
<AlmuHS>only a startUP IPI sequence
<youpi>the interrupt handler does nothing, but on return it will call all_intrs, which checks need_ast, and if so calls take_ast
<youpi>sure, that's to be set up
<youpi>just telling you what needs t obe done for this to work
<AlmuHS>ok
<AlmuHS>but this ast is not a architecture-defined IPI, is a fixed-type IPI , is not?
<AlmuHS>It's not like INIT IPI or StartUP IPI
<youpi>ast is just a software layer
<youpi>which only requires from the hardware to manage to raise an IPI
<youpi>whatever the form that has
<AlmuHS>but I haven't clear what is the purpose of a AST. Or, even, what is exactly an AST
<youpi>it's just a software layer for various stuff in the kernel to notify another cpu to do something
<youpi>since that's all software, you don't need to implement anything but the IPI
<AlmuHS>only send an IPI with the content indicated in the call ?
<youpi>really, as I said, all it takes to implement it is to write an aston function which records the reason in the need_ast array and sends an IPI
<youpi>and add an IPI handler which does nothing
<youpi>no even need for passing the content in the IPI, just record it in need_ast
<AlmuHS> https://github.com/AlmuHS/GNUMach_SMP/blob/smp/i386/i386/mp_desc.c#L232-L236
<youpi>ah, ast_on() already does the recording
<youpi>so aston() only has to trigger the IPI
<youpi>and the handler do nothing
<AlmuHS>my send_ipi() function only fills IDT (to send the IPI)
<youpi>that should be enough yes
<AlmuHS>but this need the IDT content to write
<AlmuHS>you can see an example of use just below send_ipi(), in startup_cpu()
<AlmuHS>ok, I start understand a few
<AlmuHS>but then, we need to define some IPI types (to each reason)
<AlmuHS>different signals
<AlmuHS>where I said "IDT content", I wanted to say "ICR content"
<youpi>put whatever you like, it doesn't matter
<youpi>the ast layer already manages the reasons
<youpi>so you don't need to manage them at the IPI level
<youpi>i.e. you'd have only one reason at the IPI level: ast
<AlmuHS>ok, as message passing
<youpi>note that it's when you'll start executing processes on cpu1 that mayhem may happen
<youpi>for now you do not have concurrency in drivers etc.
<youpi>once executing on cpu1 you will
<youpi>notably, the disk driver wil lbe called there
<AlmuHS>yes. I'm not an SMP expert, I'm programming a bit blindy (I only knowed until slave_main() )
<AlmuHS>I'm embarrased to say that, but It's the true :-(
<youpi>everybody started like that
<youpi>I did
<youpi>just a longer time ago than you did
<AlmuHS>then, I need a new step sequence, with the next steps to solve
<youpi>yeah, the ahci/ide disk driver will not be smp safe
<youpi>it'd probably be safer to use a rumpdisk driver
<AlmuHS>I need a new task list
***Emulatorman__ is now known as Emulatorman
<youpi>what i said for now: local apic, IPI
<AlmuHS>damien said ioapic (to share IRQ)
<youpi>then you'll want to implement cpu stats in proc for top to show them
<youpi>but in terms of implementation that should be about it
<youpi>what is next is to debug issues that will come out
<youpi>or ioapic, I don't remember the details. BAscially, something that interrupts cpus periodically
<AlmuHS>ioapic is for devices IRQ
<AlmuHS>at moment, keyboard IRQ (the tty don't reply keyboard)
<AlmuHS>Mach currently use PIC8259 controller
<AlmuHS>but damo22 said that, to share I/O IRQ between processors, could be needed to configure IOAPIC
<AlmuHS> http://paste.debian.net/plain/1111966
<youpi>we don't really need to share IRQs between processors for now
<youpi>it's fine to let cpu0 handle them all
<AlmuHS>then, why tty don't reply keyboard?
<youpi>possibly due to the pit thing, iirc they are related
<AlmuHS>It's possible
<AlmuHS> https://pasteboard.co/IEyBIUw.png
<youpi>ok, put an if to avoid these :)
<AlmuHS> if(self != THREAD_NULL) printf("cpu %d entering in idle_thread\n", cpu_number()); ?
<youpi>you can as well just test cpu_number()
<AlmuHS>yes
<AlmuHS>if(cpu_number() != 0)
<AlmuHS> printf("cpu %d entering in idle_thread\n", cpu_number());
<youpi>thinking about it on my way back home: in idle_thread_continue() you could replace machine_idle() with machine_relax(), to avoid making the cpu enter sleep mode and need an irq to be awakened
<youpi>that'll suck power, but it should allow for proper cpu scheduling
<AlmuHS>in cpu_launch_first_thread() ?
<youpi>? no, that one doesn't call machine_idle ?
<AlmuHS>idle_thread_continue() then?
<youpi>like I said, yes
<AlmuHS> https://github.com/AlmuHS/GNUMach_SMP/blob/wip/kern/sched_prim.c#L1665
<AlmuHS>this?
<youpi>well, yes, there is no other call to machine_idle in idle_thread_continue
<AlmuHS>ok
<AlmuHS>compiling...
<AlmuHS>something crash https://pasteboard.co/IEyKODJ.png
<youpi>ok. To avoid issues I was thinking that you could, in thread_init(), set bound_processor to cpu_to_processor(0) instead of PROCESSOR_NULL
<youpi>normally that'd make all tasks remain on cpu0
<youpi>so you can choose what you want to possibly put on cpu1
<youpi>that'd be notably useful to make ext2fs remain on cpu0 and avoid smp issues in the ide/ahci driver
<AlmuHS>yes, I go to try It
<AlmuHS>do you say thread_init() function, or the call to this?
<AlmuHS>ok, done
<AlmuHS>there are a special SMP usecase, disabled
<AlmuHS>#if NCPUS > 1
<AlmuHS> /* thread_template.last_processor (later) */
<AlmuHS>#endif /* NCPUS > 1 */
<youpi>when I say "in thread_init", I mean inside it
<AlmuHS>yes, I've just find It
<youpi>otherwise I would say "in callers of thread_init"
<AlmuHS>compiling...
<AlmuHS>same problem
<AlmuHS>wait
<AlmuHS>It's booting, but so slow
<youpi>how are you emulating the system?
<youpi>qemu?
<AlmuHS>yes
<youpi>with kvm acceleratoin?
<AlmuHS>I think yes
<youpi>do you have several CPUs on your host?
<youpi>does top on your host report 200% cpu usage?
<AlmuHS>yes, dual cpu, 4 thread
<AlmuHS>I have enable-kvm, in fact
<youpi>ok, you could perhaps try to put a for loop around calling machine_relax()
<youpi>to call it like a thousand times
<AlmuHS>ok
<youpi>if that's not enough, try a million times :)
<AlmuHS>ok
<AlmuHS>you can follow my commits here, btw https://github.com/AlmuHS/GNUMach_SMP/commits/wip
<youpi>already checked that out :)
<AlmuHS>latest commit don't solve problem
<AlmuHS>oh, wait. I had a fail in my latest commit. I forgot i++
<youpi>that said, it still means that calling machine_relax in a loop makes gnumach slow
<youpi>with i++ that'll be the same
<AlmuHS>yes
<AlmuHS>really, what makes slow is Hurd servers boot
<AlmuHS>the boot start freezing in ext2fs and finally freeze in Hurd server bootstrap, after load exec
<AlmuHS> https://pasteboard.co/IEz0nvP.png
<youpi>did you add the bound_processor thing?
<AlmuHS>yes
<AlmuHS> https://github.com/AlmuHS/GNUMach_SMP/commit/e9d9c00ccef0fc02fd5e94e1fc985d70b7163621
<AlmuHS>what can I do?
<youpi>did you try with the bound_processor thing but keep machine_idle()?
<youpi>in case it's actually the bound_processor thing which poses problem
<AlmuHS>I don't test It, I remember
<AlmuHS>I go to try It
<youpi>oh, wait, I didn't realize: you want to replace machine_idle with machine_relax only on cpu1
<youpi>cpu0 should continue using machine_idle
<youpi>and the infinite while loop for sure made cpu0 somehow hang :)
<AlmuHS>I go to check It
<AlmuHS>same: https://pasteboard.co/IEzqoCU.png
<AlmuHS> https://github.com/AlmuHS/GNUMach_SMP/commit/56d8508c10196a8df80e822e73ab767c2a557b73
<youpi>I was thinking: to make sure to know what is happening, put a print there when cpu is not 0
<youpi>to make sure whether cpu1 is just stuck in machine_idle, or something else
<AlmuHS>after or before than machine_relax() ?
<AlmuHS>before better
<AlmuHS>I go to dinner
<youpi>before and after
<AlmuHS>compiling
<AlmuHS> https://pasteboard.co/IEzHW8H.png
<AlmuHS>youpi:
<youpi>ah, I meant to use the machine_idle version
<youpi>it's not surprising that it gets out of machine_relax
<AlmuHS>ok, XD
<youpi>I was rather wondering whether the machine_idle version does return or not
<youpi>did you look what machine_idle/relax do?
<AlmuHS>I didn't look It yet
<youpi>it's just a few lines
<youpi>you'll better understand what I'm after
<AlmuHS>in model_dep.c, I see
<AlmuHS>machine_relax() is a "nop" loop
<youpi>do you use tags in your editor?
<AlmuHS>no
<youpi>to be able to quickly jump to the definition of a function?
<youpi>you *really* should
<youpi>you won't have to care about where is what any more
<youpi>that's a so huge time saving
<AlmuHS>I usually use CodeBlocks when I want to find any symbol
<AlmuHS>but now I'm in a simple Gedit
<AlmuHS>ok. machine_idle() is a hlt
<AlmuHS>I've just opened CodeBlocks
<youpi>why using gedit?
<AlmuHS>because It's light
<AlmuHS>to quick changes, It's more simple than a fully IDE
<AlmuHS>but the slower task is transfer the changes to the VM. I have to commit and push to my repo, then fetch and merge in my VM, checkout to the Damien's branch (with Debian patches), and remerge It
<youpi>if the IDE makes it slow to make small changes, change IDE :)
<AlmuHS>no, simply I'm so lazy to open the IDE, unless It's really necessary
<AlmuHS>ok, the new print don't shows any result
<AlmuHS> https://github.com/AlmuHS/GNUMach_SMP/commit/cbd182e9e5c9b9adfc60514ea31f28008d8de3eb
<youpi>there's a misunderstanding
<youpi>I'm not saying to look at what is happening on cpu0
<youpi>that one is fine since your system can usually boot with it
<youpi>I'm saying to use machine_idle on cpu1 too, and observe there whether it does return
<AlmuHS> https://pasteboard.co/IEzQaNf.png
<AlmuHS>oh, ok
<AlmuHS>compiling
<AlmuHS> https://pasteboard.co/IEzTuCM.png
<youpi>does it boot fast?
<youpi>(note that sometimes boot gets stuck at exec startup, and it takes a couple reboots to get it past that)
<AlmuHS>no boot
<AlmuHS>simply freeze
<youpi>just to make sure: apart from the prints, there is no difference with the version that was booting fine?
<AlmuHS>nope
<youpi>so I believe it's just the boot that gets stuck, try to reboot it
<AlmuHS>the slow is not gnumach
<AlmuHS>the problem starts when ext2fs (and hurd after this) trying to boot
<youpi>ah yes there is a difference: the bound_processor thing
<youpi>did you try to reboot several times?
<AlmuHS>yes
<AlmuHS>It's the 4th
<youpi>try to revert the bound_processor thing then
<AlmuHS>ok
<youpi>to get back to something that was working before
<AlmuHS>ok
<youpi>otherwise one can't make sure what is due to what
<AlmuHS>ok, reverted bound_processor, It seems works
<AlmuHS> https://github.com/AlmuHS/GNUMach_SMP/commit/f47300b95f3560e7e61aa0abbf606a429282f5e0
<youpi>ok, so let's keep that for now, does it print cpu 1 out of idle?
<AlmuHS>I didn't see It
<AlmuHS>I go to reboot to check better
<youpi>if you don't see it, it means it doesn't happen, normally you would see loads of it :)
<youpi>you could try to replace machine_idle(mycpu) with the infinite loop while(1) ;
<youpi>just to make it hang in a cpu-busy with that should get emulated fine
<youpi>perhaps machine_relax() actually bothers virtualization
<AlmuHS> https://pasteboard.co/IEzYTxs.png
<AlmuHS>currently, both processors has machine_idle()
<youpi>if the while(1); loop boots fine as well, you can try to make int i a volatile int i, and use for (i = 0; i < 10000000; i++) ; to make it a small delay loop without using machine_relax()
<youpi>I'm all talking about cpu1
<youpi>cpu0 works just fine since it has its irqs
<AlmuHS>yes, I go to try It
<AlmuHS>compiling
<youpi>err
<youpi>one step at a time
<AlmuHS>what?
<youpi>I said first try with the while(1); loop
<AlmuHS>I misunderstand
<youpi>if you don't try each step, you don't understand what is happening
<AlmuHS>yes, It's true
<AlmuHS>recompiling
<AlmuHS>with while(1) It works well
<youpi>ok
<youpi>so many the cpu spin is not a problem
<AlmuHS>while(1) is like stop the processor
<youpi>so try the for loop
<AlmuHS>ok
<AlmuHS>compiling
<AlmuHS>with for-loop, It seems to freeze
<AlmuHS> https://pasteboard.co/IEA7ZQS.png
<youpi>ok, so it's really letting cpu1 run other threads which poses problem
<youpi>which is only to be expected, actually :)
<AlmuHS>concurrency?
<youpi>all sorts of yes
<AlmuHS>non mutual exclusion
<youpi>I'm thinking that you should first set back the bound_processor thing
<youpi>with machine_idle() on cpu1
<youpi>and make that work, before going further
<AlmuHS>set cpu1 as bound_processor ?
<youpi>no, cpu0
<AlmuHS>I did It before
<AlmuHS>do I keep cpu0's machine_idle() ?
<youpi>yes
<youpi>the idea is to keep almost similar to what was working
<youpi>and only change the bound_processor thing
<youpi>which is not supposed to b reak anything (but apparently does)
<youpi>and once working, should allow to let cpu1 use machine_relax(), for it only to find that it has nothing to execute
<youpi>and then enable running some tasks on cpu1 one by one
<AlmuHS> https://github.com/AlmuHS/GNUMach_SMP/commit/9a0bfb25b22a2ec9c2c170daf4416aecda4c4b59
<AlmuHS> https://github.com/AlmuHS/GNUMach_SMP/commit/653a1f7ba3d5ab0f47a52853ad55d6802325cdfb
<AlmuHS>my IDE broken some tabs
<AlmuHS>the system hang again
<AlmuHS>in the same place than before
<youpi>which place ?
<AlmuHS> https://pasteboard.co/IEAgb3s.png
<AlmuHS>maybe hlt is shutdown entirely the cpu1
<youpi>that's not a problem
<youpi>cpu0 shoul dbe able to continue working without problem
<youpi>I'm checking what setting bound_processor would change
<youpi>I don't see anything obvious
<youpi>I'd say put prints in thread_setrun and in thread_select to make sure that evertything happens as appropriate
<youpi>i.e. thread_setrun puts it on cpu0's runq
<youpi>and thread_select finds it there
<AlmuHS>have I to put these prints ?
<youpi>that's waht I'm saying yes
<AlmuHS>ok
<AlmuHS>here? if (myprocessor->runq.count > 0) {
<youpi>anywhere where it makes sense
<AlmuHS>It's interesting
<AlmuHS>if ((thread->state == TH_RUN) &&
<AlmuHS>#if MACH_HOST
<AlmuHS> (thread->processor_set == pset) &&
<AlmuHS>#endif /* MACH_HOST */
<AlmuHS> ((thread->bound_processor == PROCESSOR_NULL) ||
<AlmuHS> (thread->bound_processor == myprocessor))) {
<AlmuHS> simple_unlock(&pset->runq.lock);
<AlmuHS> thread_lock(thread);
<AlmuHS> if (thread->sched_stamp != sched_tick)
<AlmuHS> update_priority(thread);
<AlmuHS> thread_unlock(thread);
<AlmuHS> }