IRC channel logs

<gnu_srs1>fowler: I have the same problem with my hurd-cross :( Not solved yey. But you can log in as root by "login root <return>"

<gnu_srs1>You can login as root with no password. Be careful!!

<damo22>=> 0xc1001518 <+56>: pause

<damo22> 0xc100151a <+58>: mov 0x8(%edi),%eax

<damo22> 0xc100151d <+61>: test %eax,%eax

<damo22>that is an infinite loop

<damo22>how can 0x8(%edi) get a different value on one core?

<damo22>this is before interrupts are working

<damo22>#define simple_lock(l) \

<damo22> ({ \

<damo22> while(_simple_lock_xchg_(l, 1)) \

<damo22> while (*(volatile int *)&(l)->lock_data) \

<damo22> __asm ("pause"); \

<damo22> 0; \

<damo22> })

<damo22>i dont think this works

<damo22>if l is already locked, it will fall into the second loop and spin on the lock_data == 1 forever

<gnucode>damo22: this is quite a bit above my paygrade...is that a glibc bug or a gnumach bug?

<damo22>gnumach, im working on smp

<youpi>damo22: if it's kept locked, the loop will go on forever, sure

<youpi>that's why people are supposed to release locks at some point :)

<damo22>if there is one cpu only, and no interrupts, it cant ever release the lock

<youpi>sure

<youpi>but that's not supposed to happen

<youpi>you're not supposed to re-take a lock that you have already held

<youpi>that's why it's already interesting to boot an smp kernel, that allows you to catch such situation

<youpi>(even with one cpu)

<youpi>( that you already hold*, rather )

<damo22>even not retaking a lock

<damo22>it cant release the lock

<youpi>sure

<youpi>but it's *not* supposed to take the same lock twice

<youpi>so that's not suppsoed to happen

<youpi>if it happens it's a bug that needs to be fixed anyway

<damo22>simple_lock(l) will never return

<youpi>that's how spinlocks work

<youpi>if it's already held by the same cpu, yes

<youpi>again, that's how they work

<youpi>it's how OS have been using them for decades

<damo22>so we cant use simple locks when theres only one cpu

<youpi>of cours we can

<youpi>there is nothing against that

<youpi>(and having several cpus doesn't solve the issue at all anyway)

<youpi>it really seems to me that you really need to read books about OS

<youpi>that explain spinlocks, spinlocks against interrupts, etc.

<damo22>yea

<youpi>e.g. "the linux kernel" from Bovet

<gnucode>I've got the dragon CS book. It's just been sitting on my shelf for 2 years...

<youpi>+understanding

<youpi>dragon isn't at all about OS programming

<damo22>why are there two nested while loops in the definition of simple_lock?

<youpi>damo22: notably, an important thing: if an interrupt handler wants to lock an slock, *all* other takers of the slock *have* to raise spl before taking the lock

<youpi>othersince things will deadlock

<youpi>damo22: again, that's very *common* knownledge about spinlocks

<youpi>as in: there is no need to perform an atomic operation to check about the availability of the spinlock, that's useless expense

<youpi>so you have an internal loop that is just there to wait, before re-trying again

<youpi>again, that's all explained in books

<damo22>if you call splhigh(); simple_lock(); its not interruptable

<youpi>yes, that's exactly the purpose

<youpi>to avoid getting interrupted by a handler that'd want to take the same slock, and thus stay stuck

<youpi>again, that's explained in the Bovet book

<youpi>(which is available as pdf on e.g. https://doc.lagout.org/operating%20system%20/linux/Understanding%20Linux%20Kernel.pdf)

<youpi>(Bovet + Cesati)

<youpi>(if there's one book I definitly recommend buying to be able to read it carefully, it's this one)

<damo22>ok

<youpi>and then linux device drivers, from corbet, rubini, & gkh

<damo22>"while the current state is locked, lock it again and spin on the lock value until its unlocked"

<damo22>that relies on some interrupt to unlock it, but that can only occur on a different cpu

<damo22>with one cpu it will never get unlocked

<damo22>actually the code we have in i386/lock.h is more like:

<damo22>1. take the lock, check its previous value, if it was unlocked before, return.

<damo22>2. otherwise spin on the value of the lock until its unlocked

<damo22>3. goto 1

<damo22>the only way this process can return is if it was never locked in the first place, OR something unlocks it (can never happen)

<damo22>so we need to make sure in the single processor case, nothing locks the same lock twice

<damo22>without first unlocking it

<damo22>simple_lock(l); simple_lock(l); DEADLOCK

<damo22>i moved the simple_lock macro to a function, and got this:

<damo22>#0 0xc100b546 in simple_lock (l=0xf0) at ../i386/i386/lock.c:17

<damo22>#1 0xc1001e20 in pmap_enter (pmap=0xc10a81c4 <kernel_pmap_store>, v=4135583744, pa=17575936, prot=3,

<damo22> wired=0) at ../i386/intel/pmap.c:2001

<damo22>why would the lock be at address 0xf0

<damo22>(gdb) frame 1

<damo22>#1 0xc1001e20 in pmap_enter (pmap=0xc10a81c4 <kernel_pmap_store>, v=4135583744, pa=17575936, prot=3,

<damo22> wired=0) at ../i386/intel/pmap.c:2001

<damo22>2001 PMAP_READ_LOCK(pmap, spl);

<damo22>(gdb) p &(pmap->lock)

<damo22>$2 = (simple_lock_data_t *) 0xc10a81cc <kernel_pmap_store+8>

<damo22>(gdb) x 0xf0

<damo22>0xf0: 0x00000001

<damo22>its reading the wrong address for the lock

<damo22>Pellescours: can you see why simple_lock would get the wrong address of the lock?

<damo22>while (*(volatile int *)&(l)->lock_data)

<damo22>where (l) would be substituted for (&(pmap)->lock)

<damo22>l = 0xf0

<luckyluke>damo22: maybe it' not initialized... Do you have a full stacktrace?

<youpi>damo22: take care that gdb cannot always get addresses right

<youpi>only assembly will tell you the truth

<youpi>and, yes simple_lock(); simple_lock() will deadlock, it's meant for that

<youpi>and testing with 1 cpu already allows to check that we don't have code like that

<Pellescours>me it deadlocked in pmap_extract and I don't see place before where lock left locked before entring there

<luckyluke>it seems I have a basic working 64-bit kernel :) https://paste.debian.net/1270421/

<luckyluke>some things are a bit strange (e.g. I see 1GB of memory in /proc/meminfo instead of 8GB) but I can use a shell

<surpador>hey, that's exciting!

IRC channel logs

2023-02-11.log