IRC channel logs

2023-02-11.log

back to list of logs

<gnu_srs1>fowler: I have the same problem with my hurd-cross :( Not solved yey. But you can log in as root by "login root <return>"
<gnu_srs1>You can login as root with no password. Be careful!!
<damo22>=> 0xc1001518 <+56>: pause
<damo22> 0xc100151a <+58>: mov 0x8(%edi),%eax
<damo22> 0xc100151d <+61>: test %eax,%eax
<damo22> 0xc100151f <+63>: jne 0xc1001518 <pmap_extract+56>
<damo22>that is an infinite loop
<damo22>how can 0x8(%edi) get a different value on one core?
<damo22>this is before interrupts are working
<damo22>#define simple_lock(l) \
<damo22> ({ \
<damo22> while(_simple_lock_xchg_(l, 1)) \
<damo22> while (*(volatile int *)&(l)->lock_data) \
<damo22> __asm ("pause"); \
<damo22> 0; \
<damo22> })
<damo22>i dont think this works
<damo22>if l is already locked, it will fall into the second loop and spin on the lock_data == 1 forever
<gnucode>damo22: this is quite a bit above my paygrade...is that a glibc bug or a gnumach bug?
<damo22>gnumach, im working on smp
<youpi>damo22: if it's kept locked, the loop will go on forever, sure
<youpi>that's why people are supposed to release locks at some point :)
<damo22>if there is one cpu only, and no interrupts, it cant ever release the lock
<youpi>sure
<youpi>but that's not supposed to happen
<youpi>you're not supposed to re-take a lock that you have already held
<youpi>that's why it's already interesting to boot an smp kernel, that allows you to catch such situation
<youpi>(even with one cpu)
<youpi>( that you already hold*, rather )
<damo22>even not retaking a lock
<damo22>it cant release the lock
<youpi>sure
<youpi>but it's *not* supposed to take the same lock twice
<youpi>so that's not suppsoed to happen
<youpi>if it happens it's a bug that needs to be fixed anyway
<damo22>simple_lock(l) will never return
<youpi>that's how spinlocks work
<youpi>if it's already held by the same cpu, yes
<youpi>again, that's how they work
<youpi>it's how OS have been using them for decades
<damo22>so we cant use simple locks when theres only one cpu
<youpi>of cours we can
<youpi>there is nothing against that
<youpi>(and having several cpus doesn't solve the issue at all anyway)
<youpi>it really seems to me that you really need to read books about OS
<youpi>that explain spinlocks, spinlocks against interrupts, etc.
<damo22>yea
<youpi>e.g. "the linux kernel" from Bovet
<gnucode>I've got the dragon CS book. It's just been sitting on my shelf for 2 years...
<youpi>+understanding
<youpi>dragon isn't at all about OS programming
<damo22>why are there two nested while loops in the definition of simple_lock?
<youpi>damo22: notably, an important thing: if an interrupt handler wants to lock an slock, *all* other takers of the slock *have* to raise spl before taking the lock
<youpi>othersince things will deadlock
<youpi>damo22: again, that's very *common* knownledge about spinlocks
<youpi>as in: there is no need to perform an atomic operation to check about the availability of the spinlock, that's useless expense
<youpi>so you have an internal loop that is just there to wait, before re-trying again
<youpi>again, that's all explained in books
<damo22>if you call splhigh(); simple_lock(); its not interruptable
<youpi>yes, that's exactly the purpose
<youpi>to avoid getting interrupted by a handler that'd want to take the same slock, and thus stay stuck
<youpi>again, that's explained in the Bovet book
<youpi>(which is available as pdf on e.g. https://doc.lagout.org/operating%20system%20/linux/Understanding%20Linux%20Kernel.pdf)
<youpi>(Bovet + Cesati)
<youpi>(if there's one book I definitly recommend buying to be able to read it carefully, it's this one)
<damo22>ok
<youpi>and then linux device drivers, from corbet, rubini, & gkh
<damo22>"while the current state is locked, lock it again and spin on the lock value until its unlocked"
<damo22>that relies on some interrupt to unlock it, but that can only occur on a different cpu
<damo22>with one cpu it will never get unlocked
<damo22>actually the code we have in i386/lock.h is more like:
<damo22>1. take the lock, check its previous value, if it was unlocked before, return.
<damo22>2. otherwise spin on the value of the lock until its unlocked
<damo22>3. goto 1
<damo22>the only way this process can return is if it was never locked in the first place, OR something unlocks it (can never happen)
<damo22>so we need to make sure in the single processor case, nothing locks the same lock twice
<damo22>without first unlocking it
<damo22>simple_lock(l); simple_lock(l); DEADLOCK
<damo22>i moved the simple_lock macro to a function, and got this:
<damo22>#0 0xc100b546 in simple_lock (l=0xf0) at ../i386/i386/lock.c:17
<damo22>#1 0xc1001e20 in pmap_enter (pmap=0xc10a81c4 <kernel_pmap_store>, v=4135583744, pa=17575936, prot=3,
<damo22> wired=0) at ../i386/intel/pmap.c:2001
<damo22>why would the lock be at address 0xf0
<damo22>(gdb) frame 1
<damo22>#1 0xc1001e20 in pmap_enter (pmap=0xc10a81c4 <kernel_pmap_store>, v=4135583744, pa=17575936, prot=3,
<damo22> wired=0) at ../i386/intel/pmap.c:2001
<damo22>2001 PMAP_READ_LOCK(pmap, spl);
<damo22>(gdb) p &(pmap->lock)
<damo22>$2 = (simple_lock_data_t *) 0xc10a81cc <kernel_pmap_store+8>
<damo22>(gdb) x 0xf0
<damo22>0xf0: 0x00000001
<damo22>its reading the wrong address for the lock
<damo22>Pellescours: can you see why simple_lock would get the wrong address of the lock?
<damo22>while (*(volatile int *)&(l)->lock_data)
<damo22>where (l) would be substituted for (&(pmap)->lock)
<damo22>l = 0xf0
<luckyluke>damo22: maybe it' not initialized... Do you have a full stacktrace?
<youpi>damo22: take care that gdb cannot always get addresses right
<youpi>only assembly will tell you the truth
<youpi>and, yes simple_lock(); simple_lock() will deadlock, it's meant for that
<youpi>and testing with 1 cpu already allows to check that we don't have code like that
<Pellescours>me it deadlocked in pmap_extract and I don't see place before where lock left locked before entring there
<luckyluke>it seems I have a basic working 64-bit kernel :) https://paste.debian.net/1270421/
<luckyluke>some things are a bit strange (e.g. I see 1GB of memory in /proc/meminfo instead of 8GB) but I can use a shell
<surpador>hey, that's exciting!