I hope someone can help me with this issue, I am pretty stuck at the moment...
What I found out so far:
- there are 5 interrupts from the network chip that are handled correctly, the 6th and subsequent thousands of interrupts are not handled correctly.
- I printk various register values from the ethernet chip.
- at the first 5 interrupts everything seems fine
- from interrupt 6 onward the chip seems to be vanished, even the register holding the chip's version number returns bogus values
- the first 5 interrupts seem to be caused by the initialisation (sending a loopback packet etc). All this is handled according to what the driver expects.
- at interrupt 6 the kernel tries to contact the NFS server for the boot partition
- the interrupt routine gets (bogus) 0 values out of the interrupt registers of the network chip and concludes (erroneously) that it is not his interrupt
- the kernel then concludes (after 1000's of interrupts) that there is something wrong and shuts down the interrupt => no more network
The only explanation I can think of is that the mmu tables get garbled after 5 interrupts and that the virtual address that is supposed to be translated to the physical network chips address is translated to somewhere else.
To check this theory, I would like some help:
- how can I translate a virtual address to a physical address? Then I could printk the physical address and check its consistency.
- how can I dump the mmu translation tables to check if something is damaged?
- do I understand correctly that the ARM mmu tables are separate (but in parallel with) from the linux tables? Then those could also get out of sync.
I can hardly believe that I am the only one with this problem, since I am not doing anything special. The only possibility is that I have some kernel build config setting different from what other people have. Could somebody with a working 2.6.22.6 kernel please mail me his config file (or put it on the list here) so I can compare? It would make me more then happy.
Cheers,
Tom