Most people would agree that the x86 design is full of legacy junk. But to truly understand this, I think one has to dive in and see for himself. I’d like to talk about my little journey of discovery, in which I learnt the horrors of i8086 legacy.
Roughly three weeks ago, I decided it would be a nice experiment to pick GRUB 2 and make an i386 firmware out of it. GRUB can already run as a standalone bootloader and be part of your firmware when you combine it with coreboot (which initializes the motherboard), but I wanted to have an easy way to test this standalone mode in QEMU. The result (which, btw, is packaged in Debian as grub-firmware-qemu) behaves in exactly the same way a coreboot/GRUB would (except, of course, that it will only work in QEMU).
Initially I thought this would be piece of cake. In QEMU there’s no motherboard to initialize, so basically the steps would be:
– Process the VGA rom with a far call.
– Switch to protected (i386) mode.
– Done! Jump to grub_main() and start as usual.
Hah! So far from reality. First of all, we start with code segment 0xf000, offset 0xfff0, which corresponds to virtual address 0xffff0. Our ROM is I/O mapped in the 0xf0000-0x100000 range. So we’re at exactly 16 bytes before the end of our code. With no room for anything, all we can do is jump.
Not so bad, right? Let’s jump to the beginning of our whole ROM image, and put the initialization code there?
No way. The 0xf0000-0x100000 range in which we’re mapped is just 64 kiB in size, and our image might be bigger (we generate it dynamically with grub-mkimage, and can even include an embedded filesystem). Only the high 64 kiB are mapped there. The rest of our code is near the top of virtual memory, which we can’t access yet because we’re still in i8086 mode (and 640 kiB are enough for everybody, remember?).
I opted for creating a small image with entry code, boot.img, using a hardcoded size (512 bytes). This image will later be picked by grub-mkimage and allocated at the end of our ROM. So we do a relative jump to the beginning of this image:
. = GRUB_BOOT_MACHINE_SIZE – 16
. = GRUB_BOOT_MACHINE_SIZE
and proceed with (finally!) processing the VGA rom:
/* Process VGA rom. */
call $0xc000, $0x3
and switching to 32-bit i386 mode:
/* Transition to protected mode. We use pushl to force generation
of a flat return address. */
DATA32 jmp real_to_prot
But before we leave boot.img, we need to figure out where’s the rest of our code. It’s not relative to our current location because, ugh, the beginning of our ROM was truncated.
We know it’s mapped at the top of memory, and for the sake of simplicity (which was greatly missed in this experience), its 32-bit entry point is at the beginning of it. So we only need to substract the ROM size to the 4 GiB barrier. But all this was already known by grub-mkimage when generating our ROM. And it was kind enough to embed this address in a variable:
movl grub_core_entry_addr, %edx
Problem is, our toolchain puts the BSS right after our code, which ends really close to the 4 GiB limit. It might not even fit in memory! There’s a chance that it might do, depending on the size of our module selection (GRUB modules are placed right after the main body of code), but no garantee about it! Isn’t the top of memory a practical location?
So let’s relocate elsewhere. Recipe for relocation: current location, destination address, size. Our destination address is somewhat arbitrary, we just pick whatever we used at link time. We’ve known our size since grub-mkimage generating this ROM, so we arranged to have it embedded in a variable, like we did for boot.img:
Whoops, too bad, we can’t even read it, because… memory access is always absolute, and we don’t know its absolute location, so we need to make this position-independant in some way. Fortunately, we know that ROM size is a multiple of 64 kiB, so we obtain %eip and round it:
/* Relocate to low memory. First we figure out our location.
We will derive the rom start address from it. */
1: popl %esi
/* Rom size is a multiple of 64 kiB. With this we get the
value of `grub_core_entry_addr’ in %esi. */
xorw %si, %si
At last! We can read grub_kernel_image_size:
/* … which allows us to access `grub_kernel_image_size’
before relocation. */
movl (grub_kernel_image_size – _start)(%esi), %ecx
and then proceed to relocate,
movl $_start, %edi
ljmp $GRUB_MEMORY_MACHINE_PROT_MODE_CSEG, $1f
zero the BSS, and jump to grub_main():
* Call the start of main body of C code.
the rest is business as usual.
So, was it so hard to just map the damn thing at a fixed address, say, 0xf0000, without truncating it or using weird memory locations, and use this same address as entry point?
I think I learnt my lesson: never underestimate what 30 years of legacy constraints can do to your sanity. Well, for what is worth, it was a nice learning experience, with a byproduct you might find useful and/or interesting yourself.