如何在没有操作系统的情况下运行程序?

use*_*609 227 x86 assembly operating-system osdev bootloader

如何在没有运行操作系统的情况下自行运行程序?你能创建计算机可以在启动时加载和运行的汇编程序,例如从闪存驱动器启动计算机并运行cpu上的程序吗?

Cir*_*四事件 683

可运行的例子

让我们创建并运行一些在没有操作系统的情况下运行的小型裸机hello world程序:

我们也将尽可能在QEMU仿真器上试用它们,因为这样更安全,更便于开发.QEMU测试已在Ubuntu 18.04主机上进行,预装QEMU 2.11.1.

以下所有x86示例的代码以及更多内容都存在于此GitHub存储库中.

如何在x86真实硬件上运行示例

请记住,在真实硬件上运行示例可能很危险,例如,您可能会错误地擦除磁盘或硬件:只在不包含关键数据的旧计算机上执行此操作!或者甚至更好,使用廉价的半一次性设备,如Raspberry Pi,请参阅下面的ARM示例.

对于典型的x86笔记本电脑,您必须执行以下操作:

  1. 将图像刻录到USB记忆棒(会破坏您的数据!):

    sudo dd if=main.img of=/dev/sdX
    
    Run Code Online (Sandbox Code Playgroud)
  2. 将USB插入计算机

  3. 打开它

  4. 告诉它从USB启动.

    这意味着固件在硬盘之前选择USB.

    如果这不是你的机器的默认行为,不停的按回车键,F12,ESC或其他类似的怪异键在接通电源后,直到你得到一个启动菜单,在这里你可以选择从USB启动.

    通常可以在这些菜单中配置搜索顺序.

例如,在我的T430上,我看到以下内容.

打开后,这是我必须按Enter进入启动菜单:

在此输入图像描述

然后,在这里我必须按F12选择USB作为启动设备:

在此输入图像描述

从那里,我可以选择USB作为启动设备,如下所示:

在此输入图像描述

或者,要更改引导顺序并选择USB具有更高的优先级,我会在"启动中断菜单"屏幕上点击F1,然后导航到:

在此输入图像描述

引导扇区

在x86上,您可以做的最简单和最低级别的事情是创建一个主引导扇区(MBR),它是一种引导扇区,然后将其安装到磁盘上.

在这里,我们通过一次printf调用创建一个:

printf '\364%509s\125\252' > main.img
sudo apt-get install qemu-system-x86
qemu-system-x86_64 -hda main.img
Run Code Online (Sandbox Code Playgroud)

结果:

在此输入图像描述

请注意,即使没有做任何事情,屏幕上也会打印出几个字符.它们由固件打印,用于识别系统.

在T430上我们只得到一个带有闪烁光标的空白屏幕:

在此输入图像描述

main.img 包含以下内容:

  • \364in octal == 0xf4in hex:hlt指令的编码,告诉CPU停止工作.

    因此我们的程序不会做任何事情:只能启动和停止.

    我们使用八进制,因为\xPOSIX没有指定十六进制数.

    我们可以通过以下方式轻松获得此编

    echo hlt > a.S
    as -o a.o a.S
    objdump -S a.o
    
    Run Code Online (Sandbox Code Playgroud)

    哪个输出:

    a.o:     file format elf64-x86-64
    
    
    Disassembly of section .text:
    
    0000000000000000 <.text>:
       0:   f4                      hlt
    
    Run Code Online (Sandbox Code Playgroud)

    但它当然也记录在英特尔手册中.

  • %509s产生509个空间.需要填写文件直到字节510.

  • \125\252在八进制== 0x55后跟0xaa.

    这些是2个必需的魔术字节,必须是字节511和512.

    BIOS通过我们所有寻找可启动磁盘的磁盘,它只考虑具有这两个魔术字节的可启动磁盘.

    If not present, the hardware will not treat this as a bootable disk.

If you are not a printf master, you can confirm the contents of main.img with:

hd main.img
Run Code Online (Sandbox Code Playgroud)

which shows the expected:

00000000  f4 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20  |.               |
00000010  20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20  |                |
*
000001f0  20 20 20 20 20 20 20 20  20 20 20 20 20 20 55 aa  |              U.|
00000200
Run Code Online (Sandbox Code Playgroud)

where 20 is a space in ASCII.

The BIOS firmware reads those 512 bytes from the disk, puts them into memory, and sets the PC to the first byte to start executing them.

Hello world boot sector

Now that we have made a minimal program, let's move to a hello world.

The obvious question is: how to do IO? A few options:

  • ask the firmware, e.g. BIOS or UEFI, to do if for us
  • VGA: special memory region that gets printed to the screen if written to. Can be used on Protected mode.
  • write a driver and talk directly to the display hardware. This is the "proper" way to do it: more powerful, but more complex.
  • serial port. This is a very simple standardized protocol that sends and retrieves characters from a host terminal.

    On desktops, it looks like this:

    在此输入图像描述

    Source.

    It is unfortunately not exposed on most modern laptops, but is the common way to go for development boards, see the ARM examples below.

    This is really a shame, since such interfaces are really useful to debug the Linux kernel for example.

  • use debug features of chips. ARM calls theirs semihosting for example. On real hardware, it requires some extra hardware and software support, but on emulators it can be a free convenient option. Example.

Here we will do a BIOS example as it is simpler on x86. But note that it is not the most robust method.

main.S

.code16
    mov $msg, %si
    mov $0x0e, %ah
loop:
    lodsb
    or %al, %al
    jz halt
    int $0x10
    jmp loop
halt:
    hlt
msg:
    .asciz "hello world"
Run Code Online (Sandbox Code Playgroud)

GitHub upstream.

link.ld

SECTIONS
{
    /* The BIOS loads the code from the disk to this location.
     * We must tell that to the linker so that it can properly
     * calculate the addresses of symbols we might jump to.
     */
    . = 0x7c00;
    .text :
    {
        __start = .;
        *(.text)
        /* Place the magic boot bytes at the end of the first 512 sector. */
        . = 0x1FE;
        SHORT(0xAA55)
    }
}
Run Code Online (Sandbox Code Playgroud)

Assemble and link with:

as -g -o main.o main.S
ld --oformat binary -o main.img -T link.ld main.o
qemu-system-x86_64 -hda main.img
Run Code Online (Sandbox Code Playgroud)

Outcome:

在此输入图像描述

And on the T430:

在此输入图像描述

Tested on: Lenovo Thinkpad T430, UEFI BIOS 1.16. Disk generated on an Ubuntu 18.04 host.

Besides the standard userland assembly instructions, we have:

  • .code16: tells GAS to output 16-bit code

  • cli: disable software interrupts. Those could make the processor start running again after the hlt

  • int $0x10: does a BIOS call. This is what prints the characters one by one.

The important link flags are:

  • --oformat binary: output raw binary assembly code, don't warp it inside an ELF file as is the case for regular userland executables.

To better understand the linker script part, familiarize yourself with the relocation step of linking: What do linkers do?

Cooler x86 bare metal programs

Here are a few more complex bare metal setups that I've achieved:

Use C instead of assembly

Summary: use GRUB multiboot, which will solve a lot of annoying problems you never thought about. See the section below.

The main difficulty on x86 is that the BIOS only loads 512 bytes from the disk to memory, and you are likely to blow up those 512 bytes when using C!

To solve that, we can use a two-stage bootloader. This makes further BIOS calls, which load more bytes from the disk into memory, here is a minimal stage 2 assembly example from scratch using the int 0x13 BIOS call

Alternatively:

  • if you only need it to work in QEMU but not real hardware, use the -kernel option, which loads an entire ELF file into memory. Here is an ARM example I've created with that method.
  • for the Raspberry Pi, the default firmware takes care of the image loading for us from an ELF file named kernel7.img, much like QEMU -kernel does.

For educational purposes only, here is a one stage minimal C example:

main.c

void main(void) {
    int i;
    char s[] = {'h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd'};
    for (i = 0; i < sizeof(s); ++i) {
        __asm__ (
            "int $0x10" : : "a" ((0x0e << 8) | s[i])
        );
    }
    while (1) {
        __asm__ ("hlt");
    };
}
Run Code Online (Sandbox Code Playgroud)

entry.S

.code16
.text
.global mystart
mystart:
    ljmp $0, $.setcs
.setcs:
    xor %ax, %ax
    mov %ax, %ds
    mov %ax, %es
    mov %ax, %ss
    mov $__stack_top, %esp
    cld
    call main
Run Code Online (Sandbox Code Playgroud)

linker.ld

ENTRY(mystart)
SECTIONS
{
  . = 0x7c00;
  .text : {
    entry.o(.text)
    *(.text)
    *(.data)
    *(.rodata)
    __bss_start = .;
    /* COMMON vs BSS: https://stackoverflow.com/questions/16835716/bss-vs-common-what-goes-where */
    *(.bss)
    *(COMMON)
    __bss_end = .;
  }
  /* https://stackoverflow.com/questions/53584666/why-does-gnu-ld-include-a-section-that-does-not-appear-in-the-linker-script */
  .sig : AT(ADDR(.text) + 512 - 2)
  {
      SHORT(0xaa55);
  }
  /DISCARD/ : {
    *(.eh_frame)
  }
  __stack_bottom = .;
  . = . + 0x1000;
  __stack_top = .;
}
Run Code Online (Sandbox Code Playgroud)

run

set -eux
as -ggdb3 --32 -o entry.o entry.S
gcc -c -ggdb3 -m16 -ffreestanding -fno-PIE -nostartfiles -nostdlib -o main.o -std=c99 main.c
ld -m elf_i386 -o main.elf -T linker.ld entry.o main.o
objcopy -O binary main.elf main.img
qemu-system-x86_64 -drive file=main.img,format=raw
Run Code Online (Sandbox Code Playgroud)

C standard library

Things get more fun if you also want to use the C standard library however, since we don't have the Linux kernel, which implements much of the C standard library functionality through POSIX.

A few possibilities, without going to a full-blown OS like Linux, include:

  • Write your own. It's just a bunch of headers and C files in the end, right? Right??

  • Newlib

    Detailed example at: https://electronics.stackexchange.com/questions/223929/c-standard-libraries-on-bare-metal/223931

    Newlib implements all the boring non-OS specific things for you, e.g. memcmp, memcpy, etc.

    Then, it provides some stubs for you to implement the syscalls that you need yourself.

    For example, we can implement exit() on ARM through semihosting with:

    void _exit(int status) {
        __asm__ __volatile__ ("mov r0, #0x18; ldr r1, =#0x20026; svc 0x00123456");
    }
    
    Run Code Online (Sandbox Code Playgroud)

    as shown at in this example.

    For example, you could redirect printf to the UART or ARM systems, or implement exit() with semihosting.

  • embedded operating systems like FreeRTOS and Zephyr.

    Such operating systems typically allows you to turn off pre-emptive scheduling, therefore giving you full control over the runtime of the program.

    They can be seen as a sort of pre-implemented Newlib.

GNU GRUB Multiboot

Boot sectors are simple, but they are not very convenient:

  • you can only have one OS per disk
  • the load code has to be really small and fit into 512 bytes
  • you have to do a lot of startup yourself, like moving into protected mode

It is for those reasons that GNU GRUB created a more convenient file format called multiboot.

Minimal working example: https://github.com/cirosantilli/x86-bare-metal-examples/tree/d217b180be4220a0b4a453f31275d38e697a99e0/multiboot/hello-world

I also use it on my GitHub examples repo to be able to easily run all examples on real hardware without burning the USB a million times.

QEMU outcome:

在此输入图像描述

T430:

在此输入图像描述

If you prepare your OS as a multiboot file, GRUB is then able to find it inside a regular filesystem.

This is what most distros do, putting OS images under /boot.

Multiboot files are basically an ELF file with a special header. They are specified by GRUB at: https://www.gnu.org/software/grub/manual/multiboot/multiboot.html

You can turn a multiboot file into a bootable disk with grub-mkrescue.

Firmware

In truth, your boot sector is not the first software that runs on the system's CPU.

What actually runs first is the so-called firmware, which is a software:

  • made by the hardware manufacturers
  • typically closed source but likely C-based
  • stored in read-only memory, and therefore harder/impossible to modify without the vendor's consent.

Well known firmwares include:

  • BIOS: old all-present x86 firmware. SeaBIOS is the default open source implementation used by QEMU.
  • UEFI: BIOS successor, better standardized, but more capable, and incredibly bloated.
  • Coreboot: the noble cross arch open source attempt

The firmware does things like:

  • loop over each hard disk, USB, network, etc. until you find something bootable.

    When we run QEMU, -hda says that main.img is a hard disk connected to the hardware, and

    hda is the first one to be tried, and it is used.

  • load the first 512 bytes to RAM memory address 0x7c00, put the CPU's RIP there, and let it run

  • show things like the boot menu or BIOS print calls on the display

Firmware offers OS-like functionality on which most OS-es depend. E.g. a Python subset has been ported to run on BIOS/UEFI: https://www.youtube.com/watch?v=bYQ_lq5dcvM

It can be argued that firmwares are indistinguishable from OSes, and that firmware is the only "true" bare metal programming one can do.

As this CoreOS dev puts it:

The hard part

When you power up a PC, the chips that make up the chipset (northbridge, southbridge and SuperIO) are not yet initialized properly. Even though the BIOS ROM is as far removed from the CPU as it could be, this is accessible by the CPU, because it has to be, otherwise the CPU would have no instructions to execute. This does not mean that BIOS ROM is completely mapped, usually not. But just enough is mapped to get the boot process going. Any other devices, just forget it.

When you run Coreboot under QEMU, you can experiment with the higher layers of Coreboot and with payloads, but QEMU offers little opportunity to experiment with the low level startup code. For one thing, RAM just works right from the start.

Post BIOS initial state

Like many things in hardware, standardization is weak, and one of the things you should not rely on is the initial state of registers when your code starts running after BIOS.

So do yourself a favor and use some initialization code like the following: /sf/answers/2275668881/

Registers like %ds and %es have important side effects, so you should zero them out even if you are not using them explicitly.

Note that some emulators are nicer than real hardware and give you a nice initial state. Then when you go run on real hardware, everything breaks.

El Torito

Format that can be burnt to CDs: https://en.wikipedia.org/wiki/El_Torito_%28CD-ROM_standard%29

It is also possible to produce a hybrid image that works on either ISO or USB. This is can be done with grub-mkrescue (example), and is also done by the Linux kernel on make isoimage using isohybrid.

ARM

In ARM, the general ideas are the same.

There is no widely available semi-standardized pre-installed firmware like BIOS for use to use for the IO, so the two simplest types of IO that we can do are:

  • serial, which is widely available on devboards
  • blink the LED

I have uploaded:


小智 147

如何在没有运行操作系统的情况下自行运行程序?

将二进制代码放在重新启动后处理器查找的位置(例如ARM上的地址0).

您是否可以创建计算机可以在启动时加载和运行的汇编程序(例如,从闪存驱动器启动计算机并运行驱动器上的程序)?

对问题的一般回答:可以做到.它通常被称为"裸机编程".要从闪存驱动器读取,您想知道什么是USB,并且您希望有一些驱动程序可以使用此USB.这个驱动器上的程序也必须采用某种特定的格式,在某些特定的文件系统上...这是引导装载程序通常会做的事情,但是你的程序可能包含自己的引导加载程序,所以它是自包含的,如果固件只有加载一小段代码.

许多ARM板可以让你做一些这样的事情.有些引导加载程序可以帮助您进行基本设置.

在这里,您可以找到有关如何在Raspberry Pi上执行基本操作系统的精彩教程.

编辑:本文和整个wiki.osdev.org将解答您的大部分问题 http://wiki.osdev.org/Introduction

此外,如果您不想直接在硬件上进行实验,可以使用qemu等虚拟机管理程序将其作为虚拟机运行.了解如何在此处直接在虚拟化ARM硬件上运行"hello world" .


sim*_*eco 7

操作系统作为灵感

操作系统也是一个程序,因此我们也可以通过从头开始创建或更改(限制或添加)其中一个小型操作系统的功能来创建我们自己的程序,然后在引导过程中运行它(使用ISO 映像) .

例如,这个页面可以作为一个起点:

如何编写一个简单的操作系统

在这里,整个操作系统完全适合 512 字节的引导扇区 ( MBR )!

此类或类似的简单操作系统可用于创建一个简单的框架,使我们能够:

使引导加载程序将磁盘上的后续扇区加载到 RAM 中,并跳转到该点继续执行。或者您可以阅读 FAT12(软盘驱动器上使用的文件系统),并实现该.

然而,有很多可能性。例如,要查看更大的 x86 汇编语言操作系统,我们可以探索MykeOS、x86 操作系统,它是一种学习工具,用于展示简单的 16 位实模式操作系统的工作,并带有注释良好的代码大量文档

引导加载程序作为灵感

其他无需操作系统即可运行的常见程序类型也是引导加载程序。我们可以创建一个受这样一个概念启发的程序,例如使用这个站点:

如何开发自己的引导加载程序

上面的文章还介绍了此类程序基本架构

  1. 通过 0000:7C00 地址正确加载到内存。
  2. 调用高级语言开发的BootMain函数
  3. 在显示屏上显示“你好,世界……”,来自低级”消息。

正如我们所见,这种架构非常灵活,允许我们实现任何程序,不一定是引导加载程序。

特别是,它展示了如何使用“混合代码”技术,因为它可以将高级构造(来自CC++与低级命令(来自Assembler)结合起来。这是一个非常有用的方法,但我们必须记住:

要构建程序并获取可执行文件,您将需要16 位模式的汇编器的编译器和链接器对于 C/C++,您只需要可以为 16 位模式创建目标文件编译器

本文还展示了如何查看创建的程序的运行情况以及如何执行其测试和调试。

UEFI 应用程序作为灵感

上面的例子使用了在数据介质上加载扇区 MBR 的事实。但是,我们可以通过使用UEFI应用程序等方式深入了解

除了加载操作系统之外,UEFI 还可以运行 UEFI 应用程序,这些应用程序作为文件驻留在 EFI 系统分区上。它们可以从 UEFI 命令外壳、固件的引导管理器或其他 UEFI 应用程序执行。UEFI 应用程序可以独立于系统制造商开发和安装。

UEFI 应用程序的一种类型是操作系统加载程序,例如 GRUB、rEFInd、Gummiboot 和 Windows Boot Manager;它将操作系统文件加载到内存中并执行它。此外,操作系统加载程序可以提供用户界面以允许选择另一个 UEFI 应用程序运行。UEFI shell 等实用程序也是 UEFI 应用程序。

如果我们想开始创建此类程序例如,我们可以从以下网站开始:

EFI 编程:创建“Hello, World”程序/ UEFI 编程 - 第一步

探索安全问题作为灵感

众所周知,在操作系统启动之前,有一整组恶意软件(即程序)正在运行

它们中的一大群在 MBR 扇区或 UEFI 应用程序上运行,就像上述所有解决方案一样,但也有一些使用另一个入口点,例如卷引导记录(VBR) 或BIOS

至少有四种已知的 BIOS 攻击病毒,其中两种用于演示目的。

或者也可能是另一个。

系统启动前的攻击

Bootkits已经从概念验证开发发展到大规模分发,现在已经有效地成为开源软件

不同的启动方式

我还认为,在这种情况下,还值得一提的是,多种形式的操作系统(或用于此目的的可执行程序)的引导方式。有很多,但我想注重从网络加载代码使用网络引导选项(PXE),这使得我们能够在计算机上运行的程序,无论其操作系统,甚至无论任何存储介质是直接连接电脑:

什么是网络引导 (PXE) 以及如何使用它?


归档时间:

查看次数:

84501 次

最近记录:

6 年,1 月 前