System Learning Daily 2

Posted on 2019-06-20

X86 Calling Convention

上一篇Post最后有个疑问是，X86函数调用时，参数怎么传递，栈怎么管理。这些东西其实都是Calling conventions的内容[1]。

修改 _terminal_xxx

主要的大范围改动就是把函数调用时的参数传递由用栈来传递，改为直接使用寄存器来传递。

添加_terminal_putchar，支持屏幕的滚动。

修改_terminal_show，使用_terminal_putchar来显示字符。

具体实现如下：

.global _terminal_putchar
.type _terminal_putchar STT_FUNC
/*!
    \brief  put a char on screen
            it will read content in TER_ROW and TER_COL to get correct offset
    \param  al the char to put
*/
_terminal_putchar:
    cmp al, '\n'
    je tp_s1

// the char is not newline
tp_s0:
    push ax
    mov ax, [TER_ROW]
    call _terminal_getoffset
    shl ax
    mov bx, ax
    mov ax, VRAM
    mov es, ax
    pop ax
    mov ah, DEFCOLOR
    mov es:[bx], ax
    cmp byte ptr [TER_COL], WIDTH - 1
    je tp_s1
    inc byte ptr ds:[TER_COL]
    ret

// the cursor should be at next line.
tp_s1:
    cmp byte ptr [TER_ROW], HEIGHT - 1
    jne tp_s2

// should scroll
    push ds
    call _terminal_scroll
    pop ds
    mov bl, HEIGHT - 1
    call _terminal_clearline // clear line at bottom
    sub byte ptr [TER_ROW], 1

//should not scroll
tp_s2:
    inc byte ptr [TER_ROW]
    mov byte ptr [TER_COL], 0
    mov ax, [TER_ROW]
    call _terminal_getoffset
    call _terminal_setcur
    ret

.global _terminal_scroll
.type _terminal_scroll STT_FUNC
/*!
    /brief  scroll the screen one line up
            it will not modify the TER_ROW and TER_COL
            it will not clear the line at bottom
            WARNING: it will change "ds" register
*/
_terminal_scroll:
    mov cx, ( (HEIGHT - 1) * WIDTH) / 2
    mov ax, VRAM
    cld
    mov ds, ax
    mov es, ax
    mov si, WIDTH * 2
    mov di, 0
    rep movsw
    ret

开始使用Makefile

不用CMake的原因是，汇编源文件和C源文件混合编译不太好搞（我太菜了）。不如Make来得直接。在子文件夹里面Make生成bin文件，在根文件夹里面把bin文件合成，并且拷贝到build文件夹。细节就不写出来了，没什么技术水平。

关于 “Call” 指令

GUN AS, Intel Syntax


.section .data

_buf1:
.word 0x10

_buf2:
.word 0x20

_buf3:
.word 0xCC

.type _func STT_FUNC
_func:
    xor ax, ax
    ret

.section .text
_start:

    mov ax, _buf1
    mov bx, _buf2
    mov cx, _buf3

此时，ax，bx，cx的值分别为：0x10，0x20，0xCC。这里，GAS mov 的是内存中的内容，然后

1
2
3

mov ax, OFFSET _buf1
mov bx, OFFSET _buf2
mov cx, OFFSET _buf3

此时，我们假定.data段从0x00开始，那么，ax，bx，cx的值会是0x00，0x02，0x04。也就是说，我们用OFFSET，可以得到label的地址。（此处存疑，OFFSET获得的是相对于段的偏移？还是获得绝对地址？似乎是获得绝对地址）

现在让我们来考虑call指令的情况：

1	call _func

这显然是直接调用 _func，然后看下个例子：

mov bx, OFFSET _func
mov _buf1, bx
call _buf1
call [_buf1]

现在 _buf1 中存储了 _func 的地址。现在有上面两种写法，第一种是错的，第二种写法是对的。第一种写法就是 call label，第二种写法才是我们想要的：“取出_buf中存储的地址，jmp到该地址。”。

这说明：在mov指令中和在call指令中出现的label会被GAS区别对待，他们有着不同的解释。

这个问题确实困扰了我好一会。

关于 “jmp label”


_start:
    jmp label
    jmp short label
label:
    hlt
    jmp label

这种直接jmp label的写法产生的机器码里面是没有绝对地址的，有的只是相对地址。把《汇编语言》[2]上面的一个例子拿过来(P.180 图9.3 转移位移的计算方法)

偏移地址     机器码                   汇编指令
0000        40                  s:  inc ax
0001        EB03                    jmp s0
0003        BB0300                  mov bx, 3
0006        43                  s0: inc bx
0007        EBF7                    jmp s
0009        90                      nop

这里0001偏移处的jmp s0指令生成的机器码是EB03，后面的0x03就是相对偏移地址。

为什么是03呢？0006 - 0001 = 5才对啊。其实8086处理器在将机器码放入缓冲区之后（还未执行改指令），就更新了IP寄存器的数值，所以在执行的时候IP的值已经是0x03了，所以编译器就把0x03放到了指令中。后面的0xF7是-9的补码，0009 - 9 = 0000，正好跳到了第一条指令。如果是有着多级流水线（Pipelining）的处理器，情况会变得更加复杂一些，在这里就不废话了（不瞎说了）[3]。

总结

总结一下以前的疑问：

System Learning Daily 1，疑问1的解答，就是该篇标题。

新的疑问：

不同的call near, far和ret, retf如何匹配。

Reference

Wikipedia, “x86 calling conventions”

王爽，《汇编语言》

NCCGROUP, “ARM, Pipeline and GDB, Oh My!”

Updates

2019-6-21
Add more content and publish the post.