MIT 6.828 JOS Lab1 实验报告
前言
此为本人上本校的操作系统实习(实验班)时所写的实验报告,简单记述了JOS Lab的各个Exercise、Challenge(未覆盖所有Challenge,每个Lab大概做了1~3个Challenge)的基本思路。仅供有需者参考,上有关课程(比如本校的操统实习实验班)的最好不要直接抄。
附上源代码链接:https://github.com/Light-of-Hers/mit-jos
Exercise 1
略
Exercise 2
跳转到bootloader前的指令:
1 |
|
在ROM上进行一些初始化工作,如:检查RAM,初始化硬件,初始化段寄存器ss和栈指针esp等。之后跳转到bootloader,即RAM地址0x7c00处(和GDB显示的不一致)
Exercise 3
Q1: At what point does the processor start executing 32-bit code? What exactly causes the switch from 16- to 32-bit mode?
在执行了ljmp $PROT_MODE_CSEG, $protcseg
后,处理器跳转到32位代码。
在以下代码执行,使能了A20总线后,处理器从16位模式进入32位模式。
1 |
|
Q2: What is the last instruction of the boot loader executed, and what is the first instruction of the kernel it just loaded?
bootloader最后执行的语句及其对应的指令为:
1 |
|
kernel被加载进来时执行的第一个指令为:
1 |
|
Q3: Where is the first instruction of the kernel?
内核执行的第一条指令位于0x10000c
Q4: How does the boot loader decide how many sectors it must read in order to fetch the entire kernel from disk? Where does it find this information?
- 根据elf文件结构体的
e_phoff
字段确定第一个程序段头(program segment header)的偏移 - 根据
e_phnum
字段确定程序段头的数量 - 依次读入各个程序段头:根据其结构体的
p_memsz
获取对应程序段(program segment)所占的大小,再据此算出该读入多少扇区(sector)
Exercise 4
略
Exercise 5
Q: Identify the first instruction that would “break” or otherwise do the wrong thing if you were to get the boot loader’s link address wrong.
会引起错误的第一条指令为
1 |
|
因为protcseg
不是位置无关代码(position indepandent code)。该地址在链接时确定,但是BIOS将bootloader加载到的地址却是固定的(0x7c00)。因此若改变了链接地址,会导致该指令跳转到错误的位置。
当然,事实上之前的lgdt gdtdesc
指令也会加载错误位置的GDT
,但是影响并没有ljmp
这样快而直接。
Exercise 6
Q: Examine the 8 words of memory at 0x00100000 at the point the BIOS enters the boot loader, and then again at the point the boot loader enters the kernel. Why are they different? What is there at the second breakpoint?
在刚进入bootloader时,那些内存位置均为0
进入kernel时,内存数据如下:
1 |
|
这些数据为bootloader所加载的.text段的开头:
1 |
|
1 |
|
Exercise 7
Q: What is the first instruction after the new mapping is established that would fail to work properly if the mapping weren’t in place?
初次出问题的指令:
1 |
|
此时%eax
储存的为0xf010002f
,若初始使用的页表没有合理映射,可能会使跳转出问题。
Exercise 8
We have omitted a small fragment of code - the code necessary to print octal numbers using patterns of the form “%o”. Find and fill in this code fragment.
修改printfmt.c
中的vprintfmt
函数:
1 |
|
Q1: Explain the interface between printf.c and console.c. Specifically, what function does console.c export? How is this function used by printf.c?
console.c
导出cputchar
函数供printf.c
中的putch
函数使用:
1 |
|
printf.c
中的putch
作为参数传入printfmt.c
中的vprintfmt
函数
Q2: Explain the following from console.c:
1 |
|
改段代码用于滚屏,也就是当当前输出位置crt_pos
大于屏幕容量的时候,不断将屏幕上移(每次上移一行)并更新crt_pos
,直到crt_pos
位于屏幕内。
Q3: Answer the following questions:
1 |
|
Q3.1: In the call to cprintf(), to what does fmt point? To what does ap point?
fmt
指向字符串"x %d, y %x, z %d\n"
,也即8(%ebp)
位置处的第一个参数。ap
指向可变参数列表,也即12(%ebp)
位置处的第二个参数
Q3.2: List (in order of execution) each call to cons_putc, va_arg, and vcprintf. For cons_putc, list its argument as well. For va_arg, list what ap points to before and after the call. For vcprintf list the values of its two arguments.
1 |
|
Q4: What is the output? Explain how this output is arrived at in the step-by-step manner of the previous exercise. Here’s an ASCII table that maps bytes to characters.
The output depends on that fact that the x86 is little-endian. If the x86 were instead big-endian what would you set i to in order to yield the same output? Would you need to change 57616 to a different value?
1 |
|
输出为He110 World
57616的16进制表示为110,而十六进制数72,6c,64在ASCII码中对应的字符分别为r, l, d
若为大端法,则只需令i = 0x726c6400
,无需改动57616
Q5: In the following code, what is going to be printed after ‘y=’? (note: the answer is not a specific value.) Why does this happen?
1 |
|
将会输出12(%ebp)
处的值
Q6: Let’s say that GCC changed its calling convention so that it pushed arguments on the stack in declaration order, so that the last argument is pushed last. How would you have to change cprintf or its interface so that it would still be possible to pass it a variable number of arguments?
将其接口改为
1 |
|
其中n
可变参数的个数。
或者
1 |
|
其中可变参数倒序输入。
如果可变参数正序输入但是又没有输入其个数的话,会给后续的处理带来不必要的麻烦(可能需要至少两趟对fmt
的遍历)
Exercise 9
Q: Determine where the kernel initializes its stack, and exactly where in memory its stack is located. How does the kernel reserve space for its stack? And at which “end” of this reserved area is the stack pointer initialized to point to?
初始化栈指针的指令为:
1 |
|
初始的栈所在位置为一个.data段:
1 |
|
如上述代码,采用.space KSTKSIZE
为栈静态分配空间
栈指针初始指向bootstacktop
,即该栈空间的地址最高处
Exercise 10
Q: How many 32-bit words does each recursive nesting level of test_backtrace push on the stack, and what are those words?
递归调用自身时,test_backtrace
先将x-1
压栈,再将返回地址压栈,再将%ebp
压栈,共3个32位数。
Exercise 11-12
在commands
中插入:
1 |
|
在monitor
添加函数:
1 |
|
在debuginfo_eip
函数中插入:
1 |
|
This completes the lab-1
1 |
|
Challenge
采用ANSI ESC Sequence嵌入来实现彩色字体的显示,如:
1 |
|
其中<ParamN>
为参数,其中决定颜色的参数为:(参见 http://rrbrandt.dee.ufcg.edu.br/en/docs/ansi/ )
cga_putc
函数(打印到Qemu的console)暂时不会处理ANSI Escape Sequence,而serial_putc
函数(打印到用户Terminal)会处理。这就导致了两者的打印内容的差异。因此要先修改cga_putc
以改变VGA的输出行为。
在VGA的text-mode下,buffer中填充的数据的位域构成如下(参见 https://os.phil-opp.com/vga-text-mode/ ):
其中color部分数值对应的颜色为:
因此修改console.c
的cga_putc
函数:
1 |
|
这样VGA就支持ANSI Escape Sequence了。除了只会接受数字参数(最长为1023,虽然并没有这么长的数字参数……),只会处理颜色参数(前景色、背景色)和重置参数之外,其余行为与bash的行为一致,如后出现的参数会覆盖之前出现的与其不相容的参数(例如后出现的前景色会覆盖前出现的前景色)等。
为了方便设置颜色,在stdio.h
中加入颜色设置接口:
1 |
|
在printf.c
中实现接口:
1 |
|
在init.c
中添加一些有趣的测试:
1 |
|
测试效果:
Qemu Console:
User Terminal:
Some problems about stab_binsearch
当stabs[m].n_value == addr
时,原始代码的处理感觉有些问题:
1 |
|
万一addr+1
处的地址也是符合所要求的type
时,所得到的匹配范围就会出错。
因此建议改成:
1 |
|
因为已经有一个恰好匹配了,所以之后的stabs[m].n_value
必然比addr
大,故而不会修改*region_left
,而且找到的*region_right
也符合定义。
还有关于最后的else
分句:
1 |
|
个人感觉也不是必要的,因为前面的循环(修改后)已经保证*region_left + 1
和*region_right
之间没有符合type
的symbol了。
Some extension of the console/serial input
感觉qemu-console和user-terminal的键盘输入不是很舒服,主要有两点:
- user-terminal的backspace只会回退光标,不会删除字符;而且backspace过多让光标回退到prompt之前……和qemu-console的行为不一致。
- 无法左右移动光标以在输入字符串中间进行插入和删除。
因此针对这两点对readline
函数进行改进(会涉及到一处对cga_putc
函数的修改)
Backspace
user-terminal的backspace只会回退光标,而qemu-console的backspace同时还会删除字符,两者行为不一致。因为我们只是单纯地将输入信息通过串行总线传给user-terminal,对于输入的处理以及显示是由user-terminal内部完成的,因此应将user-terminal的行为视为标准(而且这样仅回退不删除的行为也有利于后续光标移动的实现):
修改
cga_putc
(严格来说是cga_putc1
,因为之前challenge的修改)中对backspace'\b'
的处理:1
2
3
4
5
6
7
8case '\b':
// Change the behavior of backspace to support character insert.
// Maintain the character in `crt_pos`.
if (crt_pos > 0) {
crt_pos--;
// crt_buf[crt_pos] = (c & ~0xff) | ' ';
}
break;将(在输入字符串尾部)退格的操作实现为:
1
cputchar('\b'), cputchar(' '), cputchar('\b');
user-terminal的backspace甚至会让光标回退到prompt之前。在gdb上截获输入的字符,发现在(我的)user-terminal上输入backspace得到的字符是
DEL(0x7f)
,比空格' '
的ASCII码更大。也就是说在原始版本的readline
中,当i == 0
时,在user-terminal输入backspace虽然不会进入到处理退格的语句中(判断条件为(c == '\b' || c == '\x7f') && i > 0
),但是会进入到正常回显字符的语句中(判断条件为c >= ' ' && i < BUFLEN - 1
),因此将正常回显字符的判断条件改为:1
c >= ' ' && c <= '~' && i < BUFLEN - 1
Cursor movement
首先应知道左右方向键的输入是什么。经查阅资料( https://stackoverflow.com/questions/22397289/finding-the-values-of-the-arrow-keys-in-python-why-are-they-triples , https://www.ascii-code.com/ )以及在gdb上截获输入字符,发现user-terminal和qemu-console输入方向键得到的字符不一样:
- user-terminal为一个escape sequence:
- left arrow:
'0x1b','[','D'
- right arrow:
'0x1b', '[', 'C'
- left arrow:
- qemu-terminal为一个extended ASCII code:
- left arrow: 228
- right arrow: 229
- user-terminal为一个escape sequence:
qemu-console(
kbd_intr
)会无视ESC的输入,而user-terminal不会输入extended ASCII code。因此对两种情况分别处理,不会有冲突。为了不破坏封装性,因此仅对
readline.c
进行修改,仅以getchar
和cputchar
函数为与底层交互的接口。采用最直接的单buffer算法,每次插入/删除字符都要进行buffer的拷贝移动并将更新的部分(以及之后的部分)重新flush到display上,单次操作的平均复杂度为O(N),不过因为本身输入的量很少(buffer的大小只有1024 Byte,平时的终端输入更是不会超过50 Byte),所以使用起来并没有延迟感。代码如下:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187#include <inc/stdio.h>
#include <inc/error.h>
#include <inc/string.h>
// Handle the extended ASCII code inputed by console
inline static void handle_ext_ascii(int c);
// Handle the escape sequence inputed by serial (user-terminal)
inline static void handle_esc_seq(void);
// Move the cursor right
inline static void move_right(void);
// Move the cursor left
inline static void move_left(void);
// Flush buffer's [cur, tail) to the displays
// and move the cursor back
inline static void flush_buf(void);
// Insert char to current cursor
inline static void insert_char(int c);
// Remove current cursor's char
inline static void remove_char(void);
// Terminate the input
inline static void end_input(void);
#define BUFLEN 1024
static char buf[BUFLEN];
// Current position of cursor
static int cur;
// Tail of buffer
static int tail;
static int echoing;
char *
readline(const char *prompt)
{
int c;
if (prompt != NULL)
cprintf("%s", prompt);
cur = tail = 0;
echoing = iscons(0);
while (1) {
c = getchar();
if (c < 0) {
cprintf("read error: %e\n", c);
return NULL;
} else if ((c == '\b' || c == '\x7f') && cur > 0) {
remove_char();
} else if (c >= ' ' && c <= '~' && tail < BUFLEN-1) {
// Must have c <= '~',
// because DEL(0x7f) is larger than '~'
// and it will be inputed when you push
// 'backspace' in user-terminal
insert_char(c);
} else if (c == '\n' || c == '\r') {
end_input();
return buf;
} else if (c == '\x1b') {
handle_esc_seq(); // only serial will input esc
} else if (c > '\x7f') {
handle_ext_ascii(c); // only console will input extended ascii
}
}
}
inline static void
flush_buf(void)
{
for (int i = cur; i < tail; ++i)
cputchar(buf[i]);
for (int i = cur; i < tail; ++i)
cputchar('\b'); // cursor move back
}
inline static void
insert_char(int c)
{
if (cur == tail) {
tail++, buf[cur++] = c;
if (echoing)
cputchar(c);
} else { // general case
memmove(buf + cur + 1, buf + cur, tail - cur);
buf[cur] = c, tail++;
if (echoing)
flush_buf();
move_right();
}
}
inline static void
remove_char(void)
{
if (cur == tail) {
cur--, tail--;
if (echoing)
cputchar('\b'), cputchar(' '), cputchar('\b');
} else { // general case
memmove(buf + cur - 1, buf + cur, tail - cur);
buf[tail - 1] = ' ';
move_left();
if (echoing)
flush_buf();
tail--;
}
}
inline static void
move_left(void)
{
if (cur > 0) {
if (echoing)
cputchar('\b');
cur--;
}
}
inline static void
move_right(void)
{
if (cur < tail) {
if (echoing)
cputchar(buf[cur]);
cur++;
}
}
inline static void
end_input(void)
{
if (echoing) {
for (; cur < tail; cputchar(buf[cur++]))
/* move the cursor to the tail */;
cputchar('\n');
}
cur = tail;
buf[tail] = 0;
}
#define EXT_ASCII_LF 228
#define EXT_ASCII_RT 229
#define EXT_ASCII_UP 226
#define EXT_ASCII_DN 227
inline static void
handle_ext_ascii(int c)
{
switch(c) {
case EXT_ASCII_LF:
move_left();
return;
case EXT_ASCII_RT:
move_right();
return;
}
insert_char(c);
}
#define ESC_LF 'D'
#define ESC_RT 'C'
#define ESC_UP 'A'
#define ESC_DN 'B'
inline static void
handle_esc_seq(void)
{
char a, b = 0;
a = getchar();
if (a == '[') {
switch(b = getchar()) {
case ESC_LF:
move_left();
return;
case ESC_RT:
move_right();
return;
}
}
insert_char(a);
if (b)
insert_char(b);
}
当前的实现存在一定缺陷。主要是user-terminal最多只能让光标backspace到当前行开头。以后有空再改进一下吧。