Exam 1 Comments

Posted: Sun 16 March 2014

Some comments and discussion on selected Exam 1 questions are below. In general, students did well on most questions except for the hardware/software memory isolation question (which its understandable that students didn't do too well on since we didn't spend much time on this, but I hoped more of your answers would reveal clear understanding of what it means to provide memory isolation in this question). We'll talk more about some of these in Class 14.

Quotes are student answers (possibly slightly edited for spelling, grammar and formatting).

General Comments

1. What's the difference between a programming language and an operating system?

We talked about this in Class 14.

Similarity: They both provide abstractions for machine resources, so that users can effectively and easily control the machine.

1. An OS is essentially a program that can be written in a programming language 2. OS provides interface between computer hardware and application user. Programming language provides interface between machine resource and programmer. 3. OS manages machine resources for programs (keeps track of other programs ands gives each its due memory space and CPU time). Other programs written by any programming languages don’t have this privilege.

An operating system is a program that manages resources by determining how memory, hardware, and processors are used by the system, as well as something that proves abstractions, or interfaces, that allow other programs to use these computer resources or hardware systems efficiently. A programming language, on the other hand, is a tool that allows users to communicate instructions to a computer or hardware machine. While both programming languages and operating systems provide a means of communication or connection between hardware and user, the impact of a programming language is guided by the programmer, while an operating system is set and automated. As an operating system is itself a program, it is very closely linked as it was created by a programming language.

3. What instructions need to be privileged and why?

Many of your answers included statements like, "there are 16 privileged instructions" (on x86). This isn't really true - there are gazillions of privileged instructions since any load/store instruction that accesses memory outside a processes space is really a privileged instruction; there are a small number of opcodes that are always privileged, like LGDT, but many other (sometimes) privileged instructions.

Instructions that need to be privileged are those that control system functions such as control registers. If user level programs were able to use those instructions, it would have access to all the memory and do whatever it wants - which would be bad!

4. What are the advantages/disadvantages of hardware-based memory isolation over software-based memory isolation?

There are a lot of conflicting answers to this one, in particular about half of the answers claimed hardware-based was faster and half claimed software-based was faster. Both answers could be right!

As implemented today, hardware-based memory isolation is more efficient, but this is because all software-based memory isolation that is widely deployed is done on top of a processor that supports hardware-based memory isolation, so it is incurring the overhead of hardware-based memory isolation in addition to any added costs of software-based memory isolation. When software-based memory isolation is implemented by adding dynamic checks or masks to the code, this incurs some runtime overhead (but for systems like Native Client, that actual overhead for most programs is only a few percent for this). But, if software-based memory isolation could be done fully statically, which would be possible by a suitable engineered Rust compiler, for example, then there is no runtime cost since all the checks needed to ensure memory isolation are done at compile time.

The biggest disadvantage of software-based memory isolation is that it does not provide nearly the level of security provided by hardware-based memory isolation. The main reason for this is that it is compilcated to do memory isolation in software, even under very controlled conditions like those for Native Client. All major software sandbox implementations have had serious bugs. For example, the recent Pwn2Own event found new vulnerabilites in up-to-date versions of all major browsers, including in their sandbox execution environments that could be used to obtain arbitrary code execution (Crash, bang, boom: Down go all the major browsers at Pwn2Own, ZDNet, 14 March 2014). Even (supposedly) simple VM environments like Java have a long and sordid history of vulnerabilities. (As another example, every time you get safe Rust code to crash with a segmentation fault this means there is a bug in the software-based memory isolation Rust is supposed to provide. Most of these are probably not exploitable, or very difficult to exploit, but they should worry you if you think Rust is mature enough to provide secure memory isolation.)

Hardware memory protection is much easier since it is done at the hardware level right when memory is accessed. Unlike software-based protection which could be bypassed in many potential ways, the only way to circumvent hardware-based memory protection would be if you can find a way to set the appropriate bits in the page table (which should only be possible using privileged instructions, which low-level hardware mechanisms ensure can only be executed by the kernel), or, say, get a cosmic ray to hit exactly the right wire on the processor to disable the checking (if a TLB is in use, perhaps there are other bypasses based on using stale enries in the TLB cache, etc., but all of these should be prevented by low-level, and quite simple, hardware mechanisms).

Hardware based memory isolation requires many more physical resources to be hard coded into the system and is limited to simply trapping the invalid memory access action, preventing it and looking to the kernel for further instruction. However, it is a completely sure way to make sure a process does not access another processes memory space. Software memory isolation mechanisms are much more flexible and do not require extra hardware complexity. However if the software is somehow subverted, a process could potentially still access the wrong memory space. Therefore both are included to provide greater protection to the system.

5. What is the difference between a segmentation fault and page fault?

A segmentation fault is an actual error, whereas a page fault is closer to a cache miss. The latter occurs when a the system tries to access a page that’s not mapped to physical memory, which then triggers the kernel to fetch that memory off disk. A segmentation fault occurs when a program tries to access an illegal memory address.

7. Should a shell (like gash) be a user-level program or part of the kernel?

Here's a few good answers to this question.

The difference between a kernel level and user level program is that a kernel level program has the capability to run privileged instructions and has more access to memory and processing hardware than a user level program does. The kernel can assign memory and processing to user level program because of the superior level of memory and processing access that the kernel possesses. The advantages of making a shell program kernel level are that processes can be ran in the background with simultaneous core processing as well as independent memory allocation to each process since the kernel level shell program will have the authority to do so. A disadvantage is that the shell can be used with malicious intent since the user has too much access to memory; additionally it also opens up possibility for the user to accidentally corrupt data by allocating memory to a block that is already in use. Therefore, the cons outweigh the pros leading to the conclusion that a shell should be a user-level program.

The kernel mode allows unrestricted access and instruction of the hardware, whereas the user mode disallows such access and instruction and forces all executing code to funnel through designated system functions. The former allows substantial control of the system but also permits disaster – misuse or mistakes may cause the entire system to crash. The latter provides protection from system crashes at the expense of control. Allowing a shell to run in kernel mode would permit it much greater control over the system. Because switching between kernel and user mode is expensive, a kernel mode shell would also allow for a much faster and more efficient shell. However, misuse of the shell would not only endanger the running of the shell but also crash the entire system irrecoverably. A kernel mode shell would delegate responsibility to the developer of the shell to ensure the computer user cannot severely endanger the system. In the hands of an experienced computer user, a kernel mode shell could work. The shell, traditionally, has always run in user mode. The decision to only allow the shell user level privilege has worked for decades and has not seemed to raise any major concerns. Additionally, any necessary hardware functions the shell must use are already provided through the hardware APIs. The additional layer provides protection and consequent stability. Because most computer users probably have no idea what kernel mode entails (or even what a shell is), the shell would likely be best kept a user mode application.

Do you believe your performance on this exam will fairly reflect what you have learned so far?

Not really -- most of what I've learned has been the stuff in the Programming Assignments, which seems to be more or less orthogonal to what we're learning in lectures (and thus to the questions on this exam).

This exam isn't meant to duplicate what you're doing on the problem sets, and I think we are able to evaluate pretty well if you get what you should out of the problem sets from the code submissions and demos. The main purpose of the exam is to see how well people are understanding the concepts in the course that are not covered well by the problem sets, so in that sense, you are correct that it is orthogonal to the problem sets. I don't see that as a bad thing, though --- better to learn more things, and learn them in the medium that is most effective for learning that material, than to limit what is covered in class to things that can fit into the problem sets. I don't think it would be good to make the problem sets much bigger than they are now, but if we wanted to cover all the concepts in the class directly in problem sets they would need to be much bigger (or we'd need to leave out many important concepts).

We've learned a lot about Rust and this exam didn't really test us on those concepts, but I suppose that is what the homework is for.

Exactly — although several of the questions on the exam could definitely benefit from understanding Rust well, including the memory isolation question and modern processor design question.

Other than question 8, I felt as if this exam was extremely fair and would be a relatively good reflection on what I have learned from the course. I felt question 8 could have been expressed a bit more clearly, as I was not certain what sort of answer or level of the detail the question was eliciting.

This was, indeed, a very open ended question, and something PhD dissertations could be written about and companies are investing billions of dollars to figure out. But, I think you know enough from this class to have some good ideas on it, and most submitted answers were at the level of detail and depth I was expecting (although not nearly as creative as I was hoping for!)

Mostly, though the availability of so many resources does act somewhat as a crutch.

I don't understand this — you aren't required to use any resources you don't want to, but I guess it does require more thought on your own to reconcile different definitions and explanations in different resources (but that is an important ability to develop).

Do you have any other comments about the exam, the course so far, or

what you hope for the rest of the semester?

With questions like the ones on this test, I'm always worried that I will miss some crucial detail that the questioner was "looking for." My answers tended to be a bit long as a result. I hope that's not a problem.

Short, clear answers are almost always better, but I understand that most of you have suffered under grading schemes like the SAT essays, so have been mistrained to write long answers that attempt to hit on all the secret grading criteria. In the real world, overly long writing is unlikly to be read, and its not about hitting all points on some mysterious list of desired items, but about making a clear, concise, and convincing argument.

On another note: I understand you are tenured, but telling your class of mostly White and Asian males that you don't want to listen to White and Asian males is an easy way to have students lose faith.

I think you very much misunderstood my comments about implicit assumptions - I certainly didn't say anything about not wanting to listen to white and Asian males, only about want to do an experiment to try and change the classroom dynamic where we weren't hearing from anyone else. Sorry if it came across as not wanting contributions from white males, that was definitely not my intent.

Yes, but I believe it also greatly helped that a group of us met to discuss and answer the class notes to review before the test. We plan to continue to do so on a regular basis each week.

Great! I think this is very useful, and would encourage others to do it also. It is also encouraged to post comments on the course site about questions in the class notes.

I really like the book, and I think it would be good to emphasize the readings from it. Sure the sample code is written in C (would a good final project be to rewrite the samples in Rust?), but I find it to be a very interesting and accessible book to read. Reading through it before the exam really helped me to solidify my understanding of in class material.

This is a good point! I've been a bit lax in including links to the relevant book sections in the class notes, but will try to remember to do this more. I do encourage everyone to look at the book for relevant and interesting material. It would definitely be a great project to update parts of the book to use Rust examples!

I wish there was more documentation for Rust! I am glad the graders are being lenient, however. I look forward to seeing what this class will become after Rust is completed!

There's more and better documentation every day (and I do hope students in this class will continue to contribute to improving what is available)! I'm not sure a program language is ever "completed", but I expect Rust 1.0 to be released by the end of 2014.

Are we going to go over the answers? Hopefully so

Yes, sorry it took so long! But, hopefully the discussion here is helpful, and I will discuss (at least) one of the questions in class. If there are things that it would still be useful to discuss after this, please let me know. (You can use the comments below.)

I felt that since the exam was open resources, there could be more than 8 questions. I do not think it would be a long exam if it was 10 questions or just bit longer.

I suppose, but I think there were enough questions to cover a good range of topics, and I'm not sure what making the exam open resources has to do with the number of questions. If anything, I would need more questions in a closed resources exam to avoid the risk of unfairly evaluating students who forget some topic covered by a question.

I love operating systems. I have installed and messed with several versions of Windows and about 6 different Linux systems. And I feel like a lot of this class is about Rust and not operating systems.

I think you are spending a lot of time learning Rust for the problem sets, so understand why you feel that way. But, if we were using some other language, you would be spending a lot of time learning that language; even if you think you already know C, to use C in a way that would be reasonable for this class would require spending far more time on C-specific issues than were have spend on Rust. If you look at what we've actually done in class, less than 20% of the class time has been on Rust.

I feel like I am spending at least ten times as much time learning to write Rust code than I am learning about operating systems in this course. Trying to complete assignments in a language with a buggy compiler and half-written documentation is frustrating. I enjoy programming, but knowing that I will not use this language going forward makes learning it feel like a waste of my time.

Programming takes a lot of time, and writing challenging low-level programs like we do in this class will take a lot of time no matter what language you are using (and I would hope you are actually learning a lot of interesting things about operating systems and other aspects of computing and programming by doing this). I'm not sure how you know what languages you'll use in the future, but of course, you do have choice in that (and you have choice in this class if you want to use Rust - as I mentioned in class, you are welcome to use any language you want, but if any code you write has a memory safety problem you get -10 on that assignment). I would hope that most people in this class are mostly using languages you haven't yet learned ten years from now, and that Rust will be successful enough that any of you who want to work on interesting projects using Rust would find plenty of opportunities to do so. The Rust documentation is still quite immature, but it is way better than it was last semester, and the source code is open so you can always find what you need, and there is a very helpful community that answers questions quickly on #rust IRC.

Could you please put on more explanations on the slides? It will be really helpful for review.

Slides are really not meant to be a stand-alone resource — they are designed to work with the lecture, and I don't think it is possible to make slides that contain a lot of text and detail but still work well in presentations. The questions on the notes should give you a good idea if you are understanding the things you should from the lecture, and you do have the videos to go back to if something isn't clear from the slides (as well as lots of other resources on this material, including the recommended readings.)

Thank you for teaching this class =]. Also, I think Rust is a good language and has a lot of potential. Although I admit it is rather infuriating when there's not a lot of documentation, and sometimes the only documentation is from a previous version where the code no longer compiles in 0.9... Rust is starting to grow on me.

Great! I expect it will continue to grow on you — there's a big learning curve to get over, but the more you learn the more you should appreciate how its design helps programmers.

I'm really enjoying this course so far and I'm glad I'm taking it. I was stumped at first by some of Rust's quirks but I'm beginning to really appreciate solving problems from a different language's perspective.

I'm really enjoying this course so far and I'm glad I'm taking it. I was stumped at first by some of Rust's quirks but I'm beginning to really appreciate solving problems from a different language's perspective.

Nope, just want more problem sets that involving coding (like ps2)!

Don't worry! You'll get to do plenty of coding for ps3 and ps4 (and as much as you want for the final project).