forked from aaronbloomfield/pdr
-
Notifications
You must be signed in to change notification settings - Fork 228
/
index.html
645 lines (645 loc) · 35 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
<head>
<meta charset="utf-8" />
<meta name="generator" content="pandoc" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
<title>PDR: Laboratory 8: x86 Assembly Language, part 1 (32 bit)</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
span.underline{text-decoration: underline;}
div.column{display: inline-block; vertical-align: top; width: 50%;}
div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
ul.task-list{list-style: none;}
.display.math{display: block; text-align: center; margin: 0.5rem auto;}
</style>
<link rel="stylesheet" href="../../markdown.css" />
<!--[if lt IE 9]>
<script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script>
<![endif]-->
</head>
<body>
<h1 id="pdr-laboratory-8-x86-assembly-language-part-1-32-bit">PDR:
Laboratory 8: x86 Assembly Language, part 1 (32 bit)</h1>
<p><a href="../index.html">Go up to the Labs table of contents
page</a></p>
<h3 id="objective">Objective</h3>
<p>This lab is one of two labs meant to familiarize you with the process
of writing, assembling, and linking assembly language code. The purposes
of the in-lab and post-lab activities are to investigate how various C++
language features are implemented at the assembly level.</p>
<p>There are both <a href="../lab08-32bit/index.html">32 bit</a> (<a
href="../lab08-32bit/index.md">md</a>) and <a
href="../lab08-64bit/index.html">64 bit</a> (<a
href="../lab08-64bit/index.md">md</a>) versions of this lab. This is the
<strong><em>32 bit version</em></strong>.</p>
<h3 id="background">Background</h3>
<p>The Intel x86 assembly language is currently one of the most popular
assembly languages and runs on many architectures from the x86 line
through the Pentium 4. It is a <a
href="http://en.wikipedia.org/wiki/Complex_instruction_set_computing">CISC</a>
instruction set that has been extended multiple times (e.g. <a
href="http://en.wikipedia.org/wiki/MMX_%28instruction_set%29">MMX</a>)
into a larger instruction set. In 2004 it was extended to allow for a 64
bit memory space.</p>
<h3 id="readings">Reading(s)</h3>
<ol type="1">
<li>Read the <a href="../../slides/08-assembly-32bit.html">slides on 32
bit x86</a></li>
<li>The two book chapters on x86: <a
href="../../book/x86-32bit-asm-chapter.pdf">x86 Assembly</a> and <a
href="../../book/x86-32bit-ccc-chapter.pdf">The x86 C Calling
Convention</a>.</li>
</ol>
<h2 id="procedure">Procedure</h2>
<h3 id="pre-lab">Pre-lab</h3>
<ol type="1">
<li>You should be familiar with the readings described above. They
detail the x86 material that this lab requires.</li>
<li>Complete the tutorial, which consists of reading two book chapters
that are contained in this repository: <a
href="../../book/x86-32bit-asm-chapter.pdf">x86 Assembly</a> and <a
href="../../book/x86-32bit-ccc-chapter.pdf">The x86 C Calling
Convention</a>.</li>
<li>Read through the section, below, on compiling C++ with assembly on
different architectures, as well as the vecsum program.</li>
<li>There are different program formats for different architectures, and
this pre-lab <strong>must</strong> be submitted in the submission format
for this lab (see the next section, below). If you do not submit it in
the required format (64-bit Linux), you will not receive credit for the
lab, as it will not compile.</li>
<li>Follow the pre-lab instructions in this document. They require you
to write a program in x86 assembly called mathlib.s. To see other
examples of nasm code, you should look at the vecsum.s program, as well
as the code in the nasm tutorial.</li>
<li>Make sure your mathfun.cpp takes in only the input described in the
pre-lab section! Input is to be provided via standard input (i.e.,
<code>cin</code>), not through command-line parameters.</li>
<li>Your code must compile with <code>make</code>!
<ul>
<li>And does your code work on a 64-bit Linux machine? It will need to
in order to receive credit.</li>
</ul></li>
<li>Files to download <a href="vecsum.s.html">vecsum.s</a> (<a
href="vecsum.s">src</a>), <a href="main.cpp.html">main.cpp</a> (<a
href="main.cpp">src</a>), <a href="Makefile.html">Makefile</a> (<a
href="Makefile">src</a>)</li>
<li>Files to submit mathlib.s, mathfun.cpp, Makefile</li>
</ol>
<h3 id="in-lab">In-lab</h3>
<ol type="1">
<li>Address at least one of the topics shown in the in-lab section. Be
sure to address all the issues in that topic! You will have to complete
both of these topics for the post-lab report.</li>
<li>We are looking for a brief write-up indicating that you addressed at
least one of the topics, and the results that you found. You do not need
to make it a full fledged report yet (that’s the post-lab).</li>
<li>Files to download: none (other than the results of your
pre-lab)</li>
<li>Files to submit: inlab8.pdf</li>
</ol>
<h3 id="post-lab">Post-lab</h3>
<ol type="1">
<li>Finish addressing the topics listed in the in-lab section. We are
looking for a quality write-up here, as detailed in the post-lab
section. Be sure to address all the issues in each topic!<br />
</li>
<li>Files to download: none (other than the results of your pre-lab and
in-lab)</li>
<li>Files to submit: postlab8.pdf</li>
</ol>
<hr />
<h2 id="platform-architectures">Platform Architectures</h2>
<h3 id="different-architectures">Different Architectures</h3>
<p>There are four different platforms that students are potentially
developing their code on:</p>
<ol type="1">
<li>32-bit Linux (what is on the VirtualBox image)</li>
<li>64-bit Linux (what the submission server is running, as well as what
is installed on the computers in Rice 340 and Olsson 001)</li>
<li>32-bit Mac OS X (although we doubt anybody actually has this
anymore)</li>
<li>64-bit Mac OS X</li>
</ol>
<p><strong>Your code must compile and run on the submission server,
which is a 64-bit Linux machine!</strong></p>
<p>There are three changes that will have to be made to compile your
program (and this to the Makefile) depending on your own development
platform:</p>
<ul>
<li>You will have to determine whether to name your function
<code>vecsum</code> instead of <code>_vecsum</code> (note the lack of
underscore in the former) in vecsum.s (this file is described more
below). In the final linking step, if you get a message such as,
<code>main.cpp:(.text+0x12): undefined reference to 'vecsum'</code>,
then you should change the name of the function.<br />
</li>
<li>Some systems will have to supply a command-line parameter to
clang++; this can be put on the <code>CXX</code> or
<code>CXXFLAGS</code> macro(s) line in your Makefile</li>
<li>All systems will have a specific nasm file format option (via
<code>-f</code>) that will need to be specified.</li>
</ul>
<p>The first bullet point highlights a compatibility problem between
Linux and Mac OS X. When calling a subroutine, which in C++ would be
called <code>foo()</code>, there are two standards as to how to name the
assembly routine: you can name it either <code>_foo</code> (adding an
underscore is added before the name), or name it just <code>foo</code>
(with no underscore). Unfortunately, Linux uses a different standard
than Mac OS X, so we have to make (minor) code modifications in order to
compile the code on the other system: in Mac OS X, the vecsum.s file
should have the subroutine be called <code>_vecsum</code>, and under
Linux, it should be called <code>vecsum</code> (this is twice, on lines
9 and 21).</p>
<p>In an effort to make sure all the files submitted conform to one
standard or the other, <strong>all assembly and C/C++ code must be
submitted in Linux form</strong> (i.e. will be called <code>foo</code>
and not <code>_foo</code>). Note that in many programs, such as the
vecsum.s that we provided you, you have to change the name in TWO
places: on the <code>global</code> line (line 9 of vecsum.s) and on the
label line (line 21 of vecsum.s). <strong>If your code does not compile
on the submission system, you will receive zero credit!</strong></p>
<p>Also note that your code must compile with <code>make</code>. We
provide a sample Makefile that will compile vecsum, so you can just
modify this Makefile to compile your pre-lab program. <strong>Please
note that you should NOT specify a <code>-o</code> flag to clang++ (not
even <code>-o a</code>)</strong>, as we want it to be named the default
(a.out). This allows easy porting between the two operating systems.</p>
<p>If you plan to develop it in Mac OS X, we suggest that you develop it
normally (putting in the <code>_</code> before the subroutine name).
Then, once you have verified everything works, remove the underscores
from <strong>all</strong> the relevant lines, and test it out on a
32-bit Linux machine, such as the VirtualBox image, before submitting
it.</p>
<h3 id="platform-specifics">Platform Specifics</h3>
<p>Each of these different platforms has different compilation lines to
allow it to compile. Some of them require changing the assembly files as
well. You only need to read the line(s) pertaining to the platform(s)
you are developing on.</p>
<p>The provided code works under both 32-bit and 64-bit Linux (although
64-bit Linux usage should read below).</p>
<p><strong>32-bit Linux:</strong> The provided code and Makefile works
on 32-bit Linux. All assembly subroutine names must <strong>NOT</strong>
have a leading underscore (i.e. they should be <code>vecsum</code> and
not <code>_vecsum</code>). nasm is invoked with the <code>-f elf</code>
option. We have the <code>-m32</code> flag on the CXX macro line so that
it will work fine on a 64-bit Linux machine (including our submission
server), but it is technically not necessary (it doesn’t hurt, it just
doesn’t do anything on a 32-bit machine).</p>
<p><strong>64-bit Linux:</strong> You have to explicitly tell clang++ to
compile in 32-bit mode by passing in the <code>-m32</code> parameter
(note that this parameter is fine to pass in on 32-bit systems as well).
You may need to install the g++-multilib package - we realize that we
are not using the g++ compiler, but this installs the correct library in
the correct place (if this differs with your version of Linux, then
please let us know!). The options for nasm (<code>-f elf</code>) and the
naming of the subroutines (<code>vecsum</code>, not
<code>_vecsum</code>) are the same as 32-bit Linux.</p>
<p><strong>32-bit Mac OS X:</strong> to run the code, will need to
rename all the assembly function names with a leading underscore
(i.e. <code>_vecsum</code> not <code>vecsum</code>). You will also have
to use the <code>-f macho</code> format for nasm, and tell clang++ to
generate the correct architecture code. In the past, to generate the
correct architecture code <strong>WAS</strong> done by providing the
<code>-arch i386</code> parameter to the compiler we used previously; we
are not sure if this is necessary with clang++ (please let us know how
this works for you; we do not have access to a 32-bit Mac machine with
clang/clang++ to test this out). In the provided Makefile, try the
<code>CXX</code> macro line to be <code>clang++ -arch i386</code>
(instead of the default <code>clang++</code>), and change the
<code>ASFLAGS</code> macro line to <code>-f macho</code> (instead of the
default <code>-f elf</code>). You will probably want to remove the
<code>-m32</code> flag on the <code>CXX</code> macro line, but be sure
to put that back in before you resubmit. Note that you
<strong>MUST</strong> change all of this back in order for it to compile
via the submission system! Also note that Mac OS X does not support the
format of assembly that we use in this course, which means that you will
be stuck reading the assembly in the other format we discussed in
class.</p>
<p><strong>64-bit Mac OS X:</strong> As far as we are aware, this is the
same as the 32-bit Mac OS X. If you run into problems with this, please
let us know. Note that Mac OS X does not support the format of assembly
that we use in this course, which means that you will be stuck reading
the assembly in the other format we discussed in class.</p>
<p>Below is a table summarizing the changes</p>
<table>
<colgroup>
<col style="width: 13%" />
<col style="width: 7%" />
<col style="width: 7%" />
<col style="width: 7%" />
<col style="width: 63%" />
</colgroup>
<thead>
<tr class="header">
<th>Platform</th>
<th>nasm flag</th>
<th>x86 subroutine name</th>
<th>clang++ flags</th>
<th>Notes</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>32-bit Linux</td>
<td>-f elf</td>
<td>vecsum</td>
<td>-m32</td>
<td>This is the platform on the 001 machines and in the VirtualBox
image; -m32 is optional, but keep it in there for compatibility with
64-bit Linux</td>
</tr>
<tr class="even">
<td>64-bit Linux</td>
<td>-f elf</td>
<td>vecsum</td>
<td>-m32</td>
<td>This is what our submission server is running, <strong>and what your
code must work on.</strong> If you have it on your computer, you must
install a few packages as well - see above</td>
</tr>
<tr class="odd">
<td>32 bit Mac OS X</td>
<td>-f macho</td>
<td>_vecsum</td>
<td>-arch i386</td>
<td>We are unsure about the -arch i386 flag’s necessity. May not be able
to print the assembly in the format discussed in class.</td>
</tr>
<tr class="even">
<td>64 bit Mac OS X</td>
<td>-f macho</td>
<td>_vecsum</td>
<td>-arch i386</td>
<td>We think this is the same as 32-bit Mac OS X platform. We are unsure
about the -arch i386 flag’s necessity. May not be able to print the
assembly in the format discussed in class.</td>
</tr>
</tbody>
</table>
<p><strong>IMPORANT:</strong> Just to repeat, when you submit your code,
it <strong>MUST</strong> be in 64-bit Linux format.</p>
<hr />
<h2 id="pre-lab-1">Pre-lab</h2>
<h3 id="compiling-assembly-with-c">Compiling Assembly With C++</h3>
<p>For this part, you will need to download three files: <a
href="vecsum.s.html">vecsum.s</a> (<a href="vecsum.s">src</a>), <a
href="main.cpp.html">main.cpp</a> (<a href="main.cpp">src</a>), and <a
href="Makefile.html">Makefile</a> (<a href="Makefile">src</a>).</p>
<p>To compile a program written partly in x86 assembly and partly in
C++, we have to build the program in parts. We build the C++ file as we
have in the past:</p>
<pre><code>clang++ -m32 -Wall -g -c -o main.o main.cpp
</code></pre>
<p>Note that we used the -c flag, which tells the compiler to compile
but not link the program. Linking it will create the final executable –
but as there is not a <code>vecsum()</code> function defined (yet), the
compiler will report an error stating that it does now know the vecsum()
function. The <code>-o main.o</code> part tells clang++ to put the
compilation output into the file named main.o. Note that the
<code>-o</code> flag wasn’t really necessary here (as clang++ will use
main.o by default when compiling main.cpp), but we wanted to include it,
as we are going to use it below. We include the <code>-m32</code> flag
to force it to be a 32-bit file. We also added a few more flags
(<code>-Wall -g</code>) to print all warnings and compile debugging
symbols into the program.</p>
<p>Next, we need to compile the assembly file. To do this, we enter the
following:</p>
<pre><code>nasm -f elf -g -o vecsum.o vecsum.s</code></pre>
<p>This invokes nasm, which is the assembler that we are using for this
course. We’ll get to the <code>-f elf</code> part in a moment. The
<code>-o vecsum.o</code> option is the same as with clang++ – it is
telling the assembler to put the output into a file named vecsum.o. If
you do not specify a filename with the <code>-o</code> flag, it will
default to vecsum.obj, NOT vecsum.o – this is why we are using the
<code>-o</code> flag. We also tell it to include debugging symbols via
<code>-g</code>. The assembly file name is specified by the
<code>vecsum.s</code> at the end of the command line.</p>
<p>The new flag here is the <code>-f elf</code>. This tells the
assembler the output format for the final executable. Operating systems
can typically execute a number of different formats. As we are running
under 32 bit Linux, we specify the elf format. Mac OS X uses <code>-f
macho</code> – see the above section for more details.</p>
<p>Finally, we have to link the two files into the final executable. We
do this as before:</p>
<pre><code>clang++ -m32 -Wall -g vecsum.o main.o</code></pre>
<p>This tells clang++ to link both of the .o files created above into an
executable, which it called <code>a.out</code>. Note that there isn’t
any compiling done at this stage (the compilation was done before) –
this just links the two object files into the final executable. Also
note that for our submitted Makefiles, we will NOT have the
<code>-o</code> flag present.</p>
<h3 id="tutorial">Tutorial</h3>
<p>Complete the C++/assembly tutorial, which consists of reading two
book chapters that are contained in this repository: <a
href="../../book/x86-32bit-asm-chapter.pdf">x86 Assembly</a> and <a
href="../../book/x86-32bit-ccc-chapter.pdf">The x86 C Calling
Convention</a>.</p>
<h3 id="vecsum">Vecsum</h3>
<p>Examine the vecsum subroutine in <a href="vecsum.s.html">vecsum.s</a>
(<a href="vecsum.s">src</a>). Use the slides and readings to help
understand what is happening in vecsum.s. Make sure you understand the
prologue and epilogue implementation, as well as the instructions used
in the subroutine.</p>
<p>Compile and run the vecsum program:</p>
<ul>
<li>Use the tutorial as a guide, but see the instructions above.</li>
<li>If you forget the gdb commands described below, see the <a
href="../../docs/gdb_summary.html">GDB command summary</a>, which has a
summary of all of these commands.</li>
<li>You can find the assembly and C++ source code in the repository (<a
href="vecsum.s.html">vecsum.s</a> (<a href="vecsum.s">src</a>), <a
href="main.cpp.html">main.cpp</a> (<a href="main.cpp">src</a>), <a
href="Makefile.html">Makefile</a> (<a href="Makefile">src</a>)). For the
C++ code compilation (i.e. main.cpp) and the final link, use the
<code>-g</code> flag, which allows the program to work well with the gdb
debugger.</li>
<li>Use the debugger to step through the assembly code, view the
register contents, and view the computer’s memory.</li>
<li>Set a breakpoint at the line in the main.cpp where the vecsum()
function is called (probably line 38).</li>
<li>Normally, you would use the <code>step</code> function to step into
the next instruction. However, since no debugging information was
included with the assembler (a shortcoming of nasm), we can’t use
<code>step</code> – it will just move to the next C++ instruction (the
<code>cout</code>). Instead, we will use <code>stepi</code>, which will
step exactly one <em>assembly instruction</em>, which is what we
want.</li>
<li>To display the assembly code that is currently being executed, enter
<code>disassemble</code>. This is just like <code>list</code>, but it
displays the assembly code instead of the C++ code.</li>
<li>Note that this prints things in a different assembly format. To set
the format to the style we are used to (and the style we are programming
in with nasm), enter <code>set disassembly-flavor intel</code>. Now
enter <code>disassemble</code> again – the format should look more
familiar. You only have to enter that set command once (unless you exit
and re-enter gdb).</li>
<li>To see the vecsum function, enter <code>disassemble vecsum</code>.
Note that this only lists the first third (or so) of the routine – up
until the <code>vecsum_loop</code> label. To see the rest of the code,
enter <code>disassemble vecsum_loop</code>, <code>disassemble
vecsum_done</code>, etc.</li>
<li>To show the contents of the registers, use the <code>info
registers</code> command.</li>
</ul>
<h3 id="pre-lab-program-mathlib.s">Pre-lab program: mathlib.s</h3>
<p>You will need to write two routines in assembly, one that computes
the product of two numbers, and one that computes the power of two
numbers.</p>
<p>The first subroutine will compute the product of the two integer
parameters passed in. The restrictions are that it <strong>can only use
addition</strong>, and thus cannot use a multiplication operation. We
will assume that both of the parameters are positive integers. It must
compute this <strong>iteratively</strong>, not recursively. The
resulting product is then returned to the calling routine. This
subroutine should be called <code>product</code>. We will assume that
values will not be provided to the subroutine that will cause an
overflow, nor will negative (or zero) parameters be passed in.</p>
<p>The second subroutine will compute the power of the two integer
parameters passed in. We will assume that the first parameter is the
base, and the second parameter is the exponent. Again, both are
integers. The restrictions on this routine are that it <strong>can only
use the multiplication</strong> routine described above – it cannot call
any exponentiation routine. Furthermore, it must be defined
<strong>recursively</strong>, not iteratively. This routine should be
called <code>power</code>.</p>
<p>You can assume that the numbers passed into both routines will both
be positive, so you need not consider negative numbers or zero.
Furthermore, as described above, no values will be used on your program
that could cause an integer overflow.</p>
<p>Both of these routines should be in a file called mathlib.s, and must
use the proper C-style calling convention. You must also provide a
mathfun.cpp file, which calls both of your subroutines – see the
main.cpp file provided as a template. The program should take in ONLY
two integers (we’ll call them <em>x</em> and <em>y</em>). It should then
print out the output of calling <code>product(x,y)</code> and
<code>power(x,y)</code>. Thus, if the input is 3 and 4, it would print
out 12 and 81.</p>
<p>Input is to be read via standard input (i.e., <code>cin</code>), not
through command-line parameters.</p>
<p>If you are going to have multiple routines in a single assembly file
(as is needed for mathlib.s), you just have to have multiple
<code>global</code> lines for each subroutine that you plan on calling
from your C/C++ code.</p>
<hr />
<h2 id="in-lab-1">In-lab</h2>
<p>Come to lab with a functioning version of the pre-lab, and be
prepared to demonstrate that you understand how to build and run the
pre-lab programs. If you are unsure about any part of the pre-lab, talk
to a TA. The in-lab will ask you to write C++ code and examine the
generated assembly language for a variety of topics.</p>
<p>You should be able to explain and write recursive functions for the
final exam, so make sure that you understand how to implement the
pre-lab program. Speak to a TA if you have any questions.</p>
<p>The general activity of this in-lab will be to write small snippets
of C++ code, compile them so that you can look at the generated assembly
code, then make modifications and recompile as needed in order to deduce
the representation of a number of C++ constructs (listed below).</p>
<p>For the in-lab, you will need to work on the two topics shown below –
note that there will be a different topics to work through on the next
lab.</p>
<p>The deliverable for this in-lab is a document named inlab8.pdf. It
must be in PDF format! See the <a
href="../../docs/convert_to_pdf.html">How to convert a file to PDF</a>
page for details about creating a PDF file.</p>
<p>In the report, you should explain all the items in <em>one</em> of
the categories below (either objects or parameter passing). For the
post-lab, you will need to have all items from both categories
explained. We are looking for significant evidence that you were able to
complete some work during the in-lab, and thus are not setting page
length requirements.</p>
<p>Recall that using the <code>-S</code> flag with clang++ will generate
the assembly code. You will also want to use the <code>-mllvm
--x86-asm-syntax=intel</code> flags.</p>
<h3 id="clang-and-the-calling-convention">clang++ and the Calling
Convention</h3>
<p>As discussed in class (specifically, <a
href="../../slides/08-x86.html#clangconventionbreak">here</a>), clang
will often optimize away many parts of the calling convention, and has
even been known to pass parameters in registers. So if you see code that
is unexpected, trace it by hand to determine what is really happening.
Likely, it is an optimization of the calling convention. To avoid this
in the in-lab, use g++ rather than clang++ to compile. The correct g++
flags are <code>-S -m32 -masm=intel</code>.</p>
<h3
id="in-lab-8-topics-you-must-do-all-of-these-for-the-post-lab-but-only-one-of-these-for-the-in-lab">In-lab
8 topics: you must do ALL of these for the post-lab, but only ONE of
these for the in-lab</h3>
<p>The questions posed below are example questions to answer. Some of
them may overlap. Others may not be necessary. And there may be
questions not listed that are worth answering. The purpose of posing
those questions is to help you think about what topics and ideas you
should address in your response - it’s not meant to be a rigid structure
that you absolutely must follow. We are going to look at whether you
have addressed the general idea of each of the list topics.</p>
<h3 id="parameter-passing">Parameter passing</h3>
<p>Show and explain assembly code generated by the compiler for a simple
function and function call that passes parameters by a variety of means.
Be sure to show what is happening both in the caller (the function which
makes a call to another function) and in the callee (the function which
is called by another function, possibly a recursive call). You do not
need to describe parts of the C calling convention we described in class
(e.g. saving registers, saving the base pointer, how the call
instruction works). The focus here is on examining in detail what
happens when parameters are passed.</p>
<ol type="1">
<li>You should explain how ints, chars, pointers, floats, and objects
that contain more than one data member such as user-defined classes are
passed by value and by reference.</li>
<li>In addition, show how arrays (you may pick any type) are passed in
C++. Be sure to show <em>both</em> how these values are passed into a
function <em>and</em> how the callee accesses the parameters inside of
the function. Recall that you can use gdb to pause execution in the
assembly code, and then see actual memory addresses – see the pre-lab
for details. This question asks about exactly where the data values are
placed, so you will need to determine at least a register-relative
address, just saying the parameter is accessed as [var] is not enough.
Be sure to ask if you do not understand.</li>
<li>How does passing values by reference work in assembly? Is it
different than passing values by pointer?</li>
</ol>
<h3 id="objects">Objects</h3>
<ol type="1">
<li>Explain how data layout, data member access, and method invocation
works in C++ objects. For data layout, how are they kept in memory? How
does C++ keep different fields of an object “together”? For data access,
how does the assembly know which data member to access? We know how
local variables and parameters are accessed (offsets from the base
pointer) – describe how this is done for data fields. For method
invocation, we know how functions are invoked. But what about methods –
how does the assembly know which object it is being called out of?
Remember that assembly is not object oriented.</li>
<li>Describe where data is laid out for a sample C++ class. You should
include at least five data members in your class. Be sure to include
data members of different types (ints, chars, and other user-defined
classes) and different access levels (public and private) in your class.
We are looking for something descriptive here – for example, if you
define a char and an int (total of 5 bytes needed), how is it laid out
in memory?</li>
<li>Next, demonstrate how data members are accessed both from inside a
member function and from outside. In other words, describe what the
assembly code does to access member functions in both of these
situations.</li>
<li>Finally, show how public member functions are accessed for your
sample class. How is the “this” pointer implemented? Where is it stored?
When is it accessed? How is it passed to member functions? When (if
ever) is it updated?</li>
</ol>
<hr />
<h2 id="post-lab-1">Post-lab</h2>
<p>For the post-lab, you should explore, investigate, and understand all
of the items from the in-lab list. Be able to answer “how” and possibly
“why” for each item. Use test cases and the debugger as resources.
Additionally use resources other than yourself (e.g. books, Web,
people). Be sure to credit these sources. <strong><em>You must use (and
cite!) additional resources for this post-lab!</em></strong></p>
<p>The idea is that you will take what you started in the in-lab, and
flesh it out a bit more for the post-lab.</p>
<p>Prepare a report that explains your findings. Follow the guidelines
in the Post-lab Report Guideline section, below. In particular, you
should address how the compiler implements the construct at the machine
and assembly levels, and what lead you to this conclusion. You must show
evidence of this behavior in the form of assembly code, C++,
screenshots, memory dumps, manual quotations, output, etc. Where did you
find the information that lead to your conclusion (i.e. your
sources)?</p>
<p>Your report should be in PDF file called postlab8.pdf. It must be in
PDF format! See the <a href="../../docs/convert_to_pdf.html">How to
convert a file to PDF</a> page for details about creating a PDF
file.</p>
<h3 id="tips-for-getting-started-on-the-post-lab">Tips for Getting
Started on the Post-lab</h3>
<p>Think about how best to investigate the issues you choose. A good
starting point is to write a small C++ program that illustrates one of
the issues. This program should be as simple as possible.</p>
<p>Look at the assembly code associated with your C++ code. To examine
the disassembled code you have two main options: you can step through
the code in the debugger using the disassembly view, or you can have the
C++ code output to an assembly file (using the <code>-S</code> and
<code>-masm=intel</code> flags), which you can then browse or edit.</p>
<p>Generating assembly listings: to generate an assembly listing in
clang++, use the flags described above, and see the wiki page for
details. Probably the most useful listing will include source, and
assembly code. For some issues it will be of interest to see the machine
code as well.</p>
<p>A couple of things you will notice almost immediately about these
assembly files is that they can be surprisingly long, and that they
contain a bunch of labels, directives, and instructions that at first
glance appear to have little to do with your original source program.
Don’t despair, with a little perseverance you will be able to make heads
and tails of a good bit of this.</p>
<p>Note that printing out these disassembled files is probably not your
most useful option. You will most likely find that it is significantly
easier to view the files in a browser of your choice, such as emacs. In
this way you can navigate through the file, searching for particular
labels or C++ code. Besides, you may want to make a slight modification
to your C++ code and recompile often anyway.</p>
<p>Still stuck? Some of these issues are non-trivial to figure out.
Remember that you can use basically any resource whatsoever to figure
these things out. There will almost certainly still remain some things
in the disassembled code that you do not fully understand. Don’t let
this paralyze you. Focus on devising experiments that will help you
learn more about the particular issues in lists 1 and 2. By tracing
though some parts of the code and by modifying your C++ code and
comparing the generated assembly code for the two different versions,
you should be able to come up with some reasonably good hypotheses about
what is happening. Seek out books, manuals, and web pages that explain
the issue. Keep in mind that you are required to list your sources in
your post-lab report.</p>
<h3 id="post-lab-report-guidelines">Post-lab Report Guidelines</h3>
<p>You should submit a nicely formatted report that explains your
findings. The report should be a PDF file called postlab8.pdf. At a
minimum your post-lab report should address the items in the in-lab
list. In your report, label the items according to which list item they
came from (parameter passing or objects), and their item number within
that list.</p>
<p>Note that this is supposed to be a polished report. Code snippets
should be embedded into the document, not just printed out on a page by
themselves and added in at the end. Similarly, screen shots (which are
optional) should be embedded in the document. Highlighting portions of
code or drawing arrows between things may help make your explanations
clearer. I would expect the explanation of each item to be a page or two
long at least, including embedded code snippets and screenshots. Keep in
mind that you are only submitting one file for this report:
postlab8.pdf. Thus, everything must be included in that one file.</p>
<p>We are being very careful to not specify the length of the report,
only what we would guess is the estimated length. As long as you
properly answer the questions, the length is up to you. Double versus
single spacing is also up to you, but we prefer single spacing.</p>
<p>Other than your own experiments, feel free to use online x86 assembly
references, C++ books, and resources you may find on the Internet or
elsewhere. <strong>Discussing these issues is allowed, however, remember
that your code and final report must be your own work and that you must
credit ANY resources used.</strong></p>
<h3 id="how-much-are-we-looking-for">How much are we looking for?</h3>
<p>We want you to investigate the particular topic area from the given
list, write code to discover the answers, and learn about this topic on
your own. The questions that we pose are just meant to get you thinking
about the possible ramifications of a given question. They aren’t meant
to be specific questions that necessarily need answering one at a
time.</p>
<p>As with the previous lab, I would expect the explanation of each item
(you have to do two items) to be a page or two long, including embedded
code snippets and screenshots (obviously, we want a reasonable amount of
English text here – if you do a lot of screen shots, then your total
length will be a bit longer). Did you investigate the topic? Did you
write code to discover what you didn’t know? Was this written in a
reasonably readable format? This is what we are looking for.</p>
<p>This is somewhat vague, and purposely so – research is often vague.
If we told you exactly what to write, then there wouldn’t be much
discovery of that on your part, which would defeat the whole point of
this lab.</p>
<p><strong>We are not looking for you to spend hours and hours and hours
on this!</strong> A <em>page or two</em> per list item (and you have to
do two of them) - which means your final report needs to be 2-4 pages
long. Keep in mind if you have a lot of screenshots, that doesn’t count
much towards that page limit.</p>
<p>The grading will be based on a set of points that we would expect you
to discover when investigating a given topic. Your grade will be based
mostly on how well you hit those points. A small portion of your grade
will be based on the overall report presentation and written ability
(while we are a computer science class, we expect you to be able to
write in English to some extent!).</p>
</body>
</html>