80x86 Assembler, Part 8

Atrevida Game Programming Tutorial #19
Copyright 1998, Kevin Matz, All Rights Reserved.

Prerequisites:

Chapter 18: 80x86 Assembler, Part 7
Make sure you're comfortable with parameter passing in assembler (if necessary, quickly review Chapter 15, 80x86 Assembler, Part 4)

In this chapter, we'll learn how to interface assembler code with C and C++ programs.

Please note that this discussion deals with combining Turbo Assembler code with Turbo C/C++ or Borland C/C++. Of course, you can use other assemblers and/or other compilers, but the details will differ. I don't know the details, so check your manuals.

Inline assembler

The easiest way to add assembler code to your programs is to use the "asm" construct in Turbo/Borland C/C++. Watch this:

#include <stdio.h>

main ()
{
    char letter = 'F';

    printf ("My favorite letter is... ");

    asm {
        MOV AH, 02
        MOV DL, [letter]
        INT 21h
    }

    printf ("\n");
    return 0;
}

When you specify "asm" and put assembler code in the braces, the assembler code gets assembled by a built-in assembler and the machine code is put directly into the object code. If you want to use only a single assembler instruction, you can specify "asm" without any braces, like this:

asm INT 21h;

asm XOR AX, AX;

Variables in the C/C++ program can be used within the asm statement. The compiler and assembler still obey variable scoping rules, so you can access globally-defined variables, and you can access local variables, but you can't access variables declared locally in other functions. To access the contents of a variable, put square brackets around the variable name, just like in Ideal mode:

int x;
.
.
.
asm MOV [x], AX;

Recall that, in Turbo/Borland C/C++, a char is byte-sized, an int is word-sized, and a long int is doubleword-sized. floats and doubles are stored in nasty and difficult-to-use formats, so it's rare for these variable types to be accessed in an inline assembler block. (There are strategies for simulating floating-point numbers; we'll see them in a later chapter.)

You can even access structure members, like this:

#include <stdio.h>

struct person {
    char name[20 + 1];
    int age;
    int friends;
};

main ()
{
    struct person tom, linda, bob, gerda;

    asm {
        MOV [gerda.age], 25
        INC [linda.friends]
    }
}

When using inline assembler, there are a few things to watch out for:

If you use "asm" with braces, the opening brace must be on the same line as the "asm" statement:
```
asm {
    .
    .
    .
}
```
This will cause an error:
```
asm
{
    .
    .
    .
}
```
Borland put in this "feature" for compatibility with certain C/C++ compilers in the Unix world. Hooray for endorsements of bad standards.
The built-in assembler doesn't necessarily follow TASM Ideal mode standards. (No, that would be too convenient!)
The semicolon character can't be used to signify comments within an "asm" block. The following produces a compiler error:
```
asm {
    MOV AL, 20h                        ; AL = 20h
    OUT 20h, AL                        ; Send 20h to port 20h
}
```
It's still considered C/C++ code to some degree, so we have to use the traditional C comments, "/* like this */", or C++ comments, "// like this":
```
asm {
    MOV AL, 20h                        /* AL = 20h */
    OUT 20h, AL                        // Send 20h to port 20h
}
```
Every time I write inline assembler code, I have the nasty habit of using the semicolon for comments, and of course, the compiler complains every time.
C has a set of special, "sacred" registers that it doesn't want disturbed. These registers are CS, DS, SS, SP, and BP. C expects you to leave these registers as you found them -- in other words, C leaves critical information (such as the location of the code, data, and stack segments) in these registers, and it expects that information to be there when it comes back to use it later.
Now, this doesn't mean you can't use these registers. You are certainly permitted to use them -- but if you modify one of these special registers, you must set that register back to its original value when you're finished. The easiest way to do this is simply to use PUSH and POP to save and restore the registers.
Why do we need to preserve these special registers? Well, when we put inline assembler code into our C/C++ program listings, we're inserting our machine instructions in between the machine instructions produced by the compiler. The compiler has certain standards -- when generating machine code, the compiler expects certain registers to have certain values. The compiler normally puts in code at the start of a program to set up the segment registers, so that, for example, DS points to the data segment that contains the global variables. It then expects DS to remain unchanged throughout the program, so if we were to modify DS, we could imagine that certain nasty things might happen. The compiler relies on the other special registers, to keep track of the code segment and the stack.
Here's a quick example:
```
asm {
    PUSH DS
    PUSH ES
    PUSH BP

    .
    .
    Modify DS, ES, and BP here...
    .
    .

    POP BP
    POP ES
    POP DS
}
```
Yes, you can use the stack. As always, creating a stack overflow or underflow is not a healthy thing to do. How can you find out how big the stack is? If you're using the Borland/Turbo C/C++ IDE (Integrated Development Environment -- basically, the editor that came with your compiler), the stack size can normally be adjusted in one of the Options menus. If you're using the compiler from the command line, you can change the stack size with a command-line switch.
Of course, you must leave the stack in the same condition you found it. Given what we learned about parameter passing in previous chapters, you might recognize that the following might cause some problems:
```
main ()
{
    DoSomething ();

    .
    .
    .

    return 0;
}

void DoSomething ()
{
    asm {
        PUSH AX
        PUSH AX
        PUSH AX
    }

    return;
}
```
Recall that the return address is stored on the stack. If we put some junk data on the stack within the function, as in the above example, then when the function returns, that junk data will be used as the return address.
Basically, just follow safe stack guidelines -- leave the stack in the same condition in which you found it!
The built-in assembler doesn't handle macros.
You can't put a label inside an "asm" block. For example, this won't work:
```
asm {
    XOR AX, AX
    MOV CX, 10

LoopLabel:                  /* Built-in assembler doesn't like this! */
    ADD AX, CX
    LOOP LoopLabel
}
```
However, C lets you declare labels in your code, so that you can use C's goto statement. (You don't see labels very often in C code, because most people try to avoid gotos.) But the built-in assembler recognizes C labels, so you can do something like this:
```
asm {
    XOR AX, AX
    MOV CX, 10
}

LoopLabel:

asm {
    ADD AX, BX
    LOOP LoopLabel
}
```
Yes, that's ugly, but it works. (And of course it's ugly. It's C.)

Inline assembler is perfect for many tasks. It is much, much easier to use than the other method (assembling .ASM files to produce .OBJ files, and linking those .OBJ files with the .OBJ files created by the C/C++ compiler). As far as I can tell, inline assembler is quite popular. Apparently the excellent demo Crystal Dreams 2 by Triton was written in Borland Pascal, with virtually all of the code in inline assembler. I think that's quite clever -- you can leave some of the messy details, such as parameter passing and variable scoping, to the high-level language. (However, the Triton fellows did complain about the lack of support for macros -- see the credits at the end of Crystal Dreams 2).

External assembly -- combining assembler with C

Inline assembler is useful for many purposes, but sometimes we need a heavy-duty solution. Using "external assembly", we can combine C and assembler by doing this: we write a bunch of assembler procedures, put them into a .ASM file, and assemble that file to get an .OBJ file. Then we link that .OBJ file with the .OBJ files that are generated by compiling an associated C program, and we get the final .EXE program.

Each source code file will be called a module. So if a program consists of the main file PROGRAM.C, and then two assembler files SUPPORT1.ASM and SUPPORT2.ASM, then the program will consist of three modules -- one C module (PROGRAM.C), and two assembler modules (SUPPORT1.ASM and SUPPORT2.ASM).

Our goal is to be able to write a procedure in assembler, and then call it from a C program, just as if it was a normal function written in C. In order to do this, we have to follow certain rules and conventions in our assembler procedures. We need to emulate a C function -- we need to do everything that a C function does -- so that we can trick C into thinking that the assembler procedure is a C function. We'll also need to add some import/export declarations to our code, so that the assembler, C compiler, and linker can cooperate.

The first thing we must do is ensure that the memory models for our assembler and C modules are the same. In assembler, that means the "small" in the "MODEL small" directive we've used in just about every assembler program. I personally use the large memory model for all my C programs, so the examples in this tutorial will also use the large memory model. So, with Turbo C/C++, if you're using the IDE, you can go to the Options pull-down menu, select Compiler, then Code Generation, and then select a memory model. (If you're using the command-line compiler, there are switches to specify the memory model -- we'll see these later.) If you're using the IDE for Borland C++ 4.5 for Windows, go to the Options menu, select Project, select 16-bit Compiler, select Memory Model, and then choose a memory model from the list.

Let's start with a short and simple C program:

#include <stdio.h>

void DisplayMessage (void);            /* (function prototype) */

main ()
{
    DisplayMessage ();

    return 0;
}

void DisplayMessage ()
{
    printf ("This is a test message, from C.\n");
}

Now, what if we wanted to write an equivalent of DisplayMessage() in assembler? (That is, just plain normal assembler, where we put our code into an .ASM file.) Well, that's reasonably easy to do... First, we'd need to declare the string -- let's make it say something different -- in the data segment...

    DATASEG

AssemblerMessage                  DB   "Hello, from Assembler!", 13, 10, '$'

Then...

    CODESEG

PROC DisplayMessage FAR

    ; Display the AssemblerMessage string using INT 21h, Service 9:
    MOV DX, OFFSET AssemblerMessage
    MOV AH, 9
    INT 21h

    RET
ENDP DisplayMessage

This is a rather basic procedure -- it doesn't use any parameters or local variables, it doesn't return a value, and it doesn't disturb any of C's "sacred" registers -- CS, DS, SS, SP, and BP. In fact, this assembler procedure is almost ready to be integrated with a C program.

Let's start learning how to convert the DisplayMessage procedure to a procedure that is compatible with C. But first, we must learn how to share global variables between an assembler module and a C module.

Sharing global variables

There are two ways to share global variables between a C module and an assembler module:

- Declare a global variable in the C module
- Import the global variable in the assembler module, using "EXTRN"
- Declare a global variable in the data segment of the assembler module
- Tell the assembler that we want to export (or "make public") this global variable
- Import the global variable in the C program

Let's see how the first option, the C-to-assembler version, works.

Let's declare these global variables in a C program:

unsigned char letter_grade = 'A';
int number_of_cats = 235;
long int cash;

They can be pre-initialized, as in the number_of_cats and letter_grade cases, or not, as in the case of cash.

Then, in the assembler module, we need to import the global variable. Of course, we do this in the data segment. We use a directive called "EXTRN" (meaning, external): we write EXTRN, then the name of the variable we wish to import, then a semicolon, and then we specify either BYTE, WORD, or DWORD, depending on the size of the variable:

    DATASEG

EXTRN _letterGrade : BYTE
EXTRN _number_of_cats : WORD
EXTRN _cash : DWORD

Okay, wait a minute! There appears to be an extra underscore at the start of each variable name!

Yes, there's one important catch: you have to add an underline character (underscore) to the front of the variable name in assembler. That becomes part of the variable name (in the assembler module), so if you want to access one of the variables, you must use the name with the underscore. For example:

    MOV [_letterGrade], DH
    INC [_number_of_cats]

(This is due to a C convention that says that the names of "global symbols" must have leading underscores. So we have to play C's game here.)

That's all there is to it -- after you do those steps, you should be able to access those variables in your assembler module.

Now let's see how the assembler-to-C sharing method works.

In the data segment of our assembler program, we declare variables the same way we normally do. However, we again need to start the names of sharable variables with a single underscore. So, for example:

    DATASEG

_age                              DB   5
_number_of_dogs                   DW   99
_thirtyTwoBitChecksum             DD   ?

Now, just below this, we need to inform the assembler that we want to export these global variables. To export variables, or, in other words, to make them "public", we write the "PUBLIC" directive, and then the name of the variable, like this:

PUBLIC _age
PUBLIC _number_of_dogs
PUBLIC _thirtyTwoBitChecksum

Then, finally, in our C program, we can import these variables. The variable names in C don't have the underscores at the front. Importing variables looks basically like declaring variables, except you use the C keyword "extern" at the start, like this:

extern unsigned char age;
extern int number_of_dogs;
extern long int thirtyTwoBitChecksum;

And yes, that's "extern", six letters, as opposed to "EXTRN", five letters, in the assembler form. (Yet another thing to remember...)

Now you should be able to use those global variables in your C program just as if they were declared in C.

Don't worry, an example program will be coming up shortly.

A few things to keep in mind:

Note that, when we share variables between modules, we're not creating two separate variables. Space is allocated for the variable whenever and wherever it is declared; sharing the variable means that we're allowing the compiler/assembler/linker to give the address of that variable to other modules, so the compiler/assembler/linker handling that other module can plug in that address whenever the shared variable is accessed.
Shared variables are exactly that -- shared. That means that if you set a shared variable x to 35 in your C program, and then you call an assembler procedure that accesses x. (or actually, _x, in assembler), the assembler procedure will know that _x equals 35. If the assembler procedure then sets _x to 47 and returns to the C program, and then the C program reads x, it will be 47.
So if you were worried that you needed to issue commands to somehow "copy" the contents of global variables from one module to another -- don't worry, you don't.

If you forgot the sizes of different variables in Turbo/Borland C/C++, here they are. I've created two semi-convenient tables:

C                                 ---->  Assembler
------------------------------------------------------
char                                     BYTE
unsigned char                            BYTE
signed char                              BYTE
int                                      WORD
unsigned int                             WORD
signed int                               WORD
short int                                WORD
unsigned short int                       WORD
signed short int                         WORD
long int                                 DWORD
unsigned long int                        DWORD
signed lont int                          DWORD
float, double (floating-point types)     (too hard -- don't bother!)
near pointers (offset only)              WORD
far pointers (segment and offset)        DWORD

Assembler    ---->  C
----------------------------------------
BYTE                char
                    unsigned char
                    signed char
WORD                int
                    unsigned int
                    signed int
                    short int
                    unsigned short int
                    signed short int
                    near pointers (offset only)
DWORD               long int
                    unsigned long int
                    signed long int
                    far pointers (segment and offset)

Parameterless, return-value-less, C-compatible assembler procedures

Now that we know how to share global variables, we can learn how to share code -- that is, assembler procedures.

When converting assembler procedures to be C-compatible, the simplest type of assembler procedure has these properties:

it doesn't take any parameters
it doesn't return a value
it doesn't access C's "sacred" registers (CS, DS, SS, SP, and BP).

In fact, when this is the case, "converting" an assembler procedure, and getting C to recognize the procedure, are pretty simple. All we need to do is:

change the name of the procedure, so that is has an underscore at the front
ensure that the memory models for the assembler and C programs are the same (I prefer the large memory model); then, optionally, you can add a NEAR or FAR "tag" on the PROC definition line (if you do this, the "tag" must match the memory model -- FAR for large and huge, and NEAR for the rest)
tell the assembler to make the procedure public, using PUBLIC
in the C program, "import" the external procedure as a C function, by adding a function prototype that uses the "extern" keyword

So, our original procedure looked like this (plus the data in the data segment, as shown below):

    DATASEG

AssemblerMessage                  DB   "Hello, from Assembler!", 13, 10, '$'

    CODESEG

PROC DisplayMessage

    ; Display the AssemblerMessage string using INT 21h, Service 9:
    MOV DX, OFFSET AssemblerMessage
    MOV AH, 9
    INT 21h

    RET
ENDP DisplayMessage

Now, we just need to add an underscore to the name of the procedure, so now it's _DisplayMessage. (Good news -- you can still call this procedure from elsewhere in the same assembler module -- just say "CALL _DisplayMessage".)

Then, if we wanted, we could put a "FAR" tag at the end of the "PROC DisplayMessage" line, so it would look like this: "PROC DisplayMessage FAR". But the default is already FAR if we're using the large memory model, so we don't need to bother with it.

Then, right at the start of the code segment, we can put this line:

PUBLIC _DisplayMessage : PROC

That will export the procedure (make it public).

So now we have:

    DATASEG

AssemblerMessage                  DB   "Hello, from Assembler!", 13, 10, '$'

    CODESEG

PUBLIC _DisplayMessage : PROC

PROC _DisplayMessage FAR

    ; Display the AssemblerMessage string using INT 21h, Service 9:
    MOV DX, OFFSET AssemblerMessage
    MOV AH, 9
    INT 21h

    RET
ENDP _DisplayMessage

And now, what must we do in order to be able to call this assembler procedure from our C program? Well, we need to put a function prototype in the C program listing. In this case, it will look like this:

extern void DisplayMessage (void);

The "extern" tells C that the function is not in the current C code listing; it's somewhere else. When the C compiler sees the "extern", it says, "okay, I'll let the linker handle that". Then the "void" just means that the function won't return a value. The name of the function is "DisplayMessage" -- as before, the leading underscore gets dropped in C. Then the "(void)" indicates that the function takes no parameters.

Once you've got that, you should be able to call DisplayMessage() just as if it was a normal C function! Let's put this code into actual assembler and C file listings, so we can see exactly how everything fits together:

------- MULTI1.C begins -------

/* MULTI1.C: "C" portion of the first multi-language demo

   This is the main module.  MULTI1_A.ASM must be assembled and linked
   with this module to create the final program.
*/

#include <stdio.h>

extern void DisplayMessage ();

main ()
{
    printf ("This is C speaking...\n");
    DisplayMessage ();
    printf ("This is C speaking again...\n");

    return 0;
}

------- MULTI1.C ends -------

------- MULTI1_A.ASM begins -------

%TITLE "MULTI1_A.ASM: Assembler portion of the first multi-language demo"

    IDEAL

    MODEL large
    ; (No stack)
    LOCALS

    DATASEG

AssemblerMessage                  DB   "Hello, from Assembler!", 13, 10, '$'

    CODESEG

PUBLIC _DisplayMessage

; -------------------------------------------------------------------------
PROC _DisplayMessage FAR

    ; Display the AssemblerMessage string using INT 21h, Service 9:
    MOV DX, OFFSET AssemblerMessage
    MOV AH, 9
    INT 21h

    RET
ENDP _DisplayMessage
; -------------------------------------------------------------------------

END

------- MULTI1_A.ASM ends -------

I'll show you how to compile, link, and assemble these files into an executable program, in just a moment.

In the MULTI1.C file, there aren't any big surprises -- there's just the "extern" prototype declaration, and then the DisplayMessage() function is called inside the main() function.

Now, the MULTI1_A.ASM file looks pretty much like the standard "template" we've been using for assembler files. There's the optional %TITLE line, and then the IDEAL directive to use TASM's Ideal mode. Then we specify the memory model.

For assembler modules that are going to be linked in with C modules, you don't need to specify a stack size. The assembler module will simply use the stack provided by the C compiler. Normally this stack is about 4K in size; again, if you need to change this, you can change the settings on your C compiler.

Then I included the LOCALS directive, which is optional (recall that it allows you to use local labels, with a "@@" prefix).

Now we come across the data segment. We've declared a variable, or actually, an array of bytes. The name, AssemblerMessage, does not need an underscore at the front because it is not going to be shared with the C module. (We'll play with variables in the next example program.)

Then in the code segment, we have the PUBLIC declaration that makes the _DisplayMessage procedure visible to the C module. And then we plug in the definition of the _DisplayMessage procedure, and a final END to end the listing.

Compiling, Assembling and Linking

Let's figure out how to get this pair of modules to compile, assemble, and link, so we can actually run the program.

It's easiest to do if you're using an IDE. The Borland C/C++ and Turbo C/C++ IDEs can handle projects (which are kind of like makefiles (ick) in Unix, but nicer), and it can automatically compile C code and assemble .ASM files (using TASM, assuming you have it). Here's how to do it using the Turbo C++ 3.0 IDE:

Start up the Turbo C++ IDE (if you're in DOS, enter "TC")
Go to the Project menu and select Open Project
Enter a filename such as "MULTI1.PRJ" -- the 8-character filename part will be the name given to the final executable, so, for example, "MULTI1.PRJ" will eventually yield the file "MULTI1.EXE".
When the project window pops up, press the Insert key to add a file
Select the main module first. In this case, it's the C module, "MULTI1.C".
(If you're back in the project window again, press Insert again.) Now select the other module, "MULTI1_A.ASM".
If you're still in the file selector, hit Escape to return to the project window.
You can work on these files by selecting them in the project window -- new editor windows should pop up to let you edit the files.
Before compiling, make sure you have the correct memory model set; go to the Options menu, select Compiler, then Code Generation, and then, in this case, select Large.
To compile the program and run it, either press Ctrl-F9, or go to the Run menu and select Run. The IDE will automatically compile and assemble all of the files in the project list, and it will link them to produce the final .EXE file, which is then executed.

(Note that, if you later edit one file and then recompile, the IDE is smart enough to know not to recompile/reassemble any source code files that haven't been updated (it does this by checking the date and timestamps on the .OBJ files). This can save a lot of time if you have a giant project.)

For the IDE for Borland C++ for Windows (version 4.5), the above process is very similar. Select New Project instead of Open Project when creating a new project. And to compile and run, choose one of the options under the Debug menu. (Of course, if you execute your program under Windows, it will run a bit slower than it would under just plain DOS.)

If you use the IDE, make sure that the directories are configured correctly, and, if necessary, check to see that TASM's directory is in your PATH in AUTOEXEC.BAT. If you get error messages, it may be due to the fact that the IDE is trying to run TASM, but can't find it. Check that the setup is correct (in TC++ 3.0, look at the Transfer and Directories options under the Options menu).

Now, what if you want to compile, assemble and link from the command line? Normally you would have to run tasm to assemble your assembler file into an object file, and then you would run tcc (for Turbo C++) or bcc (for Borland C++) to compile your C program into an object file, and then you would use tlink to join the object files into an executable. The only trouble is that there are all sorts of command-line parameters to memorize (which is difficult since, for example, the "-ml" option for tasm means something completely different from the "-ml" option for tcc and bcc!)

Actually, the Turbo Assembler manual (and, for that matter, Tom Swan's book Mastering Turbo Assembler) describes a clever, simple way to use tcc (for Turbo C++) or bcc (for Borland C++) to handle all the compiling/assembling and linking, like this:

tcc -ml multi1.c multi1_a.asm

-or-

bcc -ml multi1.c multi1_a.asm

(The "-m" switch lets you specify the memory model; "l" is the code for the large memory model.)

The only problem with that is that all the files are recompiled/reassembled, even if they haven't been modified and their .OBJ files are still fresh and intact. But at least it's easy to remember, compared to the "painful" way.

Whatever method you choose, you should end up with a .EXE file -- in this case, MULTI1.EXE. Run it and compare it with the program listings to check that it works.

Why the different filenames -- `MULTI1.C` and `MULTI1_A.ASM`?

Why do the files have the names MULTI1.C and MULTI1_A.ASM? Wouldn't the names MULTI1.C and MULTI1.ASM make more sense?

Good questions. The names are different for two reasons:

If you compile MULTI1.C, you get MULTI1.OBJ. If you were to then assemble MULTI1.ASM, you'd get MULTI1.OBJ -- overwriting the first one!
When tcc and bcc compile a C program, they basically generate an assembler source code listing as part of the compilation process. tcc and bcc have a command-line switch, "-S", which will output a file containing the assembler "translation" of your C program. The name of the output file is the same as the C program file, but with a .ASM extension instead of .C. So, for example, if you had a Hello World program called HELLO.C, and then you did this...

tcc -S hello.c

-or-

bcc -S hello.c

...then a file called HELLO.ASM would be generated. (Try it on one of your C programs and have a look at the assembler file! It will have a few more or less unreadable parts, but you can see a lot of your program's action expressed in assembler syntax!)

So, of course, if we were to use MULTI1.C and MULTI1.ASM as the names for our files, if we were to then use the "-S" option to generate an assembler listing, the compiler would output the assembler listing to MULTI1.ASM -- overwriting the "hand-written" MULTI1.ASM! So, for these reasons, it's a good idea to use different filenames for your C and assembler module source code files.

An example using global variable sharing

Let's try another example where we finally get to practice the global variable sharing techniques.

------- MULTI2.C begins -------

/* MULTI2.C: "C" portion of the second multi-language demo

   This is the main module.  MULTI2_A.ASM must be assembled and linked
   with this module to create the final program.
*/

#include <stdio.h>


/* Declare global data to be shared between this module and the
   assembler module: */
char FavoriteLetter = 'W';

/* Import the FavoriteNumber variable from the assembler module: */
extern int FavoriteNumber;


/* Import the assembler procedure DisplayGlobalCharacter(): */
extern void DisplayGlobalCharacter (void);


main ()
{
    printf ("My favorite letter is: ");
    DisplayGlobalCharacter ();
    printf ("\n");

    printf ("My favorite number is: %d\n", FavoriteNumber);

    return 0;
}

------- MULTI2.C ends -------

Then:

------- MULTI2_A.ASM begins -------

%TITLE "MULTI2_A.ASM: Assembler portion of the second multi-language demo"

    IDEAL

    MODEL large
    ; (No stack)
    LOCALS

    DATASEG

_FavoriteNumber                    DW   144

; Export the _FavoriteNumber variable:
PUBLIC _FavoriteNumber

; Import the variable FavoriteLetter from the C module:
EXTRN _FavoriteLetter:BYTE

    CODESEG

; Export the procedure _DisplayGlobalCharacter:
PUBLIC _DisplayGlobalCharacter

; -------------------------------------------------------------------------
PROC _DisplayGlobalCharacter FAR

    ; Display the character _FavoriteLetter using INT 21h, Service 2:
    MOV DL, [_FavoriteLetter]
    MOV AH, 2
    INT 21h

    RET
ENDP
; -------------------------------------------------------------------------

END

------- MULTI2_A.ASM ends -------

Follow the same steps to get an executable. Using the IDE is probably the easiest way to do it, but if you're at the command line, use:

tcc -ml multi2.c multi2_a.asm

-or-

bcc -ml multi2.c multi2_a.asm

Parameter Passing

How does C handle parameter passing? Well, it uses the stack, in basically the same way as we examined in a previous chapter. The parameters are pushed onto the stack. Let's look at the similarities and minor differences:

If you remember back to Chapter 15, "80x86 Assembler, Part 4: Procedures, Parameter Passing, and Local Variables", the example program TEST10.ASM contained a procedure called DrawBlockOfCharacters, which looked like this (the actual "guts" of the procedure have been cut out since we're not interested in them here):

PROC DrawBlockOfCharacters
    ARG @@Height:WORD, @@Width:WORD, @@Character:BYTE = @@ArgBytesUsed
    LOCAL @@x:WORD, @@y:WORD = @@LocalBytesUsed

    PUSH BP                            ; Save BP
    MOV BP, SP                         ; Allow params. to be addressed
    SUB SP, @@LocalBytesUsed           ; Reserve space for local vars.

    .
    .  ("guts" have been removed)
    .

    ADD SP, @@LocalBytesUsed           ; "De-allocate" local variables'
                                       ;  space
    POP BP                             ; Restore BP
    RET @@ArgBytesUsed
ENDP DrawBlockOfCharacters

So we can see that the parameters are called @@Height, @@Width, and @@Byte. Then, when we want to call this procedure, we push our desired parameters onto the stack. In what order do we push the parameters -- that is, do we go left-to-right, pushing @@Height first and @@Byte last, or do we go right-to-left, pushing @@Byte first and @@Height last? The answer is: we go right to left, like this:

    PUSH '*'                           ; Character
    PUSH 40                            ; Width
    PUSH 6                             ; Height
    CALL DrawBlockOfCharacters         ; Draw the block.

Now, you'll be pleased to know that C also uses the right-to-left scheme. So, if we had this function prototype in C...

void DoSomething (long int alpha, char beta, int gamma);

...and then if we used this function call...

DoSomething (35, 'T', 0xABCD);

...then, since C pushes parameters on the stack in right-to-left order, C will push the gamma parameter, that is, 0xABCD, onto the stack first. Then the beta parameter, 'T', will get pushed onto the stack next (but you can't push a single byte onto the stack, so it has to be "promoted" to a word). Then the alpha parameter, 35, gets pushed onto the stack; since it is a long int, equivalent to a double word, it is pushed onto the stack as two words.

How would that actually look if it was done in assembler? Well, I put the above DoSomething() call into a very short C program and ran it through tcc/bcc using the "-S" option. The call turned into this in assembler:

   ;
   ;	    DoSomething (35, 'T', 0xABCD);
   ;	
	push	-21555
	push	84
	push	0
	push	35
        call    near ptr _DoSomething
        add     sp,8

Now we can see how the parameter ordering goes -- the first PUSH instruction pushes -21555 dec onto the stack; -21555 dec is equivalent to 0xABCD. Then the 'T' has the ASCII code 84. Then the long int 35 is broken into two words and pushed onto the stack in such a way that little-endian order (oh boy) is preserved.

Let's see that on a stack diagram:

       One word     One byte
      |<------->|   |<-->|
                                                              (Initial SP)
   SS:0000                    SP                                      .
      |                       |                                       .
     \|/                     \|/------gamma------- --beta--- --alpha--.
      *----+----*...*----+----*----+----*----+----*----+----*----+----*...
      |    |    |   |    |    | 23 | 00 | 00 | 00 | 00 | 54 | CD | AB | hex
Off-  *----+----*...*----+----*----+----*----+----*----+----*----+----*...
sets:  0000 0001     n    n+1  n+2  n+3  n+4  n+5  n+6  n+7  n+8  n+9

                    -------------------------------->         Bottom of
                          Increasing addresses                stack is
                                                             somewhere to
                    <--------------------------------         the right
                     Stack grows downward (this way)           ------>

Now it can be seen more clearly that the 35 dec parameter (00000023 hex) is in little-endian order.

Great, so now that we know that C uses the right-to-left scheme, we can use the same parameter passing technique we've used before. So, if we wanted to write an assembler version of DoSomething(), we could use this template:

; void DoSomething (long int alpha, char beta, int gamma);

PROC _DoSomething
    ARG @@alpha:WORD, @@beta:BYTE, @@gamma:DWORD = @@ArgBytesUsed

    ; Specify LOCAL variables here, for example:
    ; LOCAL @@x:WORD, @@y:WORD = @@LocalBytesUsed

    PUSH BP                            ; Save BP
    MOV BP, SP                         ; Allow params. to be addressed

    ; If local variables are used, uncomment the following line:
    ; SUB SP, @@LocalBytesUsed         ; Reserve space for local vars.

    ; If you use any critical registers (CS, DS, SS, SP, BP), save them
    ;  here by PUSHing them onto the stack.

    ;
    ; (put your code here)
    ;

    ; Restore any saved registers here, using POP.

    ; If local variables are used, uncomment the following lines:
    ; ADD SP, @@LocalBytesUsed         ; "De-allocate" local variables'
    ;                                  ;  space

    POP BP                             ; Restore BP
    RET  ; IMPORTANT: Do not put a value such as @@ArgBytesUsed here!
         ; C will clean up the stack by itself!
ENDP _DoSomething

Did you notice the RET instruction at the end? Normally we would say "RET @@ArgBytesUsed" so that, after the function ends, the stack pointer SP will be adjusted so that the parameters will no longer be considered to be on the stack. But it works differently in C -- here's the minor difference!

Take a look at the assembler code generated by the C compiler again:

   ;
   ;	    DoSomething (35, 'T', 0xABCD);
   ;	
	push	-21555
	push	84
	push	0
	push	35
        call    near ptr _DoSomething
        add     sp,8

Notice the "ADD SP, 8" after the call to the procedure? Well, that's C "cleaning up" the stack. PUSH is used four times, so that's eight bytes worth of parameters on the stack. C plugs in an "ADD SP, ___" instruction and fills in the number of bytes, in this case, eight, so that the stack pointer points beyond the parameters -- thus cleaning up the stack after the procedure call.

So, the big important thing to remember then is: don't put any value after the "RET" instruction if your procedure is to be called from C!

Okay, but what if you want your procedure to be callable from both C and assembler? What then? Well, we have to be compatible with C at all costs, so we must use the plain "RET". So when we use assembler to call this procedure, we have to do what C does -- use the "ADD SP, ___" instruction to clean up the stack.

If you don't like the "ADD SP, ___" method, you can always use a list of POP instructions. Using POP four times will remove eight bytes worth of parameters, for example. Of course, that's a little slower.

In your comments associated with each procedure, it's a very, very good idea to specify whether each procedure requires the caller to do any cleaning up. That way, as long as you check the comments before you use a function, you'll be able to avoid some very hard-to-find, and potentially system-crashing, errors!

Returning values

C functions can return values -- for example, this function...

int CalculateAverage (int a, int b, int c);

...will return a value of type int.

When we write assembler procedures that masquerade as C functions, we can get those assembler procedures to return values. We just have to do it exactly the way C does it. And here's how C does it:

When a procedure or function returns and C (or at least the code generated by the C compiler) regains control, immediately after the stack is adjusted using "ADD SP, ___", if the procedure or function returned a value, C picks that returned value out of the appropriate register or registers.

The location of the returned value depends on what type the returned value is. If the procedure or function is returning a char (signed or unsigned), C picks the value out of AL. That means that if you want to return a char, you must put the value to return in AL in your assembler procedure. (The "return" statement in C basically puts the return value into the correct location and then executes the assembler RET instruction.)

So for chars, we must put the return value in AL. What about the other types? Well, here's the chart:

To return a...                     ...put the return value in:
--------------------------------------------------------------------------
char                                  AL  *
unsigned char                         AL  *
signed char                           AL  *

short int                             AX
short unsigned int                    AX
short signed int                      AX

int                                   AX
unsigned int                          AX
signed int                            AX

long int                           |  Place the most-significant word
long unsigned int                  |  in DX, and place the least-
long signed int                    |  significant word in AX

enum (int)                            AX

float, double, long double, etc.      (Don't bother)

near pointer (offset only)           AX
far pointer (segment:offset)         DX:AX (segment in DX, offset in AX)
--------------------------------------------------------------------------
(*) The TASM 5.0 User's Guide says AX, but chars obviously go in AL.

So, if you want to return a long int, put the most-significant word in DX, and put the least-significant word in AX, and then return! That's it -- C will pick up the return value.

To emulate void functions, which don't return any values, you don't need to do anything in particular.

Let's try an example now that demonstrates both parameter passing and returning values!

Example demonstrating parameter passing and returning values

Here we go:

------- MULTI3.C begins -------

/* MULTI3.C: "C" portion of the third multi-language demo

   This is the main module.  MULTI3_A.ASM must be assembled and linked
   with this module to create the final program.
*/

#include <stdio.h>


/* Import the assembler procedure AddTwoNumbers(): */
extern int AddTwoNumbers (int alpha, int beta);


main ()
{
    int first, second;

    printf ("Enter an integer: ");
    scanf ("%d", &first);
    printf ("Enter another integer: ");
    scanf ("%d", &second);

    printf ("The sum of %d and %d is ", first, second);
    printf ("%d.\n", AddTwoNumbers(first, second));

    return 0;
}

------- MULTI3.C ends -------

And:

------- MULTI3_A.ASM begins -------

%TITLE "MULTI3_A.ASM: Assembler portion of the third multi-language demo"

    IDEAL

    MODEL large
    ; (No stack)
    LOCALS

    DATASEG

    CODESEG

; Export the procedure _AddTwoNumbers:
PUBLIC _AddTwoNumbers


; -------------------------------------------------------------------------
; int AddTwoNumbers (int alpha, int beta);
; C-compatible assembler implementation
; -------------------------------------------------------------------------
; Desc: Adds two integer (word) values and returns the result.
;  Pre: In assembler, push two words onto the stack and call this proc-
;       edure.
;       In C, pass two ints as parameters.
; Post: In assembler, the result will be returned in AX.  YOU MUST CLEAN
;       UP THE STACK AFTER USING THIS PROCEDURE -- use "ADD SP, 4".
;       In C, an int result will be returned.  C will automatically
;       clean up the stack.
;       The "sacred" C registers (CS, DS, SS, SP and BP) are not affected.
;       Flags _are_ affected -- this allows you to check for conditions
;       such as wraparound.  No registers other than AX are affected.
; -------------------------------------------------------------------------
PROC _AddTwoNumbers FAR
    ARG @@Alpha:WORD, @@Beta:WORD = @@ArgBytesUsed
    ; No local variables needed here...

    PUSH BP                            ; Save BP
    MOV BP, SP                         ; BP = SP to make params addressable

    MOV AX, [@@Alpha]
    ADD AX, [@@Beta]

    POP BP                             ; Restore BP
    RET  ; No, don't put @@ArgBytesUsed here!
ENDP _AddTwoNumbers
; -------------------------------------------------------------------------

END

------- MULTI3_A.ASM ends -------

Again, if you're in the Turbo or Borland C/C++ IDE, create a project and generate an .EXE; if you're at the command line, do this:

tcc -ml multi3.asm multi3_a.asm

-or-

bcc -ml multi3.asm multi3_a.asm

The reason I'm repeating this is that I forgot the "-ml" option and received a "fixup overflow" error from the linker. The resulting program ran but gave incorrect results, and it took me a long time to figure out my error!

Pass-by-reference parameters (i.e. pointers in the argument list)

What if you want to write a C-compatible assembler procedure that returns values via parameters that are pointers? For example:

void GetTime (int *hours, int *minutes, int *seconds, int *hundredths);

We would expect this function to return values in the locations specified by the addresses passed to the function through the argument list. In other words, doing this...

int h, m, s, hnd;
GetTime (&h, &m, &s, &hnd);

...would put the current hour into the h variable, and the current minutes value would be placed into the m variable, and so on.

How do we do this with assembler? We don't need any new programming constructs; we can do this easily using what we already know.

We simply have to recognize the fact that a pointer is an address. Yes, pointers can be scary at times, but just keep in mind that all pointers are just addresses.

With that in mind, when you specify, say, "&h" as an argument (as in the above example), remember that "&" is the address-of operator in C. So "&h" gives the address of the h variable. Is this address near (just the offset), or is it far (the segment and the offset)? That depends on the memory model. If you're using the tiny, small, medium, or compact memory models, all addresses (pointers) are near by default; with the large and huge memory models, addresses (pointers) are far by default.

So this means that, if we're using the large memory model, when we pass "&h" as a parameter, we're passing the 32-bit far address (i.e. segment:offset), and this address goes onto the stack, just like any other parameter. Then, in our assembler procedure, we can grab this address from the stack, and we can then store whatever values we like at this address. And that's how we can return values via pointers in the argument list.

Here's an example:

------- MULTI4.C begins -------

/* MULTI4.C: "C" portion of the fourth multi-language demo

   This is the main module.  MULTI4_A.ASM must be assembled and linked
   with this module to create the final program.
*/

#include <stdio.h>


/* Import the assembler procedure GetTime(): */
extern void GetTime (int *hour, int *minute, int *second, int *hundredths);


main ()
{
    int h, m, s, hundredths;

    GetTime (&h, &m, &s, &hundredths);

    printf ("The current time is %02d:%02d:%02d.%02d.\n", h, m, s,
        hundredths);
    /* The "%02d" is the "%d" type-specifier with a width-modification
       option: the "2" means "print at least two digits", and the "0"
       means "if the number of digits to be printed is less than 2, then
       left-pad the number with leading zeroes so that two digits are
       printed".  Thanks for the brilliant syntax, K&R... */

    return 0;
}

------- MULTI4.C ends -------

And:

------- MULTI4_A.ASM begins -------


%TITLE "MULTI4_A.ASM: Assembler portion of the fourth multi-language demo"

    IDEAL

    MODEL large
    ; (No stack)
    LOCALS

    DATASEG

    CODESEG

; Export the procedure _GetTime:
PUBLIC _GetTime


; -------------------------------------------------------------------------
; void GetTime (int *hours, int *minutes, int *seconds, int *hundredths);
; C-compatible assembler implementation
; -------------------------------------------------------------------------
; Desc: Determines the system time and returns it via pass-by-reference
;       parameters.
;  Pre: In assembler, four FAR ADDRESSES onto the stack; these addresses
;       must refer to the locations of the following parameters:
;       hundredths  (hundredths of seconds)    - DWORD sized (far) address
;       seconds     (seconds)                  - DWORD sized (far) address
;       minutes     (minutes)                  - DWORD sized (far) address
;       hours       (hours (24-hour clock))    - DWORD sized (far) address
;       In C, pass FAR POINTERS to appropriate variables.
; Post: The "sacred" C registers CS, DS, SS, SP, and BP are preserved.
;       No other registers or flags are preserved.
; -------------------------------------------------------------------------
PROC _GetTime FAR
    ARG @@AddrOfHours:DWORD, @@AddrOfMinutes:DWORD, \
        @@AddrOfSeconds:DWORD, @@AddrOfHundredths:DWORD = @@ArgBytesUsed
    ; (Note that the backslash character ("\") can be used to join long
    ;  lines, just as in C.)

    ; No local variables needed here...

    PUSH BP                            ; Save BP
    MOV BP, SP                         ; BP = SP to make params addressable

    ; Use INT 21h, Service 2Ch to get the time:
    MOV AH, 02Ch
    INT 21h
    ; Now, CH = hours, CL = minutes, DH = seconds, and DL = hundredths.

    XOR AH, AH                         ; Let AH = 0

    MOV ES, [WORD @@AddrOfHours + 2]   ; ES = segment part of address
    MOV DI, [WORD @@AddrOfHours]       ; DI = offset part of address
    ; Note: a nicer way to do this is to use "LES DI, [@@AddrOfHours]",
    ;  but we haven't covered the LES instruction yet!

    MOV AL, CH                         ; AL = hours (CH); now AX = hours
    MOV [ES:DI], AX                    ; Store hours at specified address


    MOV ES, [WORD @@AddrOfMinutes + 2] ; ES = segment part of address
    MOV DI, [WORD @@AddrOfMinutes]     ; DI = offset part of address
    ; Or LES DI, [@@AddrOfMinutes]

    MOV AL, CL                         ; AX = AL = minutes (CL)
    MOV [ES:DI], AX                    ; Store minutes at address...


    MOV ES, [WORD @@AddrOfSeconds + 2] ; ES = segment part of address
    MOV DI, [WORD @@AddrOfSeconds]     ; DI = offset part of address
    ; Or LES DI, [@@AddrOfSeconds]

    MOV AL, DH                         ; AX = AL = seconds
    MOV [ES:DI], AX                    ; Store seconds at address


    MOV ES, [WORD @@AddrOfHundredths + 2] ; ES = segment part of address
    MOV DI, [WORD @@AddrOfHundredths]     ; DI = offset part of address
    ; Or LES DI, [@@AddrOfHundredths]

    MOV AL, DL                         ; AX = AL = hundredths
    MOV [ES:DI], AX                    ; Store hundredths at address

    POP BP                             ; Restore BP
    RET  ; No, don't put @@ArgBytesUsed here!
ENDP _GetTime
; -------------------------------------------------------------------------

END

------- MULTI4_A.ASM ends -------

Important: the _GetTime procedure assumes that the large memory model is being used, because it assumes far pointers are being pushed on the stack. So be sure to compile, assemble and link using the large memory model! Either set it up in the IDE, or do this at the command line:

tcc -ml multi4.asm multi4_a.asm

-or-

bcc -ml multi4.asm multi4_a.asm

Also, I mentioned the LES instruction in the comments -- I'll discuss that instruction and some other miscellaneous instructions in the next chapter. (LES is nice but not critical. You can do the same thing with two MOV instructions, as shown in the code.)

Calling C functions from assembler

If you can call assembler procedures from C, can you call C functions from assembler? Certainly, we just have to play by C's rules, as usual.

You may never need to call C functions from assembler -- I myself have never actually encountered a situation where I really had to do this. I've played with it in test programs, but since I write the majority of a project's code in C and only use assembler for time-critical sections, like graphics routines, I've never found a need for it in an actual project. However, it is nice to be able to call a C function that makes use of a C library function -- perhaps to do some math that would be much too difficult to do in assembler.

To call a C function, we have to follow these steps:

Ensure that the C function is visible to the assembler module; that means that you must...
1. say "EXTRN _CFunctionName" in your assembler code, and
2. put "extern" in front of the function prototype (e.g. "extern void CFunctionName (int x, int y);"), and in front of the first line of the corresponding function definition in your C program
Push any parameters onto the stack in right-to-left order. (If the function is a parameterless function, don't put anything on the stack.)
Call the C function using CALL.
Clean up the stack using "ADD SP, ___" (fill in the blank with the number of bytes the parameters occupied). (If the function is a parameterless function, this isn't necessary.)
If the function returned a value, retrieve the value by looking in the appropriate location (see the chart above).

Let's look at an example. The following example program is composed of four files. The main module is SINEWAVE.C. Then there is an assembler module called SINEWAVA.ASM. Both of these modules call graphics functions provided in a simple graphics library, which is M13HLIB.ASM. And there is a C header file, M13HLIB.H, which is associated with the graphics library.

Tragically, I could not get it to work with the large memory model -- I got lots of "fixup overflow" error messages. Switching to the small memory model made the problems disappear.

The two references I have on this subject, the Turbo Assembler (version 5) User's Guide and Tom Swan's Mastering Turbo Assembler mysteriously don't mention the large memory model at all when discussing the mixing of C and assembler. (I have the feeling that they couldn't get it to work, so they quietly left out any mention of it. If that's true, then that's pretty sleazy!)

Anyways, that's why I've had to use the small memory model. Here are the program listings:

------- M13HLIB.ASM begins -------

%TITLE "M13HLIB.ASM: Basic Mode 13h Graphics Library in Assembler"

; This simple graphics library permits the programmer to access the VGA
; Mode 13h display using either assembler, C, or C++.  Mode 13h is a
; standard video mode accessible on all standard VGA cards.  It offers
; 320x200 resolution, with 256 colors (choosable from the VGA palette),
; but only one screen page is available.
;
; The following functions are implemented:
; void SetMode13h (void);
; void SetTextMode (void);
; void PutPixel (int x, int y, unsigned char color);
; void ClearScreen (unsigned char color);
;
; For use with Turbo Assembler:
; - Carefully read the comments above each procedure before calling
;   a function in this library.
; - In general, these procedures only save and restore the "sacred" C
;   registers CS, DS, SS, SP, and BP.  Other registers, including FLAGS,
;   are generally _not_ preserved.  That means you should PUSH and POP
;   any registers (other than CS, DS, SS, SP, and BP) that you want to
;   keep when calling a procedure in this library.
; - Because these procedures are C-compatible, parameters are not removed
;   from the stack.  For procedures in this library that use parameters,
;   the calling code must clean up the stack after the function is
;   called by using a "ADD SP, ___" instruction; fill in the blank with
;   the number of bytes the parameters occupy on the stack (i.e. 2*n,
;   where n is the number of PUSH instructions used to push the
;   parameters onto the stack).
;
; For use with Borland C or Turbo C (in other words, C programs, not C++):
; - Ensure that any C module that calls functions in this library
;   includes the C header file M13HLIB.H, like this:
;   "#include <m13hlib.h>"
;
; For use with Borland C++ or Turbo C++ (i.e. C++ programs, not C):
; - Ensure that any C++ module that calls functions in this library
;   includes the C++ header file M13HLIB.HPP, like this:
;   "#include <m13hlib.hpp>"
;

    IDEAL

    MODEL small
    ; (No stack)
    LOCALS
    
    DATASEG

_VideoSegment                     EQU  0A000h
_Mode13h_ScreenWidth              EQU  320
_Mode13h_ScreenHeight             EQU  200


    CODESEG

PUBLIC _SetMode13h
PUBLIC _SetTextMode
PUBLIC _PutPixel
PUBLIC _ClearScreen


; -------------------------------------------------------------------------
; void SetMode13h (void);
; C-compatible assembler implementation
; -------------------------------------------------------------------------
; Desc: Changes the display to VGA Mode 13h.
;  Pre: None.
; Post: None.  The "sacred" C registers CS, DS, SS, SP and BP are saved.
;       No other flags or registers are preserved.
; -------------------------------------------------------------------------
PROC _SetMode13h
    ; Use INT 10h, Service 0 to set the screen mode to Mode 13h:
    MOV AH, 0
    MOV AL, 13h
    INT 10h

    RET
ENDP _SetMode13h
; -------------------------------------------------------------------------


; -------------------------------------------------------------------------
; void SetTextMode (void);
; C-compatible assembler implementation
; -------------------------------------------------------------------------
; Desc: Changes the display back to the standard 80x25 16-color text mode.
;  Pre: None.
; Post: None.  The "sacred" C registers CS, DS, SS, SP and BP are saved.
;       No other flags or registers are preserved.
; -------------------------------------------------------------------------
PROC _SetTextMode
    ; Use INT 10h, Service 0 to set the screen mode to text mode (Mode 3):
    MOV AH, 0
    MOV AL, 3
    INT 10h

    RET
ENDP _SetTextMode
; -------------------------------------------------------------------------


; -------------------------------------------------------------------------
; void PutPixel (int x, int y, unsigned char color);
; C-compatible assembler implementation
; -------------------------------------------------------------------------
; Desc: Plots a pixel at (Column, Row) on the Mode 13h screen, using the
;       color specified by Color.
;  Pre: Column and Row must be word-sized; Color must be byte-sized.
; Post: Assuming Row and Column are within range, the pixel is plotted.
;       No range checking is performed.
;       Registers CS, DS, SS, SP, and BP are unaffected.  Any other
;       registers and flags may be modified.
; -------------------------------------------------------------------------
PROC _PutPixel
    ARG @@x : WORD, @@y : WORD, @@color : BYTE = @@ArgBytesUsed

    PUSH BP                            ; Save BP
    MOV BP, SP                         ; Allow params to be addressed

    ; Let DI equal the offset of the pixel.  The formula is:
    ;  Offset = (@@y << 8) + (@@y << 6) + @@x
    MOV AX, [@@y]                      ; Let AX = Row parameter (@@x)
    MOV DI, AX                         ; Also let DI = Row parameter (@@x)
    MOV CL, 8
    SHL AX, CL                         ; Shift AX left by 8

    DEC CL
    DEC CL                             ; Now CL = 6
    SHL DI, CL                         ; Shift DI left by 6
    ADD AX, DI                         ; AX += DI

    MOV DI, [@@x]                      ; Let DI = Column parameter (@@y)

    ADD DI, AX                         ; DI += AX

    ; Let ES equal the video segment:
    MOV AX, _VideoSegment              ; (intermediate)
    MOV ES, AX                         ; Let ES = _VideoSegment constant

    ; Now ES:DI points to the address of the pixel.  Place the byte-sized
    ;  color value at that address:
    MOV AL, [@@color]                  ; Let AL = Color parameter
    MOV [ES:DI], AL                    ; Store AL at ES:DI

    POP BP

    RET  ; This is a C-compatible procedure (no "@@ArgBytesUsed")
ENDP _PutPixel
; --------------------------------------------------------------------------


; -------------------------------------------------------------------------
; void ClearScreen (unsigned char color);
; C-compatible assembler implementation
; -------------------------------------------------------------------------
; Desc: Clears the Mode 13h screen with a specified color.
;  Pre: Before calling this procedure, push onto the stack the
;       byte-sized color number to use.
; Post: The Mode 13h screen is cleared using the color parameter.
;       Registers CS, DS, SS, SP, and BP are unaffected.  Any other
;       registers and flags may be modified.
; -------------------------------------------------------------------------
PROC _ClearScreen
    ARG @@color:BYTE = @@ArgBytesUsed

    PUSH BP                            ; Save BP
    MOV BP, SP                         ; Allow parameters to be addressed

    ; Let ES:DI point to the start of the video segment (ie. A000:0000):
    MOV AX, _VideoSegment
    MOV ES, AX                         ; ES = _VideoSegment
    XOR DI, DI                         ; DI = 0000h

    ; Load the color parameter into AL and AH:
    MOV AL, [@@color]
    MOV AH, [@@color]

    ; Let CX equal what should be 32000:
    MOV CX, (_Mode13h_ScreenWidth * _Mode13h_ScreenHeight / 2)
    ; (Yes, that's basically a constant on the right (it always
    ;  evaluates to the same thing), so it's permitted)

    ; Fill the screen with the specified color:
    REP STOSW
    
    POP BP                             ; Restore BP

    RET  ; This is a C-compatible procedure (no "@@ArgBytesUsed")
ENDP _ClearScreen
; -------------------------------------------------------------------------

END

------- M13HLIB.ASM ends -------

------- M13HLIB.H begins -------

/* M13HLIB.H: Header file for the "Basic Mode 13h Graphics Library in
             Assembler" (C header file)

   This header file is for use with C programs, not C++ programs.
   For C++ programs, please use the M13HLIB.HPP header file instead.
   ----------------------------------------------------------------------*/

#ifndef _M13HLIB_H
#define _M13HLIB_H

extern void SetMode13h (void);
    /* Changes the display to VGA Mode 13h, a 320x200 graphics mode
       with 256 colors and one screen page. */

extern void SetTextMode (void);
    /* Changes the display back to the standard 80x25 character
       color text mode. */

extern void PutPixel (int x, int y, unsigned char color);
    /* Plots a pixel of color "color" at coordinates (x, y).  The
       display must be in Mode 13h. */

extern void ClearScreen (unsigned char color);
    /* Clears the entire Mode 13h screen using the color specified in
       "color". */

#endif

------- M13HLIB.H ends -------

Then:

------- SINEWAVE.C begins -------

/* SINEWAVE.C: "C" portion of the C-and-assembler sine-wave demonstration

   This is the main module.  The Mode 13h graphics library, M13HLIB.ASM,
   and the other assembler module, SINEWAVA.ASM ("A" for assembler),
   must be assembled and linked with this module to create the final
   program.
*/

#include <stdio.h>
#include <math.h>

#include "m13hlib.h"

int GetSineValue (int number);

extern void DrawSineWave (void);

main ()
{
    SetMode13h ();
    ClearScreen (8);

    DrawSineWave ();

    SetTextMode ();

    return 0;
}

int GetSineValue (int number)
{
    return (int) (sin(number * 0.025) * 40.0);
}

------- SINEWAVE.C ends -------

And finally:

------- SINEWAVA.ASM begins -------

%TITLE "SINEWAVA.ASM: ASM portion of C-and-ASM sine-wave demonstration"

    IDEAL

    MODEL small
    ; (No stack)
    LOCALS


; We apparently cannot share equates between two assembler modules, so:
_Mode13h_ScreenWidth              EQU  320


    CODESEG

EXTRN _GetSineValue : PROC
EXTRN _PutPixel : PROC

PUBLIC _DrawSineWave

; --------------------------------------------------------------------------
; void DrawSineWave (void)
; C-compatible assembler implementation
; --------------------------------------------------------------------------
; Desc: Repeatedly draws a sine-wave pattern on the screen until a key is
;       pressed.
;  Pre: Simply call this procedure/function.  No parameters.  No return
;       value.  Ensure that this module is linked with the modules that
;       contain the procedures/functions GetSineValue() and PutPixel().
; Post: The "sacred" C registers CS, DS, SS, SP, and BP are preserved.
;       No other registers or flags are preserved.  Since no parameters
;       are used, no stack cleanup is required after calling this
;       procedure/function.
; --------------------------------------------------------------------------
PROC _DrawSineWave
    LOCAL @@XCoord:WORD, @@SinPosition:WORD = @@LocalBytesUsed

    PUSH BP                            ; Save BP
    MOV BP, SP                         ; Let BP = SP for stack addressing
    SUB SP, @@LocalBytesUsed           ; Make space for local variables

    ; Initialize variables:
    MOV [@@XCoord], 0
    MOV [@@SinPosition], 0

@@MainLoop:
    ; Call the GetSineValue() function in the C module.  We'll pass
    ;  the value of [@@SinPosition] as the parameter.
    PUSH [@@SinPosition]
    CALL _GetSineValue
    ADD SP, 2                          ; Clean up stack
    ; GetSineValue() returns an int, which is stored in AX.

    ; Negate the return value.  This makes the sine wave "right-side-up"
    ; (since positive coordinates in mathematics go in the "up" direction,
    ; and positive coordinates on the VGA screen go in the "down"
    ; direction).
    NEG AX

    ; Add 100 to AX to get a Y-coordinate that is centered on the screen:
    ADD AX, 100

    ; Call PutPixel() in the Mode 13h library.  The parameters are pushed
    ;  onto the stack in right-to-left order:
    ;  First, the color (10),
    ;  then, the Y-coordinate,
    ;  then, then X-coordinate.
    PUSH 10                            ; Color 10 = lt. green (std. palette)
    PUSH AX
    PUSH [@@XCoord]
    CALL _PutPixel
    ADD SP, 6                          ; Clean up stack

    ; Increment both variables:
    INC [@@XCoord]
    INC [@@SinPosition]

    ; Have we reached the right-hand side of the screen?
    CMP [@@XCoord], _Mode13h_ScreenWidth
    JL @@Bypass1

    ; If so, then go back to the left-hand side again:
    MOV [@@XCoord], 0

@@Bypass1:
    ; Was a key pressed?  Use INT 21h, Service 0Bh to see if a key was
    ;  pressed.  If a character is ready, AL equals FF hex; if no
    ;  characters are waiting, AL is zero.
    MOV AH, 0Bh
    INT 21h
    CMP AL, 0
    JNE @@PrepareToExit                ; If a key was pressed, jump...
    JMP @@MainLoop                     ; No key pressed?  Re-run loop.

@@PrepareToExit:
    ; Get the key that was pressed, using INT 21h, Service 7:
    MOV AH, 7
    INT 21h
    ; (Ignore the return value in AH.)

    ADD SP, @@LocalBytesUsed           ; De-allocate local variables
    POP BP                             ; Restore BP

    RET  ; This is a C-compatible function: no RET arguments!
ENDP
; --------------------------------------------------------------------------

END

------- SINEWAVA.ASM ends -------

To assemble and compile and link all this, either create a project in your IDE (and be sure to select the small memory model this time), or do this at the command line:

tcc -ms sinewave.c m13hlib.asm sinewava.asm

-or-

bcc -ms sinewave.c m13hlib.asm sinewava.asm

Combining C++ (not C) with assembler

Skip this section if you don't use C++.

So far we've been combining C programs with assembler. But what if you want to combine your C++ program with assembler?

Fortunately, everything is the same, except for this:

In your C program, you use the word "extern" to say that a certain function will be defined elsewhere (usually, in an assembler module). In C++, don't use "extern"; instead, use " extern "C" ".

For example, in C you might say this:

extern void DisplayMessage ();

In C++, we'd say this instead:

extern "C" void DisplayMessage ();

Yes, it's wild syntax (although C++'s syntax for, say, templates (or pure virtual functions!) isn't particularly inspiring either...).

Additionally, the compiler or assembler or linker might complain about functions in your C++ program that are called from other modules. When this is the case, you must also put " extern "C" " in front of both the function prototype and the function definition, like this:

/* Function prototype: */
extern "C" int PlayMusicFile (char filename[]);

/* Actual function definition (yes, we say extern "C" even though
   the code is right here in this module!): */
extern "C" int PlayMusic (char filename[])
{
    /* Put code here... */
}

To demonstrate this, I've taken the sine-wave demonstration program and "converted" it to C++. The SINEWAVA.ASM and M13HLIB.ASM files are exactly the same as the were for the C version above. But now, replace SINEWAVE.C with the following SINEWAVE.CPP, and replace the M13HLIB.H with the following M13HLIB.HPP:

------- SINEWAVE.CPP begins -------

/* SINEWAVE.CPP: C++ portion of the C++-and-assembler sine-wave
                 demonstration

   This is the main module.  The Mode 13h graphics library, M13HLIB.ASM,
   and the other assembler module, SINEWAVA.ASM ("A" for assembler),
   must be assembled and linked with this module to create the final
   program.
*/

#include <stdio.h>
#include <math.h>

#include "m13hlib.hpp"

extern "C" int GetSineValue (int number);

extern "C" void DrawSineWave (void);

main ()
{
    SetMode13h ();
    ClearScreen (8);

    DrawSineWave ();

    SetTextMode ();

    return 0;
}

extern "C" int GetSineValue (int number)
{
    return (int) (sin(number * 0.025) * 40.0);
}

------- SINEWAVE.CPP ends -------

And:

------- M13HLIB.HPP begins -------

/* M13HLIB.HPP: Header file for the "Basic Mode 13h Graphics Library in
                Assembler" (C++ header file)

   This header file is for use with C++ programs, not C programs.
   For C programs, please use the M13HLIB.H header file instead.
   ----------------------------------------------------------------------*/

#ifndef _M13HLIB_HPP
#define _M13HLIB_HPP

extern "C" void SetMode13h (void);
    /* Changes the display to VGA Mode 13h, a 320x200 graphics mode
       with 256 colors and one screen page. */

extern "C" void SetTextMode (void);
    /* Changes the display back to the standard 80x25 character
       color text mode. */

extern "C" void PutPixel (int x, int y, unsigned char color);
    /* Plots a pixel of color "color" at coordinates (x, y).  The
       display must be in Mode 13h. */

extern "C" void ClearScreen (unsigned char color);
    /* Clears the entire Mode 13h screen using the color specified in
       "color". */

#endif

------- M13HLIB.HPP ends -------

And, as you would expect, you can do this to get an executable:

tcc -ms sinewave.cpp m13hlib.asm sinewava.asm

-or-

bcc -ms sinewave.cpp m13hlib.asm sinewava.asm

What is the reason for the " extern "C" "? It has to do with the fact the C++ uses a technique called name mangling during compilation. The compiler secretly changes the names of your functions so that it can tell the difference between functions with the same name. (Recall that you can have functions with the same name but different argument lists.) But when you stir assembler procedures into the mix, the names of those assembler procedures don't get mangled (yes, "mangled" is the real technical term), and the compiler and linker get confused. The " extern "C" " mangles names that would not be properly mangled otherwise.

Summary

In this chapter, we learned how to use in-line assembler code. We learned how to call assembler procedures from C, and we learned how to call C functions from assembler. We also learned what changes must be made to combine C++ with assembler.

This article has certainly been the longest one yet, and I congratulate you for your patience in reading it!

In the next, and hopefully last, assembler article, we'll cover some miscellaneous instructions and other minor details.

A project

80x86 Assembler, Part 8

Inline assembler

External assembly -- combining assembler with C

Sharing global variables

Parameterless, return-value-less, C-compatible assembler procedures

Compiling, Assembling and Linking

Why the different filenames -- MULTI1.C and MULTI1_A.ASM?

An example using global variable sharing

Parameter Passing

Returning values

Example demonstrating parameter passing and returning values

Pass-by-reference parameters (i.e. pointers in the argument list)

Calling C functions from assembler

Combining C++ (not C) with assembler

Summary

Why the different filenames -- `MULTI1.C` and `MULTI1_A.ASM`?