Combining FORTRAN, C, and C++
Written By: Milad Fatenejad
Often times, you may find yourself in a situation where you need to combine source code written in different languages, specifically C, C++ and FORTRAN. For example, you may need to use a numerical method that is written in FORTRAN or maybe you want your library to work with multiple languages. Fortunately, it is conceptually easy to make these languages work together. Unfortunately, many of the details are compiler dependent. My goal here is to lay out the concepts for making FORTRAN, C, and C++ interact. I will use GCC for my examples, and while the details may be different using other compilers, the concepts will be the same.
Simple Example
Lets begin with a simple example. Suppose we have a function written in FORTRAN 90 called addints, which simply takes two integers, adds them and returns the result. I've placed this function in the file add.f90, and have shown it below. Notice that the function prints a message when it is called.
add.f90
| Line | |
|---|---|
| 1 | function addints(a, b) |
| 2 | |
| 3 | implicit none |
| 4 | |
| 5 | integer, intent(in) :: a |
| 6 | integer, intent(in) :: b |
| 7 | |
| 8 | integer :: addints |
| 9 | |
| 10 | print *, "Adding a + b" |
| 11 | addints = a + b |
| 12 | |
| 13 | return |
| 14 | |
| 15 | end function addints |
I've written a small program in C which calls the FORTRAN function above to add the numbers 3 and 4. It then prints the results. I've shown the program, in the file call_add.c, below. Notice that the function declaration on line 3 is a little strange. The function name has an underscore added, and that the function arguments have changed. Rather than two integers, the arguments are pointers to integers.
call_add.c
| Line | |
|---|---|
| 1 | #include <stdio.h> |
| 2 | |
| 3 | int addints_(int *a, int *b); |
| 4 | |
| 5 | int main() |
| 6 | { |
| 7 | int a = 3; |
| 8 | int b = 4; |
| 9 | |
| 10 | int answer = addints_(&a, &b); |
| 11 | |
| 12 | printf("3 + 4 = %i\n", answer); |
| 13 | |
| 14 | return 0; |
| 15 | } |
Now we need to compile and link call_add.c and add.f90. The steps to do so are shown below.
| Line | |
|---|---|
| 1 | $> gcc -c call_add.c |
| 2 | $> gfortran -c add.f90 |
| 3 | $> gcc call_add.o add.o -l gfortran |
| 4 | $> ./a.out |
| 5 | 3 + 4 = 7 |
The steps are briefly described below. They will be discussed in more detail later.
- Compile (but do not link) the C program using the C compiler, gcc.
- Compile (but do not link) the FORTRAN function using the FORTRAN compiler, gfortran.
- Link the two object files together. Notice that we need to also link to the gfortran standard library. The executable a.out is produced.
This simple example demonstrates three concepts that must be understood if you want to combine C, C++, and FORTRAN code.
- The arguments of the functions will be different. The FORTRAN function addints took two integers as arguments, while the C function took two pointers to integers.
- The names of the functions may be different. We call the FORTRAN function addints using the C function addints_.
- When linking, additional libraries may need to be specified. In the example above, we needed the -l gfortran linker option.
These three concepts will be described in detail below.
Why does this work?
Even though there were a few tweaks that had to be made to the source code, getting the C and FORTRAN functions to work together was relatively easy. To understand why it was so easy, you have to understand what happens when we compile and link source code. If you already understand this process well then feel free to skip to the next section.
Compiling and Linking
What some people call compiling actually consists of two separate steps: compiling and linking. This can be confusing, so you (and I) will refer to the joint process of compiling and linking source code as building. During the build process, our human readable source code (stored in .c, .f90, or .cpp files) is converted to machine code (ones and zeros) that the computer can understand and execute. Lets look at a simple example - a "Hello World!" program written in FORTRAN. I've placed this simple program in the file hello.f90, shown below:
hello.f90
| Line | |
|---|---|
| 1 | program hello |
| 2 | implicit none |
| 3 | |
| 4 | print *, "Hello, World!" |
| 5 | |
| 6 | end program hello |
We can build the source code as shown:
$> gfortran hello.f90 # Compile and link the program $> ./a.out Hello, World!
When we issue the command gfortran hello.f90, the compiler actually performs both steps, compiling and linking, automatically. Normally, we want to perform these two commands separately, unless our program is really simple. The -c option tells the compiler to compile, but not link, the program. When we issue the command gfortran -c hello.f90, the compiler will automatically generate a file called hello.o. A file that ends in .o is called an object file. We can then use gfortran to link the object file. The entire process is shown below:
$> gfortran -c hello.f90 # Compile, but don't link the program (generates hello.o) $> gfortran hello.o # Link hello.o to produce an exectuable $> ./a.out Hello, World!
During the compiling step, the compiler takes your source code and turns as much of it into ones and zeros as possible. However, the compiler can't finish the job - because the code in the file it is compiling may call external functions which are located in either external libraries or other source files. To deal with this, we first compile each and every source file in our program, generating one object file for each. The object file contains all of the source code in machine readable form except any calls to functions. Again, the compiler can't fill in the code to call functions because some of those function may exist in external libraries or other source files. So to complete the process, we have to link all of the source files with external libraries to produce a program we can run.
The key concept to understand is that there is no difference between object files produced by C, C++, or FORTRAN. They are all object files in the same format. All of the differences between the languages are dealt with in the compiling step. After the compiling step, the only thing left to fill is the code to call functions - and the process of calling functions is the same in each language. Thus, while the compiling process is different for each language, the end result (the object file) is basically the same. That is the fundamental reason why it is relatively easy to get C, C++ and FORTRAN code to communicate.
Keep it Simple
We will see that while it is possible to get functions in C/C++/FORTRAN to interact, it may be difficult. It gets more difficult the more "fancy" your programming is. Doing things like calling functions in modules (FORTRAN) or using templates, classes (C++), or structs (C) make things more difficult. We will see that even using something as simple as strings can be very complicated. So generally, try to do the following:
- Use global functions - no modules, namespaces, etc...
- Use simple arguments - integers, floating point numbers (real in FORTRAN)
- Don't use complex data types as arguments - no structs, derived types, classes
- In C++, use only C-style arrays and strings as arguments - no std::vector or std::string
Note, that the above rules only apply to the scope and arguments of functions you want to call from other languages - they don't apply to functions elsewhere in your program.
Function Arguments
Ok, lets deal with the first peculiarity that arises - the function arguments change. In the adding example above, we saw that if the FORTRAN function takes an integer then the C function declaration must take a pointer to an integer. This happens because all arguments in FORTRAN are passed by reference. When you call a function in FORTRAN with an argument, the compiler actually passes the address of the variable to the function. C, on the other hand, passes arguments by value. When you call a function with an argument in C, the compiler actually passes a copy of the variable to the function. Thus, when the FORTRAN function expects an argument that is an integer we have to manually tell C to pass arguments by reference. We do this by using pointers, as in the example above. So we've identified one simple rule: Pass values by reference (using pointers) and not by value.
Array Arguments
If our functions just take simple arguments, such as integers, characters, or floating point numbers, then this is all we need to know about how function arguments are handled when getting C and FORTRAN to interact. However, we often need to pass arrays to functions. Lets look at a simple example. I've written a subroutine called matmult in FORTRAN in the file matmult.f90. This function multiplies a square matrix and a vector. It takes four arguments:
- A - The matrix we are multiplying
- b - The vector we are multiplying
- c - The vector result
- n - The size of the matrixes and vectors (A is n by n, b is n by 1, and c is n by 1)
The function is shown below. Note that we are explicitly passing in the array size, n, as an argument. We could use FORTRAN-90's assumed shape array capability to eliminate the need for this argument - however, assumed shape arrays qualify as something fancy (see above), and I recommend that you avoid them.
matmult.f90
| Line | |
|---|---|
| 1 | subroutine matmult(A, b, c, n) |
| 2 | implicit none |
| 3 | |
| 4 | integer, intent(in) :: n |
| 5 | integer, intent(in) :: A(n, n) |
| 6 | integer, intent(in) :: b(n) |
| 7 | integer, intent(out) :: c(n) |
| 8 | |
| 9 | integer :: i, j |
| 10 | |
| 11 | ! Multiply the matrixes: |
| 12 | do i = 1, n |
| 13 | c(i) = 0 |
| 14 | do j = 1, n |
| 15 | c(i) = c(i) + A(i, j) * b(i) |
| 16 | end do |
| 17 | end do |
| 18 | |
| 19 | end subroutine matmult |
We're going to call this function from C. A and b are defined below, along with the correct answer.
| 1 | 2 | 3 | 1 | 6 | |||
| A = | 0 | 1 | 1 | b = | 1 | c = | 2 |
| 1 | 0 | 0 | 1 | 1 |
The C program that will call matmult is shown below, in the file call_matmult.c
call_matmult.c
| Line | |
|---|---|
| 1 | #include <stdio.h> |
| 2 | |
| 3 | void matmult_(int A[3][3], int b[3], int c[3], int *n); |
| 4 | |
| 5 | int main() |
| 6 | { |
| 7 | int A[3][3] = { { 1, 2, 3}, |
| 8 | { 0, 1, 1}, |
| 9 | { 1, 0, 0} }; |
| 10 | int b[3] = { 1, |
| 11 | 1, |
| 12 | 1 }; |
| 13 | int n = 3; |
| 14 | int c[3]; |
| 15 | |
| 16 | // Call the function: |
| 17 | matmult_(A, b, c, &n); |
| 18 | |
| 19 | // Print the results: |
| 20 | printf("%i %i %i %i %i\n", A[0][0], A[0][1], A[0][2], b[0], c[0]); |
| 21 | printf("%i %i %i * %i = %i\n", A[1][0], A[1][1], A[1][2], b[1], c[1]); |
| 22 | printf("%i %i %i %i %i\n", A[2][0], A[2][1], A[2][2], b[2], c[2]); |
| 23 | |
| 24 | return 0; |
| 25 | } |
Before we build and run the program, lets examine the C function declaration on line 3. First, note that the function's return type is void. This is because we defined a subroutine in FORTRAN and not a function. Subroutines don't return anything, and hence the corresponding return type in C is void. Second, note how we are passing the arrays into the function. We explicitly write their size in function declaration. Note that we didn't use any pointer notation for the first three arguments. This is because C treats arrays a lot like pointers. When you pass an array to a function in C, you are actually passing a pointer to the first element of the array. Behind the scenes, FORTRAN compilers do exactly the same thing, which is why this code works. In this case, we had the luxury of knowing the size of the arrays, 3, at compile time. If this is not the case, then the easiest thing to do is explicitly pass a pointer to the first array element. If we didn't know the value of n at compile time, we would have to change line 3 to:
void matmult_(int *A, int *b, int *c, int *n);
and we would change line 17 to:
#! matmult_(&(A[0][0]), &(b[0]), &(c[0]), &n);
Note, that we are now explicitly passing pointers to the first element in each array. The code will work identically in each case.
Now that we've got that sorted out, lets build and run our program to verify that it is working:
$> gfortran -c matmult.f90 $> gcc -c call_matmult.c $> gcc matmult.o call_matmult.o -l gfortran $> ./a.out 1 2 3 1 2 0 1 1 * 1 = 3 1 0 0 1 4
It worked! Except for the fact that the answer is wrong...the correct answer is the vector (6, 2, 1) and the answer we got was (2, 3, 4). What happened? Unfortunately for us, C and FORTRAN happen to store multidimensional arrays in different order in memory. Remember, that although in FORTRAN it appears that we are passing around entire arrays, we are actually passing around pointers to (or the memory address of) the first element in the array. For example, above we passed the address of element (0,0). This corresponds to the address of element (1, 1) in FORTRAN, since FORTRAN arrays start at 1 instead of 0. The FORTRAN compiler then assumes that the next memory address is occupied by element (2,1) - but this is the opposite of what C assumes! Thus, when you pass a 2D array to a FORTRAN function from C (or vice versa) you are actually passing the transpose of that array. If you do the math, you will see that the program above multiplied the transpose of A by b. Thus, before calling the function, we should have transposed A to get the correct answer. This rule is true for higher dimensional arrays as well. Thus we have another simple rule: Always transpose multidimensional arrays before passing them to functions.
Before moving on, I just want to mention that C arrays assume indexes start at 0 (element 0 is the first element of the array) while FORTRAN assumes indexes start at 1. The arrays have exactly the same size and occupy the same amount of memory - this distinction is simply notational so you just have to remember that element i in a C array is element i+1 in a FORTRAN array.
String Arguments
Warning: Calling functions with strings in FORTRAN is very compiler dependent. Make sure you look at your compiler documentation if you are not using GCC
Things get a little more complicated when we try to pass strings between C and FORTRAN. Lets demonstrate this with a simple example. In this case, I am going to call a C function from FORTRAN (not very much changes). I've written a function in C called printenv_, in the file printenv.c. This function takes two arguments, a pointer to a char (or character) and a pointer to an integer. The first argument represents a FORTRAN string, and the second argument is the length of that string. The argument is name of an environment variable. The function simply gets and prints the value of that environment variable.
printenv.c
| Line | |
|---|---|
| 1 | #include <stdlib.h> |
| 2 | #include <stdio.h> |
| 3 | |
| 4 | void printenv_(char *str, int length) |
| 5 | { |
| 6 | int i; |
| 7 | for(i = 0; i < length; i++) { |
| 8 | if(str[i] == ' ') { |
| 9 | str[i] = 0; |
| 10 | break; |
| 11 | } |
| 12 | } |
| 13 | |
| 14 | printf("getenv(%s) = %s\n", str, getenv(str)); |
| 15 | |
| 16 | str[i] = ' '; |
| 17 | } |
Before describing how this function works, lets look at the FORTRAN code. The file call_printenv.f90 contains code to call printenv_. Note that, although the string "HOST" is four characters long, I've created a string on length 5 to store it. The simple printenv_ function I wrote needs the length of the string (5, in this case) to be at least 1 character longer than the name of the environment variable. This is not necessary, but it simplifies the code a lot, as we will see later.
call_printenv.f90
program call_printenv implicit none external printenv character(len=5) :: var var = "HOST" call printenv(var) end program call_printenv
This FORTRAN program calls printenv with the string "HOST". Note that printenv has been declared external on line 3 - which tells the FORTRAN compiler that this function is defined somewhere external to call_printenv.f90 On my laptop, the value of the HOST environment variable is set to "animal3". Lets compile and run the program:
$> gcc -c printenv.c $> gfortran -c call_printenv.f90 $> gfortran printenv.o call_printenv.o $> ./a.out getenv(HOST) = animal3
It seems to work as expected - now lets look at what is going on. The first thing that pops out is that the C function appears to take 2 arguments while the FORTRAN call only passes 1 argument. The second argument is the length of the string - 5 in this case. When FORTRAN passes a string to a function, it implicitly includes the length. You don't see this when you write FORTRAN code, but rest assured that the compiler does include the length when it compiles the code. Thus, the C function must be prepared to accept the length of the string along with a pointer to the start of the string.
FORTRAN and C store strings in completely different ways. As we've seen above, FORTRAN hides the length of the string along with the string. This allows you to access the length of the string using the len function. If you create a string with a certain length and assign it a string that is shorter, FORTRAN automatically appends white space to the string. In the example above, var has length 5 but on line 7 we assign it the string "HOST" which has only four characters. After line 7, var has the value "HOST " (note the additional space at the end). C doesn't store the length of the string anywhere. In C, a string is simply an array of characters, just like any other array. To know when a string ends, C functions typically assume that the memory location immediately following the last character of the string holds the value 0. Thus to determine the length of the string you simply count the number of characters until you get to the number 0. Note that this is not the character "0" it is the number zero.
So on lines 6-12 of printenv.c, I have to manually convert the FORTRAN style string into a C style string. I do this by looping through the string until I reach a blank character. Since I know that environment variables can't have spaces in them, I know that this is the end of the string. I manually insert a 0 at line 9 so that when I call the getenv, the string is in the correct format and everything works. Before the function exits, I convert the string back to FORTRAN style. Note, that I assume that there is at least one blank character at the end of the string. To do this more robustly, you should create a temporary character array in C and copy the FORTRAN string into it - making sure to strip trailing blank characters and insert the zero at the end of the string.
This example demonstrates that passing strings between C and FORTRAN is a little annoying, and should be avoided if possible.
Numeric Types
In all of the examples so far we have passed integers between C and FORTRAN. What if we want to pass numbers that aren't just integers? Or what if we want to pass two byte integers or one byte integers? What about four/eight byte real numbers? Luckily, all of the types mentioned above have simple analogues in both C and FORTRAN. The table below relates the FORTRAN and C numeric types. I've only included a few types, but they should be enough to get you started.
| FORTRAN Type | C/C++ Type |
| integer(1) | char |
| integer(2) | short |
| integer(4) | int |
| integer(8) | long int |
| real(4) | float |
| real(8) | double |
Name Mangling
We saw in the last section that the arguments to functions have to be carefully changed in order to make the C/FORTRAN function call works. The other change we saw, that we've been ignoring so far, is that the name of the function itself changes. More specifically, we've had to append an underscore to the name it seems that whenever we've called a FORTRAN function from C, but why? The reason can be found in the linking step. Remember above, I wrote that during the compiling step all of the source code is turned into ones and zeros except the function names and calls. So the object files that are produced during the compiling step contain a bunch of ones and zeros and a bunch of function names. You might expect that the name of the function in the object file is the same as the name of the function in the source file (either C, C++, or FORTRAN) but this is only true in C. When gfortran compiles a file, it automatically tacks an underscore to the function name. But this is not all - gfortran automatically converts all function names to lowercase. Thus, if you have a function or subroutine called Test in your FORTRAN source code, you will end up with a function called test_ in the object file. This process of converting the function name in the source code to the function name in the object file is called name mangling. We've see that gfortran's name mangling is pretty simple:
- Add and underscore to the end of the function/subroutine name
- Convert function/subroutine name to lowercase
Note: At this point I go into a lot of detail about name mangling. If you are not interested in C++, and you are only using gcc/gfortran, then you can probably skip this section, although I think it is interesting :)
C's name mangling is even simpler...do nothing. If you make a function called test in your C file, you will end up with a function called test in your object file. This is critical, because it allows us to rename our C functions to accommodate the FORTRAN name mangling system. Unfortunately, not every FORTRAN compiler uses the same name mangling system. So, how do we know what to name our functions? We need some kind of utility which allows us to look at an object file and see what the function names are - that utility is called nm. Lets do an example using our matrix multiplication example from before. We're going to compile matmult.f90 and call_matmult.c. We're then going to use nm to see what functions are defined and called in the object files.
| Line | |
|---|---|
| 1 | $> gfortran -c matmult.f90 # Generates object file matmult.o |
| 2 | $> gcc -c call_matmult.c # Generates object file call_matmult.o |
| 3 | $> nm matmult.o |
| 4 | 00000000 T matmult_ |
| 5 | $> nm call_matmult.o |
| 6 | 00000000 T main |
| 7 | U matmult_ |
| 8 | U printf |
On line 3 above, we use nm to examine the object file matmult.o. The next line tells use that there is a function called matmult_ defined at address 0 in the file (address 0 is the beginning of the file). The capital T tells us that the function is defined at address 0 and not called at address zero. As we can see, the name mangling is exactly as we discussed - gfortran has inserted an underscore just like we expected. On line 5, we use nm to examine the contents of call_matmult.o - the object file generated when we compiled call_matmult.c. Again, nm tells us that at address zero in the file, a function called main is defined. But there is more - nm also tells us that that main contains two function calls - one to matmult_ and one to printf. The capital U tells us that the functions are called and not defined. Lets ignore the printf call for now. During the linking step all the linker does is match up function calls with function definitions. You can use the nm utility to understand exactly what name mangling is occurring.
Lets to a quick exercise. I've create a new file called matmult_mod.f90, where I've copied the matmult subroutine and placed it in a FORTRAN module.
matmult_mod.f90
| Line | |
|---|---|
| 1 | module matmult_module |
| 2 | |
| 3 | contains |
| 4 | |
| 5 | subroutine matmult(A, b, c, n) |
| 6 | implicit none |
| 7 | |
| 8 | integer, intent(in) :: n |
| 9 | integer, intent(in) :: A(n, n) |
| 10 | integer, intent(in) :: b(n) |
| 11 | integer, intent(out) :: c(n) |
| 12 | |
| 13 | integer :: i, j |
| 14 | |
| 15 | ! Multiply the matrixes: |
| 16 | do i = 1, n |
| 17 | c(i) = 0 |
| 18 | do j = 1, n |
| 19 | c(i) = c(i) + A(i, j) * b(i) |
| 20 | end do |
| 21 | end do |
| 22 | |
| 23 | end subroutine matmult |
| 24 | end module matmult_module |
Now lets try to compile and link matmult_mod.f90 and call_matmult.c:
$> gfortran -c matmult_mod.f90 # Generates matmult_mod.o $> gcc -c call_matmult.c # Generates call_matmult.o $> gcc matmult_mod.o call_matmult.o -l gfortran call_matmult.o: In function `main': call_matmult.c:(.text+0x8a): undefined reference to `matmult_' collect2: ld returned 1 exit status
As you can see, the linker generated an error. It is telling us that it is looking for a function called matmult_, but it can't find one. Lets use nm to examine the object file to see what happened.
$> nm matmult_mod.o 00000000 T __matmult_module_MOD_matmult
Wow - what happened? The line above is telling us that the object file matmult_mod.o doesn't have a function called matmult_ defined, but instead has a function called __matmult_module_MOD_matmult defined. This example demonstrates that the name mangling gets much worse when we use "fancy" language features like modules. This is why I have recommended that you not use modules in your FORTRAN/C interface (but DEFINITELY use them elsewhere in your program!). Note that we could rename our C function to be __matmult_module_MOD_matmult instead of matmult_ and everything would work fine - but this is obviously not ideal.
But why does gfortran do this? Why does putting a subroutine in a module affect the name mangling system? Well, lets assume that when we put a function in a module and gfortran used the same name mangling system (it lowercased everything and appended an underscore). What would happen if we created a different module that had a function of the same name? Then there were would be two functions with the same name in our object file - which is not allowed. Thus FORTRAN compilers must further mangle names to account for the fact that multiple modules can have functions of the same name.
What about C++?
So far we've ignored C++, but now that we understand name mangling we can tackle it. C++ has a problem similar to that presented when we introduce modules into FORTRAN, namely C++ can have functions of the same name. Therefore we expect C++ to have a name mangling problem. Lets do a simple example. I'm going to go back to our environment variable example. I'm going to rebuild the example, but I'm going to use the C++ compiler, g++, instead of gcc to compile printenv.c. Since C++ source code is a superset of C, I can compile this C file with a C++ compiler.
$> g++ -c printenv.c $> gfortran -c call_printenv.f90 $> gfortran printenv.o call_printenv.o -l stdc++ call_printenv.o: In function `MAIN__': call_printenv.f90:(.text+0x6e): undefined reference to `printenv_' collect2: ld returned 1 exit status
The linker has generated an error. It first tells us that it can't find a function called printenv_. Lets use nm to examine printenv.o.
$> nm printenv.o
00000000 T _Z9printenv_Pci
U __gxx_personality_v0
U getenv
U printf
The first line tells us that there is a function called _Z9printenv_Pci defined at address zero, instead of the function we expected: printenv_. You have just been introduced to C++'s name mangling system which seems to make absolutely no sense. However, if we look a little deeper we can understand why the C++ system needs to be more complicated than FORTRAN's. In C++ functions can have the same name, as long as they have different arguments. Thus, in order to distinguish functions, the C++ compiler needs a way of encoding information about function arguments into the name placed in the object file. The end result is that you end up with something crazy looking. To emphasize this point, you can use the --demangle option for nm to decode the function name:
$> nm --demangle printenv.o
00000000 T printenv_(char*, int)
U __gxx_personality_v0
U getenv
U printf
Now you can see that there is a function called printenv_ which takes two arguments, a pointer to a character and an integer, defined at address zero in the file. So this proves that C++ name mangling is just a system for encoding argument types into the function name so that multiple function can share the same name. Ok...but doesn't this make it very difficult to link C++ programs to everything else? It would make it very difficult, except for the fact that C++ has a language feature which enables us to turn off name mangling. I've demonstrated this in the file printenv.cpp.
printenv.cpp
| Line | |
|---|---|
| 1 | #include <stdlib.h> |
| 2 | #include <stdio.h> |
| 3 | |
| 4 | extern "C" { |
| 5 | void printenv_(char *str, int length) |
| 6 | { |
| 7 | int i; |
| 8 | for(i = 0; i < length; i++) { |
| 9 | if(str[i] == ' ') { |
| 10 | str[i] = 0; |
| 11 | break; |
| 12 | } |
| 13 | } |
| 14 | |
| 15 | printf("getenv(%s) = %s\n", str, getenv(str)); |
| 16 | |
| 17 | str[i] = ' '; |
| 18 | } |
| 19 | } |
You can see that printenv.cpp is identical to printenv.c except that I've wrapped the function definition in an extern "C" block. The extern "C" block tells C++ to use C's name mangling system, which is to not do anything to the names. Thus, using extern "C" we can get our C++ programs to work with C and FORTRAN. Other than this, everything we've said about C applies to C++.
Standard Libraries and Compilers
You may have noticed a few things during the above examples. The first is that I've seemingly randomly been including additional linker options like -l gfortran and -l stdc++. I've also been using different compilers at different points in the building process. Lets go back to the adding example, where I wrote a simple function to add two numbers. The steps for building this program were:
| Line | |
|---|---|
| 1 | $> gcc -c call_add.c |
| 2 | $> gfortran -c add.f90 |
| 3 | $> gcc call_add.o add.o -l gfortran |
| 4 | $> ./a.out |
| 5 | 3 + 4 = 7 |
The first line compiles call_add.c. Since this is a C source file, we have to use the C compiler, gcc to compile it. The second line involves compiling the FORTRAN source file add.f90 using the FORTRAN compiler gfortran - no surprises there. On the third line, we link the source files together to generate a program. But why do we use gcc here, why not gfortran? After all, we are compiling one FORTRAN file and one C file. A simple rule of thumb is to use the compiler corresponding to the language your main function is written in. So if you define a main function in C, use gcc (this is the case here). If you use the program statement in FORTRAN, use gfortran (that was the case in the printenv example).
But why do we have to include the -l gfortran option? Again, nm can be useful here.
$> gfortran -c add.f90
$> nm add.o
U _gfortran_st_write
U _gfortran_st_write_done
U _gfortran_transfer_character
00000000 T addints_
As expected, the file add.o contains the definition of a function called addints_. But the file also seems to call the functions _gfortran_st_write, _gfortran_st_write_done, and _gfortran_transfer_character...which is weird because we never called any of these functions. We may not have called any of these functions - but we did use the print command. When you use print, gfortran will use the functions listed above to actually implement the printing behind the scenes. These functions are defined in the gfortran standard library. That is why we have to include the -l gfortran linker option. When you link with gfortran, the compiler automatically includes the FORTRAN library and you never have to worry about it. For the same reason, in printenv C++ example, we linked with gfortran and we had to manually link to the C++ standard library using -l stdc++.
Why don't we have to include the C standard library when we link with FORTRAN - as in the printenv C example. The reason is that the gfortran library depends on the C standard library, so gfortran will automatically link to both. It doesn't hurt to link to the C standard library anyway, however. You can see that the process of linking to the right library can be pretty complicated, but two simple rules make it easy:
- Link your program using the compiler associated with the main function
- Link to the standard library of the other language
- -l c for C
- -l stdc++ for C++
- -l gfortran for gfortran
Note that these options will change on other compilers.
Conclusions
There are a lot of nuances to getting C, C++ and FORTRAN programs to interact. But I hope that I've managed to explain some of the key concepts.
Attachments
-
090724_Roy_Notes.odt
(37.0 KB) - added by katyhuff
12 months ago.
Just Some Meeting Notes From Roy
![(please configure the [header_logo] section in trac.ini)](/cgi-bin/hackerwithin.fcgi/chrome/site/thwlogo-small.png)