Introduction

This tool is designed to parse C header files to generate Ada package specifications that correspond to the header information (typically referred to as a thin binding).

Subprograms are defined with the Import aspect so that calls to the subprogram from Ada will directly call the subprogram from the C library.

Example

Consider the following C header file

example.h:

#ifndef EXAMPLE_H
#define EXAMPLE_H
#define PASSWD_UMASK   022
struct defobj {
     struct syment  *symtab;
     int             nsymtab;
};
enum pwdbstat_e { PdbOk, NoPasswd, NoShadow, BusyPasswd };
char * pwdb_errstr ( enum pwdbstat_e e[], struct defobj d);
#endif

Running the simple command h2ads example.h would generate the following Ada specification

example_h.ads:

pragma Ada_2012;

pragma Style_Checks (Off);
pragma Warnings (Off, "-gnatwu");

with Interfaces.C; use Interfaces.C;
with System;
with Interfaces.C.Strings;
 package Example_H is
    PASSWD_UMASK : constant := 18; -- 022
    type Syment is null record;
    type Defobj is record
       Symtab : access Syment;
       Nsymtab : aliased int;
    end record
    with Convention => C_Pass_By_Copy;

    type Pwdbstat_E is (
       Pdb_Ok,
       No_Passwd,
       No_Shadow,
       Busy_Passwd)
    with Convention => C;

    function Pwdb_Errstr (
       E : access Pwdbstat_E;
       D : Defobj)
        return Interfaces.C.Strings.Chars_Ptr
    with Import => True,
         Convention => C,
         External_Name => "pwdb_errstr";

end Example_H;

pragma Style_Checks (On);
pragma Warnings (On, "-gnatwu");

Behavior

When the tool parses a file, it will generate an Ada package specification containing all the items necessary to interface with the C library. The specification is designed to be binary-compatible with the C call.

Some examples:

  • Records in Ada will have the same physical alignment as the C structure

  • Enumerated types will use representation clauses when the C enum defines values for the enumerals

  • Unions will be defined as variant records with the Unchecked_Union aspect

To prevent duplication of data, when a parsed file contains #include directives, the included header will be parsed separately to create its own Ada package specification. So, parsing a single header file can generate multiple Ada packages.

Package Naming Conventions

When a specification is generated, the package name will be the header’s file name including the file extension. For example, stdio.h will generate an Ada package called Stdio_H. When dealing with a hierarchical file structure (e.g. the sys folder in a typical include directory), an empty package will be created for the folder, and then the header will become a child of that package (e.g. sys/types.h becomes Sys.Types_H).

To create an API subsytem, the user can specify a top-level package name such that all the generated specifications become children of the top-level package. For example, if you want the generated API’s to be child packages of C_Apis, then the command:

h2ads –parent C_Apis stdio.h sys/types.h

would generate the files c_apis-stdio_h.ads and c_apis-sys-types_h.ads.

Macro Values

Many times, macro expansions can depend on compile-time switches, or even on macros pre-defined by the compiler. (For instance, gcc on linux automatically defines __linux). h2ads does not know any of these values, which can cause the translation process to either fail or generate the binding incorrectly.

To get around this issue, on the command line you can specify a file that will be treated as if it is included at the top of the header file. This file can contain your own preprocessor statements (even #include), or a list of macros that are typically defined by the compiler.

h2ads –include macro_values.txt my_file.h

will parse my_file.h as if macro_values.txt was included in the file.

Note

For gcc, to get a list of the predefined macros for your version / platform, you can run the command: gcc -E -dM foo.h, where foo.h is an empty file

How to Run ‘h2ads’

Running h2ads without any arguments (or by specifing –help) gives the usage information and description of the switches.

Usage: h2ads [switches] {filename | directory} [ {filename | directory} ...]

  -h, --help              Display help
  -b, --base ARG          Specify top level of source file hierarchy (default=current)
  -c, --copyright         Insert comments at beginning of header file as copyright text
  -d, --destination ARG   Specify directory to store generated files (default=current)
  -D,  ARG                Specify macro definition
  -f, --files ARG         Specify text file containing headers to process
  -I,  ARG                Specify additional include directories (passed as '-I <directory>')
  -include, --include ARG File containing additional header information
  -n, --nested ARG        Comma-separated list of files to treat as nested packages
  -p, --parent ARG        Top-level package name
  -q, --quiet             Only display error messages
  -r, --recursive         Recursively parse directories
  -s, --spark             Add 'with Spark_Mode => Off' to packages
  -x, --no_file_extension Do not append header file extension to Ada unit name
  --stdinc                Enable parsing of standard include directory. (Do not pass '-no-standard-includes' to parser.)

h2ads takes a list of filenames and directories to process. For a filename, the tool will generate an Ada specification (plus additional specifications needed to support the #include directives). For a directory, every file in the directory will be processed to generate a specification file.

The tool will generate file(s) with the naming convention of <filename>_<extension>.ads - so stdio.h will generate stdio_h.ads. A separate file will be generated for each header file included in the original file.

Upon completion, the original header file name will be written to standard output with an indication of the success of the binding generation.

Command-Line Switches

-h, –help

Print the usage information.

-b, –base <directory name>

When parsing header files, sometimes a #include directive contains a relative path. When that is the case, you can use this option to specify the ‘base’ directory - indicating what directory the path is actually relative to.

If not specified: If this switch is not specified, the path will be considered relative to the directory where the header file containing the #include directive is located.

–compiler <compiler name>

When generating Ada bindings, assume that the C headers are compiled with the given compiler (assumed to be on the PATH). This means the bindings will be generated according to the compiler configuration, i.e. standard search headers and compiler-defined macros. Supported compilers are gcc and clang families.

-d, –destination <directory name>

Directory where Ada specification files will be stored.

-D <macro[=value]>

Specify a macro (and value) to be passed into the parser.

-f, –files <text file>

Rather than specifying a list of files and/or directories on the command line, you can specify a text file containing the the items to process (one per line). This switch can be used instead of or in addition two any command line arguments.

If not specified: Only files and directories specified as arguments will be parsed.

-I <additional directory>

Sometimes, when compiling C headers, additional directories need to be specified for the compiler to find included files. This option can be specified multiple times to supply those directories.

If not specified: Paths to all included headers should be resolvable in context.

-include, –include <text file>

This option allows you to specify a file containing additional header information, such as compiler- or user- supplied macro values. These typically would contain macro definitions that would be specified on the compiler’s command line.

If not specified: All macros should be resolvable in context.

-n, –nested <headers>

In certain situations, an included header ends up creating a circular dependency. One example is when header one.h defines a type MyType, and then includes header two.h which defines struct Struct containing MyType, and then one.h later uses struct Struct.

Treating one.h and two.h as separate packages would create a circular dependency. From an Ada perspective, we really want to treat then contents of two.h as nested within one.h.

To do this, use this switch to specify a (comma-separated) list of header files that should be nested within their enclosing file.

If not specified: All header files will generate their own Ada specifications.

-p, –parent <Ada package name>

Use this option to specify the parent package name for all generated packages.

h2ads –parent C_Apis my_header.h

Will generate an Ada specification named C_Apis.My_Header_H

If not specified: Header files at the top of the directory structure will just use the Ada specification name. Header files in subdirectories will use the subdirectory name as a hierarchy level for its included files.

-q, –quiet

When this switch is used, only error messages will be displayed.

If not specified: Warning messages will also be displayed. These kinds of messages generally indicate a failure to understand a particular construct. Warning messages typically mean no Ada code will be generated for the construct - the Ada specification will usually still compile, but it may be missing some information.

-r, –recursive

If this switch is set, then when a directory is passed on the command line, that directory will be processed recursively.

If not specified: Only directories explicitly passed on the command line will be traversed.

-s, –spark

Due to the limitations of building a thin binding, most of the generated specifications will not meet the restrictions of the SPARK language subset of Ada. Typically, an Ada package that does not enable SPARK restrictions will be considered outside of the SPARK subset. However, it is possible to set the build process to treat all packages as SPARK unless otherwised noted. Setting this switch will add with SPARK_Mode => Off to the package definition.

If not specified: Packages will not contain the SPARK_Mode aspect.

-x, –no_file_extension

Because code bases can have different file extensions to mean different things, default behavior of h2ads causes the header file extension to be appended to the filename to create the Ada unit name. For example, types.h becomes package Types_H. Using this switch will prevent this behavior, such that types.h will become package Types.

Note: In case of naming collisions (e.g. types.h and types.hxx reside in the same directory), the first file encountered during processing (not necessarily the first alphabetically) will get the unit name Types, and the second will be named Types_2 (and so on for other conflicts).

–stdinc

When parsing an include directive of the form #include <filename.h> the parser may look in its “default” location for filename.h. When using this tool to parse a non-native header structure, this could be problematic. Because of that, the tool, by default, will pass -no-standard-includes to the parser. If you want the standard library location to be used, set this switch.

If not specified: Default include directory will not be searched by the parser.

Handling Language Differences

Note: Overcoming language differences was handled in regards to the GNAT compiler. All of the design decisions will work for other compilers, but some of those decisions were required due to GNAT behavior.

File Names

Because C works on files (as opposed to Ada units), filenames are not required to be legal identifiers. In GNAT, default behavior is that the filename must match the package name. The following conventions will be enforced when converting filenames to package names.

  • All specification files will have the extension ads

  • All dots (‘.’) in a filename will be converted to underscores (‘_’)

  • Leading and trailing underscores will be replaced by ‘U’, and then separated from the rest of the name by an underscore. (e.g. __file__.h will be converted to uu_file_uu_h.ads

  • Filenames where the first non_underscore is a digit will have c_ added to the front (e.g. 1_file.h becomes c_1_file_h.ads

  • All other text will follow rules in the Identifiers section.

Identifiers

Because C is case-sensitive, and allows leading and trailing underscores, there is a lot of opportunity for conflicts when converting to Ada identifiers (e.g. FooBar and _foobar). The following conventions will be enforced when converting C identifiers to Ada identifiers.

  • Text that is all lower-case will have the first letter capitalized

  • Text that is all upper-case will remain all upper-case

  • Camel-case will have individual words separated by underscores, where the first letter in each word will be capitalized (e.g. FooBar will be converted to Foo_Bar).

  • Trailing and/or leading underscores will be removed

  • Consecutive underscores will be replaced by a single underscore (e.g. foo__bar becomes Foo_Bar).

  • When the entire identifier is an Ada reserved word, it will be prefaced by C_ (e.g. select becomes C_Select).

  • When multiple C names convert to identical Ada names (e.g. foo, Foo, FOO), occurences after the first will get a numeric suffix.

  • Duplicated names in a single context (e.g. void func (int Func) will get The_ added as a prefix (e.g. procedure Func (The_Func : Interfaces.C.Int)