Introduction
This tool is designed to parse C header files to generate Ada package specifications that correspond to the header information (typically referred to as a thin binding).
Subprograms are defined with the Import
aspect so that calls to
the subprogram from Ada will directly call the subprogram from the
C library.
Example
Consider the following C header file
example.h:
#ifndef EXAMPLE_H #define EXAMPLE_H #define PASSWD_UMASK 022 struct defobj { struct syment *symtab; int nsymtab; }; enum pwdbstat_e { PdbOk, NoPasswd, NoShadow, BusyPasswd }; char * pwdb_errstr ( enum pwdbstat_e e[], struct defobj d); #endif
Running the simple command h2ads example.h would generate the following Ada specification
example_h.ads:
pragma Ada_2012; pragma Style_Checks (Off); pragma Warnings (Off, "-gnatwu"); with Interfaces.C; use Interfaces.C; with System; with Interfaces.C.Strings; package Example_H is PASSWD_UMASK : constant := 18; -- 022 type Syment is null record; type Defobj is record Symtab : access Syment; Nsymtab : aliased int; end record with Convention => C_Pass_By_Copy; type Pwdbstat_E is ( Pdb_Ok, No_Passwd, No_Shadow, Busy_Passwd) with Convention => C; function Pwdb_Errstr ( E : access Pwdbstat_E; D : Defobj) return Interfaces.C.Strings.Chars_Ptr with Import => True, Convention => C, External_Name => "pwdb_errstr"; end Example_H; pragma Style_Checks (On); pragma Warnings (On, "-gnatwu");
Behavior
When the tool parses a file, it will generate an Ada package specification containing all the items necessary to interface with the C library. The specification is designed to be binary-compatible with the C call.
Some examples:
Records in Ada will have the same physical alignment as the C structure
Enumerated types will use representation clauses when the C enum defines values for the enumerals
Unions will be defined as variant records with the
Unchecked_Union
aspect
To prevent duplication of data, when a parsed file contains #include
directives,
the included header will be parsed separately to create its own Ada
package specification. So, parsing a single header file can generate
multiple Ada packages.
Package Naming Conventions
When a specification is generated, the package name will be the header’s
file name including the file extension. For example, stdio.h will
generate an Ada package called Stdio_H
. When dealing with a
hierarchical file structure (e.g. the sys folder in a typical include
directory), an empty package will be created for the folder, and then the
header will become a child of that package (e.g. sys/types.h becomes
Sys.Types_H
).
To create an API subsytem, the user can specify a top-level package name
such that all the generated specifications become children of the top-level
package. For example, if you want the generated API’s to be child packages
of C_Apis
, then the command:
h2ads –parent C_Apis stdio.h sys/types.h
would generate the files c_apis-stdio_h.ads and c_apis-sys-types_h.ads.
Macro Values
Many times, macro expansions can depend on compile-time switches, or even
on macros pre-defined by the compiler. (For instance, gcc on linux
automatically defines __linux
). h2ads does not know
any of these values, which can cause the translation process to either fail
or generate the binding incorrectly.
To get around this issue, on the command line you can specify a file that
will be treated as if it is included at the top of the header file.
This file can contain your own preprocessor statements (even #include
),
or a list of macros that are typically defined by the compiler.
h2ads –include macro_values.txt my_file.h
will parse my_file.h as if macro_values.txt was included in the file.
- Note
For gcc, to get a list of the predefined macros for your version / platform, you can run the command: gcc -E -dM foo.h, where foo.h is an empty file
How to Run ‘h2ads’
Running h2ads without any arguments (or by specifing –help) gives the usage information and description of the switches.
Usage: h2ads [switches] {filename | directory} [ {filename | directory} ...]
-h, --help Display help
-b, --base ARG Specify top level of source file hierarchy (default=current)
-c, --copyright Insert comments at beginning of header file as copyright text
-d, --destination ARG Specify directory to store generated files (default=current)
-D, ARG Specify macro definition
-f, --files ARG Specify text file containing headers to process
-I, ARG Specify additional include directories (passed as '-I <directory>')
-include, --include ARG File containing additional header information
-n, --nested ARG Comma-separated list of files to treat as nested packages
-p, --parent ARG Top-level package name
-q, --quiet Only display error messages
-r, --recursive Recursively parse directories
-s, --spark Add 'with Spark_Mode => Off' to packages
-x, --no_file_extension Do not append header file extension to Ada unit name
--stdinc Enable parsing of standard include directory. (Do not pass '-no-standard-includes' to parser.)
h2ads takes a list of filenames and directories to process. For a filename,
the tool will generate an Ada specification (plus additional specifications
needed to support the #include
directives). For a directory, every file
in the directory will be processed to generate a specification file.
The tool will generate file(s) with the naming convention of <filename>_<extension>.ads - so stdio.h will generate stdio_h.ads. A separate file will be generated for each header file included in the original file.
Upon completion, the original header file name will be written to standard output with an indication of the success of the binding generation.
Command-Line Switches
-h, –help
Print the usage information.
-b, –base <directory name>
When parsing header files, sometimes a #include
directive contains a
relative path. When that is the case, you can use this option to specify
the ‘base’ directory - indicating what directory the path is actually
relative to.
If not specified: If this switch is not specified, the path will be
considered relative to the directory where the header file
containing the #include
directive is located.
-c, –copyright
Many headers contain comments at the beginning of the file which can include copyright information, a description of the contents, and/or change history. As the Ada specification is just a translation of the C header, this information is usually still appropriate. By setting this switch, all comments at the beginning of the header up to the first non-comment will be copied to the beginning of the specification.
If not specified: No comments will be processed.
–compiler <compiler name>
When generating Ada bindings, assume that the C headers are compiled with the given compiler (assumed to be on the PATH). This means the bindings will be generated according to the compiler configuration, i.e. standard search headers and compiler-defined macros. Supported compilers are gcc and clang families.
-d, –destination <directory name>
Directory where Ada specification files will be stored.
-D <macro[=value]>
Specify a macro (and value) to be passed into the parser.
-f, –files <text file>
Rather than specifying a list of files and/or directories on the command line, you can specify a text file containing the the items to process (one per line). This switch can be used instead of or in addition two any command line arguments.
If not specified: Only files and directories specified as arguments will be parsed.
-I <additional directory>
Sometimes, when compiling C headers, additional directories need to be specified for the compiler to find included files. This option can be specified multiple times to supply those directories.
If not specified: Paths to all included headers should be resolvable in context.
-include, –include <text file>
This option allows you to specify a file containing additional header information, such as compiler- or user- supplied macro values. These typically would contain macro definitions that would be specified on the compiler’s command line.
If not specified: All macros should be resolvable in context.
-n, –nested <headers>
In certain situations, an included header ends up creating a
circular dependency. One example is when header one.h defines
a type MyType
, and then includes header two.h which
defines struct Struct
containing MyType
, and then one.h
later uses struct Struct
.
Treating one.h and two.h as separate packages would create a circular dependency. From an Ada perspective, we really want to treat then contents of two.h as nested within one.h.
To do this, use this switch to specify a (comma-separated) list of header files that should be nested within their enclosing file.
If not specified: All header files will generate their own Ada specifications.
-p, –parent <Ada package name>
Use this option to specify the parent package name for all generated packages.
- h2ads –parent C_Apis my_header.h
Will generate an Ada specification named
C_Apis.My_Header_H
If not specified: Header files at the top of the directory structure will just use the Ada specification name. Header files in subdirectories will use the subdirectory name as a hierarchy level for its included files.
-q, –quiet
When this switch is used, only error messages will be displayed.
If not specified: Warning messages will also be displayed. These kinds of messages generally indicate a failure to understand a particular construct. Warning messages typically mean no Ada code will be generated for the construct - the Ada specification will usually still compile, but it may be missing some information.
-r, –recursive
If this switch is set, then when a directory is passed on the command line, that directory will be processed recursively.
If not specified: Only directories explicitly passed on the command line will be traversed.
-s, –spark
Due to the limitations of building a thin binding, most of the generated
specifications will not meet the restrictions of the SPARK language subset
of Ada. Typically, an Ada package that does not enable SPARK restrictions
will be considered outside of the SPARK subset. However, it is possible
to set the build process to treat all packages as SPARK unless otherwised
noted. Setting this switch will add with SPARK_Mode => Off
to
the package definition.
If not specified: Packages will not contain the SPARK_Mode
aspect.
-x, –no_file_extension
Because code bases can have different file extensions to mean different
things, default behavior of h2ads causes the header file extension
to be appended to the filename to create the Ada unit name. For example,
types.h becomes package Types_H
. Using this switch will
prevent this behavior, such that types.h will become package
Types
.
Note: In case of naming collisions (e.g. types.h and
types.hxx reside in the same directory), the first file
encountered during processing (not necessarily the first alphabetically)
will get the unit name Types
, and the second will be named
Types_2
(and so on for other conflicts).
–stdinc
When parsing an include directive of the form #include <filename.h>
the parser may look in its “default” location for filename.h.
When using this tool to parse a non-native header structure, this could
be problematic. Because of that, the tool, by default, will pass
-no-standard-includes
to the parser. If you want the standard
library location to be used, set this switch.
If not specified: Default include directory will not be searched by the parser.
Handling Language Differences
Note: Overcoming language differences was handled in regards to the GNAT compiler. All of the design decisions will work for other compilers, but some of those decisions were required due to GNAT behavior.
File Names
Because C works on files (as opposed to Ada units), filenames are not required to be legal identifiers. In GNAT, default behavior is that the filename must match the package name. The following conventions will be enforced when converting filenames to package names.
All specification files will have the extension ads
All dots (‘.’) in a filename will be converted to underscores (‘_’)
Leading and trailing underscores will be replaced by ‘U’, and then separated from the rest of the name by an underscore. (e.g. __file__.h will be converted to uu_file_uu_h.ads
Filenames where the first non_underscore is a digit will have c_ added to the front (e.g. 1_file.h becomes c_1_file_h.ads
All other text will follow rules in the Identifiers section.
Identifiers
Because C is case-sensitive, and allows leading and trailing underscores, there is a
lot of opportunity for conflicts when converting to Ada identifiers (e.g. FooBar
and _foobar
). The following conventions will be enforced when converting
C identifiers to Ada identifiers.
Text that is all lower-case will have the first letter capitalized
Text that is all upper-case will remain all upper-case
Camel-case will have individual words separated by underscores, where the first letter in each word will be capitalized (e.g.
FooBar
will be converted toFoo_Bar
).Trailing and/or leading underscores will be removed
Consecutive underscores will be replaced by a single underscore (e.g.
foo__bar
becomesFoo_Bar
).When the entire identifier is an Ada reserved word, it will be prefaced by
C_
(e.g.select
becomesC_Select
).When multiple C names convert to identical Ada names (e.g.
foo, Foo, FOO
), occurences after the first will get a numeric suffix.Duplicated names in a single context (e.g.
void func (int Func)
will getThe_
added as a prefix (e.g.procedure Func (The_Func : Interfaces.C.Int)