7. Python API tutorial

Now that you are familiar with Libadalang’s Core concepts, let’s actually do some practice with the Python API.

Note

Libadalang’s Python API supports Python 3.9 and 3.10.

7.1. Preliminary setup

As seen in the section on core concepts, the first thing to do in order to use Libadalang is to create an analysis context. We’ll create a simple test.py file with the following content:

import libadalang as lal
context = lal.AnalysisContext()

This very simple program will allow us to make sure the Python environment is properly setup to use Libadalang. Running the above file should yield no error and no result.

# Empty program output
$ python test.py
$

7.2. Browse the tree

Ok, so now let’s do something useful with Libadalang. Let’s create a program that will read all the source files given in argument and then output all the object declarations they contain.

Once you have an analysis context at hand, parsing an existing source file into an analysis unit is very simple:

import libadalang as lal
context = lal.AnalysisContext()
unit = context.get_from_file("my_ada_file.adb")

Assuming that parsing went well enough for the parsers to create a tree, libadalang.AnalysisUnit.root() will return the root node associated to unit. You can then use libadalang.AdaNode.finditer() on the root node to iterate on every node in the tree via a generator:

import libadalang as lal
context = lal.AnalysisContext()
unit = context.get_from_file("my_ada_file.adb")

for node in unit.root.finditer(lambda n: True):
    pass

If there are fatal parsing errors, or if the file cannot be read, the unit root will be null, but the unit will have diagnostics that you can access via the libadalang.AnalysisUnit.diagnostics() property on the analysis unit. The property will return a list of libadalang.Diagnostic.

if unit.diagnostics:
    for d in unit.diagnostics:
        print("{}: {}".format(d.sloc_range.start, d.message))

Now what can we do with a node? One of the first things to do is to check its type: is it a subprogram specification? a call expression? an object declaration? The way to do that in Python is by calling the libadalang.AdaNode.is_a() method on a node, giving a type object as a parameter (it’s just a shortcut for isinstance). Here, we want to specifically process the nodes whose type is libadalang.ObjectDecl.

Another useful thing to do with nodes is to relate them to the original source code. The first obvious way to do this is to get the source code excerpts that were parsed to create them: libadalang.AdaNode.text() does this. Another way is to get the source location corresponding to the first/last tokens that belong to this node: libadalang.AdaNode.sloc_range() will do this, returning a libadalang.SlocRange. This provides the expected start/end line/column numbers.

print("Line {}: {}".format(node.sloc_range.start.line, repr(node.text)))

7.2.1. Accessing node fields

Another thing to do with nodes is to access their fields. Each kind of node has a specific set of fields: child nodes in the parsing tree. For instance, ObjectDecl nodes have 8 syntactic fields:

  • f_ids: identifiers for the declared objects;

  • f_has_aliased: node to materialize the presence/absence for the aliased keyword;

  • f_has_constant: node to materialize the presence/absence for the constant keyword;

  • f_mode: node to materialize the parameter passing mode (when the object declaration is used as a generic formal);

  • f_type_expr: type for the declared objects;

  • f_default_expr: expression to initialize the declared objects or provide a default value;

  • f_renaming_clause: part that follows the renames keyword when the declaration is a renaming.

  • f_aspects: list of aspects associated to this declaration.

Accessing them is as simple as using the homonym attribute on the node that contains the field. For instance, in order to get the type expression for an object declaration:

obj = get_some_object_decl()
print(obj.f_type_expr)

Note that is is always valid to access syntax fields for non-null objects. Some fields may contain a null node, for instance the ObjectDecl.f_default_expr field is null for the V : T; object declaration.

7.2.2. Final program

Put all these bit in the right order, and you should get something similar to the following program:

import sys
import libadalang as lal

context = lal.AnalysisContext()

for filename in sys.argv[1:]:
    unit = context.get_from_file(filename)
    print("== {} ==".format(filename))
    for d in unit.diagnostics:
        print("{}: {}".format(filename, d))

    if unit.root:
        for node in unit.root.finditer(lambda n: n.is_a(lal.ObjectDecl)):
            print("Line {}: {}".format(
                node.sloc_range.start.line, repr(node.text)))

If you run this program on the ada example program, you should get:

== main.adb ==
Line 33: u'Context : constant LAL.Analysis_Context := LAL.Create_Context;'
Line 38: u'Filename : constant String := Ada.Command_Line.Argument (I);'
Line 39: u'Unit     : constant LAL.Analysis_Unit :=\n            Context.Get_From_File (Filename);'

7.2.3. Note on API discoverability

The Ada syntax is rich; as a consequence, there are many node kinds, and each have many syntax fields. Short of reading the language grammar, the best way to discover the nodes that parsing creates is to let Libadalang parse an example and print the resulting tree. This is easily done with the dump method:

# Test script

import libadalang as lal
import sys

ctx = lal.AnalysisContext()
u = ctx.get_from_file(sys.argv[1])
for d in u.diagnostics:
    print(u.format_gnu_diagnostic(d))
u.root.dump()
--  Source to parse

package Pkg is
end Pkg;

Running the above program on the pkg.ads source file yields:

CompilationUnit pkg.ads:1:1-2:9
|f_prelude:
|  AdaNodeList pkg.ads:1:1-1:1
|f_body:
|  LibraryItem pkg.ads:1:1-2:9
|  |f_has_private:
|  |  PrivateAbsent pkg.ads:1:1-1:1
|  |f_item:
|  |  PackageDecl ["Pkg"] pkg.ads:1:1-2:9
|  |  |f_package_name:
|  |  |  DefiningName "Pkg" pkg.ads:1:9-1:12
|  |  |  |f_name:
|  |  |  |  Id "Pkg" pkg.ads:1:9-1:12: Pkg
|  |  |f_aspects: None
|  |  |f_public_part:
|  |  |  PublicPart pkg.ads:1:15-2:1
|  |  |  |f_decls:
|  |  |  |  AdaNodeList pkg.ads:1:15-1:15
|  |  |f_private_part: None
|  |  |f_end_name:
|  |  |  EndName pkg.ads:2:5-2:8
|  |  |  |f_name:
|  |  |  |  Id "Pkg" pkg.ads:2:5-2:8: Pkg
|f_pragmas:
|  PragmaNodeList pkg.ads:2:9-2:9

We can see here that the parse tree for pkg.ads is made of:

  • a CompilationUnit node as the root of the tree; that node has children in 3 syntax fields:

  • its f_prelude field is an AdaNodeList node, that is an empty list (i.e. it has no children itself);

  • its f_body field is a LibraryItem node, which has itself other syntax fields (f_has_private and f_item);

  • its f_pragmas field is a PragmaNodeList that is an empty list;

  • the PackageDecl node has a null f_aspects syntax field.

7.3. Follow references

While the previous section only showed Libadalang’s syntactic capabilities, we can go further with semantic analysis. The most used feature in this domain is the computation of cross references (“xrefs”): the ability to reach the definition a particular identifier references.

7.3.1. Resolving files

As mentioned in the Core concepts section, the nature of semantic analysis requires to know how to fetch compilation units: which source file and where? Teaching Libadalang how to do this is done through the use of unit providers.

The default unit provider, i.e. the one that is used if you don’t pass anything specific to libadalang.AnalysisContext, assumes that all compilation units follow the GNAT naming convention and that all source files are in the current directory.

If the organization of your project is not so simple, you have two options currently in Python:

Be aware though, that because of lack of access to proper Python API to process GNAT project files, the corresponding facilities in Python are limited for the moment. If the above options are not sufficient for you, we recommend using the Ada API.

In our program, we’ll create a simple project unit provider if a project file is provided. If not, we’ll use the default settings.

Finally, let’s update our code to use Libadalang’s name resolution capabilities: when we find an object declaration, we’ll print the entity representing the type of the object declaration.

import libadalang as lal
import argparse

parser = argparse.ArgumentParser()
parser.add_argument('--project', '-P', type=str)
parser.add_argument('files', help='Files to analyze', type=str, nargs='+')
args = parser.parse_args()

provider = None
if args.project:
    project = lal.GPRProject(args.project)
    provider = project.create_unit_provider()

context = lal.AnalysisContext(unit_provider=provider)

for filename in args.files:
    unit = context.get_from_file(filename)
    print("== {} ==".format(filename))
    for d in unit.diagnostics:
        print("{}: {}".format(filename, d))

    if unit.root:
        for node in unit.root.finditer(lambda n: n.is_a(lal.ObjectDecl)):
            print("Line {}: {}".format(
                node.sloc_range.start.line, repr(node.text)
            ))
            type_decl = node.f_type_expr.p_designated_type_decl
            if type_decl:
                print("   type is: {}".format(repr(type_decl.text)))

The most interesting line is emphasized above and does the following:

This time, running this updated program on the equivalent Ada version will yield something like:

== main.adb ==
Line 33: u'Context : constant LAL.Analysis_Context := LAL.Create_Context;'
   type is: u'type Analysis_Context is tagged private;'
Line 38: u'Filename : constant String := Ada.Command_Line.Argument (I);'
   type is: u'type String is array (Positive range <>) of Character;'
Line 39: u'Unit     : constant LAL.Analysis_Unit :=\n            Context.Get_From_File (Filename);'
   type is: u'type Analysis_Unit is tagged private;'

We have seen here libadalang.TypeExpr.p_designated_type_decl(), which resolves references to types, but Libadalang offers many more properties to deal with name resolution in Ada:

  • libadalang.AdaNode.p_xref() will try to resolve from any node to the corresponding declaration, much like an IDE would do when you Control-click on an identifier, for instance.

  • All the p_body_part* and p_decl_part* properties will let you navigate between the specification and body that correspond to each other for various nodes: subprograms, packages, etc.

  • libadalang.AdaNode.p_expression_type() returns the type of an expression.

  • libadalang.AdaNode.p_generic_instantiations() returns the list of package/subprogram generic instantiations that led to the creation of this node.

You can find these and all the other properties documented in your favorite language’s API reference.

7.3.2. Find all references

Source processing tools often need to look for all references to an entity. For instance: all references to an object declaration, all types that derive from a type T, all calls to a subprogram P, etc.

Libadalang provides several properties to answer such queries: p_find_all_references, p_find_all_derived_types, p_find_all_calls, etc. All these properties have in common that they take as argument the list of analysis units in which to look for the references. For instance, in order to look for all the references to the v object declaration in units foo.adb, bar.adb and foobar.adb, one may write:

import libadalang as lal

context: lal.AnalysisContext = ...
v: lal.ObjectDecl = ...

v_first_id = v.f_ids[0]
units = [context.get_from_file("foo.adb"),
        context.get_from_file("bar.adb"),
        context.get_from_file("foobar.adb")]

print(f"Looking for references to {v_first_id}:")
for r in v_first_id.p_find_all_references(units):
    print(f"{r.kind}: {r.ref}")

The first step is to get the defining_name node on which to perform the query: in the A, B : Integer object declaration, for instance, this allows one to specifically query all references to A. The second step is to select the set of units in which to look for references. The last step is to call the p_find_all_references property and process its results.

This property returns an array of RefResult values, which contain both: ref (a BaseId node), which constitutes the reference to the defining name, and kind (a RefResultKind enumeration value), which gives more information about this reference: whether Libadalang successfully managed to compute this information, whether it had to do error recovery or completely failed (for instance due to incorrect analyzed source code).

7.3.3. List of sources in a project

Even though there is no dedicated Python API to analyze GNAT project files, Libadalang provides a convenience function to compute such a list: libadalang.GPRProject.source_files. This is especially useful to compute the analysis units to pass to the p_find_all_* properties (described in the previous section).

This function takes the information necessary to load a project tree (name of the project file, scenario variables, etc.), a mode to determine the scope of the sources to consider (root project only, the whole project tree, the runtime, …) and just returns the list of source files:

import libadalang as lal

project = lal.GPRProject(...)
context: lal.AnalysisContext = lal.AnalysisContext(
    unit_provider=project.create_unit_provider(...),
    ...
)
id: lal.DefiningName = ...

source_files = project.source_files()
units = [context.get_from_file(f) for f in source_files]

print(f"Looking for references to {id}:")
for r in id.p_find_all_references(units):
    print(f"{r.kind}: {r.ref}")