The element module

Module which provides the base class for πtree Abstract Syntax Trees (ASTs) and provides serialization/deserialization. The code generated by πtree will define specific ASTs for your project, as derived classes of that defined here.

Generally you shouldn’t check this file in to your project. Instead, at around the same time you run πtree to generate your t_def.py or similar file, also install this module by using the pitree --install-element command.

class ndcode.piyacc.element.Element(text=None, children=None)

Class which holds a single node of a πtree AST.

children = None

Contains the direct child nodes, also of class Element or a derived class of Element.

  • Often the number of children and their meanings will be known in advance. For example, a BinaryExpression class might have left and right children, accessed via children[0] and children[1].

  • Sometimes the number of children can be arbitrary. For example, a Function class might contain an arbitrary number of statements as direct children.

It is expected that the types of the children will be known statically. In the BinaryExpression example, the children would be of class Expression or a derived class of Expression. In the Function example, the children would be of class Statement or a derived class of Statement. When the children are implicitly a tuple the children can be typed independently of one another. When they are implicitly a list they should ideally have uniform type.

If no children argument is passed to the constructor, it will default to None and then be internally translated to a freshly constructed empty list []. This ensures that it is mutable and not shared with other instances.

deserialize(element, ref_list)

Internal routine that supports deserialization. It should not be called directly – use element.deserialize() instead. It will be overridden in a derived class if the class has fields, to set the fields from the attributes of the xml.etree.ElementTree node passed in as the element argument.

serialize(ref_list)

Internal routine that supports serialization. It should not be called directly – use element.serialize() instead. This method converts self into an xml.etree.ElementTree node, populated with the recursive conversion of its children. It will be overridden in a derived class if the class has fields, to populate the attributes of the returned xml.etree.ElementTree node.

text = None

Contains strings of text to interpolate between the child nodes. Must have length len(self.children) + 1. So, for example, if there are two child nodes the contents of the node can be conceptualized as text[0] children[0] text[1] children[1] text[2].

  • For nodes with children, the text is often not very significant and may be set to all empty strings. For example, a BinaryExpression node could have self.text == ['', '', ''] and only children[0] and children[1] significant. On the other hand, it could store the operator as a string, such as '+' or '-', in the text[1] field, if needed. This would print in a natural way. A combination of approaches is also possible, so that text isn’t significant, but would be filled in before pretty-printing the tree.

  • For nodes with no children, often the text[0] value holds the content of the node. For example, an Identifier node would usually store its identifier string in text[0]. Again, this would print in a natural way.

If no text argument is passed to the constructor, it will default to None and then be internally translated to a freshly constructed list of the correct number of empty strings. This ensures that it is the right length and also that it is mutable and not shared with other instances.

visited = None

During serialization, this indicates nodes that have been encountered before, so that DAGs or circular constructs can be serialized and reconstructed later. It contains either None or (element, ref, seen).

Note that it is not allowed to have multiple use of direct children (direct children are nodes in the self.children list). That is, a node may have at most one direct parent. Any DAGs or circular constructs must be via fields. Fields are added by creating a derived class and overriding the serialization methods appropriately (or using the πtree generator, which does this for you).

ndcode.piyacc.element.concatenate(children, factory=<class 'ndcode.piyacc.element.Element'>, *args, **kwargs)

Convenience function to concatenate an arbitrary number of nodes into one.

The nodes are concatenated into a new empty node constructed by the factory function that you specify. Only the text and children are taken from the nodes being concatenated, the types of the nodes and any data in fields are ignored.

The factory argument is usually a constructor for an Element-derived object, but it can also be any arbitrary function, and any further arguments sent after the factory argument will be sent into the factory call.

For example, suppose node a has two children and node b has one. Then the call concatenate([a, b]) is equivalent to:

Element(
  children = [a.children[0], a.children[1], b.children[0]],
  text = [a.text[0], a.text[1], a.text[2] + b.text[0], b.text[1]
)
ndcode.piyacc.element.deserialize(fin, factory=<class 'ndcode.piyacc.element.Element'>, encoding='unicode')

Front end to the deserializer. Essentially, reverses the process of element.serialize(). All the same comments apply to this function also.

The tricky part with deserializing is knowing what kind of object to construct. For instance if the XML looks like this,

<root>
  <AnObject ref="0">some text</AnObject>
</root>

we want to find a constructor for a derived class of Element called AnObject. This is the role of the factory function that you pass in to deserialize(). It takes a tag name such as 'AnObject', followed by the arguments to be passed into AnObject’s constructor. Typically the factory function will be written like this:

tag_to_class = {
  'AnObject': AnObject
}
def factory(tag, *args, **kwargs):
  return tag_to_class[tag](*args, **kwargs)

It is also possible to have a more complex factory function, for instance if you have defined an AST to use as a mini-language inside another AST, and you want to defer object creation to the mini-language’s factory for those objects.

ndcode.piyacc.element.deserialize_ref(value, ref_list)

Internal routine to deserialize a reference and return an object of type Element or a derived class of Element. It is meant to be called from the deserialize() method of a derived class of Element, for deserializing fields that are references to an AST node or AST subtree.

The reference has already been processed as an integer and hence the value is the position in ref_list where the referenced object is to be found, or -1. Returns that object from ref_list, or None for -1.

ndcode.piyacc.element.serialize(value, fout, encoding='unicode')

Front end to the serializer. Pass in an AST that is to be serialized and an output stream to place the serialized output on. The encoding should be passed as either 'unicode' for encoding to standard output or 'utf-8' for encoding to a file descriptor, this is a bit hacky and one should refer to the code of xml.etree.ElementTree to see what it really does with this value.

The output stream will look something like:

<root>
  <AnObject ref="0"><ANestedObject /></AnObject>
  <AnotherObject ref="1" />
</root>

The object with the attribute ref="0" corresponds to the value parameter passed in here. Any direct children, grandchildren etc will be serialized inside it. Objects which are accessed through fields from those objects will be serialized separately and be given higher reference numbers. Note that those secondary objects can also have direct children, which are serialized inside those secondary objects, and so on. The <root>...</root> element then ends up being a collection of all objects with no direct parent.

ndcode.piyacc.element.serialize_ref(value, ref_list)

Internal routine to serialize a reference and return a value that can be placed in the attribute dictionary of an xml.etree.ElementTree node. It is meant to be called from the serialize() method of a derived class of Element, for serializing fields that are references to an AST node or AST subtree.

This is a special case, since other kinds of values (int, str etc) can be serialized by the json module, whereas references must be recursively converted. If the reference has already been serialized its value is returned directly, otherwise it will be added to the list ref_list and serialized. The value returned is its position in ref_list (it behaves like serializing an integer from that point on). The None value is serialized as -1.

ndcode.piyacc.element.to_text(root)

Convenience function to recursively extract all the text from a subtree.

The result is similar to serializing the subtree (itself, its direct children, grandchildren etc) to XML, and then throwing away all XML tags and attributes.

For example, if there are two child nodes, the value returned will be:

text[0] + to_text(children[0]) + text[1] + to_text(children[1] + text[2]