Optimizations#

While PyJess started as a Cython wrapper of Jess, it also contains several optimizations to make the code write better while maintaining consistency with the original Jess code. Some of these optimizations are described below, as well as the version where they were introduced.

Residue name index#

Added in version 0.5.1.

Given a Molecule, matching a new Template to that Molecule requires generating a set of candidate atoms for each TemplateAtom of the template according to its match_mode.

In the original Jess code, this is done by a full-scan on each Atom of the Molecule for each TemplateAtom at index k:

Atom* A;
Molecule* M;
Template* T;
CandidateSet* S;
int n, i;

n=Molecule_count(M);
for(i=0; m<n; m++)
{
    A=(Atom*) Molecule_atom(M,i);
    if(T->match(T,k,A))
    {
        S->atom[S->count]=A;
        S->count++;
    }
}

However, the most common match modes require the candidate Atom to have a residue_name equal to one of the TemplateAtom residue_names. To exploit this, and avoid a full-scan, we can create an index grouping the atoms of a Molecule by their residue_name, and only iterate on atoms with the right residue_name when building the candidates for a TemplateAtom:

Atom* A;
Molecule* M;
Template* T;
CandidateSet* S;
Atom** atoms;

for(name=T->residueNames(T,k);name!=NULL;name++)
{
    for(atoms=Molecule_atoms(M,resname);*atoms!=NULL;atoms++)
    {
        A=*atoms;
        if(T->match(T,k,A))
        {
            S->atom[S->count]=A;
            S->count++;
        }
    }
}

By doing so we greatly reduce the number of calls to T->match, at the cost of computing an index when a new Molecule is created. This is grealy beneficial for a large number of templates, with an additional \(O(n)\) memory requirement.

k-d tree generation#

Added in version 0.6.0.

Jess uses a k-d tree data structure to partition the molecule into geometric regions and speed-up the retrieval of atoms in each region. To generate a \(k\)-d tree from a list of Atom, the atoms are recursively partitioned on a single dimension around a pivot value, usually the median coordinate for that dimension.

The original Jess code uses qsort (the QuickSort implementation of the C standard library) to first sort the atoms on a single dimension, and then takes the middle point:

KdTreeNode* N;
int* idx;
int n;

qsort(idx,n,sizeof(int),KdTree_compare);

split = n/2;
N->index=idx[split-1];
N->type=type;

type = (type+1)%dim;
N->left = KdTreeNode_create(idx,split,type,u,dim);
N->right = KdTreeNode_create(&idx[split],n-split,type,u,dim);

While implemented very efficiently, QuickSort has an average runtime complexity of \(O(nlog(n))\). Given that the algorithm is only used here to search for the median, and that the sort order is actually irrelevant, we replaced it with the QuickSelect algorithm, which can retrieve the median with an average runtime complexity of \(O(n)\).

Approximate annulus intersection#

Added in version 0.6.0.

During the search for matches in a Scanner, Jess takes into account the flexibility of the Template by modelling the distance constraints to each TemplateAtom using an annulus. Then, it queries the \(k\)-d tree created on the candidate atoms to find candidate atoms that are included in this annulus.

In the original Jess, the traversal of the \(k\)-d tree is done by computing for each internal node of the tree whether the box formed by that node intersect the query annulus, and for each leaf whether they are contained in that annulus.

Computing the intersection between a box and an annulus requires computing Euclidean distances, and therefore products between real numbers, an operation that is among the slowest even on modern CPUs:

Annulus* A;
double minBox[d];
double maxBox[d];

double minSum=0.0;
double maxSum=0.0;
for(i=0; i<d; i++)
{
    double t1 = A->centre[i]-minBox[i];
    double t2 = A->centre[i]-maxBox[i];
    t1 *= t1;
    t2 *= t2;

    if(minBox[i]>A->centre[i] || maxBox[i]<A->centre[i])
        minSum += min(t1,t2);

    maxSum += max(t1,t2);
}

bool intersects = minSum>(A->outer*A->outer) || maxSum<(A->inner*A->inner);

To speed-up the querying, we approximate the query annulus as a bounding box, and instead compute the intersection to the \(k\)-d tree box to the bounding box, which only requires comparisons:

Annulus* A;
double minBox[d];
double maxBox[d];

bool intersects = true;
for(i=0; i<d; i++)
{
    double dmin = A->centre[i]-A->outer;
    double dmax = A->centre[i]+A->outer;
    if( !( dmin <= maxBox[i] && minBox[i] <= dmax ) )
        intersects = false;
}

As this is an approximation, it may wrongly return true on (literal) corner cases, i.e. when the intersection happens in a corner of the bounding box around the annulus. However, given that the \(k\)-d tree later checks that the points are actually included in the annulus for the leaf nodes, using this implementation will not generate false positives.

Reduced backtracking#

Added in version 0.6.0.

The Scanner in Jess iterates on the Template atoms and then aligns the Molecule atoms iteratively. When no more candidate for a TemplateAtom can be found in a Molecule, it backtracks to the previous TemplateAtom, and continues this iteration.

Since every backtracking event triggers the querying of the \(k\)-d tree, we want to minimize backtracking as much as possible. To do so, we compute an iteration order over the Template atoms that minimizes the amount of backtracking required to explore all paths. This can be done quickly by sorting the TemplateAtom using the number of Atom in the corresponding CandidateSet.

Empirically, this approach reduced the amount of \(k\)-d tree queries by a factor of 10. It is nevertheless unclear whether an optimal path can be further identified and pre-computed from the candidate Atom, possibly by filtering distant atoms first.

Warning

Out all the optimizations in this section, this one is the only one to introduce slight behavioral changes in PyJess compared to Jess. Since the order in which candidate atoms are matched have changed, the order in which a PyJess Query yields Hit objects differs to that in which Jess reports a match. This only affects the order though: all matches are still returned, and are 1-to-1 identical! This can be disabled by running with Jess.query(..., reorder=False) to use the original matching order, at the cost of a longer runtime.

Type concretization#

Added in version 0.6.0.

Jess uses generic code to support multiple types of spatial regions using function pointers to implement virtual method tables. While elegant and functional, the code never really makes uses of the genericity, and therefore the algorithm can be specialized to the appropriate Region concrete type (either Join or Annulus) to remove the overhead of calling the function pointers.

In addition, most of the geometric code is inlined, as it is called in hot paths where the compiler can apply auto-vectorization.

Dimension concretization#

Added in version 0.6.0.

The Annulus type is made to be generic over the number of dimensions, but in practice it is only used for 3-dimensions. We hardcoded the dimensions to encourage the compiler to unroll loops over the Annulus dimensions where applicable, and to use constant-size arrays rather than dynamic allocation when applicable.

Scanner memory recycling#

Added in version 0.6.0.

The original Jess code performs a lot of allocation/deallocations in hot paths, as it creates a new Scanner which allocates memory for each Template / Molecule pair to match. Effectively, most of these buffers can actually be reused across Templates for a given Molecule, provided sufficient bookkeeping. Our implementation keeps allocation to a minimum across an entire Query.