Contents > 9 Extending the Metrics and Rule Engine > 9.6 Set Functions

## 9.6 Set Functions

A set function is used in set expressions (see Section 8.5 "Expression Terms") and yields an element set or a value set. To illustrate the implementation of set functions, we'll define one to calculate the symmetric difference of two sets.

The symmetric difference of two sets A and B is the set of elements contained in either A or B, but not both. For regular sets, we could express the symmetric difference in terms of the existing set operations as (A+B)-(A*B) (i.e., the union of the sets without the intersection of the sets, see Section 8.5.3 "Set Expressions"). For multisets, however, we must take the cardinality of elements into account: the cardinality of an element in the symmetric difference is the absolute difference of the cardinality of the element in sets A and B. For example, if the cardinality of element e is five in set A and three in set B, the cardinality of element e in the symmetric difference is two. The formula (A+B)-(A*B) would yield cardinality (5+3)-3=5 for element e and therefore cannot be used for multisets.

The following implementation handles both regular and multisets.

```   packacke com.acme;
import java.util.Collection;
import java.util.Iterator;

import com.sdmetrics.math.ExpressionNode;
import com.sdmetrics.metrics.MetricTools;
import com.sdmetrics.metrics.SDMetricsException;
import com.sdmetrics.metrics.SetOperation;
import com.sdmetrics.metrics.Variables;
import com.sdmetrics.model.ModelElement;

01 public class SetOperationSymmDiff extends SetOperation {

@Override
02 public Collection<?> calculateValue(ModelElement element,
ExpressionNode node, Variables vars) throws SDMetricsException {

03   Collection<?> left = evalSetExpression(element, node.getOperand(0),
vars);
04   Collection<?> right = evalSetExpression(element, node.getOperand(1),
vars);

05   boolean isMultiSet = MetricTools.isMultiSet(right)
|| MetricTools.isMultiSet(left);
06   Collection<?> result = MetricTools.createHashSet(isMultiSet);

// process elements from the first set
07   Iterator<?> it = MetricTools.getFlatIterator(left);
08   while (it.hasNext()) {
09     processElement(it.next(), result, left, right);
}

// process additional elements from the second set
10   it = MetricTools.getFlatIterator(right);
11   while (it.hasNext()) {
12     Object o = it.next();
13     if (!left.contains(o)) {
14       processElement(o, result, left, right);
}
}
15   return result;
}

@SuppressWarnings({ "unchecked", "rawtypes" })
16 private void processElement(Object o, Collection col,
Collection<?> left, Collection<?> right) {
17   int leftCount = MetricTools.elementCount(left, o);
18   int rightCount = MetricTools.elementCount(right, o);
19   int count = Math.abs(leftCount - rightCount);
20   for (int i = 0; i < count; i++)
21     col.add(o);
}
}
```
Once more, we discuss the salient features of this implementation, line by line.
• 01: The class must have public visibility, a standard constructor, and extend the abstract class com.sdmetrics.metrics.SetOperation.
• 02: Like Boolean and scalar functions (cf. Section 9.4.2 "Implementation of the Boolean Function"), the base class defines an abstract method calculateValue, with identical input parameters. The return value of the method is of type java.util.Collection and provides the result set of the function. Regular sets must be represented by instances of java.util.HashSet, multiset must be instances of com.sdmetrics.math.HashMultiSet. An "element set" (see Section 8.2 "Definition of Sets") contains instances of ModelElement. A "value set" contains instances of java.lang.Number (Integer or Float), or strings.
• 03-04: Method evalSetExpression evaluates set expressions. We use it to calculate the two input sets passed as arguments into the function.
• 05: Class MetricTools contains a number of static methods that are useful when dealing with sets that may be either regular or multisets. Method isMultiSet checks if a set is a multiset or a regular set.
• 06: Method MetricTools.createHashSet(boolean) creates a new, empty regular set or multiset. The Boolean parameter determines the type of set created. In our example, we create a multiset if at least one of the input sets is a multiset.
• 07: Method MetricTools.getFlatIterator(Collection) obtains an iterator over the elements in a set that returns each element in the set exactly once, even if the set is a multiset and the cardinality of the element is greater than one.
• 17: Method MetricTools.elementCount(Collection, Object) determines the cardinality of an element in a set. For regular sets, the method returns 1 if the element is contained in the set, else 0.
• 20-21: The element is added to the result set, multiple times if necessary to get the cardinality of the element right. This implementation does not check the types of the elements added to the result set. That way, the types of the elements in the input sets determines if the result set is an element set or a value set.
To use the set function, we register it with the metrics engine as follows:
```<setoperationdefinition name="symmdiff"
class="com.acme.SetOperationSymmDiff" />
```
Again, we deploy the class file of the class in the "bin" folder of our SDMetrics installation (path com/acme/SetOperationSymmDiff.class). After that, we can write set expressions using the new function. For example:
```<metric name="FooBar" domain="package">
<compoundmetric term="size(symmdiff(FooBarClassesSet, FooBazClassesSet))" />
</metric>
```