What is XML ?
XML stands for Extensible Markup Language (XML). It is a set of rules (= markup language) for documents that are both human-readable and machine-readable. Because of this feature, it can serve well when data has to be created or checked by humans but interpreted by machines.
XML consists of markup and content. The most common construct is content, enclosed by a start and an end tag as markup:
<tagName>content</tagName>
Or, the same, just different looks:
<tagName>
content
</tagName>
The start tag, content and the fitting end tag together form an element. The content can contain one or several elements itself, which are then called child elements. This way a hierarchical structure can be built using nesting:
<tagName>
<childTagName>
childContent
</childTagName>
</tagName>
The whole hierarchical structure is then called a tree, the outermost element is the root element. Working with XML always means working with this tree structure, going to the right hierarchical level and extracting, modifying or adding elements.
An element can additionally have attributes, which are located inside the first tag:
<tagName attribute1="value1" attribute2="value2">
content
</tagName>
That was XML in a nutshell. Here are some topics for the interested reader to find out more:
- Checking if an XML document is well-formed
- Writing a header for a XML document to specify the XML version
- Adding comments
- Defining a fixed format with DTD or Schemes
- Preventing duplicate tag use with namespaces
Use for simulation
In a simulation environment, possible applications can be:
- layout information (the layout gets created in the model at runtime)
- parameters of objects
- domain specific data (production plan, shipping dates, events,.....)
- all of the above*
As an example let's define an XML that contains a basic information about a product:
<data>
<product>
<color>red</color>
</product>
</data>
We can use this to create certain products by a Source element in an AnyLogic process flow later.
*Note: Actually, the AnyLogic source file (with the .ALP ending) is also XML. Layout information as well as all other model information is stored in an XML format. There is however no documentation available for this format and it is regularly changed with new versions of AnyLogic. If you want to know more about the XML of the .ALP, you can read this article.
How to work with XML in Java ?
There are four different approaches on how to work with XML in Java:
- DOM: Loads the whole XML tree in the temporary memory. Packages: dom4j, XOM.
- Pull: The elements can be iterated over by a cursor, certain elements can be pulled. Packages: XPP, StAX.
- Push: The parser works event based, it triggers functions in your code whenever it finds a certain element or text. Packages: SAX.
- Mapping: A representation of the XML data in form of Java objects is built. Packages: JAXB, XStream
Which one to use depends on your requirements. If you need to sort, or resolve references between different elements, you need to have the full tree in memory (DOM). If you only want to search for certain elements or read data to fill own data structures, Pull/Push might be good. Mapping is adding a layer abstraction between XML parsing and actual working with the XML data. Mapping can internally switch between DOM, SAX, StAX and so on, without having to change the code.
In this example we want to read the data and create products based on it. We will do it using DOM.
The model
The model consists of a very simple process flow, a basic layout (2 nodes and one connecting path) and a file object. The idea is that objects get created according to the data in the XML file by code and enter the flow at the Enter block.
Three helper functions exist:
- readFile()
- createObject(Color color)
- parseColor(String colorString)
The function readFile()
, which reads the file and returns it as a String:
java.lang.StringBuilder sb = new java.lang.StringBuilder();
while(file.canReadMore()){
sb.append(file.readLine());
}
return sb.toString();
The function createObject(Color color), which is called for each product found in the XML data:
Agent agent = new Agent();
enter.take(agent);
agent.setColor(color);
And finally parseColor(String colorString) used for parsing the color information from the XML content to the correct Java type.
How to integrate XML parsing in AnyLogic
No external Jar files have to be included in the project. Some packages have to be imported, this is done in Imports section in Main:
import java.io.File;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
import org.w3c.dom.Element;
import java.io.ByteArrayInputStream;
import java.nio.charset.StandardCharsets;
We write a function parseXML(String inputString)
that will, well, parse the XML for us.
The first part creates and prepares the DOM parser and reads in the whole XML tree.
In the second part, the correct location in the tree is found (product) and then all sub-elements are retrieved as a list and iterated.
For each element product found, a function in the model is triggered.
try {
//First part: prepare DOM tree
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
ByteArrayInputStream input = new ByteArrayInputStream(inputString.getBytes(StandardCharsets.UTF_8));
Document doc = builder.parse(input);
doc.getDocumentElement().normalize();
//Second part: Find location, retrieve list of items
NodeList nList = doc.getElementsByTagName("product");
Node nNode;
Color color;
for (int temp = 0; temp < nList.getLength(); temp++) {
nNode = nList.item(temp);
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
color = parseColor(eElement
.getElementsByTagName("color")
.item(0)
.getTextContent());
//for each product entry get the color element and trigger a function in the model
createObject(color);
}
}
} catch (Exception e) {
e.printStackTrace();
}
Complete Model
Taken all this together, we now have a model that takes data in XML format from a text file and uses it to create and parameterise objects in the model:
The source files of the model can be downloaded from the AnyLogic Cloud.
Conclusion
One can see that parsing XML is quite easy. The structured text format of XML has advantages on both sides: for humans it is readable and modifiable, even without any special software. And in the model, the tree structure makes it easy to find, retrieve and process the information. XML should therefore be considered for input data, as an alternative to Excel, CSV, database etc.