Nov 032014

XML is a format commonly used for the interchange of data between software applications.

The easiest way to process data in XML format is by means of some procedure that reads the whole document into a data structure native to the programming language used. In PHP, the result would be an associative array of pairs (key,value). Each of the values in the array, in turn, would be an associative array,  or a primitive value of numeric or string type.

But sometimes, the file holding the XML document may be up to several GBytes in size, and the volume of data to be processed is too big to hold the entire document in memory. In those cases, the file must be processed as a stream: Elements in the file need to be read an processed one element at a time.

This post explains how to work in PHP with large XML documents, by reading them as streams.

Continue reading »

 Posted by at 9:05 am
Jul 032014

JSON and XML are to formats commonly used to represent structured data in a text file. Sometimes, data exported from a tool in JSON format needs to be imported in a different tool that only accepts XML as input.

This post explain how to perform the conversion from JSON to XML using a perl script.

Continue reading »

 Posted by at 11:38 am
Oct 312012

In our previous post we have explained how to process a file in XML format using the XML::Simple module from CPAN.

However, that module works by reading the whole file in memory. This is not suitable if the file to be processed is large and the RAM memory resources available are limited.

Instead, we can use the XML::Parse::PerlSAX module (SAX stands for “Simple API for XML”). Using this module, the file is read as a data stream, and events such as “start of element”, “end of element”, etc. are generated. The programmer needs only to provide an event handler package implementing methods to process these standard events.

Continue reading »

 Posted by at 8:05 pm