Jul 122014
Article Perl

In a perl script, data being processed is usually read into one or several perl data structures: arrays and hashes. But, if the volume of data to be processed is large, the available memory resources may not be enough to keep it all.

This post explains some possible ways to deal lwith this issue.

Using a database to store the information

This is the most straightforward way to deal with large data sets. A database (MySQL, PostgreSQL,…) is a tool specifically suited to the handling of large volumes of data. In a relational database, a set of tables, indexes, stored procedures, etc. can be defined for storing and issuing queries to the data being processed, in an efficient way and making an optimal use of the available resources.

Storing the data structures in files with DBM::Deep

Using the CPAN module DBM::Deep, a file can be “tied” to a perl array or hash. In this way, data stored in the data structure are transparently written to the file, instead of being kept in main memory. In this way, the responsability to perform the required data transfers between the file system and the main memory in the most efficient way is delegated to the operating system itself.

As already mentioned, once the file is tied to the data structure, the association can be completely transparent, and the arrays or hashes can be used exactly as a “normal” array or hash.

Example 1 – Using DBM::Deep with a hash:

The first time this sample script is run, a file “data.db” is created, and the value “value” assigned to key “key” is stored in it.

The next time the script is executed, the value previusly assigned to “key” is retrieved from “data.db”, and the resulting output is:

Example 2 – Using DBM::Deep with an array

As we can see, the main difference with respect with the code used for a hash is in the way arguments are passed to new(). A “type” argument needs to be passed to tell DBM::Deep to tie the file to an array reference, instead of the default hash reference.

Other than that, the array is used as a “normal” array. As we can see in the example, nested arrays or hashes can be added to the array, as well as scalar elements.


Index of posts related to perl programming

 Posted by at 5:39 pm

 Leave a Reply