DL4J (Deeplearning4j) – A Glance at the POM

Maven is a project management tool that facilitates the building process. It offers quite a few tools on it’s tool belt, enough so that we could talk about Maven for quite a few posts in a row. But for our purposes with getting started with DL4J; we will just focus on what it provides us, the user/practitioner. More specifically, we are going to examine the pom.xml we set up in our “Getting Started with DL4J” post. Lets get started.

pom.xml

The pom (project object model) file is an xml configuration file. In complex projects, like the DL4J source, pom.xml files can link to other pom.xmls via their folder structure and contents inside. Since we are focusing on a small project utilizing the DL4J library and it’s partners in crime, Nd4J and Canova, we will only have one pom ourselves. In the following, I’m going to touch on parts that are instrumental to understand. By the end of tutorial, you should have enough knowledge to start playing around in the pom.xml, be able to update to newer versions of the libraries, and change back-end dependencies.

A Quickie on XML

If you aren’t familiar with XML, just understand that things inside <these carrots> are called tags. Every tag that is opened needs to be closed, </these carrots>. The tabbing structure you see isn’t necessary for functionality, but sure makes the file easier to read; like all code really. Tags that are created between opened and closed tags are children to their enclosing tag. I.E.

<parent>
    <child>Child 1</child>
    <child>Child 2</child>
    <parentChild>
        <child>Child of a child</child>
    </parentChild>
</parent>

Tags can not only contain other tags between them, but also text. As seen above. Lastly, tags can also have information within their <tag>. These are called attributes, and help define specific information about a tag, whether it’s an identification attribute, or meta-information for that tag. We’ll see the attributes in our first section of the pom.xml file we are examining below.

<Project>

The project tag is our parent tag for the pom. It declares to maven that this is the start of a project, and include some meta-information in the form of attributes. These attributes help maven understand how to understand the remainder of the document. You can almost think of it as header information.

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">

For nearly every scenario, we will not need to change this. The exception being if pom files receive a revision on the Apache maven side, and DL4J also makes the transition. So for all intents and purposes, we can pay little mind to it.

Project Definition

After the <project> tag, we then will define some information about maven. All of these definitions are really unnecessary for most of us working with DL4J, especially if we are focusing on just using the library. If you were to deploy your project to the Maven Central Repository, then those specifications have more weight. So feel free to put less emphasis on these settings unless you are in a distribution process.

<modelVersion>4.0.0</modelVersion>

  <groupId>com.mygroup</groupId>
  <artifactId>deeplearning</artifactId>
  <version>0.0.1-SNAPSHOT</version>
  <packaging>jar</packaging>

  <name>deeplearning-bt</name>
  <url>http://maven.apache.org</url>

The <ModelVersion> should correspond to the pom schema version you are writing to. In this case, 4.0 is the appropriate version.

<Properties> – The variables of the pom!

The properties section acts as variable definition. For child of the properties is a key, value set up. Here we are establishing various version information that we will use to define versions for various dependencies later in the pom document. This allows us to keep our documents cleaner and more succinct.

If there are new releases of DL4J, Nd4J, or Canova, we can make the appropriate update here and tell maven to update. 
It's that simple.
<properties>
    <nd4j.version>0.4-rc0</nd4j.version>
    <dl4j.version>0.4-rc0</dl4j.version>
    <canova.version>0.0.0.5</canova.version>
  </properties>

In any subsequent part of the pom document, we can refer to a property like such: ${property}. Where property would be one of the tags above for this document.

<distributionManagement>

Distribution management is used to establish settings to push maven compilations to a distribution platform. We won’t worry much about this, and I suspect it can be removed from your pom.

 <distributionManagement>
    <snapshotRepository>
      <id>sonatype-nexus-snapshots</id>
      <name>Sonatype Nexus snapshot repository</name>
      <url>https://oss.sonatype.org/content/repositories/snapshots</url>
    </snapshotRepository>
    <repository>
      <id>nexus-releases</id>
      <name>Nexus Release Repository</name>
      <url>http://oss.sonatype.org/service/local/staging/deploy/maven2/</url>
    </repository>
  </distributionManagement>

It has since been removed from from the example pom, as the above was legacy XML from the examples git repo. I’ve still noted it in this tutorial because it adds some insight to what Maven is capable of doing.

<dependencies> & <dependency>

Dependency lets Maven know what needs to be included during the build/compile process. This is the part that we as a community benefit from in regards to DL4J, ND4J and Canova. The Dependencies tag simply lets maven know there is a group of individual sets of dependency that need to be round up.

If we wish to add a new dependency, say deeplearning4j-aws, or the spark hook for DL4J. 
We can simply go to http://search.maven.org/, look up the groupId, artifactId 
and version we are interested in. Then add those into our dependencies.
<dependencies>
    <dependency>
      <groupId>org.deeplearning4j</groupId>
      <artifactId>deeplearning4j-ui</artifactId>
      <version>${dl4j.version}</version>

    </dependency>
    <dependency>
      <groupId>org.deeplearning4j</groupId>
      <artifactId>deeplearning4j-nlp</artifactId>
      <version>${dl4j.version}</version>

    </dependency>
    <dependency>
      <groupId>org.deeplearning4j</groupId>
      <artifactId>deeplearning4j-core</artifactId>
      <version>${dl4j.version}</version>

    </dependency>
    <dependency>
      <groupId>org.nd4j</groupId>
      <artifactId>nd4j-x86</artifactId>
      <version>${nd4j.version}</version>
    </dependency>
    <dependency>
      <artifactId>canova-nd4j-image</artifactId>
      <groupId>org.nd4j</groupId>
      <version>${canova.version}</version>
    </dependency>

  </dependencies>

Note that we are using the properties defined earlier in the document for the versions. This allows us to easily update with new releases.

<dependencyManagement>

The dependencyManagement tag provides a way to issue version control in a multi-tiered pom system. It provides configuration for children poms, such as version number. It does not force the dependency to exist in it’s children though. It exists as a way to control versioning through the entire maven project.

<build>

The build tag defines build behavior for the Maven project. The contents of this tag don’t fall very well into the focus of this tutorial, but you can expect a follow up post digging deeper into maven and the pom.

Closing Comments

Now that we have a rough understanding of what’s going on in pom.xml, we should feel comfortable mucking around when needed. More importantly, when a new release candidate comes out, we should know where to look. In this pom.xml, we are storing the release candidate versions in the properties. With a simple change there, we are able to tell maven to rebuild and have access to the latest ND4J, Canova and DL4J releases.

As always, thanks for reading, get programming and feel free to post any questions, corrections or just general comments. Also I’ve recently entered the twitter-verse, and you can follow me at @depiesML. I’ll tweet when new articles are posted, and share interesting Machine Learning articles.

Lastly, thank you Chris Nicholson for clarifying the functionality of a few of the tags.

Advertisements

One thought on “DL4J (Deeplearning4j) – A Glance at the POM

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s