DL4J (Deeplearning4j) – A Glance at the POM

Maven is a project management tool that facilitates the building process. It offers quite a few tools on it’s tool belt, enough so that we could talk about Maven for quite a few posts in a row. But for our purposes with getting started with DL4J; we will just focus on what it provides us, the user/practitioner. More specifically, we are going to examine the pom.xml we set up in our “Getting Started with DL4J” post. Lets get started.

pom.xml

The pom (project object model) file is an xml configuration file. In complex projects, like the DL4J source, pom.xml files can link to other pom.xmls via their folder structure and contents inside. Since we are focusing on a small project utilizing the DL4J library and it’s partners in crime, Nd4J and Canova, we will only have one pom ourselves. In the following, I’m going to touch on parts that are instrumental to understand. By the end of tutorial, you should have enough knowledge to start playing around in the pom.xml, be able to update to newer versions of the libraries, and change back-end dependencies.

Continue reading

Advertisements

DL4J (Deeplearning for Java) – Getting Started

UPDATE: Hey guys this tutorial has aged poorly when it comes to working with the newest version(s) of DL4J. The descriptive material found here is still fine (though dated). Here’s a small and quick update to get started.

ND4J, Canova & DL4J

I’m just going to take a quick moment to break down these three libraries and identify what they are. If you are interested in just getting started read ahead.

  1. Introduction to the Deeplearning tripod
  2. Getting starting
    1. Set-up using the command prompt
    2. Set-up using Eclipse

ND4J

ND4J is an N-dimensional Array scientific computing library for Java, meant to rival the offerings of the likes of numpy. According to the ND4J page, it can/will out perform numpy with the right back end. In short, it’s an easy to use API, that operates at high efficiency without ever being locked to one linear algebra library.

Back End?

That’s right, ND4J itself is a consolidation of well established and highly optimized BLAS (Basic Linear Algebra Subprograms) under a unified API. What does that mean for you? It means that you are able to switch to various Linear Algebra libraries without ever having to touch your code. It means you can port your code that runs on ND4J to the GPU with a simple swap in the maven configuration file (pom.xml, we’ll talk more about this later).

Visit the ND4J webpage for more information.

Canova

Canova exists to take raw data and convert it to many standardized vector formats which are easily loaded into Machine Learning pipelines. By raw data, we can think of images, sound, video, and so forth. This library is still under development, but serves as an important piece to the tripod of deep learning with Java.

DL4J

Deeplearning4j, or Deeplearning for Java, is a comprehensive deep learning offering for Java. It’s attempting to fill the role that Torch fills for LUA, or Theano for python. To compare these libraries directly may not be fair, given their different life spans, but it’s definitely a way to think about them. DL4J builds upon the ND4J offerings, this means that any algorithm in DL4J can be configured to utilize various BLAS backends. I.E. GPU support out of the box.

One of the more attractive features of DL4J is how configurable it is. Most if not all major Neural Network frameworks have been implemented, and their flexibility is what data scientists and machine learning enthusiasts dream of. What’s more, the networks can easily be piped into eachother with relative ease. Really, there is too much to talk about, and much of it I have yet to fully explore. So lets jump into getting set up, and after you can join me as I’ll continue to dig into this framework.

One last thing to note, if the ND4J backend scalability was not already attractive enough, DL4J offers context hooks for Hadoop and Spark. In what's described as, prototype locally, then scale your pipeline with Hadoop and/or Spark with little to no modifications. The library is built with distributed computing in mind.

Continue reading