Part I of the article available here where you can find an example of how to work with Atlas REST API & UI.

The aims of this article show base steps to work with Apache Atlas Java API. Here will be considered the next points:

  1. Example description;
  2. Overview of how to work with Java API;
  3. Solution.

The whole source code for this example is located here.

1. Example description

I will explain how to work with Java API and we create two AWS S3 objects and connect (lineage) then by Spark process. Our result will look like:

2. Overview how to work with Java API

Client API available through Atlas Client V2. To work with it you have to extend AtlasClientV2 class and pass parameters to the constructor, like Atlas URL, username, password. There are several constructors:

Also you have to connect some Atlas libraries to your project:

“org.apache.atlas” % “atlas-client-v2”
“org.apache.atlas” % “atlas-common”

3. Solution

First and foremost let’s look at how looks my build.sbt:

Then to resource folder of your project we have to add atlas-application.properties from ./apache-atlas-sources-2.1.0/distro/target/apache-atlas-2.1.0-bin/apache-atlas-2.1.0/conf/ directory, otherwise you will get

ERROR AtlasBaseClient: Exception while loading configuration. org.apache.atlas.AtlasException: Failed to load application properties

In a production environment just add a path to this file to the classpath.

If you wish to look at logs then add log4j.properties to the resource folder and define the logging level.

For constants I defined a separate case object, so my constants:

Then we have to extend AtlasClientV2:

Now we have to write code which will create our objects in Atlas and we will see the lineage. To do this we have to create: two s3 objects & one spark process. If you check aws s3 object model, then you will see that it consists:

aws_s3_bucket
aws_s3_pseudo_dir
aws_s3_object

So we have to register each object and connect when together:

Now we will create also a function for Spark_process:

Let’s call out functions:

As result you can open Apache Atlas UI and find all objects which we created

The whole project you can find here

--

--

Alexey Artemov

Staff Data Engineer | MLOps | Data Architect | AWS | Databricks | Data Governance. https://www.linkedin.com/in/aartemov/