ToastedBits

Introduction
Requirements
Code
Usage
Neo4j
Conclusion

Introduction

In this post I'll share a Gradle plugin I wrote for our workplace called the Downstream Plugin. Have you ever been writing code and wondered which other projects in your organization might be making use of it? With this plugin we can answer that question! This is especially useful for in house libraries that get shared across many projects, helpign to assess how much "transitive" work is necessary when making changes to upstream projects. I hope others might find useful in their organizations!

I'd like to thank my managers Matt Hawkes and Dale Huffman for making room in our busy schedules for "innovation weeks"; my team for proving to the larger organization that these innovation weeks are valuable (so we can keep doing them!); and lastly for the permission to release this plugin as open source!

After integrating this plugin in all of our projects, we collect quite a number of nodes and relationships. Here is an anonymized selection of one of our shared libraries. Unfortunately several of our projects failed to specify a version at one point, they have now all been fixed so I need to exclude them from the results (and eventually remove them from the database). Mostly all snapshot jars floating around the particular version I queried.

Using the Neo4j REST interface we've also been working on a friendlier (and much more performant) web interface than the default neo4j browser shell. Thanks to Ryan Nix for making the first implementation of this web frontend. There are still some core features to work out and permissions to acquire before (hopefully) open sourcing this piece. Until then perhaps this can be a starting point for some of your own ideas on how to use the database!

Requirements

Java, Groovy, Gradle
Neo4j Server

Optional (Strongly Recommneded)

Dependency Management Solution (for hosting the plugin)
- Artifactory
- Nexus
- Archiva
- etc

You can use a filesystem to "host" this plugin, but it really shines in an organization that is making use of dependency management tools.

Code

One of the great things about Gradle is that it is interpreted by Groovy. Not only does this give us a convenient syntax to work with, but more importantly this gives us full access to both the Groovy and Java APIs along with the whole Java ecosystem of tools. The basic idea of this plugin is to upload all dependencies in our build scripts for each project to a central graph database, Neo4j. Integrating this reporting step as part of our continuous integration solution (Jenkins) we can make queries about how our projects are related to each other and quickly find which projects are using out of date libraries for both internal and open source projects!

In the plugin we make use of Groovy's HttpBuilder to interact with the Neo4j REST API.

Begin a transaction
Find a node that represents our current project by artifactId, groupId, and version. If it exists:
1. Clear all outbound (dependency) relations
2. Create a new relation for each dependency
3. label each relation with the configuration which the dependency belongs to


//TODO: Use gradle logger and remove refrences to println, scoping issues with the logger is blocking this improvement
import groovyx.net.http.*
import static groovyx.net.http.ContentType.*
import static groovyx.net.http.Method.*

buildscript {
	repositories {
		//Your dependency manager here...
		mavenCentral()
	}
	dependencies {
		classpath "org.codehaus.groovy.modules.http-builder:http-builder:0.7.1"
	}
}

//TODO: HTTPBuilder is synchronous, for better performance use AsyncHTTPBuilder - will require stricter handling of callbacks
class DownstreamPlugin implements Plugin<Project> {

	def commitTransaction(commitUrl) {
		def http = new HTTPBuilder(commitUrl)
		http.request(POST, JSON) { req ->
			response.success = { resp, data ->
				println "Commit successful: ${resp.statusLine}"
			}
			response.failure = { resp, data ->
				println "Commit failed: ${resp.statusLine}"
			}
		}
		println "END TRANSACTION"
	}

	def upstreamTransaction(project, transactionUrl, commitUrl) {
		def http = new HTTPBuilder(transactionUrl)

		project.configurations.each { conf ->
			println "Configuration ${conf.name}"
			conf.allDependencies.each { dep ->
				def builder = new groovy.json.JsonBuilder()
				builder.call (
					statements : [
						{
							statement "MERGE (up:Project {name:{upName}, group:{upGroup}, version: {upVersion}})"
							parameters {
								upName "${dep.name}"
								upGroup "${dep.group}"
								upVersion "${dep.version}"
							}
						},
						{
							statement "MATCH (down:Project {name:{projectName}, group:{projectGroup}, version:{projectVersion}}), (up:Project {name:{upName}, group:{upGroup}, version:{upVersion}}) MERGE (down)-[:${conf.name}]->(up)"
							parameters {
								projectName "${project.name}"
								projectGroup "${project.group}"
								projectVersion "${project.version}"
								upName "${dep.name}"
								upGroup "${dep.group}"
								upVersion "${dep.version}"
							}
						}
					]
				)
				http.request (POST, JSON) { req ->
					body = builder.toString()
					response.success = { resp, data ->
						println "Success adding upstream dependency ${dep.group}:${dep.name}:${dep.version}, ${resp.statusLine}"
					}
					response.failure = { resp, data ->
						println "Failure adding upstream dependency ${dep.group}:${dep.name}:${dep.version}, ${resp.statusLine}"
					}
				}
			}
		}

		commitTransaction(commitUrl)
	}

	def launchReportUpstream(project) {
		println "BEGIN TRANSACTION"
		def http = new HTTPBuilder("http://${project.downstream.host}:${project.downstream.port}/db/data/transaction")

		http.request (POST, JSON) { req ->
			def builder = new groovy.json.JsonBuilder()
			builder.call (
				statements : [
					{
						statement "MERGE (proj:Project {name:{projectName}, group:{projectGroup}, version:{projectVersion}})"
						parameters {
							projectName "${project.name}"
							projectGroup "${project.group}"
							projectVersion "${project.version}"
						}
					},
					{
						statement "MATCH (down:Project {name:{projectName}, group:{projectGroup}, version:{projectVersion}})-[rel]->() DELETE rel"
						parameters {
							projectName "${project.name}"
							projectGroup "${project.group}"
							projectVersion "${project.version}"
						}
					}
				]
			)
			body = builder.toString()
			response.success = { resp, data ->
				println "Refresh project node for ${project.name}:${project.group}:${project.version}, ${resp.statusLine}"
				upstreamTransaction(project, resp.headers.'Location', data.commit)
			}
		}
	}

	void apply(Project project) {
		project.extensions.create("downstream", DownstreamPluginExtension)

		//Upload our dependencies to a neo4j server
		project.task("reportUpstream") << {
			try {
				def http = new HTTPBuilder("http://${project.downstream.host}:${project.downstream.port}/db/data")
				http.request (POST, JSON) {
					response.success = { resp ->
						launchReportUpstream(project)
					}
					response.failure = { resp -> 
						println "Neo4j does not appear to be running on ${project.downstream.host}:${project.downstream.port}"
					}
				}
			}
			catch(UnknownHostException | ConnectException e) {
				println "Unable to reach ${project.downstream.host}:${project.downstream.port}"
			}
		}

		//Fetch projects who depend on us from a neo4j server
		project.task("showDownstream") << {
			def http = new HTTPBuilder("http://${project.downstream.host}:${project.downstream.port}/db/data/cypher")
			try {
				http.request( POST, JSON ) { req ->

					def builder = new groovy.json.JsonBuilder()
					builder.call ( [
						query : "MATCH (up:Project {name:{projectName}, group:{projectGroup}, version:{projectVersion}})<-[rel]-(down) RETURN down.name, down.group, down.version, type(rel)",
						params : {
							projectName "${project.name}"
							projectGroup "${project.group}"
							projectVersion "${project.version}"
						} ]
					)
					body = builder.toString()

					response.success = { resp, json ->
						println "Downstream projects from ${project.name}:${project.group}:${project.version}"
						def depMap = [:]
						json.data.each { d ->
							//Force the type to be a plain old Java String, not a Groovy GString. The ':' characters have special evaluation rules for a map
							def key = new String("${d[0]}:${d[1]}:${d[2]}")
							if(!depMap.containsKey(key)) {
								depMap[key] = []
							}
							depMap[key] << d[3]
						}
						depMap.each { key, value ->
							println "${key}: ${value}"
						}
					}
					response.failure = { resp, json ->
						println "Failed to retrieve downstream dependencies: ${resp.statusLine}"
					}
				}
			}
			catch(UnknownHostException | ConnectException e) {
				println "Unable to reach ${project.downstream.host}:${project.downstream.port}"
			}
		}
	}
}

class DownstreamPluginExtension {
	//Your group's default neo4j server here
	def host = "localhost"
	def port = 7474
}

//Make it to where we only need to apply the plugin to the root project in multi-project builds
if (!hasProperty("_downstreamPluginApplied")) {
	ext["_downstreamPluginApplied"] = true

	allprojects {
		apply plugin: DownstreamPlugin
	}
}

Of course this code may become outdated, especially if anyone out there wishes to enhance it! I've copied the code into a GitHub repository for future enhancements and pull requests.

Usage

run "gradle reportUpstream" to report upstream dependencies for the current project to the neo4j database.
run "gradle showDownstream" to list projects that make use of the current project

Of course showDownstream only knows what has been reported to neo4j, I strongly recommend integrating reportUpstream as part of your team's continuous integration solution!

Neo4j

Getting started with neo4j is relatively straightforward.

Download the community edition and unpack it to any directory on the system you wish to host it.
Create a user account to run the service
Increase the ulimits for open file connectors on the account to over 40k
Edit the neo4j-server.conf file, uncomment the listen address line to allow remote access
Launch the neo4j server with bin/neo4j start
Test the neo4j installation by accessing the neo4j host from a browser (default port 7474). It should provide a UI to the database

You may also wish to setup appropriate init scripts to launch the service. Alternatively your distribution may provide a package to install neo4j for you.

Conclusion

Having a graph of all your team's internal projects is handy for a multitude of reasons. Most notably to keep projects in sync with each other and find when projects rely on deprecated libraries. Since Neo4j can be accessed via a REST API, it is simple to put any kind of a frontend on it. For instance, our team is experimenting with a simple jQuery/d3.js visualization frontend, it has a couple minor issues at the moment so will hold off on that until after our next "innovation week". You can also craft raw neo4j "cypher" queries against the database for really complicated use cases.