In Carola Lilienthal’s talk about architecture and technical debt at Herbstcampus 2017, I was reminded that I wanted to implement some of the examples of her book “Long-lived software systems” (available only in German) with the structural analysis tool jQAssistant. Especially the visualizations of the dependencies between different business subdomains seemed like a great starting point to try out some stuff…
Recently I came over a great visualization of imported classes by one class made by Mike Bostock with his Hierarchical Edge Bundling in D3. I wondered how hard it would be to reimplement this visualization with jQAssistant and Neo4j and show actual dependencies between Java types. So let’s have a look!
In software development, it’s all about knowledge – both technical and the business domain. But we software developers transfer only a small part of this knowledge into code. But code alone isn’t enough to get a glimpse of the greater picture and the interrelations of all the different concepts. There will be always developers that know more about some concept as laid down in source code. It’s important to make sure that this knowledge is distributed over more than one head…
There are multiple reasons for analyzing a version control system like your Git repository. See for example Adam Tornhill’s book “Your Code as a Crime Scene” or his upcoming book “Software Design X-Rays” for plenty of inspirations:
You can analyze knowledge islands, distinguish often changing code from stable code parts, identify code that is temporal coupled to other code.
Having the necessary data for those analyses in a Pandas DataFrame gives you many possibilities to quickly gain insights into the evolution of your software system in various ways…
I recently watched Michael Feathers’ talk about Strategic Code Deletion. Michael said (among other very good things) that if we want to delete code, we have to know the actual usage of our code.
In this post, I want to show you how you can very easily gather some data and create insights about unused code.
All the work before was just there to get a nice graph model that feels more natural. Now comes the analysis part: As mentioned in the introduction, we don’t only want the hotspots that signal that something awkward happened, but also
the trigger in our application of the hotspot combined with
the information about the entry point (e. g. where in our application does the problem happen) and
(optionally) the request that causes the problem (to be able to localize the problem)…
I show how I determine the parts of an application that trigger unnecessary SQL statements by using graph analysis of a call tree…
You all know word clouds!
They give you a quick overview of the top topics of your blog, book, source code – or presentation. The latter was the one that got me thinking: How cool would it be if you start your presentation with a word cloud of the main topics of your talk…
Reading data from a software version control system can be pretty useful if you want to answer some evolutionary questions like
– Who are our main committers to the software?
– Are there any areas in the code where only one developer knows of?
– Where were we working on the last months?
Software version control systems contain a huge amount of evolutionary data. It’s very common to mine these repositories to gain some insight about how the development of a software product works. But there is the need for some preprocessing of that data to avoid false analysis.
That’s why I show you how to read the commit information of a Git repository into Pandas’ DataFrame!