Reading a Git repo’s commit history with Pandas efficiently

Reading a Git repo’s commit history with Pandas efficiently

There are multiple reasons for analyzing a version control system like your Git repository. See for example Adam Tornhill’s book “Your Code as a Crime Scene” or his upcoming book “Software Design X-Rays” for plenty of inspirations:

You can analyze knowledge islands, distinguish often changing code from stable code parts, identify code that is temporal coupled to other code.

Having the necessary data for those analyses in a Pandas DataFrame gives you many possibilities to quickly gain insights into the evolution of your software system in various ways…