Here is an older cheat sheet I’ve found again with some git tricks to get data from source code repositories for data-driven software analytics or for getting some initial insights into a software development project.

Start to map author names to persons via .mailmap file

Command

git shortlog -sne | cut -f2 | sort > .mailmap

Example content of .mailmap

Ameya Pandilwar <ameya@ccs.neu.edu>
Ameya Pandilwar <ameya@pandilwar.com>
AndrejGajdos <Andrej1>
Antoine Rey <antoine.rey@free.fr>
Antoine Rey <antoine.rey@gmail.com>

Count the number of commits per file

Command

git log --no-merges --no-renames --numstat --pretty=format:"" -- *.java | cut -d$'\t' -f3 | grep -v '^$' | sort | uniq -c | sort

Example output

     16 src/test/java/org/springframework/samples/petclinic/service/AbstractClinicServiceTests.java
     18 src/main/java/org/springframework/samples/petclinic/repository/jdbc/JdbcVetRepositoryImpl.java
     18 src/main/java/org/springframework/samples/petclinic/web/PetController.java
     19 src/main/java/org/springframework/samples/petclinic/web/OwnerController.java
     23 src/main/java/org/springframework/samples/petclinic/repository/jdbc/JdbcOwnerRepositoryImpl.java

Search for commit activity per author

Command

git shortlog -ns -- **Owner**.java

Example output

    45  michaelisvy
    12  Antoine Rey
     6  Keith Donald
     2  Tomas Repel
     1  Colin But

Search for awkward things

e. g. “todos” and “fixmes”

git grep --perl-regexp "\/\/ *(todo|fixme)"

e. g. commented code

git grep --perl-regexp " \/\/.*(=|;)" -- *.java

Browse through the commit messages

Command

git log --pretty=format:"%h %s"

Example output

0c24083 removed appserver-specific files
feca50d added jQueryUI
95cb32d used tag c:out for EL to prevent HTML injection
d88b565 migrated all dates to joda time
c4b5a98 navbar, reorganized JSP folders

Count last changed line in each Java source code file per author

Command

find . -type d -name ".git" -prune -o -type f \( -iname "*.java" \) | xargs -n1  git blame -w -f -C -M  --date=format:"|" | cut -d"|" -f 1 | cut -d"(" -f 2 | sed 's/\s*$//' | sort | uniq -c | sort

Example output

    152 Gordon Dickens
    364 Antoine Rey
    418 Colin But
   1226 Costin Leau
   1718 michaelisvy
print
Software Archaeology with Git

Leave a Reply

Your email address will not be published. Required fields are marked *

I accept that my given data and my IP address is sent to a server in the USA only for the purpose of spam prevention through the Akismet program.More information on Akismet and GDPR.