CVS log analysing for summarising spent hours


CVS is based on RCS, which is file based. That means, you don’t really have revisions over multiple files. The “cvs log” command reflects that problem. This outputs a more useful summary: date and commit comment per line.

cvs log > cvslog # puts log in this file
cat cvslog |
grep '^date: ' -A3 | # we assume max 3 lines of comments
while read line; do echo -n "$line |"; done | # all in one line, seperated by |
sed 's/date: /\ndate: /g'| # a line per date
sed 's/[-=]\{2,\}/\n/g' | # removing line seperators
sed 's/^[- |]*//g'| # remove useless - and |
grep -v '^revision '| # there exist some more boring lines
sed 's/;  author:.*; |/;/g'| # we aren't interested in author and id etc.
sed 's/ |$//g'| # remove ending |
sort -u | # sort (by date)
python guniq.py | # show only uniq lines
cat > cvslog.1

You need guniq.py which does the same as the unix command uniq, except that it removes duplicates found in the whole input.

guniq.py:

#!/usr/bin/python
import sys
a = []
while True:
l = sys.stdin.readline()
if l == '':
break
if not a.__contains__(l):
print l,
a.append(l)

The whole thing got me nearly to boot into Windows, because TortoiseCVS/TortoiseSVN is a really cool awesome thingy!

  1. #1 by panzi on August 15th, 2008

    This python script is more pythonic, concise and has a better performance:

    #!/usr/bin/env python
    import sys
    a = set()
    for l in sys.stdin:
    if not l in a:
    print l,
    a.add(l)

    :)

  2. #2 by panzi on August 15th, 2008

    REPLY:
    Oh, I think even more pythonic would be:

    #!/usr/bin/env python
    import sys
    a = set()
    for l in sys.stdin:
    if l not in a:
    print l,
    a.add(l)

    :)

(will not be published)

  1. No trackbacks yet.