A computer science student was studying under a tree when his classmate pulls up on a flashy new bike.

The first guy asks, "Where'd you get that?

The student on the bike replies, "You'll never belove it! While I was studying outside, a beautiful girl pulled up on her bike. She took off all her clothes and said, 'You can have anything you want!'."

The first student exclaimed, "Good choice! Her clothes probably wouldn't have fit you."

the brown-dragon blog

Importing SVN to GIT

2008-12-29

I've been trying out Git to see if I should move my SCM solution (currently in SVN) to it.

The current solution works fine - I have scripts to keep track of merges which is traditionally the biggest pain point in SVN. I also do issue-based tracking of change-sets which gives a lot of flexibility to my users. The UI is a mix of Tk and TortoiseSVN.

I've been running about 80 users (around 10 teams) on it and it works great.

The biggest problem users face is merge conflicts. While I take care of reflected merges and directional issues, there's just so much I can do. From all the hype around Git I'm hoping it will help with this. If it gives me a dramatic improvement, I'll re-write the entire system in Git. And maybe Lua instead of Python... (yay!)

So anyhow, I've starting moving some repositories from subversion to git. I work on windows and ran into two problems:

So, instead of chasing down a fix, I spent an hour and wrote my own Subversion to Git importer. I wrote it in Chicken Scheme and it turned out to be pretty simple.

It's easy to use, should be very easy to understand and customize if needed.

Here it is:

svn2git.scm

;; This utility parses a subversion log dump and creates a DOS batch file   ;;
;; that imports every log entry into git.                                   ;;
;; It does not handle branches and tags but those should be easy to add     ;;
;; if needed.                                                               ;;
;;                                                                          ;;
;; Q: Why not just use git-svnimport?                                       ;;
;; A: I don't have it on my windows machine. Besides this took about 30     ;;
;;    mins and was fun to do. Plus I can customize it to do *exactly*       ;;
;;    whatever I want.                                                      ;;
;;                                                                          ;;
;; Usage:                                                                   ;;
;;      1. Check out the subversion repository into a folder.               ;;
;;         using "svn co <url> -r0 <folder>"                                ;;
;;      2. Run "svn log --verbose -r 0:HEAD <folder>" and save the output.  ;;
;;      3. Edit the log if needed to keep only the log entries you want.    ;;
;;      4. Load svn2git.scm into Chicken Scheme and run                     ;;
;;           (svn-2-git <log output> <folder> "somebatchfile.bat")          ;;
;;         Eg:                                                              ;;
;;           (svn-2-git "svn-logdump" "myrepo" "go-to-git.bat")             ;;
;;      5. Set EMAIL, GIT_AUTHOR_EMAIL, GIT_COMMITTER_NAME,                 ;;
;;         and GIT_COMMITTER_EMAIL if needed.                               ;;
;;      6. Run the generated batch file to import.                          ;;
;;      7. Done! Clean up the log output file and the generated batch       ;;
;;          file. The .git repository is now ready.                         ;;
;;                                                                          ;;
;; http://www.the-brown-dragon-blog.com                                     ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

(require 'posix) ;; Needed for (string-match ..)

(define parse-svn-log #f)
(let ((log-paths '()))
  (set! parse-svn-log (lambda (_file _process)
    ;; Parses the svn log file and generates a list   ;;
    ;;        (rev author stamp msgline1 msgline2 ..) ;;
    ;; which it passes to the _process function.      ;;
    ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
    (let ((f (read-lines _file))
          (er (lambda (st ln)
                (error (format "Failed parsing ~S in state ~S!" ln st))))
          (le '()))
      (let loop ((cur (car f))
                 (rst (cdr f))
                 (state 'separator))
        (case state
          ((separator) (if (not (string-match "-+" cur)) (er state cur))
           (set! le '())
           (set! state 'revinfo))
          ((revinfo)
           (let ((m (string-match "r(\\d+) \\| (.*) \\| (.*) \\|.*" cur)))
             (if (not m) (er state cur))
             (set! le (list (list-ref m 1) (list-ref m 2) (list-ref m 3))))
           (set! state 'changed-paths-header))
          ((changed-paths-header)
           (if (not (string=? cur "Changed paths:"))(er state cur))
           (set! log-paths '())
           (set! state 'changed-paths))
          ((changed-paths)
           (if (string=? cur "")
               (begin
                 (set! le (append le (cons (reverse log-paths) '())))
                 (set! state 'msg))
               (begin
                 (if (string-match ".*:.*" cur) ; there is a path to check
                     (set! cur                  ; strip (from /...$
                           (cadr (string-match
                                  "(.*) \\(from /.*\\)" cur))))
                 (set! cur (substring cur 5))
                 (set! log-paths (cons cur log-paths)))))
          ((msg)
           (if (string-match "-+" cur)
               (begin
                 (set! state 'end-one)
                 (set! rst (cons cur rst)))
               (set! le (append le (cons cur '())))))
          ((end-one)
           (let ((chk #f))
             (for-each (lambda (l)
                         (if (not chk)
                             (if (not (string-match "[ \t]*" l))
                                 (set! chk #t))))
                       (cddddr le))
             (when (not chk)
                   (display (string-append "No log message for revision "
                                           (car le)
                                           "! Forcing to \"(no log message)\""))
                   (newline)
                   (set! le (append le (list "(no log message)")))))
           (_process le)
           (set! state 'separator)
           (set! rst (cons cur rst)))
          (else (er state cur)))
        (if (not (eq? rst '()))
            (loop (car rst) (cdr rst) state))))))
)

(define (bcmd _cmd _batchfile)
  ;; Writes a given command to a batchfile along with error checking ;;
  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
  (write-line _cmd _batchfile)
  (write-line "if errorlevel 1 exit /B" _batchfile))

(define (flp-stepup _pth)
  ;; Step's up one path element ;;
  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
  (if (not (string=? "/" _pth))
      (begin
        (set! _pth (cadr (string-match "(.*)/[^/]*" _pth)))
        (if (string=? "" _pth) (set! _pth "/"))))
  _pth)

(define (find-lowest-path _pthlst)
  ;; Finds the lowest common path of the given set of paths ;;
  ;; This is useful to update only what is required in svn. ;;
  ;; Otherwise svn update takes too long.                   ;;
  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
  (let ((lp (car _pthlst)))
    (for-each
     (lambda (pth)
       (let loop ()
         (when (not (string=? lp pth))
               (if (> (string-length lp) (string-length pth))
                   (set! lp (flp-stepup lp))
                   (set! pth (flp-stepup pth)))
               (loop))))
     _pthlst)
    lp))

(define (import-wscm-revs _logelem _importdir _batchfile _gc?)
  ;; Generates a batch file containing commands for importing   ;;
  ;; all versions into git.                                     ;;
  ;; Also creates a large list of log message dumps (and auto-  ;;
  ;; cleans after running).                                     ;;
  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
  (letrec ((svn-cmd (string-append "svn update -r" (car _logelem)))
           (bp #f)
           (lp (find-lowest-path (cadddr _logelem))))
    (set! svn-cmd (string-append svn-cmd " ../" _importdir lp))
    (bcmd svn-cmd _batchfile)
    (if _gc?  (bcmd "git-gc" _batchfile))
    (set! bp (string-match "(/trunk/[^/]*)(/.*)?" lp))
    (if (not bp)
        (set! bp (string-match "(/branches/[^/]*/[^/]*)(/.*)?" lp)))
    (if (not bp)
        (set! bp (string-match "(/workspaces/[^/]*)(/.*)?" lp)))
    (if (and bp
             (not (substring=? "[Ignore:" (car (cddddr _logelem)))))
        (_import-wscm-revs _logelem _importdir _batchfile (cadr bp)))))

(define (_import-wscm-revs _logelem _importdir _batchfile _bp)
  (letrec ((svn-cmd (string-append "svn update -r" (car _logelem)))
           (cp-cmd "xcopy /S/Y ")
           (clean-cmd "rm -rf ./*")
           (msg-file (string-append _importdir "-commitmsg-" (car _logelem)))
           (git-env1 (format "set GIT_AUTHOR_NAME=~S" (cadr _logelem)))
           (git-env2 (format "set GIT_AUTHOR_DATE=~S" (caddr _logelem)))
           (git-env3 (format "set GIT_COMMITTER_DATE=~S" (caddr _logelem)))
           (git-env4 (format "set EMAIL=~A@tallysolutions.com" (cadr _logelem)))
           (git-cmdms "git-checkout master")
           (git-cmdbr "git-branch ")
           (git-chkot "git-checkout ")
           (git-cmd1 (string-append "git-add ."))
           (git-cmd2 (string-append
                      "git-commit -q -a --allow-empty --no-verify "
                      "-F ../" msg-file))
           (git-unenv1 "set GIT_AUTHOR_NAME=")
           (git-unenv2 "set GIT_AUTHOR_DATE=")
           (git-unenv3 "set GIT_COMMITTER_DATE=")
           (git-unenv4 "set EMAIL=")
           (bp #f)
           (brn #f)
           (del-msg (string-append "del ..\\" msg-file)))
    (set! brn (string-translate
               (cadr (string-match "/[^/]*/(.*)" _bp)) #\/ #\-))
    (set! cp-cmd (string-append
                  cp-cmd
                  (string-translate (string-append "../" _importdir _bp)
                                    #\/ #\\)
                  " ."))
    (set! git-cmdbr (string-append git-cmdbr brn))
    (set! git-chkot (string-append git-chkot brn))
    (call-with-output-file msg-file
      (lambda (port)
        (for-each (lambda (l) (write-line l port)) (cddddr _logelem))))
    (write-line git-cmdms _batchfile)
    (write-line git-cmdbr _batchfile)
    (bcmd git-chkot _batchfile)
    (bcmd clean-cmd _batchfile)
    (bcmd cp-cmd _batchfile)
    (bcmd git-env1 _batchfile)
    (bcmd git-env2 _batchfile)
    (bcmd git-env3 _batchfile)
    (bcmd git-env4 _batchfile)
    (bcmd git-cmd1 _batchfile)
    (bcmd git-cmd2 _batchfile)
    (bcmd git-unenv1 _batchfile)
    (bcmd git-unenv2 _batchfile)
    (bcmd git-unenv3 _batchfile)
    (bcmd git-unenv4 _batchfile)
    (bcmd del-msg _batchfile)))

(define (write-header _importdir _batchfile)
  ;; Writes a header to create the git repository and the .gitignore file. ;;
  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
  (bcmd (string-append "mkdir " _importdir "-git") _batchfile)
  (bcmd (string-append "cd " _importdir "-git") _batchfile)
  (bcmd "git-init" _batchfile)
  (bcmd "echo .svn/> .gitignore" _batchfile)
  (bcmd "git add .gitignore" _batchfile)
  (bcmd "git commit -q -m \"initial check-in\"" _batchfile))

(define (write-footer _batchfile)
  ;; Writes out the footer of the batch file ;;
  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
  (bcmd "git-gc" _batchfile)
  (bcmd "cd .." _batchfile))

(define (svn-2-git _logoutputfile _importdir _batchfile)
  ;; Wrapper that puts everything together ;;
  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
  (let ((gc? 0))
    (call-with-output-file _batchfile
      (lambda (batchfile)
        (write-header _importdir batchfile)
        (parse-svn-log _logoutputfile
                       (lambda (logelem)
                         (set! gc? (+ gc? 1))
                         (import-wscm-revs logelem _importdir batchfile
                                         (eq? 0 (modulo gc? 10))))) ;gc every 10
        (write-footer batchfile)))))

Other Posts

(ordered by Tags then Date)