Textconv

From Git SCM Wiki
Jump to: navigation, search

Contents


Presentation

Git is able to use an external driver which converts the real content of an element into something else, preferably more useful for the user. Thus the git machinery (diff, blame ...) can operate on converted data.

Git diff on binary files used to only detect if the files differ and git blame would just show blame on binary content. Thanks to textconv support, git diff and blame on binary files now give an understandable and usable result.

Textconv support for "git diff" is available for Git 1.6.1 or later, and support for "git blame" starts from 1.7.2. A patch is currently in progress to add textconv support for "git gui blame": [1][2]

How to use

Textconv driver configuration

Textconv drivers must be specified for the files you want to convert.

.gitconfig

~/.gitconfig must indicate the command to execute for the the textconv driver:

[diff "<driver_name>"]
      textconv=<command>

Here is an example:

[diff "odf"]
      textconv=odt2txt

odt2txt command is specified for the odf driver, it is a simple converter from OpenDocument to plain text.

Please note that, in this case, odt2txt must have been installed.

.gitattributes

.gitattributes must associate files to a driver:

<file pattern> diff=<driver_name>

Here the continuation of the previous example:

*.odf diff=odf
*.odt diff=odf
*.odp diff=odf

It associates the odf driver for the odf, odt and odp files.

To have more information about how to use git to track OpenDocument (OpenOffice, Koffice) files, please refer to this article.

Blame and diff

Now, git blame, git gui blame and git diff will convert files if they have a textconv driver.

If you want to disable textconv:

  • for git diff and git blame: just use the --no-textconv option.
  • for git gui blame: disable the "Use TextConv for Diffs and Blames" in the File Viewer options. For more details, please look at the demo.

Demo

Let's take an example with the file textconv_demo.odt.

git blame

(patch in progress)

The command : git blame textconv_demo.odt gives the following result:

2fad25a7 (Diane Gasselin    2010-06-09 16:50:57 +0200  1) 
2fad25a7 (Diane Gasselin    2010-06-09 16:50:57 +0200  2)   Here is an example of how textconv works with blame.
2fad25a7 (Diane Gasselin    2010-06-09 16:50:57 +0200  3) 
1b6ec1d5 (Axel Bonnet       2010-06-09 16:55:54 +0200  4)   This file has been modified by several users.
1b6ec1d5 (Axel Bonnet       2010-06-09 16:55:54 +0200  5) 
1b6ec1d5 (Axel Bonnet       2010-06-09 16:55:54 +0200  6)   Instead of having a binary output, it presents a nice and
1b6ec1d5 (Axel Bonnet       2010-06-09 16:55:54 +0200  7)   readable one.
1b6ec1d5 (Axel Bonnet       2010-06-09 16:55:54 +0200  8) 
00000000 (Not Committed Yet 2010-06-09 17:46:18 +0200  9)   You only have to specify a driver in your configuration files.
00000000 (Not Committed Yet 2010-06-09 17:46:18 +0200 10) 


If not using textconv (git blame --no-textconv textconv_demo.odt), here is the kind of result you will have on binary files.

00000000 (Not Committed Yet 2010-06-11 15:10:19 +0200  1) PK^C^D^T^@^@^H^@^@0|<C9><^<C6>2^L'^@^@^@'^@^@^@^H^@^@^@mimetypeapplication/vnd.oasis.opendocument.textPK^C^D^T^@^@^H^@^@0|<C9><^@^@^@^@
00000000 (Not Committed Yet 2010-06-11 15:10:19 +0200  2) ^O<EB><AC>    6<D6>^Mϓ<9D>Q<D2>7Vt<C3>G|i<AC><AF><D3>{<C3><C3><F7>^F^_U<83><A1><9F>^WIZ^R<D2><U+05CA>u<B9>m^B^ڐ+^\w<U+063E>X<CA>i<BB><97>
2fad25a7 (Diane Gasselin    2010-06-09 16:50:57 +0200  3) <AD>^O<9D>"tr<F8>5M<FE><D6>FA>

git gui blame

(patch in progress)

The command : git gui blame textconv_demo.odt gives the following result:

Demo git gui blame.jpg


To disable textconv, launch git gui blame on your binary file.

Then, in your file viewer, go to Edit>Options... and deselect the option "Use Textconv For Diffs and Blames".

You will finaly get as result like:

Demo git gui blame no textconv.jpg


git diff

(Git 1.6.1 or later)

We are now going to see the results for git diff with and without textconv.

"git diff" gives:

diff --git a/textconv_demo.odt b/textconv_demo.odt
index b4e6a6d..c08e32e 100644
--- a/textconv_demo.odt
+++ b/textconv_demo.odt
@@ -6,3 +6,5 @@
   Instead of having a binary output, it presents a nice and
   readable one.
 
+  You only have to specify a driver in your configuration files.
+

whereas "git diff --no-textconv" only tells

diff --git a/textconv_demo.odt b/textconv_demo.odt
index b4e6a6d..c08e32e 100644
Binary files a/textconv_demo.odt and b/textconv_demo.odt differ

References

  1. [1] Patch for git-blame on the mailing-list
  2. [2] Patch for git-gui blame on the mailing-list
Personal tools