Ken Muse

What Is .gitattributes?

Sometimes you see the file in a project. It’s just sitting there, somehow influencing Git. But what is .gitattributes? In short, it is a basic text document that associates attributes and behaviors with specific file types. At its most basic, it configures how files matching certain patterns (usually file extensions) are handled. Common scnearios include configuring how the line feeds are handled, whether the document is treated as a binary (non-diffable), and providing special processing before a commit or after checkout.

Basic Format

Like most Git settings files, the # character is used to start a comment. Beyond that, it follows the format of file-pattern attributes.... There are four settings types:

  • Set (boolean true). Specifying the name of the attribute.
  • Unset (boolean false). Specifying the name of the attribute, prefixed with a dash (-).
  • Unspecified. The file pattern has no matches or no specific configuration has been defined for the attribute. You can configure an attribute to be unspecified by prefixing it with an exclamation point (!).
  • Set to value. The attribute is followed by an equals sign(=) and the value to apply.

The file path is similar to .gitignore, with two exceptions: you can’t have negative patterns and that patterns that match directories are not recursive. The file paths are case-sensitive, so that can require additional care when matching. For example, *.bat and *.BAT are treated as two different extensions. A common way to handle this is to support both when handling files that may have mixed cases. For example, using *.[Bb][Aa][Tt].

The attributes can be any of the following:

textEnables end-of-line normalization and treats the file as a text file. If set to auto, the line endings are converted to LF on checkin and converted to the appropriate OS line endings in the working directory.
binaryMarks a file as binary and not capable of differencing. This is the same as -text -merge -dff.
eolONly applies for text files or where text is set to auto. The line endings for files are normalized to LF on checking, but can be configured to always normalize to crlf or lf on checkout.
working-tree-encodingGit recognizes ASCII, UTF-8, and similar encodings as text files, but other encodings (such as UTF-16) are interpreted as binary. When set, the file is normalized and encoded as UTF-8 on checkin, and restored to the specified encoding on checkout. This can have issues for encoders that are not round-trip safe, such as SHIFT-JIS (Japanese). In this case, the configuration setting core.checkRoundtripEncoding should be set to avoid data corruption.
identWhen set, Git replaces $Id$ with $Id: {blob name}$ on checkout, where the blob name is a 40-character hexadecimal value.
filterThis is used for assigning a smudge/clean filter to a specific path. Filters are configured as part of your .gitconfig and have two parts. When a file is checked in, the clean filter is applied. When it is checked out, smudge is applied. This only happens if the filter is configured. Otherwise, no filtering occurs. One of the more common filters is LFS (Large File Storage), which enables storing large files outside of the Git repo, replacing it with a reference and enabling download on checkout.
diffIf set, treats the file as text and allows differencing the file history. This If a value is specified, it defines an external driver (with command) to be used for performing the diff. Note that it is possible to configure binary differencing with this attribute. Git has built-in handling for multiple file and language types. This is covered in more depth in the Git documentation.
mergeThe driver to use for 3-way merges.
whitespaceIf set, Git will consider all potential whitespace errors as a different. If unset, errors are ignored. Unspecified, it uses core.whitespace to decide. It can be configured to a comma-separated list, similar to core.whitespace.
export-ignoreFiles will not be added to archives
export-substExpands placeholders when exported to an archive. The expansions are documented as part of the git log pretty formats.
deltaIf unset, delta compression will be disabled.
encodingSpecifies the character encoding for GUI tools.

Whenever an attribute is changed, you may need to use git add --renormalize to ensure that the files are properly normalized according to the new settings. This is especially true with the filter attribute.

To give you an idea of a starting point:

 1# Set default behavior to automatically normalize line endings.
 2* text=auto
 4# Windows files require CRLF
 5*.[cC][mM][dD] text eol=crlf
 6*.[bB][aA][tT] text eol=crlf
 8# Shell scripts are LF only
 9*.sh text eol=lf
11# Treat SVG files as normalized text instead of binary
12*.svg text

Happy DevOp’ing!