Git fork
at reftables-rust 1382 lines 50 kB view raw
1gitattributes(5) 2================ 3 4NAME 5---- 6gitattributes - Defining attributes per path 7 8SYNOPSIS 9-------- 10$GIT_DIR/info/attributes, .gitattributes 11 12 13DESCRIPTION 14----------- 15 16A `gitattributes` file is a simple text file that gives 17`attributes` to pathnames. 18 19Each line in `gitattributes` file is of form: 20 21 pattern attr1 attr2 ... 22 23That is, a pattern followed by an attributes list, 24separated by whitespaces. Leading and trailing whitespaces are 25ignored. Lines that begin with '#' are ignored. Patterns 26that begin with a double quote are quoted in C style. 27When the pattern matches the path in question, the attributes 28listed on the line are given to the path. 29 30Each attribute can be in one of these states for a given path: 31 32Set:: 33 34 The path has the attribute with special value "true"; 35 this is specified by listing only the name of the 36 attribute in the attribute list. 37 38Unset:: 39 40 The path has the attribute with special value "false"; 41 this is specified by listing the name of the attribute 42 prefixed with a dash `-` in the attribute list. 43 44Set to a value:: 45 46 The path has the attribute with specified string value; 47 this is specified by listing the name of the attribute 48 followed by an equal sign `=` and its value in the 49 attribute list. 50 51Unspecified:: 52 53 No pattern matches the path, and nothing says if 54 the path has or does not have the attribute, the 55 attribute for the path is said to be Unspecified. 56 57When more than one pattern matches the path, a later line 58overrides an earlier line. This overriding is done per 59attribute. 60 61The rules by which the pattern matches paths are the same as in 62`.gitignore` files (see linkgit:gitignore[5]), with a few exceptions: 63 64 - negative patterns are forbidden 65 66 - patterns that match a directory do not recursively match paths 67 inside that directory (so using the trailing-slash `path/` syntax is 68 pointless in an attributes file; use `path/**` instead) 69 70When deciding what attributes are assigned to a path, Git 71consults `$GIT_DIR/info/attributes` file (which has the highest 72precedence), `.gitattributes` file in the same directory as the 73path in question, and its parent directories up to the toplevel of the 74work tree (the further the directory that contains `.gitattributes` 75is from the path in question, the lower its precedence). Finally 76global and system-wide files are considered (they have the lowest 77precedence). 78 79When the `.gitattributes` file is missing from the work tree, the 80path in the index is used as a fall-back. During checkout process, 81`.gitattributes` in the index is used and then the file in the 82working tree is used as a fall-back. 83 84If you wish to affect only a single repository (i.e., to assign 85attributes to files that are particular to 86one user's workflow for that repository), then 87attributes should be placed in the `$GIT_DIR/info/attributes` file. 88Attributes which should be version-controlled and distributed to other 89repositories (i.e., attributes of interest to all users) should go into 90`.gitattributes` files. Attributes that should affect all repositories 91for a single user should be placed in a file specified by the 92`core.attributesFile` configuration option (see linkgit:git-config[1]). 93Its default value is $XDG_CONFIG_HOME/git/attributes. If $XDG_CONFIG_HOME 94is either not set or empty, $HOME/.config/git/attributes is used instead. 95Attributes for all users on a system should be placed in the 96`$(prefix)/etc/gitattributes` file. 97 98Sometimes you would need to override a setting of an attribute 99for a path to `Unspecified` state. This can be done by listing 100the name of the attribute prefixed with an exclamation point `!`. 101 102 103RESERVED BUILTIN_* ATTRIBUTES 104----------------------------- 105 106builtin_* is a reserved namespace for builtin attribute values. Any 107user defined attributes under this namespace will be ignored and 108trigger a warning. 109 110`builtin_objectmode` 111~~~~~~~~~~~~~~~~~~~~ 112This attribute is for filtering files by their file bit modes (40000, 113120000, 160000, 100755, 100644). e.g. ':(attr:builtin_objectmode=160000)'. 114You may also check these values with `git check-attr builtin_objectmode -- <file>`. 115If the object is not in the index `git check-attr --cached` will return unspecified. 116 117 118EFFECTS 119------- 120 121Certain operations by Git can be influenced by assigning 122particular attributes to a path. Currently, the following 123operations are attributes-aware. 124 125Checking-out and checking-in 126~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 127 128These attributes affect how the contents stored in the 129repository are copied to the working tree files when commands 130such as 'git switch', 'git checkout' and 'git merge' run. 131They also affect how 132Git stores the contents you prepare in the working tree in the 133repository upon 'git add' and 'git commit'. 134 135`text` 136^^^^^^ 137 138This attribute marks the path as a text file, which enables end-of-line 139conversion: When a matching file is added to the index, the file's line 140endings are normalized to LF in the index. Conversely, when the file is 141copied from the index to the working directory, its line endings may be 142converted from LF to CRLF depending on the `eol` attribute, the Git 143config, and the platform (see explanation of `eol` below). 144 145Set:: 146 147 Setting the `text` attribute on a path enables end-of-line 148 conversion on checkin and checkout as described above. Line endings 149 are normalized to LF in the index every time the file is checked in, 150 even if the file was previously added to Git with CRLF line endings. 151 152Unset:: 153 154 Unsetting the `text` attribute on a path tells Git not to 155 attempt any end-of-line conversion upon checkin or checkout. 156 157Set to string value "auto":: 158 159 When `text` is set to "auto", Git decides by itself whether the file 160 is text or binary. If it is text and the file was not already in 161 Git with CRLF endings, line endings are converted on checkin and 162 checkout as described above. Otherwise, no conversion is done on 163 checkin or checkout. 164 165Unspecified:: 166 167 If the `text` attribute is unspecified, Git uses the 168 `core.autocrlf` configuration variable to determine if the 169 file should be converted. 170 171Any other value causes Git to act as if `text` has been left 172unspecified. 173 174`eol` 175^^^^^ 176 177This attribute marks a path to use a specific line-ending style in the 178working tree when it is checked out. It has effect only if `text` or 179`text=auto` is set (see above), but specifying `eol` automatically sets 180`text` if `text` was left unspecified. 181 182Set to string value "crlf":: 183 184 This setting converts the file's line endings in the working 185 directory to CRLF when the file is checked out. 186 187Set to string value "lf":: 188 189 This setting uses the same line endings in the working directory as 190 in the index when the file is checked out. 191 192Unspecified:: 193 194 If the `eol` attribute is unspecified for a file, its line endings 195 in the working directory are determined by the `core.autocrlf` or 196 `core.eol` configuration variable (see the definitions of those 197 options in linkgit:git-config[1]). If `text` is set but neither of 198 those variables is, the default is `eol=crlf` on Windows and 199 `eol=lf` on all other platforms. 200 201Backwards compatibility with `crlf` attribute 202^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 203 204For backwards compatibility, the `crlf` attribute is interpreted as 205follows: 206 207------------------------ 208crlf text 209-crlf -text 210crlf=input eol=lf 211------------------------ 212 213End-of-line conversion 214^^^^^^^^^^^^^^^^^^^^^^ 215 216While Git normally leaves file contents alone, it can be configured to 217normalize line endings to LF in the repository and, optionally, to 218convert them to CRLF when files are checked out. 219 220If you simply want to have CRLF line endings in your working directory 221regardless of the repository you are working with, you can set the 222config variable "core.autocrlf" without using any attributes. 223 224------------------------ 225[core] 226 autocrlf = true 227------------------------ 228 229This does not force normalization of text files, but does ensure 230that text files that you introduce to the repository have their line 231endings normalized to LF when they are added, and that files that are 232already normalized in the repository stay normalized. 233 234If you want to ensure that text files that any contributor introduces to 235the repository have their line endings normalized, you can set the 236`text` attribute to "auto" for _all_ files. 237 238------------------------ 239* text=auto 240------------------------ 241 242The attributes allow a fine-grained control, how the line endings 243are converted. 244Here is an example that will make Git normalize .txt, .vcproj and .sh 245files, ensure that .vcproj files have CRLF and .sh files have LF in 246the working directory, and prevent .jpg files from being normalized 247regardless of their content. 248 249------------------------ 250* text=auto 251*.txt text 252*.vcproj text eol=crlf 253*.sh text eol=lf 254*.jpg -text 255------------------------ 256 257NOTE: When `text=auto` conversion is enabled in a cross-platform 258project using push and pull to a central repository the text files 259containing CRLFs should be normalized. 260 261From a clean working directory: 262 263------------------------------------------------- 264$ echo "* text=auto" >.gitattributes 265$ git add --renormalize . 266$ git status # Show files that will be normalized 267$ git commit -m "Introduce end-of-line normalization" 268------------------------------------------------- 269 270If any files that should not be normalized show up in 'git status', 271unset their `text` attribute before running 'git add -u'. 272 273------------------------ 274manual.pdf -text 275------------------------ 276 277Conversely, text files that Git does not detect can have normalization 278enabled manually. 279 280------------------------ 281weirdchars.txt text 282------------------------ 283 284If `core.safecrlf` is set to "true" or "warn", Git verifies if 285the conversion is reversible for the current setting of 286`core.autocrlf`. For "true", Git rejects irreversible 287conversions; for "warn", Git only prints a warning but accepts 288an irreversible conversion. The safety triggers to prevent such 289a conversion done to the files in the work tree, but there are a 290few exceptions. Even though... 291 292- 'git add' itself does not touch the files in the work tree, the 293 next checkout would, so the safety triggers; 294 295- 'git apply' to update a text file with a patch does touch the files 296 in the work tree, but the operation is about text files and CRLF 297 conversion is about fixing the line ending inconsistencies, so the 298 safety does not trigger; 299 300- 'git diff' itself does not touch the files in the work tree, it is 301 often run to inspect the changes you intend to next 'git add'. To 302 catch potential problems early, safety triggers. 303 304 305`working-tree-encoding` 306^^^^^^^^^^^^^^^^^^^^^^^ 307 308Git recognizes files encoded in ASCII or one of its supersets (e.g. 309UTF-8, ISO-8859-1, ...) as text files. Files encoded in certain other 310encodings (e.g. UTF-16) are interpreted as binary and consequently 311built-in Git text processing tools (e.g. 'git diff') as well as most Git 312web front ends do not visualize the contents of these files by default. 313 314In these cases you can tell Git the encoding of a file in the working 315directory with the `working-tree-encoding` attribute. If a file with this 316attribute is added to Git, then Git re-encodes the content from the 317specified encoding to UTF-8. Finally, Git stores the UTF-8 encoded 318content in its internal data structure (called "the index"). On checkout 319the content is re-encoded back to the specified encoding. 320 321Please note that using the `working-tree-encoding` attribute may have a 322number of pitfalls: 323 324- Alternative Git implementations (e.g. JGit or libgit2) and older Git 325 versions (as of March 2018) do not support the `working-tree-encoding` 326 attribute. If you decide to use the `working-tree-encoding` attribute 327 in your repository, then it is strongly recommended to ensure that all 328 clients working with the repository support it. 329+ 330For example, Microsoft Visual Studio resources files (`*.rc`) or 331PowerShell script files (`*.ps1`) are sometimes encoded in UTF-16. 332If you declare `*.ps1` as files as UTF-16 and you add `foo.ps1` with 333a `working-tree-encoding` enabled Git client, then `foo.ps1` will be 334stored as UTF-8 internally. A client without `working-tree-encoding` 335support will checkout `foo.ps1` as UTF-8 encoded file. This will 336typically cause trouble for the users of this file. 337+ 338If a Git client that does not support the `working-tree-encoding` 339attribute adds a new file `bar.ps1`, then `bar.ps1` will be 340stored "as-is" internally (in this example probably as UTF-16). 341A client with `working-tree-encoding` support will interpret the 342internal contents as UTF-8 and try to convert it to UTF-16 on checkout. 343That operation will fail and cause an error. 344 345- Reencoding content to non-UTF encodings can cause errors as the 346 conversion might not be UTF-8 round trip safe. If you suspect your 347 encoding to not be round trip safe, then add it to 348 `core.checkRoundtripEncoding` to make Git check the round trip 349 encoding (see linkgit:git-config[1]). SHIFT-JIS (Japanese character 350 set) is known to have round trip issues with UTF-8 and is checked by 351 default. 352 353- Reencoding content requires resources that might slow down certain 354 Git operations (e.g 'git checkout' or 'git add'). 355 356Use the `working-tree-encoding` attribute only if you cannot store a file 357in UTF-8 encoding and if you want Git to be able to process the content 358as text. 359 360As an example, use the following attributes if your '*.ps1' files are 361UTF-16 encoded with byte order mark (BOM) and you want Git to perform 362automatic line ending conversion based on your platform. 363 364------------------------ 365*.ps1 text working-tree-encoding=UTF-16 366------------------------ 367 368Use the following attributes if your '*.ps1' files are UTF-16 little 369endian encoded without BOM and you want Git to use Windows line endings 370in the working directory (use `UTF-16LE-BOM` instead of `UTF-16LE` if 371you want UTF-16 little endian with BOM). 372Please note, it is highly recommended to 373explicitly define the line endings with `eol` if the `working-tree-encoding` 374attribute is used to avoid ambiguity. 375 376------------------------ 377*.ps1 text working-tree-encoding=UTF-16LE eol=crlf 378------------------------ 379 380You can get a list of all available encodings on your platform with the 381following command: 382 383------------------------ 384iconv --list 385------------------------ 386 387If you do not know the encoding of a file, then you can use the `file` 388command to guess the encoding: 389 390------------------------ 391file foo.ps1 392------------------------ 393 394 395`ident` 396^^^^^^^ 397 398When the attribute `ident` is set for a path, Git replaces 399`$Id$` in the blob object with `$Id:`, followed by the 40040-character hexadecimal blob object name, followed by a dollar 401sign `$` upon checkout. Any byte sequence that begins with 402`$Id:` and ends with `$` in the worktree file is replaced 403with `$Id$` upon check-in. 404 405 406`filter` 407^^^^^^^^ 408 409A `filter` attribute can be set to a string value that names a 410filter driver specified in the configuration. 411 412A filter driver consists of a `clean` command and a `smudge` 413command, either of which can be left unspecified. Upon 414checkout, when the `smudge` command is specified, the command is 415fed the blob object from its standard input, and its standard 416output is used to update the worktree file. Similarly, the 417`clean` command is used to convert the contents of worktree file 418upon checkin. By default these commands process only a single 419blob and terminate. If a long running `process` filter is used 420in place of `clean` and/or `smudge` filters, then Git can process 421all blobs with a single filter command invocation for the entire 422life of a single Git command, for example `git add --all`. If a 423long running `process` filter is configured then it always takes 424precedence over a configured single blob filter. See section 425below for the description of the protocol used to communicate with 426a `process` filter. 427 428One use of the content filtering is to massage the content into a shape 429that is more convenient for the platform, filesystem, and the user to use. 430For this mode of operation, the key phrase here is "more convenient" and 431not "turning something unusable into usable". In other words, the intent 432is that if someone unsets the filter driver definition, or does not have 433the appropriate filter program, the project should still be usable. 434 435Another use of the content filtering is to store the content that cannot 436be directly used in the repository (e.g. a UUID that refers to the true 437content stored outside Git, or an encrypted content) and turn it into a 438usable form upon checkout (e.g. download the external content, or decrypt 439the encrypted content). 440 441These two filters behave differently, and by default, a filter is taken as 442the former, massaging the contents into more convenient shape. A missing 443filter driver definition in the config, or a filter driver that exits with 444a non-zero status, is not an error but makes the filter a no-op passthru. 445 446You can declare that a filter turns a content that by itself is unusable 447into a usable content by setting the filter.<driver>.required configuration 448variable to `true`. 449 450Note: Whenever the clean filter is changed, the repo should be renormalized: 451$ git add --renormalize . 452 453For example, in .gitattributes, you would assign the `filter` 454attribute for paths. 455 456------------------------ 457*.c filter=indent 458------------------------ 459 460Then you would define a "filter.indent.clean" and "filter.indent.smudge" 461configuration in your .git/config to specify a pair of commands to 462modify the contents of C programs when the source files are checked 463in ("clean" is run) and checked out (no change is made because the 464command is "cat"). 465 466------------------------ 467[filter "indent"] 468 clean = indent 469 smudge = cat 470------------------------ 471 472For best results, `clean` should not alter its output further if it is 473run twice ("clean->clean" should be equivalent to "clean"), and 474multiple `smudge` commands should not alter `clean`'s output 475("smudge->smudge->clean" should be equivalent to "clean"). See the 476section on merging below. 477 478The "indent" filter is well-behaved in this regard: it will not modify 479input that is already correctly indented. In this case, the lack of a 480smudge filter means that the clean filter _must_ accept its own output 481without modifying it. 482 483If a filter _must_ succeed in order to make the stored contents usable, 484you can declare that the filter is `required`, in the configuration: 485 486------------------------ 487[filter "crypt"] 488 clean = openssl enc ... 489 smudge = openssl enc -d ... 490 required 491------------------------ 492 493Sequence "%f" on the filter command line is replaced with the name of 494the file the filter is working on. A filter might use this in keyword 495substitution. For example: 496 497------------------------ 498[filter "p4"] 499 clean = git-p4-filter --clean %f 500 smudge = git-p4-filter --smudge %f 501------------------------ 502 503Note that "%f" is the name of the path that is being worked on. Depending 504on the version that is being filtered, the corresponding file on disk may 505not exist, or may have different contents. So, smudge and clean commands 506should not try to access the file on disk, but only act as filters on the 507content provided to them on standard input. 508 509Long Running Filter Process 510^^^^^^^^^^^^^^^^^^^^^^^^^^^ 511 512If the filter command (a string value) is defined via 513`filter.<driver>.process` then Git can process all blobs with a 514single filter invocation for the entire life of a single Git 515command. This is achieved by using the long-running process protocol 516(described in Documentation/technical/long-running-process-protocol.adoc). 517 518When Git encounters the first file that needs to be cleaned or smudged, 519it starts the filter and performs the handshake. In the handshake, the 520welcome message sent by Git is "git-filter-client", only version 2 is 521supported, and the supported capabilities are "clean", "smudge", and 522"delay". 523 524Afterwards Git sends a list of "key=value" pairs terminated with 525a flush packet. The list will contain at least the filter command 526(based on the supported capabilities) and the pathname of the file 527to filter relative to the repository root. Right after the flush packet 528Git sends the content split in zero or more pkt-line packets and a 529flush packet to terminate content. Please note, that the filter 530must not send any response before it received the content and the 531final flush packet. Also note that the "value" of a "key=value" pair 532can contain the "=" character whereas the key would never contain 533that character. 534 535----------------------- 536packet: git> command=smudge 537packet: git> pathname=path/testfile.dat 538packet: git> 0000 539packet: git> CONTENT 540packet: git> 0000 541----------------------- 542 543The filter is expected to respond with a list of "key=value" pairs 544terminated with a flush packet. If the filter does not experience 545problems then the list must contain a "success" status. Right after 546these packets the filter is expected to send the content in zero 547or more pkt-line packets and a flush packet at the end. Finally, a 548second list of "key=value" pairs terminated with a flush packet 549is expected. The filter can change the status in the second list 550or keep the status as is with an empty list. Please note that the 551empty list must be terminated with a flush packet regardless. 552 553------------------------ 554packet: git< status=success 555packet: git< 0000 556packet: git< SMUDGED_CONTENT 557packet: git< 0000 558packet: git< 0000 # empty list, keep "status=success" unchanged! 559------------------------ 560 561If the result content is empty then the filter is expected to respond 562with a "success" status and a flush packet to signal the empty content. 563 564------------------------ 565packet: git< status=success 566packet: git< 0000 567packet: git< 0000 # empty content! 568packet: git< 0000 # empty list, keep "status=success" unchanged! 569------------------------ 570 571In case the filter cannot or does not want to process the content, 572it is expected to respond with an "error" status. 573 574----------------------- 575packet: git< status=error 576packet: git< 0000 577----------------------- 578 579If the filter experiences an error during processing, then it can 580send the status "error" after the content was (partially or 581completely) sent. 582 583------------------------ 584packet: git< status=success 585packet: git< 0000 586packet: git< HALF_WRITTEN_ERRONEOUS_CONTENT 587packet: git< 0000 588packet: git< status=error 589packet: git< 0000 590------------------------ 591 592In case the filter cannot or does not want to process the content 593as well as any future content for the lifetime of the Git process, 594then it is expected to respond with an "abort" status at any point 595in the protocol. 596 597----------------------- 598packet: git< status=abort 599packet: git< 0000 600----------------------- 601 602Git neither stops nor restarts the filter process in case the 603"error"/"abort" status is set. However, Git sets its exit code 604according to the `filter.<driver>.required` flag, mimicking the 605behavior of the `filter.<driver>.clean` / `filter.<driver>.smudge` 606mechanism. 607 608If the filter dies during the communication or does not adhere to 609the protocol then Git will stop the filter process and restart it 610with the next file that needs to be processed. Depending on the 611`filter.<driver>.required` flag Git will interpret that as error. 612 613Delay 614^^^^^ 615 616If the filter supports the "delay" capability, then Git can send the 617flag "can-delay" after the filter command and pathname. This flag 618denotes that the filter can delay filtering the current blob (e.g. to 619compensate network latencies) by responding with no content but with 620the status "delayed" and a flush packet. 621 622----------------------- 623packet: git> command=smudge 624packet: git> pathname=path/testfile.dat 625packet: git> can-delay=1 626packet: git> 0000 627packet: git> CONTENT 628packet: git> 0000 629packet: git< status=delayed 630packet: git< 0000 631----------------------- 632 633If the filter supports the "delay" capability then it must support the 634"list_available_blobs" command. If Git sends this command, then the 635filter is expected to return a list of pathnames representing blobs 636that have been delayed earlier and are now available. 637The list must be terminated with a flush packet followed 638by a "success" status that is also terminated with a flush packet. If 639no blobs for the delayed paths are available, yet, then the filter is 640expected to block the response until at least one blob becomes 641available. The filter can tell Git that it has no more delayed blobs 642by sending an empty list. As soon as the filter responds with an empty 643list, Git stops asking. All blobs that Git has not received at this 644point are considered missing and will result in an error. 645 646------------------------ 647packet: git> command=list_available_blobs 648packet: git> 0000 649packet: git< pathname=path/testfile.dat 650packet: git< pathname=path/otherfile.dat 651packet: git< 0000 652packet: git< status=success 653packet: git< 0000 654------------------------ 655 656 657After Git received the pathnames, it will request the corresponding 658blobs again. These requests contain a pathname and an empty content 659section. The filter is expected to respond with the smudged content 660in the usual way as explained above. 661 662------------------------ 663packet: git> command=smudge 664packet: git> pathname=path/testfile.dat 665packet: git> 0000 666packet: git> 0000 # empty content! 667packet: git< status=success 668packet: git< 0000 669packet: git< SMUDGED_CONTENT 670packet: git< 0000 671packet: git< 0000 # empty list, keep "status=success" unchanged! 672------------------------ 673 674Example 675^^^^^^^ 676 677A long running filter demo implementation can be found in 678`contrib/long-running-filter/example.pl` located in the Git 679core repository. If you develop your own long running filter 680process then the `GIT_TRACE_PACKET` environment variables can be 681very helpful for debugging (see linkgit:git[1]). 682 683Please note that you cannot use an existing `filter.<driver>.clean` 684or `filter.<driver>.smudge` command with `filter.<driver>.process` 685because the former two use a different inter process communication 686protocol than the latter one. 687 688 689Interaction between checkin/checkout attributes 690^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 691 692In the check-in codepath, the worktree file is first converted 693with `filter` driver (if specified and corresponding driver 694defined), then the result is processed with `ident` (if 695specified), and then finally with `text` (again, if specified 696and applicable). 697 698In the check-out codepath, the blob content is first converted 699with `text`, and then `ident` and fed to `filter`. 700 701 702Merging branches with differing checkin/checkout attributes 703^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 704 705If you have added attributes to a file that cause the canonical 706repository format for that file to change, such as adding a 707clean/smudge filter or text/eol/ident attributes, merging anything 708where the attribute is not in place would normally cause merge 709conflicts. 710 711To prevent these unnecessary merge conflicts, Git can be told to run a 712virtual check-out and check-in of all three stages of each file that 713needs a three-way content merge, by setting the `merge.renormalize` 714configuration variable. This prevents changes caused by check-in 715conversion from causing spurious merge conflicts when a converted file 716is merged with an unconverted file. 717 718As long as a "smudge->clean" results in the same output as a "clean" 719even on files that are already smudged, this strategy will 720automatically resolve all filter-related conflicts. Filters that do 721not act in this way may cause additional merge conflicts that must be 722resolved manually. 723 724 725Generating diff text 726~~~~~~~~~~~~~~~~~~~~ 727 728`diff` 729^^^^^^ 730 731The attribute `diff` affects how Git generates diffs for particular 732files. It can tell Git whether to generate a textual patch for the path 733or to treat the path as a binary file. It can also affect what line is 734shown on the hunk header `@@ -k,l +n,m @@` line, tell Git to use an 735external command to generate the diff, or ask Git to convert binary 736files to a text format before generating the diff. 737 738Set:: 739 740 A path to which the `diff` attribute is set is treated 741 as text, even when they contain byte values that 742 normally never appear in text files, such as NUL. 743 744Unset:: 745 746 A path to which the `diff` attribute is unset will 747 generate `Binary files differ` (or a binary patch, if 748 binary patches are enabled). 749 750Unspecified:: 751 752 A path to which the `diff` attribute is unspecified 753 first gets its contents inspected, and if it looks like 754 text and is smaller than core.bigFileThreshold, it is treated 755 as text. Otherwise it would generate `Binary files differ`. 756 757String:: 758 759 Diff is shown using the specified diff driver. Each driver may 760 specify one or more options, as described in the following 761 section. The options for the diff driver "foo" are defined 762 by the configuration variables in the "diff.foo" section of the 763 Git config file. 764 765 766Defining an external diff driver 767^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 768 769The definition of a diff driver is done in `gitconfig`, not 770`gitattributes` file, so strictly speaking this manual page is a 771wrong place to talk about it. However... 772 773To define an external diff driver `jcdiff`, add a section to your 774`$GIT_DIR/config` file (or `$HOME/.gitconfig` file) like this: 775 776---------------------------------------------------------------- 777[diff "jcdiff"] 778 command = j-c-diff 779---------------------------------------------------------------- 780 781When Git needs to show you a diff for the path with `diff` 782attribute set to `jcdiff`, it calls the command you specified 783with the above configuration, i.e. `j-c-diff`, with 7 784parameters, just like `GIT_EXTERNAL_DIFF` program is called. 785See linkgit:git[1] for details. 786 787If the program is able to ignore certain changes (similar to 788`git diff --ignore-space-change`), then also set the option 789`trustExitCode` to true. It is then expected to return exit code 1 if 790it finds significant changes and 0 if it doesn't. 791 792Setting the internal diff algorithm 793^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 794 795The diff algorithm can be set through the `diff.algorithm` config key, but 796sometimes it may be helpful to set the diff algorithm per path. For example, 797one may want to use the `minimal` diff algorithm for .json files, and the 798`histogram` for .c files, and so on without having to pass in the algorithm 799through the command line each time. 800 801First, in `.gitattributes`, assign the `diff` attribute for paths. 802 803------------------------ 804*.json diff=<name> 805------------------------ 806 807Then, define a "diff.<name>.algorithm" configuration to specify the diff 808algorithm, choosing from `myers`, `patience`, `minimal`, or `histogram`. 809 810---------------------------------------------------------------- 811[diff "<name>"] 812 algorithm = histogram 813---------------------------------------------------------------- 814 815This diff algorithm applies to user facing diff output like git-diff(1), 816git-show(1) and is used for the `--stat` output as well. The merge machinery 817will not use the diff algorithm set through this method. 818 819NOTE: If `diff.<name>.command` is defined for path with the 820`diff=<name>` attribute, it is executed as an external diff driver 821(see above), and adding `diff.<name>.algorithm` has no effect, as the 822algorithm is not passed to the external diff driver. 823 824Defining a custom hunk-header 825^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 826 827Each group of changes (called a "hunk") in the textual diff output 828is prefixed with a line of the form: 829 830 @@ -k,l +n,m @@ TEXT 831 832This is called a 'hunk header'. The "TEXT" portion is by default a line 833that begins with an alphabet, an underscore or a dollar sign; this 834matches what GNU 'diff -p' output uses. This default selection however 835is not suited for some contents, and you can use a customized pattern 836to make a selection. 837 838First, in .gitattributes, you would assign the `diff` attribute 839for paths. 840 841------------------------ 842*.tex diff=tex 843------------------------ 844 845Then, you would define a "diff.tex.xfuncname" configuration to 846specify a regular expression that matches a line that you would 847want to appear as the hunk header "TEXT". Add a section to your 848`$GIT_DIR/config` file (or `$HOME/.gitconfig` file) like this: 849 850------------------------ 851[diff "tex"] 852 xfuncname = "^(\\\\(sub)*section\\{.*)$" 853------------------------ 854 855Note. A single level of backslashes are eaten by the 856configuration file parser, so you would need to double the 857backslashes; the pattern above picks a line that begins with a 858backslash, and zero or more occurrences of `sub` followed by 859`section` followed by open brace, to the end of line. 860 861There are a few built-in patterns to make this easier, and `tex` 862is one of them, so you do not have to write the above in your 863configuration file (you still need to enable this with the 864attribute mechanism, via `.gitattributes`). The following built in 865patterns are available: 866 867- `ada` suitable for source code in the Ada language. 868 869- `bash` suitable for source code in the Bourne-Again SHell language. 870 Covers a superset of POSIX shell function definitions. 871 872- `bibtex` suitable for files with BibTeX coded references. 873 874- `cpp` suitable for source code in the C and C++ languages. 875 876- `csharp` suitable for source code in the C# language. 877 878- `css` suitable for cascading style sheets. 879 880- `dts` suitable for devicetree (DTS) files. 881 882- `elixir` suitable for source code in the Elixir language. 883 884- `fortran` suitable for source code in the Fortran language. 885 886- `fountain` suitable for Fountain documents. 887 888- `golang` suitable for source code in the Go language. 889 890- `html` suitable for HTML/XHTML documents. 891 892- `java` suitable for source code in the Java language. 893 894- `kotlin` suitable for source code in the Kotlin language. 895 896- `markdown` suitable for Markdown documents. 897 898- `matlab` suitable for source code in the MATLAB and Octave languages. 899 900- `objc` suitable for source code in the Objective-C language. 901 902- `pascal` suitable for source code in the Pascal/Delphi language. 903 904- `perl` suitable for source code in the Perl language. 905 906- `php` suitable for source code in the PHP language. 907 908- `python` suitable for source code in the Python language. 909 910- `ruby` suitable for source code in the Ruby language. 911 912- `rust` suitable for source code in the Rust language. 913 914- `scheme` suitable for source code in the Scheme language. 915 916- `tex` suitable for source code for LaTeX documents. 917 918 919Customizing word diff 920^^^^^^^^^^^^^^^^^^^^^ 921 922You can customize the rules that `git diff --word-diff` uses to 923split words in a line, by specifying an appropriate regular expression 924in the "diff.*.wordRegex" configuration variable. For example, in TeX 925a backslash followed by a sequence of letters forms a command, but 926several such commands can be run together without intervening 927whitespace. To separate them, use a regular expression in your 928`$GIT_DIR/config` file (or `$HOME/.gitconfig` file) like this: 929 930------------------------ 931[diff "tex"] 932 wordRegex = "\\\\[a-zA-Z]+|[{}]|\\\\.|[^\\{}[:space:]]+" 933------------------------ 934 935A built-in pattern is provided for all languages listed in the 936previous section. 937 938 939Performing text diffs of binary files 940^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 941 942Sometimes it is desirable to see the diff of a text-converted 943version of some binary files. For example, a word processor 944document can be converted to an ASCII text representation, and 945the diff of the text shown. Even though this conversion loses 946some information, the resulting diff is useful for human 947viewing (but cannot be applied directly). 948 949The `textconv` config option is used to define a program for 950performing such a conversion. The program should take a single 951argument, the name of a file to convert, and produce the 952resulting text on stdout. 953 954For example, to show the diff of the exif information of a 955file instead of the binary information (assuming you have the 956exif tool installed), add the following section to your 957`$GIT_DIR/config` file (or `$HOME/.gitconfig` file): 958 959------------------------ 960[diff "jpg"] 961 textconv = exif 962------------------------ 963 964NOTE: The text conversion is generally a one-way conversion; 965in this example, we lose the actual image contents and focus 966just on the text data. This means that diffs generated by 967textconv are _not_ suitable for applying. For this reason, 968only `git diff` and the `git log` family of commands (i.e., 969log, whatchanged, show) will perform text conversion. `git 970format-patch` will never generate this output. If you want to 971send somebody a text-converted diff of a binary file (e.g., 972because it quickly conveys the changes you have made), you 973should generate it separately and send it as a comment _in 974addition to_ the usual binary diff that you might send. 975 976Because text conversion can be slow, especially when doing a 977large number of them with `git log -p`, Git provides a mechanism 978to cache the output and use it in future diffs. To enable 979caching, set the "cachetextconv" variable in your diff driver's 980config. For example: 981 982------------------------ 983[diff "jpg"] 984 textconv = exif 985 cachetextconv = true 986------------------------ 987 988This will cache the result of running "exif" on each blob 989indefinitely. If you change the textconv config variable for a 990diff driver, Git will automatically invalidate the cache entries 991and re-run the textconv filter. If you want to invalidate the 992cache manually (e.g., because your version of "exif" was updated 993and now produces better output), you can remove the cache 994manually with `git update-ref -d refs/notes/textconv/jpg` (where 995"jpg" is the name of the diff driver, as in the example above). 996 997Choosing textconv versus external diff 998^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 999 1000If you want to show differences between binary or specially-formatted 1001blobs in your repository, you can choose to use either an external diff 1002command, or to use textconv to convert them to a diff-able text format. 1003Which method you choose depends on your exact situation. 1004 1005The advantage of using an external diff command is flexibility. You are 1006not bound to find line-oriented changes, nor is it necessary for the 1007output to resemble unified diff. You are free to locate and report 1008changes in the most appropriate way for your data format. 1009 1010A textconv, by comparison, is much more limiting. You provide a 1011transformation of the data into a line-oriented text format, and Git 1012uses its regular diff tools to generate the output. There are several 1013advantages to choosing this method: 1014 10151. Ease of use. It is often much simpler to write a binary to text 1016 transformation than it is to perform your own diff. In many cases, 1017 existing programs can be used as textconv filters (e.g., exif, 1018 odt2txt). 1019 10202. Git diff features. By performing only the transformation step 1021 yourself, you can still utilize many of Git's diff features, 1022 including colorization, word-diff, and combined diffs for merges. 1023 10243. Caching. Textconv caching can speed up repeated diffs, such as those 1025 you might trigger by running `git log -p`. 1026 1027 1028Marking files as binary 1029^^^^^^^^^^^^^^^^^^^^^^^ 1030 1031Git usually guesses correctly whether a blob contains text or binary 1032data by examining the beginning of the contents. However, sometimes you 1033may want to override its decision, either because a blob contains binary 1034data later in the file, or because the content, while technically 1035composed of text characters, is opaque to a human reader. For example, 1036many postscript files contain only ASCII characters, but produce noisy 1037and meaningless diffs. 1038 1039The simplest way to mark a file as binary is to unset the diff 1040attribute in the `.gitattributes` file: 1041 1042------------------------ 1043*.ps -diff 1044------------------------ 1045 1046This will cause Git to generate `Binary files differ` (or a binary 1047patch, if binary patches are enabled) instead of a regular diff. 1048 1049However, one may also want to specify other diff driver attributes. For 1050example, you might want to use `textconv` to convert postscript files to 1051an ASCII representation for human viewing, but otherwise treat them as 1052binary files. You cannot specify both `-diff` and `diff=ps` attributes. 1053The solution is to use the `diff.*.binary` config option: 1054 1055------------------------ 1056[diff "ps"] 1057 textconv = ps2ascii 1058 binary = true 1059------------------------ 1060 1061Performing a three-way merge 1062~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1063 1064`merge` 1065^^^^^^^ 1066 1067The attribute `merge` affects how three versions of a file are 1068merged when a file-level merge is necessary during `git merge`, 1069and other commands such as `git revert` and `git cherry-pick`. 1070 1071Set:: 1072 1073 Built-in 3-way merge driver is used to merge the 1074 contents in a way similar to 'merge' command of `RCS` 1075 suite. This is suitable for ordinary text files. 1076 1077Unset:: 1078 1079 Take the version from the current branch as the 1080 tentative merge result, and declare that the merge has 1081 conflicts. This is suitable for binary files that do 1082 not have a well-defined merge semantics. 1083 1084Unspecified:: 1085 1086 By default, this uses the same built-in 3-way merge 1087 driver as is the case when the `merge` attribute is set. 1088 However, the `merge.default` configuration variable can name 1089 different merge driver to be used with paths for which the 1090 `merge` attribute is unspecified. 1091 1092String:: 1093 1094 3-way merge is performed using the specified custom 1095 merge driver. The built-in 3-way merge driver can be 1096 explicitly specified by asking for "text" driver; the 1097 built-in "take the current branch" driver can be 1098 requested with "binary". 1099 1100 1101Built-in merge drivers 1102^^^^^^^^^^^^^^^^^^^^^^ 1103 1104There are a few built-in low-level merge drivers defined that 1105can be asked for via the `merge` attribute. 1106 1107text:: 1108 1109 Usual 3-way file level merge for text files. Conflicted 1110 regions are marked with conflict markers `<<<<<<<`, 1111 `=======` and `>>>>>>>`. The version from your branch 1112 appears before the `=======` marker, and the version 1113 from the merged branch appears after the `=======` 1114 marker. 1115 1116binary:: 1117 1118 Keep the version from your branch in the work tree, but 1119 leave the path in the conflicted state for the user to 1120 sort out. 1121 1122union:: 1123 1124 Run 3-way file level merge for text files, but take 1125 lines from both versions, instead of leaving conflict 1126 markers. This tends to leave the added lines in the 1127 resulting file in random order and the user should 1128 verify the result. Do not use this if you do not 1129 understand the implications. 1130 1131 1132Defining a custom merge driver 1133^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1134 1135The definition of a merge driver is done in the `.git/config` 1136file, not in the `gitattributes` file, so strictly speaking this 1137manual page is a wrong place to talk about it. However... 1138 1139To define a custom merge driver `filfre`, add a section to your 1140`$GIT_DIR/config` file (or `$HOME/.gitconfig` file) like this: 1141 1142---------------------------------------------------------------- 1143[merge "filfre"] 1144 name = feel-free merge driver 1145 driver = filfre %O %A %B %L %P 1146 recursive = binary 1147---------------------------------------------------------------- 1148 1149The `merge.*.name` variable gives the driver a human-readable 1150name. 1151 1152The `merge.*.driver` variable's value is used to construct a 1153command to run to common ancestor's version (`%O`), current 1154version (`%A`) and the other branches' version (`%B`). These 1155three tokens are replaced with the names of temporary files that 1156hold the contents of these versions when the command line is 1157built. Additionally, `%L` will be replaced with the conflict marker 1158size (see below). 1159 1160The merge driver is expected to leave the result of the merge in 1161the file named with `%A` by overwriting it, and exit with zero 1162status if it managed to merge them cleanly, or non-zero if there 1163were conflicts. When the driver crashes (e.g. killed by SEGV), 1164it is expected to exit with non-zero status that are higher than 1165128, and in such a case, the merge results in a failure (which is 1166different from producing a conflict). 1167 1168The `merge.*.recursive` variable specifies what other merge 1169driver to use when the merge driver is called for an internal 1170merge between common ancestors, when there are more than one. 1171When left unspecified, the driver itself is used for both 1172internal merge and the final merge. 1173 1174The merge driver can learn the pathname in which the merged result 1175will be stored via placeholder `%P`. The conflict labels to be used 1176for the common ancestor, local head and other head can be passed by 1177using `%S`, `%X` and `%Y` respectively. 1178 1179`conflict-marker-size` 1180^^^^^^^^^^^^^^^^^^^^^^ 1181 1182This attribute controls the length of conflict markers left in 1183the work tree file during a conflicted merge. Only a positive 1184integer has a meaningful effect. 1185 1186For example, this line in `.gitattributes` can be used to tell the merge 1187machinery to leave much longer (instead of the usual 7-character-long) 1188conflict markers when merging the file `Documentation/git-merge.adoc` 1189results in a conflict. 1190 1191------------------------ 1192Documentation/git-merge.adoc conflict-marker-size=32 1193------------------------ 1194 1195 1196Checking whitespace errors 1197~~~~~~~~~~~~~~~~~~~~~~~~~~ 1198 1199`whitespace` 1200^^^^^^^^^^^^ 1201 1202The `core.whitespace` configuration variable allows you to define what 1203'diff' and 'apply' should consider whitespace errors for all paths in 1204the project (See linkgit:git-config[1]). This attribute gives you finer 1205control per path. 1206 1207Set:: 1208 1209 Notice all types of potential whitespace errors known to Git. 1210 The tab width is taken from the value of the `core.whitespace` 1211 configuration variable. 1212 1213Unset:: 1214 1215 Do not notice anything as error. 1216 1217Unspecified:: 1218 1219 Use the value of the `core.whitespace` configuration variable to 1220 decide what to notice as error. 1221 1222String:: 1223 1224 Specify a comma separated list of common whitespace problems to 1225 notice in the same format as the `core.whitespace` configuration 1226 variable. 1227 1228 1229Creating an archive 1230~~~~~~~~~~~~~~~~~~~ 1231 1232`export-ignore` 1233^^^^^^^^^^^^^^^ 1234 1235Files and directories with the attribute `export-ignore` won't be added to 1236archive files. 1237 1238`export-subst` 1239^^^^^^^^^^^^^^ 1240 1241If the attribute `export-subst` is set for a file then Git will expand 1242several placeholders when adding this file to an archive. The 1243expansion depends on the availability of a commit ID, i.e., if 1244linkgit:git-archive[1] has been given a tree instead of a commit or a 1245tag then no replacement will be done. The placeholders are the same 1246as those for the option `--pretty=format:` of linkgit:git-log[1], 1247except that they need to be wrapped like this: `$Format:PLACEHOLDERS$` 1248in the file. E.g. the string `$Format:%H$` will be replaced by the 1249commit hash. However, only one `%(describe)` placeholder is expanded 1250per archive to avoid denial-of-service attacks. 1251 1252 1253Packing objects 1254~~~~~~~~~~~~~~~ 1255 1256`delta` 1257^^^^^^^ 1258 1259Delta compression will not be attempted for blobs for paths with the 1260attribute `delta` set to false. 1261 1262 1263Viewing files in GUI tools 1264~~~~~~~~~~~~~~~~~~~~~~~~~~ 1265 1266`encoding` 1267^^^^^^^^^^ 1268 1269The value of this attribute specifies the character encoding that should 1270be used by GUI tools (e.g. linkgit:gitk[1] and linkgit:git-gui[1]) to 1271display the contents of the relevant file. Note that due to performance 1272considerations linkgit:gitk[1] does not use this attribute unless you 1273manually enable per-file encodings in its options. 1274 1275If this attribute is not set or has an invalid value, the value of the 1276`gui.encoding` configuration variable is used instead 1277(See linkgit:git-config[1]). 1278 1279 1280USING MACRO ATTRIBUTES 1281---------------------- 1282 1283You do not want any end-of-line conversions applied to, nor textual diffs 1284produced for, any binary file you track. You would need to specify e.g. 1285 1286------------ 1287*.jpg -text -diff 1288------------ 1289 1290but that may become cumbersome, when you have many attributes. Using 1291macro attributes, you can define an attribute that, when set, also 1292sets or unsets a number of other attributes at the same time. The 1293system knows a built-in macro attribute, `binary`: 1294 1295------------ 1296*.jpg binary 1297------------ 1298 1299Setting the "binary" attribute also unsets the "text" and "diff" 1300attributes as above. Note that macro attributes can only be "Set", 1301though setting one might have the effect of setting or unsetting other 1302attributes or even returning other attributes to the "Unspecified" 1303state. 1304 1305 1306DEFINING MACRO ATTRIBUTES 1307------------------------- 1308 1309Custom macro attributes can be defined only in top-level gitattributes 1310files (`$GIT_DIR/info/attributes`, the `.gitattributes` file at the 1311top level of the working tree, or the global or system-wide 1312gitattributes files), not in `.gitattributes` files in working tree 1313subdirectories. The built-in macro attribute "binary" is equivalent 1314to: 1315 1316------------ 1317[attr]binary -diff -merge -text 1318------------ 1319 1320NOTES 1321----- 1322 1323Git does not follow symbolic links when accessing a `.gitattributes` 1324file in the working tree. This keeps behavior consistent when the file 1325is accessed from the index or a tree versus from the filesystem. 1326 1327EXAMPLES 1328-------- 1329 1330If you have these three `gitattributes` file: 1331 1332---------------------------------------------------------------- 1333(in $GIT_DIR/info/attributes) 1334 1335a* foo !bar -baz 1336 1337(in .gitattributes) 1338abc foo bar baz 1339 1340(in t/.gitattributes) 1341ab* merge=filfre 1342abc -foo -bar 1343*.c frotz 1344---------------------------------------------------------------- 1345 1346the attributes given to path `t/abc` are computed as follows: 1347 13481. By examining `t/.gitattributes` (which is in the same 1349 directory as the path in question), Git finds that the first 1350 line matches. `merge` attribute is set. It also finds that 1351 the second line matches, and attributes `foo` and `bar` 1352 are unset. 1353 13542. Then it examines `.gitattributes` (which is in the parent 1355 directory), and finds that the first line matches, but 1356 `t/.gitattributes` file already decided how `merge`, `foo` 1357 and `bar` attributes should be given to this path, so it 1358 leaves `foo` and `bar` unset. Attribute `baz` is set. 1359 13603. Finally it examines `$GIT_DIR/info/attributes`. This file 1361 is used to override the in-tree settings. The first line is 1362 a match, and `foo` is set, `bar` is reverted to unspecified 1363 state, and `baz` is unset. 1364 1365As the result, the attributes assignment to `t/abc` becomes: 1366 1367---------------------------------------------------------------- 1368foo set to true 1369bar unspecified 1370baz set to false 1371merge set to string value "filfre" 1372frotz unspecified 1373---------------------------------------------------------------- 1374 1375 1376SEE ALSO 1377-------- 1378linkgit:git-check-attr[1]. 1379 1380GIT 1381--- 1382Part of the linkgit:git[1] suite