If you’ve ever worked on a project that uses code generation, you know the frustration. You regenerate your API client because a single endpoint changed, and the resulting diff has dozens of lines that don’t matter – timestamps, version strings, generator metadata – burying the actual changes you care about. Reviewing that pull request means mentally filtering out the noise to find what’s really different.
What if Git could do that filtering for you?
It turns out Git has a little-known feature called textconv that does exactly this. It lets you run a command to transform a file’s content before Git computes the diff, so the noise never appears in the output. The underlying files remain untouched – you’re only changing what Git shows you when comparing versions.
The noisy diff
Let’s make this concrete. Suppose your team uses
OpenAPI Generator to produce Java client code from an API specification. Every generated class includes an @Generated annotation stamped with the current timestamp:
1@javax.annotation.Generated(
2 value = "org.openapitools.codegen.languages.JavaClientCodegen",
3 date = "2026-05-20T14:32:07.123Z[Etc/UTC]")
4public class PetApi {
5
6 @javax.annotation.Generated(
7 value = "org.openapitools.codegen.languages.JavaClientCodegen",
8 date = "2026-05-20T14:32:07.123Z[Etc/UTC]")
9 public static class Builder {
10 // ...
11 }
12}The next time someone regenerates the client – even without spec changes – the date field updates on every single annotation. A file with five or six nested classes produces five or six diff lines of pure noise per file, mixed in with any real changes. Multiply that across a generated client with dozens of files, and code review becomes an exercise in patience as you deal with diffs that are expected but ignorable.
1-@javax.annotation.Generated(
2- value = "org.openapitools.codegen.languages.JavaClientCodegen",
3- date = "2026-05-20T14:32:07.123Z[Etc/UTC]")
4+@javax.annotation.Generated(
5+ value = "org.openapitools.codegen.languages.JavaClientCodegen",
6+ date = "2026-05-28T09:15:32.441Z[Etc/UTC]")
7 public class PetApi {
You don’t want to remove the annotations from the code (they’re useful metadata). You just don’t want to see them in your diffs.
The diff pipeline
Git’s diff pipeline is more flexible than most developers realize. When you run git diff, Git doesn’t necessarily compare raw file bytes. It supports pluggable diff drivers – named configurations that tell Git how to produce diffs for specific file types. Interestingly, this feature was originally designed for binary files – allowing you to see meaningful diffs of things like details about the image or word processor documents from the Git CLI. It happens that this method works equally well as a filter for noisy text files, which is exactly what you’re going to take advantage of here.
The textconv option within a diff driver specifies a command that Git runs on each version of a file before computing the diff. It captures and diffs the output of that command instead of the original file content. This is fundamentally different from
filter drivers, which use a clean filter to transform content at staging time (changing what Git stores) and a smudge filter to transform content at checkout (changing what appears in your working tree). With textconv, Git never modifies the stored content or your working tree files – only the diff view changes.
Setting this up requires two pieces of configuration: defining a diff driver that specifies the transformation command, and then assigning that driver to the file patterns you care about.
Define the diff driver
You define a diff driver in your Git configuration. That means similar to Git Large File Storage (LFS) and other drivers, you must define it in your system or user (global) .gitconfig (or your repository’s .git/config).
To strip the @Generated lines from the diff output, you can use sed to delete any line matching the annotation pattern. Here’s what the configuration looks like:
The name nogen is arbitrary – pick something meaningful to your team. Every driver must have a unique name so that you can invoke it. The sed expression deletes any block of lines from @javax.annotation.Generated( through the first line that contains ). For those not familiar with sed, this syntax searches using a range pattern /start-pattern/,/end-pattern/. All lines in the range are deleted (d). Since the . means “any character”, best practice is to escape it (\.). This works reliably for OpenAPI Generator’s output format, but keep in mind that the end pattern matches the first line containing any ) – so if a field value inside the annotation contained ), the range would end too early.
With this configured, Git computes a diff for any file assigned to the nogen driver by writing each version to a temporary file and passing its path as an argument to the command.
You can also create this entry from the command line. Just remember to drop --global if you only want to scope this to the current repository (instead of the current user):
1git config --global diff.nogen.textconv "sed '/@javax\.annotation\.Generated(/,/)/d'"If you prefer PowerShell, add the following to your .gitconfig directly – the nested quoting makes the git config command-line form impractical:
It’s important to know that the driver definition will always live outside of source control – .gitconfig and .git/config are not tracked by Git and cannot be committed as part of your repository. If you really need that to apply consistently across your team, a global or system-level driver definition is your best option.
Assign the driver to files
Now you need to tell Git which files should use this driver. You do this through
.gitattributes, a file that associates behaviors to files using path-based patterns. You can learn more about the options for configuring this file in my post
Creating a .gitattributes Without Committing.
To use the driver, add a mapping like this to your .gitattributes:
1src/generated/**/*.java diff=nogenThis mapping assigns the nogen diff driver to all Java files under the src/generated/ directory. Those files will now use the nogen driver, while all of the other files will be unaffected. For those not yet comfortable with the syntax, ** matches text that includes directory separators, while * matches any text except directory separators.
Seeing the results
With both pieces in place, running git diff on a regenerated file now skips the annotation noise entirely. Instead of the numerous differences related to the timestamps for the generated code, those lines are ignored by Git. Only the actual API changes appear. The annotations remain in the file exactly as before – they just don’t clutter your local diffs by appearing as changes anymore.
The benefits also apply to git log -p, git show, and any other command that uses Git’s diff machinery. Other tools, however, may have a different behavior.
Visual Studio Code
If you use VS Code, it’s built-in diff view shows changes differently depending on the state of the changes in Git:
- Unstaged changes: VS Code reads the modified file directly from disk – no Git diff involved. Since
textconvis part of Git’s diff pipeline and not its file-reading pipeline, the filter never runs. You’ll see the raw, unfiltered diff. - Staged changes: Both sides are stored Git objects, so VS Code fetches them using
git diff. This goes through Git’s full diff machinery, and Git appliestextconvto both stored file versions before computing the diff. The filtered result appears as expected.
The practical takeaway: textconv filtering works reliably from the command line or in VS Code once changes are staged.
Source control systems
Since the driver definition lives outside source control, tools like GitHub and Azure DevOps will still show the full diff when you’re viewing a pull request or comparing files in the web interface. It only works in environments where you can configure how Git works with the files.
When to use this technique
The textconv approach works best when you have lines that change frequently but carry no meaningful information for local code reviews. Generated timestamps are the classic case, but other scenarios include:
- Build-inserted version strings or hash comments
- Auto-formatted headers that a tool rewrites on every run
- Machine-generated copyright year updates
This technique can also be used with AI tools. By pre-configuring the environment (for example, defining a copilot-setup-steps.yml for Copilot cloud agent), you can eliminate content from a diff that won’t be meaningful. This can save tokens and prevent the runner from seeing differences that are not actually relevant to the problem it’s trying to solve.
A small bit of Git configuration can save a surprising amount of time – and for AI tools, it also saves attention.
