How did `git pull` eat my homework?

I feel like a kid in the principal’s office explaining that the dog ate my homework the night before it was due, but I’m staring some crazy data loss bug in the face and I can’t figure out how it happened. I would like to know how git could eat my repository whole! I’ve put git through the wringer many times and it’s never blinked. I’ve used it to split a 20 Gig Subversion repo into 27 git repos and filter-branched the foo out of them to untangle the mess and it’s never lost a byte on me. The reflog is always there to fall back on. This time the carpet is gone!

From my perspective, all I did is run git pull and it nuked my entire local repository. I don’t mean it “messed up the checked out version” or “the branch I was on” or anything like that. I mean the entire thing is gone.

Here is a screen-shot of my terminal at the incident:

incident screen shot

Let me walk you through that. My command prompt includes data about the current git repo (using prezto’s vcs_info implementation) so you can see when the git repo disappeared. The first command is normal enough:

  » caleb » jaguar » ~/p/w/incil.info » ◼  zend ★ »
❯❯❯ git co master
Switched to branch 'master'
Your branch is up-to-date with 'origin/master'.

There you can see I was on the ‘zend’ branch, and checked out master. So far so good. You’ll see in the prompt before my next command that it successfully switched branches:

  » caleb » jaguar » ~/p/w/incil.info » ◼  master ★ »
❯❯❯ git pull
remote: Counting objects: 37, done.
remote: Compressing objects: 100% (37/37), done.
remote: Total 37 (delta 25), reused 0 (delta 0)
Unpacking objects: 100% (37/37), done.
From gitlab.alerque.com:ipk/incil.info
 + 7412a21...eca4d26 master     -> origin/master  (forced update)
   f03fa5d..c8ea00b  devel      -> origin/devel
 + 2af282c...009b8ec verse-spinner -> origin/verse-spinner  (forced update)
First, rewinding head to replay your work on top of it...
>>> elapsed time 11s

And just like that it’s gone. The elapsed time marker outputs before the next prompt if more than 10 seconds have elapsed. Git did not give any output beyond the notice that it was rewinding to replay. No indication that it finished.

The next prompt includes no data about what branch we are on or the state of git.

Not noticing it had failed I obliviously tried to run another git command only to be told I wasn’t in a git repo. Note the PWD has not changed:

  » caleb » jaguar » ~/p/w/incil.info »
❯❯❯ git fetch --all
fatal: Not a git repository (or any parent up to mount point /home)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).

After this a look around showed that I was in a completely empty directory. Nothing. No ‘.git’ directory, nothing. Empty.

My local git is at version 2.0.2. Here are a couple tidbits from my git config that might be relevant to making out what happened:

[branch]
        autosetuprebase = always
        rebase = preserve
[pull]
        rebase = true
[rebase]
        autosquash = true
        autostash = true
[alias]
        co = checkout

For example I have git pull set to always do a rebase instead of a merge, so that part of the output above is normal.

I can recover the data. I don’t think there were any git objects other than some unimportant stashes that hadn’t been pushed to other repos, but I’d like to know what happened.

I have checked for:

  • Messages in dmesg or the systemd journal. Nothing even remotely relevant.
  • There is no indication of drive or file system failure (LVM + LUKS + EXT4 all look normal). There is nothing in lost+found.
  • I didn’t run anything else. There is nothing in the history I’m not showing above, and no other terminals were used during this time. There are no rm commands floating around that might have executed in the wrong CWD, etc.
  • Poking at another git repo in another directory shows no apparent abnormality executing git pulls.

What else should I be looking for here?

Asked By: Caleb

||

Looks like someone ran git push --force on this repo, and you pulled down those changes. Try cloning the repo fresh, that should get you back into a clean working state again.

Answered By: conorsch

With luck, you can fix this with the following command:

git reset --hard ORIG_HEAD  

When potential dangerous changes commence, git stashes your current state in ORIG_HEAD. With it you can undo a merge or rebase.

Git Manual: Undoing a Merge

Answered By: Routhinator

Possibly by failing at defining the file path to be deleted.

Your case reminded me a beautiful day that when my homemade remove(path) method tried to remove the root folder because the given parameter was empty string which the OS corrected (!) as the root folder.

This may be a similar git bug. Such that:

  1. Rebase command wanted to delete a file like remove(project_folder + file_path) (pseudo code)
  2. Somehow file_path was empty at the time.
  3. Command evaluated as some thing like remove(project_folder)
Answered By: maliayas

Yes, git ate my homework. All of it.

I made a dd image of this disk after the incident and messed around with it later. Reconstructing the series of events from system logs, I deduce what happened was something like this:

  1. A system update command (pacman -Syu) had been issued days before this incident.
  2. An extended network outage meant that it was left re-trying to download packages. Frustrated at the lack of internet, I’d put the system to sleep and gone to bed.
  3. Days later the system was woken up and it started finding and downloading packages again.
  4. Package download finished sometime just before I happened to be messing around with this repository.
  5. The system glibc installation got updated after the git checkout and before the git pull.
  6. The git binary got replaced after the git pull started and before it finished.
  7. And on the seventh day, git rested from all its labors. And deleted the world so everybody else had to rest too.

I don’t know exactly what race condition occurred that made this happen, but swapping out binaries in the middle of an operation is certainly not nice nor a testable / repeatable condition. Usually a copy of a running binary is stored in memory, but git is weird and something about the way it re-spawns versions of itself I’m sure led to this mess. Obviously it should have died rather than destroying everything, but that’s what happened.

Answered By: Caleb
Categories: Answers Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.