[What is it?] [Where do I get it?] [Who's working on it?] [To Do...] [Related Work]


The ext3cow file system is a time-shifting file system implemented with copy-on-write (COW) and based on the very popular
ext2/ext3 file system. Phew! Now what does all that mean? Basically, it's a functionally enhanced version of ext3 that allows one to take a snapshot of one's file system, with one second granularity, freezing the way the file system looked at given point-in-time. One may then travel back through time, if you will, and be presented with a read-only image of the file system. No special mounting, or '.' directories. Just type a name or directory followed by the special '@' character and a time. For example:

The second cat, reads the current version of the file, while the first cat, reads the file foo.txt as it appeared at Thu Jul 10 09:58:04 2003. The syntax for the time variable is the number of seconds passed since the Epoch (00:00:00 UTC, January 1, 1970). This may seem a little weird to use, but we believe any fancier date syntax should be parsed by a shell or shell macro and not by the file system. This is on our ToDo List.

Of course, this works for more than just text files; images, databases, and binaries, ext3cow does it all.

There are quite a few advantages to system like this, including data availability and reliability, a consistent image for backup, a checkpoint for restoring the state of a file system, a source for data mining, and a way to provide tamper resistant storage.

So, what are the trade-offs for such a system? For every version of a file that exists, an inode must exist to reference it. Therefore, there's a slight increase in metadata overhead of about 5%. Of course this percentage varies with snapshot frequency and the number of files modified between snapshots. If you never take a snapshot, then there's no increase in metadata. Further, no data is every thrown away, so there's higher data block usage. Results on the average increase in data block usage have yet to be reported, but because of the copy-on-write policy, only the blocks that have changed between snapshots are written to disk. In this way, versions of the same file may share data blocks between snapshots minimizing the data footprint.

Other features include being able to change into a directory in the past and read it's contents as it appeared at a point-in-time, making symbolic links to files in the past, and the ability to diff versions of file.

 


You may download ext3cow and it's related tools from the
download page.

 


Right now, ext3cow is the research of young Zachary Peterson and his advisor Dr. Randal Burns. Both are members of the Hopkins Storage Systems Lab (HSSL) at the Johns Hopkins University. This material is based upon work supported by the National Science Foundation under Grant No. 0238305 and by the Department of Engery. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Further development on the e2fsprog tools was done by Joe Herring. Version listing in 0.1.4 was done by Matthiew Jonglez.

NEW! Fellow JHU student, Sandeep Ranade, has developed a Time-Traveling File Manager for intuitively, efficiently and easily navigating and interfacing with the past.

If you'd like to join in the development, report bugs, suggest new and better features, provide feedback, ask about the design, or just see how Zach's doing, please send email to: Join the ext3cow mailing list! Go to http://hssl.cs.jhu.edu/mailman/listinfo/ext3cow-devel or email ext3cow-devel (at) hssl.cs.jhu.edu.

Publications

 


The following is a list of open issues in the file system, or perphipheral projects still needed to be done.
  • Update ext3cow for 2.6 kernel.
  • Support for memory mapped files.
  • CVS-like tagging and branching features. (Or is this part of ttcsh?)

 


  • The ext2/3 file system. [Website] [Website] [Website]
    • Design and implementation of the second extended file system. Remy Card and Theodore Y. Ts'o and Stephen Tweedie. The 1994 Amsterdam Linux Conference, 1994.
      [HTML]
    • Planned Extensions to the Linux Ext2/Ext3 Filesystem. Theodore Y. Ts'o and Stephen Tweedie. The 2002 USENIX Annual Technical Conference, Monterey, CA, June 2003.
      [HTML] [PDF]

  • CVFS. [Website]
    • Metadata Efficiency in a Comprehensive Versioning File System. Craig A. N. Soules, Garth R. Goodson, John D. Strunk, Gregory R. Ganger. 2nd USENIX Conference on File and Storage Technologies, San Francisco, CA, Mar 31 - Apr 2, 2003. Also available as CMU SCS Technical Report CMU-CS-02-145, May 2002.
      [Abstract] [PDF]

  • Wayback
    • Wayback: A User-level Versioning File System for Linux. Brian Cornell, Peter A. Dinda and Fabian E. Bustamante. The 2004 USENIX Annual Technical Conference, FREENIX track, June, 2004.
      [Abstract] [PDF]

  • The Elephant file system.
    • Deciding when to forget in the Elephant file system. Douglas S. Santry, Michael J. Feeley, Norman C. Hutchinson, Alistair C. Veitch, Ross W. Carton and Jacob Ofir. The 17th ACM Symposium on Operating Systems Principles (SOSP), December 1999.
      [PDF]

  • SnapFS. [Website] [Website]

  • The Venti file system.
    • Venti: a new approach to archival storage. Sean Quinlan and Sean Dorward. First USENIX conference on File and Storage Technologies, Monterey, CA, January 2002.
      [HTML] [PDF]

 

 

 

Last updated:
5/17/05