52

Can anyone show me the correct way to compress and decompress tar.gzip files in java i've been searching but the most i can find is either zip or gzip(alone).

kdgwill
  • 2,081
  • 3
  • 27
  • 46
  • 4
    tgz files aren't anything special -- you un-gzip it first, then un-tar it. – Chris Eberle Aug 19 '11 at 22:47
  • related: [How to print the content of a tar.gz file with Java?](http://stackoverflow.com/questions/5094074/how-to-print-the-content-of-a-tar-gz-file-with-java) – David Cary Aug 24 '11 at 15:50
  • See also http://stackoverflow.com/questions/315618/how-do-i-extract-a-tar-file-in-java – Vadzim Oct 28 '16 at 16:19

7 Answers7

42

I've written a wrapper for commons-compress called jarchivelib that makes it easy to extract or compress from and into File objects.

Example code would look like this:

File archive = new File("/home/thrau/archive.tar.gz");
File destination = new File("/home/thrau/archive/");

Archiver archiver = ArchiverFactory.createArchiver("tar", "gz");
archiver.extract(archive, destination);
thrau
  • 2,610
  • 3
  • 23
  • 31
33

My favorite is plexus-archiver - see sources on GitHub.

Another option is Apache commons-compress - (see mvnrepository).

With plexus-utils, the code for unarchiving looks like this:

final TarGZipUnArchiver ua = new TarGZipUnArchiver();
// Logging - as @Akom noted, logging is mandatory in newer versions, so you can use a code like this to configure it:
ConsoleLoggerManager manager = new ConsoleLoggerManager();
manager.initialize();
ua.enableLogging(manager.getLoggerForComponent("bla"));
// -- end of logging part
ua.setSourceFile(sourceFile);
destDir.mkdirs();
ua.setDestDirectory(destDir);
ua.extract();

Similar *Archiver classes are there for archiving.

With Maven, you can use this dependency:

<dependency>
  <groupId>org.codehaus.plexus</groupId>
  <artifactId>plexus-archiver</artifactId>
  <version>2.2</version>
</dependency>
PHPirate
  • 5,958
  • 6
  • 46
  • 72
Petr Kozelka
  • 7,298
  • 1
  • 28
  • 41
17

To extract the contents of .tar.gz format, I successfully use apache commons-compress ('org.apache.commons:commons-compress:1.12'). Take a look at this example method:

public void extractTarGZ(InputStream in) {
    GzipCompressorInputStream gzipIn = new GzipCompressorInputStream(in);
    try (TarArchiveInputStream tarIn = new TarArchiveInputStream(gzipIn)) {
        TarArchiveEntry entry;

        while ((entry = (TarArchiveEntry) tarIn.getNextEntry()) != null) {
            /** If the entry is a directory, create the directory. **/
            if (entry.isDirectory()) {
                File f = new File(entry.getName());
                boolean created = f.mkdir();
                if (!created) {
                    System.out.printf("Unable to create directory '%s', during extraction of archive contents.\n",
                            f.getAbsolutePath());
                }
            } else {
                int count;
                byte data[] = new byte[BUFFER_SIZE];
                FileOutputStream fos = new FileOutputStream(entry.getName(), false);
                try (BufferedOutputStream dest = new BufferedOutputStream(fos, BUFFER_SIZE)) {
                    while ((count = tarIn.read(data, 0, BUFFER_SIZE)) != -1) {
                        dest.write(data, 0, count);
                    }
                }
            }
        }

        System.out.println("Untar completed successfully!");
    }
}
RemusS
  • 1,225
  • 11
  • 9
  • 1
    Since you are using the try-with-resources syntax, you shouldn't need `dest.close();` and `tarIn.close();` – FGreg Mar 29 '17 at 20:57
  • 1
    Warning: This is unsafe due to ZipSlip, do not use this code in production software. Specifically f.mkdir() is not safe to call in the blind: https://snyk.io/research/zip-slip-vulnerability – sichinumi Mar 30 '22 at 22:23
7

In my experience Apache Compress is much more mature than Plexus Archiver, specifically because of issues like http://jira.codehaus.org/browse/PLXCOMP-131.

I believe Apache Compress has more activity as well.

Gili
  • 81,444
  • 90
  • 364
  • 657
  • Apache Compress cannot extract some tar.gz archives because of a lack of support. This bug has never been resolved : https://www.jfrog.com/jira/browse/HAP-651 – didil Oct 28 '16 at 13:42
  • 4
    @didile how do you expect this to get fixed if the bug was reported to jfrog instead of apache compress? – Gili Oct 28 '16 at 13:47
  • It has been also reported to apache issue tracker. – didil Oct 28 '16 at 13:58
  • 4
    @didile please provide a link. – Gili Oct 28 '16 at 14:19
  • 2
    @didile I don't see any bug reported to https://issues.apache.org/jira/browse/COMPRESS that would match HAP-651. It would be great if you could open one and attach a tar where Compress fails. – Stefan Bodewig Nov 22 '16 at 19:45
2

If you are planning to compress/decompress on Linux, you can call the shell command line to do that for you:

Files.createDirectories(Paths.get(target));
ProcessBuilder builder = new ProcessBuilder();
builder.command("sh", "-c", String.format("tar xfz %s -C %s", tarGzPathLocation, target));
builder.directory(new File("/tmp"));
Process process = builder.start();
int exitCode = process.waitFor();
assert exitCode == 0;
Dherik
  • 15,440
  • 11
  • 108
  • 148
0

With TrueVFS extracting Tar.GZip archive is one-liner:

new TFile("archive.tar.gz").cp_rp(new File("dest/folder"));

But beware of dependencies issue.

Vadzim
  • 23,055
  • 10
  • 129
  • 146
-3

It works for me, using GZIPInputStream, https://www.mkyong.com/java/how-to-decompress-file-from-gzip-file/

zhaoyou
  • 312
  • 4
  • 18
  • 6
    Am I missing something, this will decompress the gzip file and leave you with the tar file? – JohnC Feb 21 '18 at 20:38
  • 1
    @JohnC This is usually what happens, and likely the reason for the downvote since the answer does not solve the problem. – toolforger Nov 08 '19 at 15:39