Daily Archive for May 2nd, 2007

Compression in Java Implementation

Wanting to add zip/jar files on-the-fly decompression/reading implmentation for DictionaryForMids, I experimented classpath (A free/GNU replacement for Sun’s proprietary core Java class libraries)and jazzlib (A pure java implementation of the java.util.zip library) sources.

Even after I could make them compile, it seemed that some files couldn’t go through the preverifier (for unknown reasons, but i guess its some missing links or file size) so it couldn’t be packaged for mobile phones. [update] I managed to fix this problem by just looking at the errors when building the sources.

Frustrated, I look at some other alternatives.

TAR - A container file with no compression.
GZip - There are a few other implementations out there.
ZLib - I could use the files from JZLib re-implementation of zlib in pure Java..
BZip - I couldn’t find an implementation although there seemed to be from some time back.

Ideally, to get real compression, files would be tared first, then “hard” compressed at the next layer. Someone did a comparison of the different way of compression here. My tests show TAR and JZlib giving best compressions but the usual archivers (eg. Winrar, TUZip) do not open them. Tar with Gzip gives good storage for performances.

Take a look at my test codes for compressing files.

import java.io.*;
import java.util.zip.*;

public class Zip {
   static final int BUFFER = 2048;
   public static void main (String argv[]) {
      try {
         BufferedInputStream origin = null;
         FileOutputStream dest = new
           FileOutputStream("C:\\Users\\Zz85\\Desktop\\DictionaryProject\\test.zip");
         ZipOutputStream out = new ZipOutputStream(new
           BufferedOutputStream(dest));
         //out.setMethod(ZipOutputStream.DEFLATED);
         byte data[] = new byte[BUFFER];
         // get a list of files from current directory
         File f = new File("C:\\Users\\Zz85\\Desktop\\DictionaryProject\\Dictionary\\.");
         String files[] = f.list();

         for (int i=0; i < files .length; i++) {
            System.out.println("Adding: "+files[i]);
            FileInputStream fi = new
              FileInputStream(f.getParent() + "\\" + files[i]);
            origin = new
              BufferedInputStream(fi, BUFFER);
            ZipEntry entry = new ZipEntry(files[i]);
            out.putNextEntry(entry);
            int count;
            while((count = origin.read(data, 0,
              BUFFER)) != -1) {
               out.write(data, 0, count);
            }
            origin.close();
         }
         out.close();
      } catch(Exception e) {
         e.printStackTrace();
      }
   }
}

This is a classic way of compressing a folder (without its subdirectorie, but doable) to a zip file. The other methods are almost similar.


import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;

import com.ice.tar.TarEntry;
import com.ice.tar.TarOutputStream;
import com.jcraft.jzlib.JZlib;
import com.jcraft.jzlib.ZOutputStream;

public class PackDictTest {
	static final int BUFFER = 2048;
	   public static void main (String argv[]) {
	      try {
	         BufferedInputStream origin = null;
	         FileOutputStream dest = new
	           FileOutputStream("C:\\Users\\Zz85\\Desktop\\DictionaryProject\\test.tar.Z");
	         TarOutputStream out = new TarOutputStream(new
	        		 BufferedOutputStream(new ZOutputStream(dest,JZlib.Z_BEST_COMPRESSION)));
	        		//BufferedOutputStream(new GZIPOutputStream(dest)));
	           		//BufferedOutputStream(dest));

	         byte data[] = new byte[BUFFER];
	         // get a list of files from current directory
	         File f = new File("C:\\Users\\Zz85\\Desktop\\DictionaryProject\\Dictionary\\.");
	         String files[] = f.list();

	         for (int i=0; i < files .length; i++) {
	            System.out.println("Adding: "+f.getParent() + File.separatorChar +files[i]);
	            FileInputStream fi = new
	              FileInputStream(f.getParent() + File.separatorChar +  files[i]);
	            origin = new
	              BufferedInputStream(fi, BUFFER);

	            File toTar = new File (f.getParent() + File.separatorChar +files[i]);
	            TarEntry entry = new TarEntry(toTar);
	            entry.setName(toTar.getName());
	            //TarEntry entry = new TarEntry(files[i]);
	            //entry.setSize( toTar.length());
	            //entry.setModTime(toTar.lastModified());

				out.putNextEntry(entry);
	            /*
	            while (true) {
	            	//System.out.println("Bad");
					int count = origin.read(data, 0, data.length);
					if (count <= 0)
						break;
					out.write(data, 0, count);
				}*/
	            /**/
	            int count;
	            while((count = origin.read(data, 0,
	              BUFFER)) != -1) {
	               out.write(data, 0, count);
	            }
	            out.closeEntry();
	            origin.close();
	         }
	         out.close();
	      } catch(Exception e) {
	         e.printStackTrace();
	      }
	   }

}

Notice the how the lines BufferedOutputStream(dest)); can be swaped with BufferedOutputStream(new GZIPOutputStream(dest))); to BufferedOutputStream(new ZOutputStream(dest,JZlib.Z_BEST_COMPRESSION))); to give an different compression easily using its wrapper classes.

Next will the issue on how to decompressing them.