In-Memory Compression by LZMA in C#

I found that LZMA SDK is a powerful compression library.
The SDK doesn’t provide the in-memory compression sample but it can be done so easily.
I chose MemoryStream for IO data type, and I plan to compress small XML files with this.

* Note that there are hard-coded endian issues in the original SDK. I will check it back when I have access to big endian PowerBook.
* Remember everything goes to heap, so it’s not suitable for large data ( I guess more than 10MB.)

Partial code looks like below.
Comment if you need a complete set of source.

using System;
using System.IO;
using SevenZip;
using SevenZip.Compression;

namespace SevenZip.Compression
{
  class LZMACoder : IDisposable
  {
    private bool isDisposed = false;

    //These properties are straight copy from the SDK.
    //Actually, I don't know what these mean.

    private static Int32 dictionary = 1 << 21; //No dictionary
    private static Int32 posStateBits = 2;
    private static Int32 litContextBits = 3;   // for normal files  // UInt32 litContextBits = 0; // for 32-bit data                                             
    private static Int32 litPosBits = 0;       // UInt32 litPosBits = 2; // for 32-bit data
    private static Int32 algorithm = 2;
    private static Int32 numFastBytes = 128;
    private static bool eos = false;
    private static string mf = "bt4";

    private static CoderPropID[] propIDs = 
    {
        CoderPropID.DictionarySize,
        CoderPropID.PosStateBits,  
        CoderPropID.LitContextBits,
        CoderPropID.LitPosBits,
        CoderPropID.Algorithm,
        CoderPropID.NumFastBytes,
        CoderPropID.MatchFinder,
        CoderPropID.EndMarker
    };
    private static object[] properties = 
    {
        (Int32)(dictionary),
        (Int32)(posStateBits),  
        (Int32)(litContextBits),
        (Int32)(litPosBits),
        (Int32)(algorithm),
        (Int32)(numFastBytes),
        mf,
        eos
    };

    public LZMACoder()
    {
      if (BitConverter.IsLittleEndian == false)
      {
        Dispose();
        throw new Exception("Not implemented");        
      }
    }

    public MemoryStream decompress(MemoryStream inStream)
    {
      return decompress(inStream, false);
    }

    public MemoryStream decompress(MemoryStream inStream, bool closeInStream)
    {
      inStream.Position = 0;
      MemoryStream outStream = new MemoryStream();

      byte[] properties = new byte[5];
      if (inStream.Read(properties, 0, 5) != 5)
        throw (new Exception("input .lzma is too short"));

      SevenZip.Compression.LZMA.Decoder decoder = new SevenZip.Compression.LZMA.Decoder();
      decoder.SetDecoderProperties(properties);

      long outSize = 0;

      if (BitConverter.IsLittleEndian)
      {
        for (int i = 0; i < 8; i++)
        {
          int v = inStream.ReadByte();
          if (v < 0)
            throw (new Exception("Can't Read 1"));

          outSize |= ((long)(byte)v) << (8 * i);
        }
      }

      long compressedSize = inStream.Length - inStream.Position;
      decoder.Code(inStream, outStream, compressedSize, outSize, null);

      if (closeInStream)
        inStream.Close();

      return outStream;
    }

    public MemoryStream compress(MemoryStream inStream)
    {
      return compress(inStream, false);
    }

    public MemoryStream compress(MemoryStream inStream, bool closeInStream)
    {
      inStream.Position = 0;
      Int64 fileSize = inStream.Length;
      MemoryStream outStream = new MemoryStream();

      SevenZip.Compression.LZMA.Encoder encoder = new SevenZip.Compression.LZMA.Encoder();
      encoder.SetCoderProperties(propIDs, properties);
      encoder.WriteCoderProperties(outStream);

      if (BitConverter.IsLittleEndian)
      {
        byte[] LengthHeader = BitConverter.GetBytes(fileSize);
        outStream.Write(LengthHeader, 0, LengthHeader.Length);
      }

      encoder.Code(inStream, outStream, -1, -1, null);

      if (closeInStream)
        inStream.Close();

      return outStream;
    }

    ~LZMACoder()
    {
      Dispose();
    }

    public void Dispose()
    {
      Dispose(true);
      GC.SuppressFinalize(this); 
    }

    private void Dispose(bool disposing)
    {
      if (this.isDisposed == false)
      {
        if (disposing)
        {
          //Console.WriteLine("dispose"); 
          GC.SuppressFinalize(this);
        }
      }
      this.isDisposed = true;      
    }
  }
}

Advertisements

4 Responses to “In-Memory Compression by LZMA in C#”

  1. karsten Says:

    Hi

    This looks very interresing, could you email me the source with everything? Thanks

    • genki Says:

      Hi karsten,
      Thanks for the comment.
      Did you set up the SDK?

      It’s actually the code is complete. You may have to change the namespace part as it matches to the SDK.

      Once you have the SDK, you just need to add the code as .cs file. Let me know how it goes. Hopefully, I will have some time on the update.

  2. Suchit Sapate Says:

    Hi
    I want to some work on XML compression on efficient way. So I want you code to analyze the XML compression methods So that I will optimize you. If I will get the proper results I will mail you full code. Please Send you code with initial instructions.

  3. Carl Says:

    Hi,

    Good article. Had a look at compressing a string with code above but it keeps on failing when I read the compressed stream to a string and then reading that string back into a stream. I see the size is not the same. Do you have a solution to this?

    Thank you
    Carl


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: