I once had a requirement to compress 100,000 log files into one archive and be able to extract a single file for processing. Zip is simply not up to the challenge. It can’t take that many files. That would have made my life easier. There are, after all, many resources in the internet on how to do this. After some research I find that .TAR.GZ files (TAR files that were GZipped) seem the best candidate amongst many products. Mainly because it’s free. They also provide superb compression and frankly I haven’t explored the limits yet of how many files in could pack.
The drawback is the limited documentation on how to handle them in ASP.Net. If you’re willing to spend a couple of bucks then you can indeed have it easy. It is perhaps more economical to just buy components rather than spend hours working with a free one. But then, if someone tells you how work with the free one then you wouldn’t have to buy would you?
In this post, I’m going to talk about extracting .TAR.GZ files in ASP.Net. However, don’t ask me the details of the code. Like I said, there’s not much documentation around. I came up with this code after hours of trial and error.
The critical component for this sample is SharpZipLib. Download the dll file here.
After downloading, copy ICSharpCode.SharpZipLib.dll from the net-20 folder and place on your website’s Bin folder.
Import the namespaces System.IO, ICSharpCode.SharpZipLib.Tar and ICSharpCode.SharpZipLib.GZip on your code-behind or class file.
Imports ICSharpCode.SharpZipLib.Tar
Imports ICSharpCode.SharpZipLib.GZip
Imports System.IO
Or like this if you’re using inline code
<%@ Import Namespace="System.IO" %>
<%@ Import Namespace="ICSharpCode.SharpZipLib.Tar" %>
<%@ Import Namespace="ICSharpCode.SharpZipLib.GZip" %>
Then copy the method below
Sub ExtractFile(ByVal Archive As String, ByVal Filename As String, ByVal DestinationFolder As String)
Dim gZipStream As New GZipInputStream(File.OpenRead(Archive))
Dim ZipFile As New TarInputStream(gZipStream)
Dim ZipFileEntry As TarEntry
ZipFileEntry = ZipFile.GetNextEntry
Dim bFound As Boolean = False
Do While ZipFileEntry IsNot Nothing
If ZipFileEntry.Name = Filename Then
bFound = True
Exit Do
End If
ZipFileEntry = ZipFile.GetNextEntry
Loop
If bFound = False Then
ZipFile.Close()
ZipFile.Dispose()
gZipStream.Close()
ZipFile.Dispose()
End If
Dim fs As New FileStream(DestinationFolder & Filename, FileMode.Create)
ZipFile.CopyEntryContents(fs)
ZipFile.Close()
ZipFile.Dispose()
fs.Close()
fs.Dispose()
End Sub
The method above is designed for general use. It will extract one file and save it to disk. For example….
Dim sFilename As String = "pic1.jpg"
Dim sArchive As String = "D:\temp\pics.tar.gz"
Dim sDestinationFolder As String = "D:\"
ExtractFile(sArchive, sFilename, sDestinationFolder)
This will extract the file “pic1.jpg” from the archive “D:\temp\pics.tar.gz” and save it under the folder “D:\”. From there you may do a couple of things like redirect to the file or do some processing. That depends on your requirement. It’s perfect for archiving pictures, old content, or whatever you see fit to save space. You will have to maintain an index of that somewhere of course.
Don't miss this post on file compression using .Tar.Gz in ASP.Net