compare zip files ignoring datestamps

Completado Publicado Feb 22, 2009 Pagado a la entrega
Completado Pagado a la entrega

There's a Unix command 'cmp' which can be used to determine whether two files have identical contents: "cmp file1 file2". However, certain file types can't be usefully compared in this way because they contain datestamps which lead to false negatives. For instance, if you run the command "cl /c foo.c" twice in a row on Windows you will get two [login to view URL] files which are "functionally identical" but which will show as different decause of differing datestamps.

There's an open source library called 'objcmp' which addresses this. It behaves much like cmp but is smarter in that it can optionally locate and ignore datestamps in a number of well-known file formats. However, one format which it does not understand is the zip file format. This project is to enhance objcmp to skip over datestamps in zip format files (which would include .zip, .jar, and anything else conforming to the zip standard). A copy of objcmp is attached with a custom 'testdata' directory added.

## Deliverables

PLEASE SEE ATTACHED ZIP FILE CONTAINING 'objcmp' PROGRAM

The enhancement involves two programming tasks. The first would be to identify a file as zip format. For this, looking at ftp://[login to view URL] would probably show the technique. There's already code in objcmp to recognize other files, such as Unix archive (.a) files, by their magic number and this would just need an additional case to handle zip files. Identification should NOT be done by file extension but rather by recognizing the data format. The only tricky part is is that the zipfile 'header' is actually at the end of the file so you might have to seek to it backward from there.

The second would be zeroing out all datestamps. The infrastructure for doing this is present already; all that remains is determining the offsets to datestamps. We are not looking for any datestamps *within* the files, by the way, just datestamps inserted by the zip program as part of the archive format. Nor do we have to worry about encrypted zip files. Another thing to not worry about is file order. According to the zip format, files can be in any order. However, we can assume that a given zip program will use the same ordering algorithm each time.

The goal is simply to be able to compare two regular zipfiles, made by the same program with the same options and containing the same data, and find them equal. The acceptance test will be the ability to make two zip files from the same inputs and compare them identical. E.g.

% date > foo

% date > bar

% zip [login to view URL] foo bar

% zip [login to view URL] foo bar

% objcmp [login to view URL] [login to view URL]

The platform infrastructure of opening and mapping the files is already present so this is not a Windows job or a Unix job; you can do it on any platform you like and it should work everywhere since the zip format is the same across platforms.

As far as I can tell the official zipfile format is at [login to view URL] But this logic is also encoded in plenty of open soource tools, for instance INFO-Zip. Java has classes for nagivating zip and jar files and the Sun JDK is all open source so that's another place to look.

Bottom line, all of the required algorithms are documented and/or present in existing code. It's just a matter of researching and implementing them in the context of objcmp. By the way, within objcmp you should be able to ignore the unstamp* code - that's all for Windows exe format files. All the interesting code should be in objcmp.c and the doc in objcmp.pod.

Programación en C Ingeniería Linux Microsoft MySQL PHP Arquitectura de software Verificación de software UNIX Windows Desktop

Nº del proyecto: #3662242

Sobre el proyecto

6 propuestas Proyecto remoto Activo Feb 24, 2009

Adjudicado a:

mmtt

See private message.

$24 USD en 14 días
(6 comentarios)
1.7

6 freelancers están ofertando un promedio de $68 por este trabajo

gfreemann

See private message.

$170 USD en 14 días
(77 comentarios)
6.0
anurag7vw

See private message.

$51 USD en 14 días
(71 comentarios)
5.0
manjunathku

See private message.

$72.25 USD en 14 días
(0 comentarios)
0.0
grg183

See private message.

$46.75 USD en 14 días
(0 comentarios)
0.0
maitreyisharma

See private message.

$46.75 USD en 14 días
(1 comentario)
0.0