Saturday, November 22, 2014

A Peek Inside an XLSX Document

Wants to see what's under the hood of an XLSX file? Probably not, but just for giggles, this is a little bit interesting. Did you know that Excel xlsx files are actually zip files? Change the file extension to .zip and [poof], you can view the underlying supporting documents. You'll notice the format of the xlsx file is supported by pretty straightforward xml files.

Make File Extensions Visible

If you don't have file extensions visible - set this folder option first.
Option your control panel, and select "Folder Options".
Then remove the checkmark for "Hide extensions for known file types".


Rename the xlsx file

Using Windows File Explorer, choose any xlsx file and rename the extension from xlsx to zip.



Naturally, say yes to the prompt.


Now open the .zip file, and view the contents.


Oh my.. Look at that xml file. 

Digging inside there you'll notice it's straight forward xml with values, except that excel is using an index lookup to the sharedStrings.xml file (also located in a higher leve inside the zip file) for cell values. So in the Sheet1.xml, a <v>0</v> would correspond to the first value in the sharedStrings file. Yes, you can edit the files, save it back, rename the zip back to an xlsx file, and it still works.

Programmatically, this makes for pretty easy interaction with excel documents.

Anyways, enjoy.
-P

No comments:

Post a Comment