Extracting Metada From Files
Penetration testers must be able to think outside of the box and to use whatever method is necessary in order to discover information about their targets.Malicious attackers will not stop in the conventional tactics and this should apply and to the penetration tester.Many organizations are uploading in their websites word documents and excel files without been aware that they expose sensitive information.This information is hidden in the metadata of the files.Also in application assessments (web or mobile) it is a good practice except of the common vulnerabilities to check and the metadata in order to see if this information can be used in a malicious way.In this article we will examine some of the tools that we can use for metadata extraction and what kind of information can unveil.
Exiftool
One of the tools that can extract Metadata information is the exiftool.This tool is found in Backtrack distribution and can extract information from various file types like DOC,XLS,PPT,PNG and JPEG.Typically the information that we would look for are:
- Title
- Subject
- Author
- Comments
- Software
- Company
- Manager
- Hyperlinks
- Current User
Below is the information that we have obtained from an image and the metadata from a doc file.



FOCA
FOCA is another great tool for analyzing metadata in documents.It is a GUI based tool which make the process a lot of easier.The only thing that we have to do is to specify the domain that we want to search for files and the file type (doc,xls,pdf) and FOCA will perform the job for us very easily.Below you can see a screenshot of the metadata that we have extracted from a doc file.As you can see we have obtained a username an internal path and the operating system that the file has created.

Conclusion
As we have seen in this article metadata can unveil important information which can be used in conjunction with other attacks.Companies should be aware about this exposure of information that exist in their documents and before they upload something on public domain must use the appropriate tools first in order to remove the metadata from their files and to mitigate the risk.