Contents

Website Footprinting

1.4 Website Footprinting

My resume of module 02 footprinting form CEH material

Website footprinting refers to monitoring and analyzing the target organization’s website for information. It is possible for an attacker to build a detailed map of a website’s structure and architecture without triggering the IDS or without raising any system administrator’s suspicions. Attackers use sophisticated footprinting tools or the basic tools that come with the operating system, such as Telnet, or by using a browser.

The Netcraft tool can gather website information such as IP address, registered name and address of the domain owner, domain name, host of the site, and OS details. However, the tool may not give all these details for every site. In such cases, the attacker can browse the target website.

Browsing the target website will typically provide the following information:

  • Software used and its version*
  • Operating system used
  • Sub-directories and parameters
  • Filename, path, database field name, or query*
  • Scripting platform
  • Contact details and CMS details

Use Burp Suite, Zaproxy, Paros Proxy, Website Informer, and Firebug to view headers that provide:

  • Connection status and content-type
  • Accept-Ranges and Last-Modified information
  • X-Powered-By information
  • Web server in use and its version

Website footprinting can be performed by examining HTML source code and cookies.

  • Examining the HTML source code
  • Examining Cookies

Website Footprinting using Web Spiders

Web spider (also known as web crawler or web robot) is a program or automated script that browses websites in a methodical manner to collect specified information such as employee names, email addresses and so on. Attackers then use the collected information to perform footprinting and social engineering attacks.

Remember
Web spidering fails if the target website has the robots.txt file in its root directory, with a listing of directories to prevent crawling.

Web Spidering Tools

Web spidering tools can collect sensitive information from the target website.

  • try search on github first!!!
  • like this awesome repo

Mirroring Entire Website

Website mirroring is the process of creating an exact replica or clone of the original website. Users can duplicate the websites by using mirroring tools such as HTTrack Web Site Copier, and NCollector Studio. These tools download a website to a local directory, building recursively all directories, HTML, images, flash, videos, and other files from the web server to another computer. Website mirroring has the following benefits:

  • It is helpful for offline site browsing
  • It supports an attacker in spending more time viewing and analyzing the website for vulnerabilities and loop holes
  • It assists in finding directory structure and other valuable information from the mirrored copy without multiple requests to the web server

Website Mirroring Tools

check in github

Extracting Website Information from https://archive.org

Source: https://archive.org Archive is an Internet Archive Wayback Machine that explores archived versions of websites. Such exploration allows an attacker to gather information on an organization’s web pages since their creation. As the website https://archive.org keeps track of web pages from the time of their inception, an attacker can retrieve even information removed from the target website.

Extracting Metadata of Public Documents

Useful information may reside on the target organization’ website in the form of pdf documents, Microsoft Word files and other formats. You should be able to extract valuable data, including metadata and hidden information from such documents. It mainly contains hidden information about the public documents that can be analyzed in order to obtained information such as title of the page, description, keywords, creation/modification data and time of the content, usernames and e-mail addresses of employees of the target organization.

Metadata Extraction Tools

Metadata extraction tools automatically extract critical information that includes username of the clients, operating systems (exploits are OS-specific), email addresses engineering), list of servers list of software (version and type) used, (possibly for social and document date creation/modification, authors of the website and so on.

very nice tools are in github

Monitoring Web Pages for Updates and Changes

Web updates monitoring tools are capable of detecting any changes or updates in a particular website and can give notifications or send alerts to the interested users through email or SMS.