CHAPTER 2. Reconnaissance: Information Gathering for the Ethical Hacker

In this chapter you will

• Define active and passive footprinting
• Identify methods and procedures in information gathering
• Understand the use of social networking, search engines, and Google hacking in information gathering
• Understand the use of whois, ARIN, and nslookup in information gathering
• Describe the DNS record types

Footprinting

Gathering information about your intended target is more than just a beginning step in the overall attack; it’s an essential skill you’ll need to perfect as an ethical hacker. I believe what most people wonder about concerning this particular area of our career field comes down to two questions: what kind of information am I looking for, and how do I go about getting it? Both are excellent questions (if I do say so myself), and both will be answered in this section. As always, we’ll cover a few basics in the way of the definitions, terms, and knowledge you’ll need before we get into the hard stuff.

You were already introduced to the term reconnaissance in Chapter 1, so I won’t bore you with the definition again here. I do think it’s important, though, that you understand there may be a difference in definition between reconnaissance and footprinting, depending on which security professional you’re talking to. For many, recon is more of an overall, overarching term for gathering information on targets, whereas footprinting is more of an effort to map out, at a high level, what the landscape looks like. They are interchangeable terms in CEH parlance, but if you just remember that footprinting is part of reconnaissance, you’ll be fine.

During the footprinting stage, you’re looking for any information that might give you some insight into the target—no matter how big or small. And it doesn’t necessarily need to be technical in nature. Sure, things such as the high-level network architecture (what routers are they using, and what servers have they purchased?), the applications and websites (are they public-facing?), and the physical security measures (what type of entry control systems present the first barrier, and what routines do the employees seem to be doing daily?) in place are great to know, but you’ll probably be answering other questions first during this phase. Questions concerning the critical business functions, the key intellectual property, the most sensitive information this company holds may very well be the most important hills to climb in order to recon your organization appropriately and diligently.

Of course, anything providing information on the employees themselves is always great to have because the employees represent a gigantic target for you later in the test. Although some of this data may be a little tricky to obtain, most of it is relatively easy to get and is right there in front of you, if you just open your virtual eyes.

As far as footprinting terminology and getting your feet wet here with ECCouncil’s view of it all, most of it is fairly easy to remember. For example, while most footprinting can be passive in nature, takes advantage of freely available information, and is designed to be blind to your target, sometimes an overly security-conscious target organization may catch on to your efforts. If you prefer to stay in the virtual shadows (and because you’re reading this book I can safely assume that you do), your footprinting efforts may be designed in such a way as to obscure their source. If you’re really sneaky, you may even take the next step and create ways to have your efforts trace back to anyone and anywhere but you.

NOTE

Giving the appearance that someone else has done something illegal is, in itself, a crime. Even if it’s not criminal activity you’re blaming on someone else, the threat of prison and/or a civil liability lawsuit should be reason enough to think twice about this.
Anonymous footprinting, where you try to obscure the source of all this information gathering, may be a great way to work in the shadows, but pseudonymous footprinting is just downright naughty, making someone else take the blame for your actions. How dare you!

EXAM TIP
ECC describes four main focuses and benefits of footprinting for the ethical hacker:

  1. Know the security posture (footprinting helps make this clear).
  2. Reduce the focus area (network range, number of targets, and so on).
  3. Identify vulnerabilities (self-explanatory).
  4. Draw a network map.

Footprinting, like everything else in hacking, usually follows a fairly organized path to completion. You start with information you can gather from the “50,000-foot view”—using the target’s website and web resources to collect other information on the target—and then move to a more detailed view. The targets for gathering this type of information are numerous and can be easy or relatively difficult to crack open. You may use search engines and public-facing websites for general, easy-to-obtain information while simultaneously digging through DNS for detailed network-level knowledge. All of it is part of footprinting, and it’s all valuable; just like a detective in a crime novel, no piece of evidence should be overlooked, no matter how small or seemingly insignificant.

That said, it’s also important for you to remember what’s really important and what the end goal is. Milan Kundera famously wrote in The Unbearable Lightness of Being, “Seeing is limited by two borders: strong light, which blinds, and total darkness,” and it really applies here. In the real world, the only thing more frustrating to a pen tester than no data is too much data. When you’re on a pen test team and you have goals defined in advance, you’ll know what information you want, and you’ll engage your activities to go get it. In other words, you won’t (or shouldn’t) be gathering data just for the sake of collecting it; you should be focusing your efforts on the good stuff.

There are two main methods for gaining the information you’re looking for. Because you’ll definitely be asked about them repeatedly on the exam, I’m going to define active footprinting versus passive footprinting here and then spend further time breaking them down throughout the rest of this chapter. An active footprinting effort is one that requires the attacker to touch the device, network, or resource, whereas passive footprinting refers to measures to collect information from publicly accessible sources. For example, passive footprinting might be perusing websites or looking up public records, whereas running a scan against an IP you find in the network would be active footprinting. When it comes to the footprinting stage of hacking, the vast majority of your activity will be passive in nature. As far as the exam is concerned, you’re considered passively footprinting when you’re online, checking on websites, and looking up DNS records, and you’re actively footprinting when you’re gathering social engineering information by talking to employees. Lastly, I need to add a final note here on footprinting and your exam, because it needs to be said. Footprinting is of vital importance to your job, but for whatever reason ECC just doesn’t focus a lot of attention on it in the exam. It’s actually somewhat disconcerting that this is such a big part of the job yet just doesn’t get much of its due on the exam. Sure, you’ll see stuff about footprinting on the exam, and you’ll definitely need to know it (we are, after all, writing an all-inclusive book here), but it just doesn’t seem to be a big part of the exam. I’m not really sure why. The good news is, most of this stuff is easy to remember anyway, so let’s get on with it.

Passive Footprinting

Before starting this section, I got to wondering about why passive footprinting seems so confusing to most folks. During practice exams and whatnot in a class I recently sat through, there were a few questions missed by most folks concerning passive footprinting. It may have to do with the term passive (a quick “define passive” web search shows the term denotes inactivity, nonparticipation, and a downright refusal to react in the face of aggression). Or it may have to do with some folks just overthinking the question. I think it probably has more to do with people dragging common sense and real-world experience into the exam room with them, which is really difficult to let go of. In any case, let’s try to set the record straight by defining exactly what passive footprinting is and, ideally, what it is not.

Passive footprinting as defined by EC-Council has nothing to do with a lack of effort and even less to do with the manner in which you go about it (using a computer network or not). In fact, in many ways it takes a lot more effort to be an effective passive footprinter than an active one. Passive footprinting is all about the publicly accessible information you’re gathering and not so much about how you’re going about getting it. Methods include, but are not limited to, gathering of competitive intelligence, using search engines, perusing social media sites, participating in the ever-popular dumpster dive, gaining network ranges, and raiding DNS for information. As you can see, some of these methods can definitely ring bells for anyone paying attention and don’t seem very passive to common-sense-minded people anywhere, much less in our profession. But you’re going to have to get over that feeling rising up in you about passive versus active footprinting and just accept this for what it is—or be prepared to miss a few questions on the exam.

Passive information gathering definitely contains the pursuit and acquisition of competitive intelligence, and because it’s a direct objective within CEH and you’ll definitely see it on the exam, we’re going to spend a little time defining it here. Competitive intelligence refers to the information gathered by a business entity about its competitors’ customers, products, and marketing. Most of this information is readily available and can be acquired through different means. Not only is it legal for companies to pull and analyze this information, it’s expected behavior. You’re simply not doing your job in the business world if you’re not keeping up with what the competition is doing. Simultaneously, that same information is valuable to you as an ethical hacker, and there are more than a few methods to gain competitive intelligence.

The company’s own website is a great place to start. Think about it: what do people want on their company’s website? They want to provide as much information as possible to show potential customers what they have and what they can offer. Sometimes, though, this information becomes information overload. Just some of the open source information you can gather from almost any company on its site includes company history, directory listings, current and future plans, and technical information. Directory listings become useful in social engineering, and you’d probably be surprised how much technical information businesses will keep on their sites. Designed to put customers at ease, sometimes sites inadvertently give hackers a leg up by providing details on the technical capabilities and makeup of their network.

Several websites make great sources for competitive intelligence. Information on company origins and how it developed over the years can be found in places like the EDGAR Database (www.sec.gov/edgar.shtml), Hoovers (www.hoovers.com), LexisNexis (www.lexisnexis.com) and Business Wire (www.businesswire.com). If you’re interested in company plans and financials, the following list provides some great resources:

SEC Info (www.secinfo.com)
Experian (www.experian.com)
Market Watch (www.marketwatch.com)
Wall Street Monitor (http://www.twst.com)
Euromonitor (www.euromonitor.com)

Other goodies that may be of interest in competitive intelligence include the company’s online reputation (as well as the company’s efforts to control it) and the actual traffic statistics of the company’s web traffic (www.alexa.com is a great resource for this). Also, check out finance.google.com, which will show you company news releases on a timeline of its stock performance—in effect, showing you when key milestones occurred.

Active Footprinting

When it comes to active footprinting, per EC-Council, we’re really talking about social engineering, human interaction, and anything that requires the hacker to interact with the organization. In short, whereas passive measures take advantage of publicly available information that won’t (usually) ring any alarm bells, active footprinting involves exposing your information gathering to discovery. For example, you can scrub through DNS usually without anyone noticing a thing, but if you were to walk up to an employee and start asking them questions about the organization’s infrastructure, somebody is going to notice. I have an entire chapter dedicated to social engineering coming up (see Chapter 11), but will hit a few highlights here.

Social engineering has all sorts of definitions, but it basically comes down to convincing people to reveal sensitive information, sometimes without even realizing they’re doing it. There are millions of methods for doing this, and it can sometimes get really confusing. From the standpoint of active footprinting, the social engineering methods you should be concerned about involve human interaction. If you’re calling an employee or meeting an employee face to face for a conversation, you’re practicing active footprinting.

This may seem easy to understand, but it can get confusing in a hurry. For example, I just finished telling you social media is a great way to uncover information passively, but surely you’re aware you can use some of these social sites in an active manner. What if you openly use Facebook connections to query for information? Or what if you tweet a question to someone? Both of those examples could be considered active in nature, so be forewarned.

Footprinting Methods and Tools

In version 9 of the exam, ECC is putting a lot of focus on the tools themselves and not so much on the definitions and terms associated with them. This is really good news from one standpoint—those definitions and terms can get ridiculous, and memorizing the difference between one term and another doesn’t really don’t do much in the way of demonstrating your ability as an actual ethical hacker. The bad news is, you have to know countless tools and methods just in case you see a specific question on the exam. And, yes, there are plenty of tools and techniques in footprinting for you to learn—both for your exam and your future in pen testing.

Search Engines

When I was a kid and someone asked me how to do something I’d never done, to define something I’d never heard of, or to comment on some historical happening I spaced out on during school, I had no recourse. Back then you simply had to say, “I don’t know.” If it were really important you went to the library and tried to find it in a book (GASP! The HORROR!). Today when I’m asked something, I do what everyone else does—I Google it. Just yesterday somebody asked me about the diet of sandhill cranes (they’re gigantic, beautiful birds, are always wandering through my backyard, and if I had to guess my first thought on their diet of choice would be small children and household pets). Twenty years ago I wouldn’t have a clue what a sandhill crane was, much less what they ate. Today, given 5 minutes and a browser, I sound like an ornithologist, with a minor in sandhill crane foodstuffs.

Pen testing and hacking are no different. Want to learn how to use a tool? Go to YouTube and somebody has a video on it. Want to define the difference between BIA and MTD? Go to your favorite search engine and type it in. Need a good study guide for CEH? Type it in and—voilà—here you are….

Search engines can provide a treasure trove of information for footprinting and, if used properly, won’t alert anyone you’re looking at them. Mapping and location-specific information, including drive-by pictures of the company exterior and overhead shots, are so commonplace now people don’t think of them as footprinting opportunities. However, Google Earth, Google Maps, and Bing Maps can provide location information and, depending on when the pictures were taken, can show all sorts of potentially interesting intelligence. Even personal information—like residential addresses and phone numbers of employees—are oftentimes easy enough to find using sites such as Linkedin.com and Pipl.com.

A really cool tool along these same lines is Netcraft (www.netcraft.com). Fire it up and take a look at all the goodies you can find. Restricted URLs, not intended for public disclosure, might just show up and provide some juicy tidbits. If they’re really sloppy (or sometimes even if they’re not), Netcraft output can show you the operating system (OS) on the box too.

Netcraft has a pretty cool toolbar add-on for Firefox and Chrome (http://toolbar.netcraft.com/).

Another absolute goldmine of information on a potential target is job boards. Go to CareerBuilder.com, Monster.com, Dice.com, or any of the multitude of others, and you can find almost everything you’d want to know about the company’s technical infrastructure. For example, a job listing that states “Candidate must be well versed in Windows 2008 R2, Microsoft SQL, and Veritas Backup services” isn’t representative of a network infrastructure made up of Linux servers. The technical job listings flat-out tell you what’s on the company’s network—and oftentimes what versions. Combine that with your astute knowledge of vulnerabilities and attack vectors, and you’re well on your way to a successful pen test!

Leave a comment