Archive for August, 2009

Setting Up A Security Information Management System-Part4

August 12th, 2009

In the last post I talked about how to setup a logging server that will accept remote log entries. In this installment I’ll talk about how to sort log entries into specific files.

Facility, severity and priority

Let’s talk about how logging servers figure out which file to store a log entry in when it gets received. Log messages contain two descriptive parameters, facility and severity. When these two parameters are combined, the value is referred to as the priority of the log message.

Facility

Facility defines the type of process that generated the log entry. For example all mail servers are expected to identify that their log entries are part of the “mail” facility. FTP processes should use the FTP facility, NTP processes should use the NTP facility, and so on. RFC 3164 defines the valid facilities, but here’s the list:

Numerical          Facility

Code

0              kernel messages (kern)

1              user-level messages (user) – default if not specified

2              mail system (mail)

3              system daemons (daemon)

4              security/authorization messages (auth)

5              internal syslogd (syslog)

6              line printer subsystem (lpr)

7              network news subsystem (news)

8              UUCP subsystem (uucp)

9              clock daemon

10            security/authorization messages (authpriv)

11            FTP daemon (ftp)

12            NTP subsystem (ntp)

13            log audit

14            log alert

15            clock daemon (cron)

16            local use 0  (local0)

17            local use 1  (local1)

18            local use 2  (local2)

19            local use 3  (local3)

20            local use 4  (local4)

21            local use 5  (local5)

22            local use 6  (local6)

23            local use 7  (local7)

The “local use” facilities are similar to private addresses in the IP world. These facilities are not reserved, and are available for anyone to use as they see fit.

Facility problems

There are a couple of problems here. To start, where is the Web server facility? This list was generated back in 1987 before Web servers (or Gopher for that matter) existed. So some of the services we use today (VoIP, SQL, etc.) are missing. Also, some of the listed services typically go unused in a corporate environment. UUCP and Network News (NNTP) are excellent examples.

The lack of current services has caused many vendors to rely heavily of the local use facilities. This can cause potential conflicts when we get into sorting our log entries. For example Linux uses local use 7 to identify its boot time log entries. Apache also uses local use 7 for Web server errors. So down the road it may be difficult for us to sort Web errors and boot messages into different log files.

Another problem is that there is no verbose description about each of these facilities. This can make it a bit difficult for a programmer to identify which one to use. For example, let’s say we’ve written a program that authenticates a user for network access. Which facility should we use? Facility 4 and 10 seem the most likely, but their descriptions are identical. How do we choose? If our program runs as a background process should we actually choose facility 3 instead?

You get the idea. The list is not as clear-cut as it could be. It is not uncommon to see vendors use a different facility than you would expect. For example I’ve seen VPN vendors undecided as to the differences between facilities 4 and 10, so they simply send some percentage of log entries to each.

Severity

Severity defines the importance of the log entry. The same RFC 3164 defines the severity levels as:

Numerical         Severity

Code

0             Emergency: system is unusable (emerg)

1             Alert: action must be taken immediately (alert)

2             Critical: critical conditions (crit)

3             Error: error conditions (error)

4             Warning: warning conditions (warn)

5             Notice: normal but significant condition (notice)

6             Informational: informational messages (info)

7             Debug: debug-level messages (debug)

Luckily the severity levels are far less vague than the facility descriptions. This means they are much less confusing to work with. The higher numbered severity levels tend to be very verbose. This means saying you want to send debug level messages to your logging server could easily flood the network. Use the higher numbered severity levels with caution.

Priority

When a log entry gets transmitted to a log server, the first value contained within it is the priority of the message. The priority is the facility and severity values combined per the following math formula:

( Facility x 8 ) + Severity = Priority

So lets say our mail server needs to send a warning message. What would the priority be? The mail facility has a value of 2, while warnings have a severity of 4. So the math would be:

( 2 x 8 ) + 4 = 20

If a print server (facility 6) needed to send a log entry saying it is currently on fire (severity 0), the priority value in the message would be:

( 6 x 8 ) + 0 = 48

When a log entry gets transmitted, the priority value needs to be encapsulated in less than and greater than signs. So the priority value in the above mail server message would be “<20>” while the print server would use “<48>”. Again, this needs to be the first piece of information transmitted in the log message.

Sorting log entries

The priority value is used by logging servers to sort the incoming messages. For example if we wanted all mail messages to go to the same file, we would tell our logging server that all messages with a priority of 16 (2×8+0) through 23 (2×8+7) should go to the “maillog” file. Most logging servers (like rsyslog) will let you do this numerically or by using the short description names.

rsyslog.conf example

Here are two lines out of the rsyslog.conf file that ships with Fedora. Let’s talk about what they are actually doing:

authpriv.*                                                              /var/log/secure

*.info;mail.none;authpriv.none;cron.none                /var/log/messages

These lines define two of the rules for determining which log entries should go to which log files. The syntax for sorting is:

facility.severity

So the first line says all facility 10 (authpriv) log entries, regardless of severity (“*” is a wild card match) should be sent to the file /var/log/secure.

The second line is a bit more complex as it has multiple conditions separated by semi-colons. These conditions state:

  • *.info = All facilities, so long as the severity level is info
  • mail.none = No mail facility log entries, regardless of severity
  • authpriv.none = No authprive facility log entries, regardless of severity
  • cron.none = No cron facility log entries, , regardless of severity

Or, to translate this to English, the line says “Send all severity “info” messages to /var/log/messages, except those that contain a facility of “mail”, “authpriv” or “cron”.

So with these rules we can define any combination of facility and severity values and which log file we would like to direct it. When you first set this up, stick with the defaults. As you start collecting log entries you can tweak the rules as you see fit.

Bending the RFCs

In an ideal world, the RFCs would be a perfect fit for everyone’s needs. Unfortunately this is not always the case. A good example is the logging facilities. As mentioned we are missing facilitates for modern day services, while at the same time have facilitates that we will never use.  An obvious answer is to recycle the outdated facilities in order to support modern services.

For example, UUCP ( facility 8 ) is not even supported by modern operating systems. With this in mind, I like to use it as my Windows facility. That way I can sort all Windows log entries into their own file. For network hardware, I use the network news facility (facility 7). If you are unsure if a facility is currently in use, modify your logging server’s configuration file to send all log entries for that facility to a unique file:

ftp.*                                                                 /var/log/facility-test

If no entries arrive, you are in good shape. Just keep in mind that a legitimate service may use it at a later date. For example if three months from now someone sets up an FTP server, we may have problems if we are already using the FTP facility (facility 11). If you are unsure you can always stick with the local use facilities, as that is what they are intended for. Local use 0 and 7 seem to be the most heavily used, so avoid them when possible.

Other sorting options

While its not part of the RFC, some logging servers give you the ability to sort log entries based on patterns within the message. A good example is Syslog-NG. Syslog-NG will sort based on facility and severity, but you can also sort based on source IP, the application that generated the log entry, etc. This gives you far more flexible sorting options and it may be something to consider if facility/severity is not granular enough for your needs.

Exec Summary

In this installment we talked about how facility and severity is used to sort log entries. In my next post I’ll talk about how to get each of our systems to submit log messages to our centralized server.

Setting Up A Security Information Management System-Part3

August 11th, 2009

In the last post I covered some of the architecture concerns with rolling out a centralized security information system. In this post I’ll cover deploying a basic log server, and verifying that it is ready to accept log entries.

Selecting a logging server

The first thing we need to do is select a platform for our logging server. If we are simply setting up a test lab, Windows, UNIX or Linux will all make great choices. Choosing Windows might be helpful for a Windows administrator, as they will not have to cut the curve on a new operating system while attempting to test out logging. While Windows does not support Syslog out of the box, there are some excellent packages like Kiwi Syslog Server and WinSyslog that will add Syslog support. Both have evaluation versions and are relatively inexpensive to license.

If we are talking about setting up a production server however, we will want to stay away from Windows. Windows is notorious for having a horrible IP stack. If fact previous “patches” have crippled it even further in the interest of slowing worm propagation and increasing the speed of the GUI. While many of these limitations have been removed in 2008 server and Windows 7, IP performance is still sub-par when compared to a Linux or UNIX system deployed on identical hardware.

So that leaves Linux and UNIX as choices for a production system. Which to choose will depend on personal choice. Some like the stability of BSD while others like the flexibility of Linux. For the purpose of this document I’ll be working with a Fedora based Linux system. Installation and setup of the OS is relatively intuitive and straightforward.

Accepting remote logs

In order to accept log entries from remote systems, older versions of Fedora required you to initialize the Syslog daemon (syslogd) with the “-r” option. This was done by adding “-r” to the syslogd_options line of /etc/sysconfig/syslog file. Some versions of Linux still support legacy Syslog, and require you to add “-r” to the Syslog RC initialization file. Check the docs for your specific distribution.

New Fedora systems however support “Reliable Syslog” or rsyslog. Implementation is pretty similar to plain old Syslog, except rsyslog supports communications over TCP/514 as well as UDP/514. In the last post I described that running log entries over TCP can fix some of the reasons we loose log entries, but not all of them. If you want to play around with TCP support, go ahead and open both ports on the logging server.

To get rsyslog to accept remote log entries, we must edit the /etc/rsyslog.conf file. Towards the beginning of the file you should see the following:

# Provides UDP syslog reception

#$ModLoad imudp.so

#$UDPServerRun 514

# Provides TCP syslog reception

#$ModLoad imtcp.so

#$InputTCPServerRun 514

The “#” (pound) symbol at the beginning of the line tells the system not to process the rest of the line. We use this technique for commentary as well as “commenting out” commands we do not wish to have processed. By commenting out the ModLoad and port specification lines, we prevent rsyslog from opening a listening socket. The helps to keep the system in a more secure state.

Since we are setting up a centralized logging server, we will need to open those sockets to accept remote log entries. Modify the /etc/rsyslog.conf file to remove the appropriate pound symbols. The file should now look like this:

# Provides UDP syslog reception

$ModLoad imudp.so

$UDPServerRun 514

# Provides TCP syslog reception

$ModLoad imtcp.so

$InputTCPServerRun 514

If you know you will never use TCP, you can leave the last two lines commented out. Once complete save your changes and exit the file.

We now need to restart logging so our changes are implemented. This is done on Fedora by executing the following command:

service rsyslog restart

When you execute the command, you should see rsyslog stop and start with a status of “OK”. If the shutdown failed, it is because rsyslog is not being initialized at boot time. From the command line, execute the command “setup” and select “System services” from the main menu. When the services menu appears, scroll down the list till you find rsyslog. Check off the box to the left and then select “OK”. Quit the setup utility and rsyslog will now initialize whenever the system is booted.

Verifying the listening port

Next we need to ensure that our logging process is accepting remote log entries. From the command line, type “netstat -an | grep :514”. The output should look similar to the following:

[root@fubar ~]# netstat -an | grep :514

tcp     0      0 0.0.0.0:514                 0.0.0.0:*              LISTEN

tcp     0      0 :::514                                 :::*              LISTEN

udp    0      0 0.0.0.0:514                 0.0.0.0:*

udp    0      0 :::514                                 :::*

The first line tells us that TCP/514 is listening via IPv4 on all network interfaces. Line two tells us the TCP port is also listening on any interface with an Ipv6 address. Lines three and four are the same information, except for UDP. If any of the entries state “127.0.0.1:514” instead of “0.0.0.0:514”, then the port is only bound to the loopback interface. Only the local system will be able to reach it. This can happen with legacy Syslog systems if you forgot to run them with the “-r” switch.

You should now have a logging server that is capable of receiving inbound log entries. In the next post I’ll talk about how these log entries get sorted into specific files.

Setting Up A Security Information Management System-Part2

August 10th, 2009

In my last post we discussed defining your goals for a Security Information Management (SIM) system. In this post we’ll talk about architecture concerns as well as capacity planning.

Network communications

The goal will be to have one or more SIM servers that will collect log entries from other systems. This will obviously have an impact on network utilization. How much of an impact will depend on the quantity and type of systems we collect log entries from.

UDP/514

Just about all systems support the original Syslog communication implementation which goes all the way back to the year 1988. The last description of this spec appeared in RFC 3164. While this RFC has been obsoleted by RFC 5424, RFC 3164 still represents the implementation supported by most vendors. Windows is a notable exception (proprietary, no Syslog support), but there is 3rd party software to rectify this.

Both RFCs specify the use of the UDP protocol when transmitting log entries. The well-known port to use is UDP/514. Where RFC 3164 and 5424 differ is in the format of the log message. I’ll dig into these differences in a later post.

The love/hate of using UDP

On the positive side, UDP is connectionless. This means that it generates less traffic than if we used TCP. Also, log transmissions are a one-way process. The host generating a log entry sends a packet to the logging server, but the logging server never replies. This means we can control traffic flow with static filtering rather than stateful filtering which will place less overhead on the traffic control device. Also, the UDP header is typically 1/3 – 1/4 the size of a TCP header, which means smaller transmission packets, thus less network overhead.

On the negative side, UDP is connectionless. ;) This means that it has minimal error reporting capability. For example if we transmit a log entry and the frame goes missing (say a collision or a firewall dropping the packet), UDP does not have the ability to detect that a retransmission is required. This means its possible for log entries to go missing if we overflow the network. Further, UDP has no flow control ability. If the SIM server recognizes it is reaching capacity it has no way to slow down the incoming transmission of log entries. The SIM server’s only option is to throw the packets away without processing them.

Needless to say, we need to ensure that we properly specify capacity. If the network or the SIM server becomes overloaded, we are going to lose log entries. Proper capacity planning starts with understanding the impact of logging on the network.

Network impact of logging

The maximum size of a UDP Syslog packet has different specifications in different RFC’s. The outdated RFC 3164 defines the maximum message size as being 1,024 bytes. RFC 5426 drops this maximum size to 480 bytes. If a vendor is still following the old spec, its possible they may still think the 1,024 byte size is legitimate. It has been my experience however that most log entry packets range in size from 75 to 225 bytes, so the maximums are a non-issue.

Windows systems, firewalls and intrusion detection systems tend to generate the largest messages. Network hardware tends to generate the smallest messages. If we have a 100 Mb Ethernet network, the theoretical maximum would be somewhere around 50,000 to 130,000 frames per second. This assumes zero other traffic, which is rarely the case. For the purposes of capacity planning, assume you will be limited to 5,000 log entries per second. This number might even be less if you have a busy network. Taking some utilization measurements during the planning process is key.

Syslog over TCP

As mentioned above, UDP introduces the problem that log entries can become lost without us even knowing it. There are ways to validate capacity, which I will cover in a later post. Some feel running Syslog over TCP can rectify this problem. TCP can be leveraged for its reliability to insurie our log entries are properly received.

Unfortunately TCP support for Syslog is no were near standardized. Some vendors support TCP by simply listening on TCP/514. RFC 3195 defines Reliable Syslog as using port TCP/601, but its adoption has been extremely limited. RFC 5425 defines the use of TLS to secure Syslog transmission. This RFC specifies the use of port TCP/6514. This is a brand new specification and I’m unaware of anyone supporting it just yet.

So support for TCP is all over the board. Further, TCP does not completely fix the problem. While TCP will give us flow control and reliability on the wire, it cannot make up for the fact that Syslog at the application layer does not acknowledge the receipt of log entries. This was by design as it reduces overhead. The problem is that even by using TCP we can still lose messages within the IP stack and never know it is occurring.

So if you want to try and transmit logs via TCP, its only going to work between a specific vendor’s client and server software. For example you may need to run Syslog-NG on both ends of the connection to leverage it’s support for TCP. This is not always practical, as you cannot run the software client on appliances like access points, switches, routers, etc.

Where to place the logging server

When deciding where to place the logging server, we have to keep both network capacity and security in mind. Take a look at Figure #1. This is an ideal situation where the logging server has been isolated to a dedicated network operations network. This isolates it from the other security zones and makes it much easier to leverage the firewall to restrict access to the logging server.

sim-placement

The drawing assumes we only need one logging server for our entire environment. What if we have 100,000 nodes to keep track of? Large networks may need to look at aggregating the data. For example if I have 10 field offices, I may need to have a logging server located at each of them collecting local log info. Each of these logging servers would then relay summary information back to the corporate office for network wide trend reports. This way we maintain a high level of visibility while reducing network load. I’ll cover some possible aggregation options in a later post.

How many systems can log to a single logging server?

There is no single answer to this question as each network is different. It is going to depend on how much capacity is available on your network and how many log entries each of these systems generate. For example I could probably point 50,000 switches at a single logging server, as switches tend to generate very few messages. Firewalls on the other hand are extremely chatty, so I might max out the network or SIM server with only 20-50 firewalls. So to answer the question we need to look at two metrics:

  • How much free capacity is there on the wire?
  • How many log entries will each host generate?

The second question is not as straightforward as it may seem. For example the average desktop may only generate 40-100 log entries per day. If we can push 1,000 log entries per second, the math says we should be able to point 86 million desktops at a single logging server. The problem is about 80% of those messages are generated at initial boot time. If everyone typically powers around 9:00 AM, the math changes to a more realistic 750 desktops (again, assuming we can push 1,000 log entries per second over the wire).

So we can’t just look at quantity of long entries. We need to take time of day into account as well. This will identify the actual number of log entries per second we can expect under worse case conditions. Worse case is the capacity level we need to plan for.

Deploy centralized logging in phases

If you have tens of thousands of systems to deal with, it is easy to get overwhelmed with the work involved with deploying centralized logging. Rolling out the solution in phases makes it easier to wrap your brain around the whole process.

First, start with a single logging server. You may not be able to cover your whole network, but we have to start somewhere. Large networks should consider a deployment at the corporate office first, moving out to field offices once the corporate system is fully vetted and functional.

You will also want to phase in which devices you are collecting information from. I usually go with the following order:

  1. Network intrusion detection systems
  2. Firewalls
  3. Network hardware (routers, switches, access points, print servers, etc.)
  4. Internet facing servers
  5. Internal servers
  6. Internal desktops

Obviously you can tweak this list to fit your needs. For example if you do not plan on collecting info from desktops, simply leave that step out. I like to start with network intrusion systems first as their log entries are well suited for vetting both daily reports and real time alerting. Once we have a handle on alerting and reporting, adding additional devices becomes far easier.

Exec Summary

In this post I covered all the things you need to consider when initially deploying a centralized logging solution. We covered how to predict the impact it will have on network utilization, how to calculate the number of hosts per logging server, and why it is important to deploy the solution in phases. In the next post we’ll start talking about configuring the centralized logging server. Specifically, we’ll look at how we are going to sort log entries.

Setting Up A Security Information Management (SIM) System – Part 1

August 8th, 2009

I get a lot of logging related questions. So much so that I decided to do a series on how to deploy log management. There are some excellent logging resources on the Internet, but they are fragmented in scope and/or vendor specific (usually written by the vendors). I wanted to create something vendor neutral that holds your hand through the entire process of deploying a log management solution.

Why should I deploy a security information management system?

Let’s be candid, deploying log management is hard and painful. This is the reason why so many administrators avoid it like the plague. It is difficult to deploy and a wild buck for performing long term administration. Weekly trips to the dentist would probably be more pleasurable.

With all that said, log management is probably the single most effective security solution you can deploy. You can’t drop it and forget it like a firewall, but log management can give you unrivaled visibility into the inner workings of your network. When its not providing insight into security events you might otherwise miss, it is doing double duty helping you troubleshoot communication and system issues. A logging system can be resource intensive, but it can also provide a very high rate of return.

Why do you want a SIM?

Before we begin, the first question you have to ask yourself is why do you want a SIM solution. Do you want to improve security or is there a compliance specification you need to adhere to? It might seem odd to want to distinguish between the two, but the requirements are drastically different. Standards are far easier (and cheaper) to meet than true security.

Standards such as PCI-DSS require you to log user, application and network activity. However they tend to be very vague in how that information gets processed. You can usually get away with dropping in a black box, generating some colorful management reports, and be considered “compliant”. It may not help you find that backdoored system that’s calling home, but you’ve met the standard.

Standards tend to focus on the lowest common denominator. They need to be applicable for a wide range of audiences, including businesses without a lot of resources. Rather than evaluating a specific organization’s risk and basing the requirements on that, we set the bar low so it is achievable by small and large organizations alike.

Also, to simplify the process, we tend to focus on checklists. Checklists are cool because they tell you exactly what needs to be done to be complaint. If an auditor can put a checkmark next to all the items, you pass the testing. The problem is checklists tend to focus on symptoms, not the actual problem.

I’ll give you a great example. I had a client bring in a Qualified Security Assessor to certify them for PCI-DSS. This was one of my clients running a strict implementation of application control, so they could show a year and a half history of zero Malware infections. While they certainly received Malware over that time, we could prove that there were zero instances of actual infection as every Malware attack was immediately contained and eliminated. Not many businesses can claim a year+ with zero Malware infections.

The auditor failed them. PCI-DSS requirement #5 states: “anti-virus software must be used on all systems commonly effected by Malware”. Since they ran application control, not anti-virus, they were deemed non-compliant. If requirement 5 had been written to identify an acceptable threshold for Malware containment, they certainly would have met the specification. However risk evaluation and metrics do not make for easy checklist items.

So if you want to deploy a SIM to actually augment security in your environment, it is going to take longer and require more work than simply meeting a specification.

Should you build your own SIM?

I’m a firm believer that anyone considering a SIM solution should start by building his or her own. While there are some decent commercial SIM solutions out there, they isolate you from the inner works of the logging process. This can be a good thing in that it saves you time. The problem is you will not learn as much.

Also, log management deployment is a journey. You will find in the course of a rollout that your requirements may change. Information you initially thought was important, all of a sudden is not. Reports you didn’t even think of, all of a sudden jump to the top of the list. By building your own system you will have more flexibility to make changes on the fly. If you later decide you want a commercial solution, you are now better informed of your requirements and can do a better job evaluating a potential purchase. This is important, as many log solutions are expensive. You don’t want to drop a lot of money on a solution that will not meet your long-term needs.

I’ll give you a good example. Most of the sites I’ve worked with initially think failed logons are important and want to see the reports. It does not take them long to figure out seeing all failed logons is a complete waste of time as everyone fat fingers the keyboard on occasion. They then realize they want some thresholds around the data. For example they only want to see failed logons if three or more failures are seen in five seconds (indicating an automated attack). Or only show failed logons when multiple logon names are used from the same source IP (indicating a password guessing attack). So by dealing with some information overload, they become better skilled at defining exactly what they wish to see.

Summary

OK, so we’ve covered defining a focus (security Vs. standards requirement) as well as the importance of initially building your own system. In the next installment I’ll get into architecture and capacity planning.

DLP FAQ

August 7th, 2009

I’ve had a few queries regarding the SANS Data Leak Prevention & Encryption Summit I’ll be keynoting next month. The questions have revolved around DLP in general, so I thought I would give a run down on the technology.

What is DLP?

DLP stands for “Data Leak Prevention” or “Data Loss Prevention”, depending on which vendor you are talking to. There are a few other names currently being bounced around (gotta love marketing people trying to make their stuff look newer and cooler ;) ), but they are effectively the same technology. DLP attempts to log, or possibly prohibit, the transfer of sensitive information from a secure location to an insecure location.

Sensitive information usually includes data like credit card numbers or social security numbers. Most will also give you the ability to define phrases or specific files as sensitive as well. Of course how much customization you get depends on the product, but these features are pretty standard. The big difference tends to be with the ease of policy creation. Some let you use a simple, natural language while others may require you to learn a Regex type of expression language to create policies and write filters.

Think of DLP devices as intrusion detection systems for specific keywords and you’ll get the idea. In fact some established NIDS and NIPS vendors are now touting their DLP capabilities as well. You also have a number of startups that are focused specifically on the DLP market.

How does DLP work?

Currently there are three different methods of DLP deployment:

  • On the wire
  • On the server
  • On the desktop

Some vendors support a single method of deployment while others support all three. There are strengths and weaknesses to each, which I will cover later in this FAQ.

How much does DLP cost?

Since it’s a new technology, prices are all over the board. A medium size company (50-500 nodes) can expect to pay anywhere from $30,000 to $200,000 US. These devices are by no means plug and play, so a portion of the cost includes configuring the device and customizing it for the specific environment. You should also expect a bit of lead-time in getting the device(s) deployed properly.

What are the problems with DLP?

Probably the biggest problem with DLP technology is that it can easily be defeated. It is really designed to prevent accidental data leakage, rather than a true attack. You should consider DLP an enhancement to your existing security posture, not a replacement for any previously deployed technology.

For example, deploying DLP on the wire is probably the fastest and most effective deployment. The problem is it can easily be defeated by encryption. So if I encrypt a sensitive file prior to transmission, or leverage a VPN technology (see items 5 and 4 on my Top 5 Firewall Threats) post, the network based DLP will be unable to see the passing information.

Some DLP devices can give you limited ability to work around the encryption problem. For example Fedelis will integrate with a number of proxy products to check passing HTTPS. You have to purchase a supported product however and configure it specifically to prevent end-to-end encryption of HTTPS (the proxy breaks the encrypted stream so payload can be analyzed). Even then you’ve only solved the problem over HTTPS. Encrypted data through other ports will still be an issue. Or an attacker could encrypt the file locally and then transmit via HTTPS because all the proxy can strip away is the SSL encryption.

Deploying DLP on the desktop solves some of these problems, but not all of them. For example the desktop agents I’ve looked at do a pretty good job of preventing me from transferring a sensitive file via the Internet or to a local USB drive. If you run an agent based DLP, try this:

  1. Open a sensitive file
  2. Create a screen capture of sensitive info (CTRL-ALT-Print screen)
  3. Open Windows Paint and press CTRL-V
  4. Save the file as a GIF or JPG
  5. Copy to a USB drive or transfer via the Internet

If your results are similar to mine, you’ll find this very simple trick fools the agent into letting the data pass by. If you wanted to get really slick, you could add a bit of Steganography.

Exec Summary

DLP is a powerful technology that can help prevent the release of sensitive information. Currently it is better suited for preventing against accidental data leakage rather than a determined attacker. If the release of sensitive data is a serious concern, you may need to rework your current architecture in order to close the holes DLP cannot defend.

If you can read this, you don’t work for SANS – part 2

August 5th, 2009

This issue appears to have been resolved. Kind of funny actually. I had been dealing with the Host Monster ticket system and it was taking 24 hours to get a reply. This morning I made a post to Matt Heaton’s blog (CEO of Blue Host) about the problem. It was resolved within hours and I’ve already received 3 follow ups from support.

Host Monster support states that the problem was D-Shield put themselves (and I assume Cisco as well) on their own ban list. I spoke with Johannes at D-Shield. I’ve known him for 10 years and he’s a real straight shooter. He had no clue what they are talking about and had not heard of this problem with anyone else. Sounds a little funny to me, because if they were actually using D-Shield to generate a ban list they would have known they were the good guys last Thursday when I first contacted support.

In any event it appears that all of the previously mentioned blocks have been cleared. Who says security and day time soap operas have nothing in common. ;)

If you can read this, you don’t work for SANS

August 4th, 2009

Anyone that has trained with me can tell you I’m real big on being able to read your own detects. While we have plenty of security devices that try to accurately describe what they think they see on the wire, they are programmed by humans and humans make mistakes. Try and automate the process and the mistakes become compounded. Even Cisco has backed off a bit on their grandiose claims of what a self defending network is actually capable of. Nothing replaces having a skilled analyst reviewing the findings.

Good help is hard to find

Of course the keyword in that last comment is skilled. I’ve dealt with plenty of senior security folks that have never seen a decode of an IP packet, let alone can tell you what a legit IP session should look like. One of the problems is when we need training we usually turn to the vendors. Vendors tend to focus on their pretty GUI, not what’s going on behind the scenes.

In a previous life I owned an ISP and had some very entertaining abuse reports submitted. One of my personal favorites was an admin reporting that one of my systems was “sending hostile ICMP packets” to one of his systems. When I reviewed my logs, I noted that one of my routers was in fact sending him ICMP host unreachable messages. This would happen every time his host probed the RPC port of an IP address that was not in use. I wrote back and explained that if his system would simply stop probing for non-existent systems, my router would stop telling him the host is off-line.

Another admin (at a rather large, well known company I might add) informed me that one of my systems was attacking him with Code Red via e-mail. If you remember Code Red, it only attacked IIS Web servers via HTTP. The “attacks” in question were users subscribed to a mailing list. Folks were talking about how to write good intrusion signatures to properly catch Code Red. If that was not ironic enough, the payload of the decode he sent me as evidence explained that the attacks were only HTTP based. If that twist is still not enough to make you chuckle, he later admitted that he was the one of the people subscribed to that list. :D

The more things change the more they stay the same

My hosting provider Host Monster (a subsidiary of Blue Host) has put a filter in place blocking all access to the SANS Institute mail server (SysAdmin, Audit, Network, Security Institute; provides computer security training), The Internet Storm Center (daily diary of Internet security threats) and DShield (an early warning system for Internet threats). I contacted support and they confirmed they are filtering these sites. I was unable to find out why beyond “due to suspicious activity”.

I know the folks that maintain the SANS and Dshield servers. They are hard core security folks with a serious clue. When I first signed up with my hosting provider I was impressed with the knowledge level of their support personnel. Lately however, I’ve found them to be lacking in even the basics. While I’m left to guess as to what actually caused the ban, I’m inclined to think that someone at Host Monster (or possibly Blue Host) saw an alert but didn’t have the skills to figure out its a false positive.

Communication is a two way street

Blue Host claims 1.5 million hosted sites through all of their holdings. So they now have 1.5 million clients that can’t:

  • Receive real time blocking alerts of malicious IP’s
  • Receive assessments on current Internet threats
  • Receive info on what’s going on in the security industry

So while attempting to protect themselves is a positive thing, the implementation has had a negative effect on the security of their clients.

How to verify a detect

So let’s pull something positive out of all of this and identify the proper procedure for verifying a security alert. We first need to start with good gear. Do not even consider an intrusion detection or prevention system that does not include:

  • Access to the signature language
  • Full decode of suspect packets

Without these features you are shooting in the dark.

Step 1: Understand the attack

When an alert gets triggered, make sure you understand the attack mechanism. What ports or services does it go after? Are there any known signatures? If you Google the attack’s name followed by the key words “false positive” and “spoofed”, does anything come up?

Step 2: Understand your intrusion system

No security product is perfect. They all have weaknesses or limitations. Does your intrusion system maintain state? If so, is it all the time or just some of the time? Does it properly validate CRC fields? How does it deal with fragmented traffic? Is it known to generate false positives? If so, are the false positives limited to only certain signatures or protocols, or is it all of the time?

Step 3: Sanity check the alert

Sometimes false positives can be weeded out from the limited amount of info presented in an alert. For example does the alert claim to have detected an HTTP attack coming from TCP/80 instead of going to it? If so, there is an obvious problem with the signature generating the alert.

Step 4: Check the signature

Some signatures are written very specifically so that there is little chance of a false positive. Some are more general however so its possible to have false positive fall out. Review the signature that generated the alert and make a judgement call. Does the signature check 3-4 different conditions or ten or more? Obviously the more parameters we are checking, the less likely we are to get a false positive.

Step 5: Check the decode

If you understand the attack pattern, you should already have an expectation of what will be in the attack decode. Does the packet match your expectations? I’ve seen plenty of false positives generated by people reading info on a Web site describing an HTTP based attack. These are easy to distinguish due to the extra HTML, proper agent and referrer fields, etc. In short, if the packet does not match a known decode of the real attack, figure out why.

Step 6: Research the source

I always take the time to make sure I understand who is sitting behind the source IP address. Sometimes this can go a long way towards identifying whether I can trust the alert. I’m reminded of a friend that banned a number of IP addresses his intrusion system had identified as hostile. Shortly after he started noticing that parts of the Internet were no longer reachable. Turns out someone spoofed a series of attacks from the IP addresses of the root name servers. Had he taken the time to look up the IP addresses first, he most certainly would not have blocked them.

Exec Summary

Blocking known to be hostile IP addresses can certainly be beneficial to security, but it must be implemented with caution. At the core of any network security system must be a knowledgeable security expert with good common sense. If that component is missing, the whole structure can fall apart like a house of cards.

Update: Since posting this I’ve found that Host Monster (Blue Host) is blocking access to one or more Cisco servers as well. Guess the list continues…

Top 5 Firewall Threats – Part 2

August 3rd, 2009

In the last post I started counting down the five greatest threats to perimeter security. In this post I’ll complete the list.

Firewall Threat #3: Outbound HTTP

The popularity of HTTP (TCP/80) has become both a blessing and a tragedy. Certainly the Internet would not be as popular as it is today without the World Wide Web. While HTTP has lead to the greatest exchange of information in mankind’s history, our implementation of the service has caused it to become one of our greatest security problems on the Internet.

Why is it a threat?

The first issue is that TCP/80 access has become so commonplace; many firewall administrators have chosen to ignore it. An overwhelming majority of the networks I have audited permit outbound TCP/80 access but then never log its use. When I ask why the permitted traffic pattern is not being logged, the standard answer I receive is “it makes my firewall logs too big”. Hummm, didn’t realize the perimeter security mantra was “no fatties”. ;)

Permitted traffic is inherently a higher risk than denied traffic because it facilitates the exchange of information. That passing traffic could be a zombie calling home or an internal system leaking sensitive data. If we do not log the use of a permitted protocol, we are completely blind to its abuse. The problem “how do I process large log files?” is much easier to solve than “how do I spot evil traffic when I’m not bothering to look for it?”.

Since HTTP has become a “turn it on and forget it” service, vendors and attackers alike have started running everything through this port. The brainchild at Microsoft who thought tunneling RPC through DCOM though HTTP was a good idea obviously had zero concern for how we would actually secure the implementation. While IRC used to be the protocol of choice for call home Malware, it is now HTTP because attackers can usually count on that port being wide open and unlogged.

How to prevent it

All permitted traffic patterns need to be logged. This includes outbound HTTP traffic traveling from the internal network to the Internet. In a later post I’ll tackle the problem of processing firewall log files so it is relatively easy to pull out the interesting bits.

Firewall threat #2: Banner grabbing

Most Internet based servers will happily identify themselves to connecting clients. For example whenever you connect to this hosted server, your browser sees:

Server: Apache/2.2.11 (Unix) mod_ssl/2.2.11 OpenSSL/0.9.8k DAV/2 mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/5.0.2.2635

While this information can be useful for troubleshooting, it can also be extremely useful for someone wanting to attack the system.

Why is it a threat?

Consider the following analogy: You are playing Five Card Draw against a number of opponents. Your opponent’s cards are well hidden in their hands, while your cards are laid out on the table for all to see. What are your chances of walking away the big winner at the end of the night?

Displaying version banners to connecting clients puts you in a similar position. If an attacker can see what software you are running along with the specific version, they can immediately determine if you are vulnerable to any of the attacks in their arsenal. So by displaying a software banner you’ve effectively helped the attacker get it right on the first try.

Without the benefit of the banner, the attacker would be forced to try each of their attacks in order to see if they will work. If we are vulnerable, we’re still going to get whacked. If we’re not, we’ve just forced the attacker to start generating log entries that will clue us in that the source IP is hostile. In other words, we’ve called their bluff so we now get to see their losing cards. This gives us an audit history and time to respond accordingly.

How to prevent it

Change the banners on any Internet facing services. This includes Web, FTP, name servers, mail servers, or any other service that can be reached from the Internet. Do not forget to restart the service after you make the change.

How easy or hard this is depends on the vendor. For example to fix this problem with the Apache Web server we would simply edit the “httpd.conf” file and change the “ServerTokens” parameter to be “Prod”. With IIS however, we do not have this type of flexibility. Microsoft does not let you change the banner to help insure they can properly identify their current market share. Your only real option is to put a reverse proxy in front of the Web server and leverage the proxy to scrub the banner.

Can changing the banner cause problems?

Most vulnerability scanners are primarily banner grabbing devices. For example when you run a vulnerability scanner against your mail server, it does not try every attack pattern its been programmed to test. Rather, it will grab the server’s banner and check it against a built in database. If the reported software version has known vulnerabilities, they get printed to a report. If you have ever run a vulnerability scan which claims to check for thousands of known attacks, but your IDS barely notices the scan, this is why.

Now with that said, not all checks are banner based. For example reading the banner will not tell the vulnerability scanner whether your mail server can be used as a spam relay. The scanner has to specifically test for that condition. So some exploits do need to be tested directly. Simply reading the banner however can satisfy a majority of the verification testing.

So, changing the banners not only makes it more difficult for attackers to assess your vulnerabilities, but it makes it more difficult for you to do so as well. You may be forced to drop to the command line to verify the version of software you are running. Luckily most software supports a “-v” or “-V” option which will print out it’s version information. Sometimes a different switch value is used so we will need to do a bit of research in the application’s help files. For example to get version information for Sendmail we would type:

[root@fubar ~]# sendmail -d0.1

Version 8.14.3

Compiled with: DNSMAP HESIOD HES_GETMAILHOST LDAPMAP LOG MAP_REGEX

MATCHGECOS MILTER MIME7TO8 MIME8TO7 NAMED_BIND NETINET NETINET6

NETUNIX NEWDB NIS PIPELINING SASLv2 SCANF SOCKETMAP STARTTLS

TCPWRAPPERS USERDB USE_LDAP_INIT

Firewall threat #1: Non-signature Malware

I’ve written extensively on the problems with detecting Malware. Feel free to use the search option on this site to pull up earlier posts for more info.

The bottom line is we try very hard to solve this problem at the perimeter by leveraging Unified Threat Management (UTM), firewall plug-ins, anti-virus proxies, etc. These solutions will never be 100% effective. In fact, their effectiveness has been declining sharply over the last few years. If you truly want to get a handle on modern day Malware threats you have to look at an application control solution.

Exec Summary

So the top five firewall threats are:

  1. Non-signature Malware
  2. Leaking banner info
  3. Outbound HTTP
  4. Outbound SSH
  5. Commercial VPN services

Note that the last four of the five are outbound traffic patterns. While we tend to focus heavily on what is trying to get in to our network, we also tend to blindly trust the traffic leaving it. Its this misplaced confidence that has lead to each of these items making it to our list.

Top 5 Firewall Threats – Part 1

August 1st, 2009

I spend a bit of time in the field consulting. Over the last few years I’ve noticed a trend with new clients whereby I can usually (about 7 out of 10 times) identify “something” leaking out through the client’s perimeter that they were unaware of. I thought I would compile a list of the top 5 so folks could sanity check their own security.

Firewall Threat #5: Commercial VPN services

It amazes me how often employees will do an end run around perimeter security and setup their own VPN solution. I even see this on networks that already have an existing VPN solution in place. Could be the end user thinks the corporate solution is too restrictive. Could be they simply want to be able to extract info without going through the corporate content checking solution. Unfortunately there are a number of third party companies that are more than happy to sell VPN services that will circumvent the average firewall policy.

How does it work?

There are a number of VPN service vendors, but the most popular is GoToMyPC (now owned by Citrix), so I will talk about that service specifically. Implementation for the end user is pretty straightforward:

  • Create an account and pay the $20/month fee
  • Load agent software on their system at work
  • Go home and launch their Web browser

The user can then logon to their work system and control their desktop remotely. They can even transfer file information by dragging files in or out of the session window.

Why are third party VPN services a problem?

Third party VPN services allow your end users to create an encrypted tunnel between their work system and systems on the Internet. If you have implemented some form of Data Leak Prevention (DLP) or content checking on the perimeter to control the flow of company private information, they will be defeated by a third party VPN. This means your end users could be transferring anything out of your network. Not only do you have no control of what info passes through the VPN tunnel, these services try very hard to look like normal traffic so you may not even know this is happening on your network.

Under the hood

Figure #1 shows how a third party VPN solution can work. Note that session establishment originates from the work system, not from a system on the Internet. So as far as the firewall is concerned this is an outbound session. Further, many of these services use TCP/443 to communicate and SSL to secure the session. So from the firewall this looks like normal HTTPS traffic. Most admins do not log outbound HTTPS, so there may not be any log entries to review.

gotomypc

How to prevent it

There are a number of possible solutions to curb this activity:

1) Review outbound firewall logs

Specifically, you want to look at outbound traffic taking place after hours. Filter out patching sites (Microsoft, Adobe, etc.), A/V signature updates, or any other expected pattern. Everything else should be scrutinized. Pay close attention to repeated connection attempts to the same host or subnet at a predictable interval (say every 20-60 seconds).

2) Desktop enforcement

Leverage application control or a similar technology to control which applications your users can install on their desktop. While this option may seem to be the most cumbersome, its also the only one that will work most consistently.

3) Block the IP addresses of known VPN services

While this option will work, it requires research to find out which IPs you should ban. Also, the list is going to change as new vendors come on the scene or IPs get moved around. For these reasons its my least favorite option.

Firewall Threat #4: Secure Shell (SSH)

It pains me to put SSH on this list, as over the years I’ve found it to be one of my most useful tools. Unfortunately SSH can be your worst nightmare when its in the hands of your end users.

How does it work?

Most people know you can use SSH as a secure replacement for Telnet. What is not as well known is that SSH can be used to tunnel TCP traffic. This can be implemented as either a forward tunnel, or a reverse tunnel.

SSH forward tunnels

When a user creates an SSH forward tunnel, the SSH client will open up which ever TCP port is specified by the user. Further, the user can instruct the remote SSH server where to send any traffic sent to this local port.

An example is shown in Figure #2. Let’s say that all outbound Web traffic is sent through a content checking system. The firewall checks outbound HTTP to ensure that only appropriate sites are accessed and that certain keywords are not permitted through. Let’s further assume that we have an internal user who wishes to access inappropriate sites.

ssh-forward

Here’s what the end user would do. On their home system they would run a proxy server as well as an SSH server. From their office system they would run an SSH client in order to connect to their SSH server and request a forward tunnel. They would tell the SSH client to open a local port (say 8080) and tell the SSH server to forward all of this traffic to the local proxy server. Now all they need to do is tell their Web browser to use the proxy server located at Localhost, port 8080.

Voila! When they launch their Web browser it will connect to TCP/8080 on the local system. This traffic is then sent down the encrypted tunnel to the SSH server, and then on to the proxy. The proxy server will then connect to what ever site on the Internet the user specified. The firewall can not content check the passing data or control which sites get accessed, because the datastream is encrypted. Our perimeter security has been circumvented.

By the way, the user does not have to limit access to only their personal system. Its possible to tell the SSH client to service connections for other hosts on the wire as well. I remember an on-site I once performed where my porn signatures were triggering on the internal network but not at the perimeter. Turned out one of the engineers had done the above and opened up access to their buddies. So when another engineer connected to his system, the porn content was visible. When it left his system and went to the Internet, it was now encrypted and protected from detection.

SSH reverse tunnels

SSH reverse tunnels are similar to SSH forward tunnels, but are designed to handle data requests headed in the other direction. This is shown in Figure #3. Let’s say we have an internal server which is only accessible to internal employees. In other words, the firewall does not permit inbound connection requests from the Internet to be forwarded to the server. In fact there may not even be any one-to-one NAT or port forwarding in place for folks on the Internet to even try and access the server. Let’s further assume that we have an end user who wishes to expose this server to the Internet.

ssh-reverse

This time the user only needs the SSH server on their home system. From their office system, they launch an SSH session out to their server and request a reverse tunnel. With a reverse tunnel the user tells the SSH server to listen on some local TCP port (say TCP/80) and send any inbound connection requests to the SSH client. The user then tells the SSH client to forward these connection attempts to a specific port at a specific IP address (say TCP/80 on an internal HR Web sever).

Once this connection is complete, anyone connecting to TCP/80 on the users home system will be forwarded to TCP/80 on the internal HR Web sever. The firewall can not control the session because it looks like outbound traffic and all the data is encrypted. In fact you can’t even detect this activity on the HR Web server as all connection requests will be logged as originating from the user’s internal desktop system (the one running the SSH client).

How to prevent it

Take control of all outbound SSH activity. Block TCP/22 outbound and only permit it through when its verified the traffic meets corporate policy. SSH can be configured to listen on any TCP port, so we have to be able to spot non-standard port use as well. Leverage a NIDS or NIPS signature that checks the first three incoming TCP reply packets for an SSH server banner. This should be done on all TCP ports except 22. The banner will contain the string “SSH-1” or “SSH-2”. While its possible to hack the client and server software to change the banners, most end users do not have necessary skills to modify the software and still have it remain functional.

That’s enough for this post. In the next I’ll include the last three threats.

C