Sometimes, resolving network issues just requires a bit of creativity. Years ago, I had an issue with a T1 connected to a Linux box through a very inexpensive ISP. Via the same ISP, I had another Linux box - with another T1 connected. To prevent confusion, we'll call these systems Florida and New York.
One morning, Florida complained that they "could not access the internet". This is a common thing, usually remedied by turning on the computer. But, being skeptical I opened an ssh connection to their firewall - and responded that the internet connection was fine. They said that no pages were loading, so I logged into another system behind the network and indeed - internet access was in fact degraded.
I noticed that the problem appeared to be limited only to DNS. But, after a few tests I learned something interesting - Florida just could not contact New York at all. Everything else worked - just not the connections over the same ISP. But why? I immediately dialed the number to get to the bottom of this.
Unfortunately, the tech support there wasn't as great as Pantek, and I spent a lot of time sitting idle on the phone waiting for action. Since this was during the start of the business day - I couldn't possibly wait for them, so I opted to figure out my own resolution.
Since I'm in Cleveland - I can't just walk over there and re-configure the workstations (particularly since there were about 75 of them - not to mention that would need to be restored after the change, so my work would effectively be doubled. Instead, I opted to use the resources I had on hand to solve the problem.
Prior to making changes, it was necessary to identify the real business problem - people couldn't access the internet or other web pages. Since the symptoms started with DNS failures I started there.
On the system, I installed Bind and downloaded all zone files to the system, so that it could be the master of our domain. Then, once it was started, I counted on iptables to redirect outbound port 53 traffic to the local host - so that the router/firewall combo could respond with all DNS queries.
Once this was accomplished (and people were once again able to access their GMail, it was then time to allow people to access our domain's email and web pages (since 97% of all business functions are computed through the domain website and mail servers).
Unfortunately, you can't just copy your mail server configuration over to the new system and just magically have email arrive. So, I counted on a combination of two packages - rinetd and of course, iptables. What I then did was this - on a server in Cleveland, I installed rinetd to listen on web and mail ports (including SSL) and to immediately forward those requests over to the original servers. Then, I configured iptables on the Cleveland server to block access to all systems but from Florida.
rinetd is a great utility - it performs TCP-based port forwarding by accepting a connection, opening a new connection and then forwarding all content received via either connection to one another. Once I confirmed that the setup was working, I then configured rinetd on the local firewall again - this time, forwarding all connections to the Cleveland system. Then, I tested by connecting to the localhost on one of the aforementioned ports.
What did this do? This essentially forwards all connections on the inbound firwall over to the Cleveland network, then back to the New York network. Responses, go over the same channel. Granted, it was super slow (as expected) - but it was working. After adding a few iptables redirect commands to the nat table, people were starting to rejoice that they were able to access their content again.
But what about their original problem?
Turns out, when you would issue a ping from Florida to New York, the New York side would see it - and respond to it, but the response would never get to the system in Florida. Why not? Turns out, the wonderful ISP in question had screwed up some of their BGP rules, causing none of the return traffic to make it. For some reason, they didn't believe me while on the phone with them for 6 hours that the problem was related to BGP - but that's another complaint for another day.
So, what's the moral of this story? Sometimes, the answer to the business problem is not the technically accurate answer - and a little creativity is all that's needed to sit back at the end of the day with a smile on your face knowing you solved a problem that originally seemed outside of your control.
0 comments:
Post a Comment