MPLS Lab 024 Troubleshoot Internet Access via Global Routing Table Scenario2

Image requirements:
VIRL: IOSv 15.7
EVE-NG: Cisco vIOS Router vios-15.6
GNS3: vios-adventerprisek9-m.vmdk.SPA.156-2.T



Description:
This is the second troubleshooting the scenario for lab 021, both CE sites' LAN networks unable to access the Internet, the problem might be caused by the recent changes have been made by the network technicians to increase the capacity of the connection to the ISP but this is preliminary assumptions further the investigation is required.



Topology:


Download Lab: VIRL | EVE-NG | GNS3




Scenario:
Users in the customer's networks 192.168.10.0/24 and 192.168.20.0/24 notified the network support team that they unable to perform their jobs due to the lack of communication with the Internet resources. You as lead MPLS expert have been assigned to perform troubleshooting of this connectivity dilemma.



Instruction:
1. Download the Lab, use your choice of network simulation software, access both CE1 nodes and verify the problem by using the "ping 8.8.8.8 source lo1" command.
2. Select the proper troubleshooting approach to identify and fix the issues faster.
3. Apply necessary configuration to the topology.
4. Verify if the applied settings resolved the problem.



The solution to the problem:
If you are unable to fix on your own then click the button to learn what is the source of the trouble and how to correct it.
Solution



Summary:
Troubleshooting can be a daunting process, sometimes confusing due to the complexities and virtualization of the network infrastructure, when you are approaching the problem causing connectivity issues in the topology like in this lab, always think of the situation like the end-to-end strings between source and destination points representing data paths over which information flows, for data to be exchanged in this topology we need four data paths because there are two sites that need to be virtually connected to the Internet destination and those connections have to be bidirectional, now that you have established those paths, there should be the question asked, what can cause one of them, several, all of them to be broken, to do so, you have to imagine these virtual paths are hanging above underlying infrastructure that supports them to move data at the first place after you have got this picture in your mind, you need to put all technologies that used in the topology in correlation with OSI model but skipping presentation and session layers when you have done this, treat virtual data paths as the application layer.
Enumerate technologies used in this network infrastructure and as you do so correlate them with layers below starting from the very bottom physical layer, what can go wrong here? For instance routers themselves, are they powered up? Cables. Are any of them damaged or loosened that prevent electrical or optical signals to be transmitted?
The next layer is the Datalink, all devices in this topology use Ethernet for layer 2 frame exchange. What can go wrong here? For example check for a duplex mismatch, input and output errors, maybe if used, trunk-link has encapsulation configured differently on each side of the link, verify that mac-addresses are not duplicate.
Yes, finally you at the network layer, the most important and interesting for any network engineer, what can go wrong here? Actually a lot and what is awesome about the IP layer, that it is this layer where virtualization occurs the most. So what does it mean from a troubleshooting point of view, yes at first its good idea to verify if routers interfaces configured with correct IPv4 or IPv6 addresses then if communication is working properly between two nodes on the same link, but at the end of the day you have to start thinking about network layer outside of the box and apply creative approach when it comes to IP connectivity. I further will clarify my thoughts on the troubleshooting aspects but for now, I want to talk about what do I mean when I speak of virtualization at the network layer.
Virtualization at the IP layer, what it is, as I understand, simply in my perspective, it is the ability of one IP protocol use another IP protocol properties to pass the metadata or in other words, encapsulation which occurs within the same layer or you also can say that this is the process of placing one header or multiple headers in the front of the original header of the packet to ensure that logical path for this packet to travel is guaranteed. For example, in the case of GRE, the IP packet header goes behind the GRE header and then the GRE header placed again afterward IP header and this is how the private networks are able to exchange data between them over the public infrastructure like the Internet. And in the case of MPLS it just about labels get squeezed between datalink header and network header.
Back to the troubleshooting the connectivity of the virtual path, how can you approach each of the networking technologies here, if you count them it would be about half of the dozen, but in my experience, the best thing to do is to stack them on top of each other, in order of dependency then what do while troubleshooting kind of becomes more clear. Now, what do I mean by the order of dependency, simply it means that for example, OSPF relies on successful IPv4 connectivity between nodes, and IPv4 relies on proper L2 addressing then if you have identical mac-addresses on both sides of the link obviously IPv4 will fail which in turn will affect the IGP and imagine this that over this routing infrastructure lies the path for GRE to operate over but as you can see simply have L2 addressing issue between nodes somewhere in the path could bring down the communication between networks on both sides of GRE tunnel. Additionally what if the requirement for the private traffic states that all communications between the LAN networks have to be encrypted then IPsec comes in the game of data transfer, and how do you correlate this encryption technology in the model of the stacking. Does IPsec go on top of the GRE or other way around? The answer is it depends on how you will configure IPsec protocol. Will you apply it under the tunnel interface then GRE gets encapsulated into IPsec header but if you configure IPsec between the interfaces facing the LAN networks then packets with IPsec will be encapsulated inside the GRE. So I called this systematic approach, the troubleshooting dependency model.
Finally, let's build this kind of model for the topology I describe in the section above, most important to learn is that layers of troubleshooting dependency model are nameless and floating that you can have as many layers as you need which also depend on the number of networking technologies used per topology or even your imagination, you can have sub-layers as well, for example, within the OSPF layer, authentication could be another layer on which functions of the OSPF to exchange the prefixes could depend. And you also can have the model delineate objects within the OSI layer or even cross the boundaries to other OSI layers. Additionally, the uniqueness of this creative approach is that the model is not standardized to a particular type of structure of networking technologies, it works with all types of routed and routing protocols, virtualization, cloud computing and even with network-based applications and SDN. It all depends on the abilities and expertise of IT professionals involved with the particular subject matter. For example, the CCNP level Network Engineer can use his or her knowledge to build a model on the information available to him or her obtained through the years of experience or particular curriculum used during the study process but it does not necessarily mean that the model will have in its design the information about network automation, but the network engineer with knowledge of the python and its libraries will include that kind of information into troubleshooting dependency model. 
To present the model to a person with less knowledge of a particular subject, it will be nice to accompany the model with a logical diagram to show what the model tries to interpret, and finally when creating the model always put at the top layer the end-to-end data path between two IP communicating objects. 

Logical diagram:
 img11 crop

Troubleshooting dependency model for the topology described in the summary text:
Layers Technologies
 Top Level  End-to-End Logical Data Path
 Level6  GRE tunnel configuration
 Level5  IPsec configuration (further can be divided into its own layers due to the complexity)
 Level4  Router to Router IPv4 connectivity for GRE tunnel (loopback0 interfaces)
 Level3  IGP: OSPF
 Level2  IPv4 connectivity between nodes
 Level1  OSI L2 Addressing ARP/MAC


As you can see from the table above that it started in the OSI data link layer and then completely manifested in the OSI network layer. Now the question of how can you use this simple two columns table to reduce the burden of the troubleshooting effort. Let say that the user in the LAN subnet on one side of the GRE tunnel calls you and reports that she was not able to access resources on the particular server which you know logically located on the opposite side of the GRE tunnel. By looking at the table you can determine that the first thing on the list that assures the end-to-end data exchange between the user and the server she tries to access is the implementation of the GRE tunnel, so you ssh to the router and verified that the GRE tunnel is operational, it means that all underlying protocols are working properly and you do need to spend your time to troubleshoot those technologies. But it happens that you possess the baseline documentation which shows you output from the show ip dhcp snooping command on the switch where the user is connected, over the phone you instructed her to open cmd terminal and obtain the MAC address of her desktop after you confirmed the MAC address you determine the IP address of the desktop and attempted to ping but it fails. You notify the user that there is a problem with her desktop and you now trying to solve it, you contacted the network team responsible for switches and identified that last night the team performed the upgrade to the network and this might be causing the problem, after the troubleshooting by the switches team, you got emailed that the port where user's desktop connected has been accidentally shut down and now the problem has been solved.
In conclusion, by using the troubleshooting dependency model you have managed to save yourself time and in addition, did not cause further incidental damage to the production network. By the way, the network team responsible for switching infrastructure has its own troubleshooting model for the switching technologies. 

Comments

Popular Posts