Packet Capture and Analysis on MEDINA, A Massively Distributed Network Data Caching Platform
Abstract
Traffic capture and analysis is key to many domains including network management, security and network forensics. Traditionally, it is performed by a dedicated device accessing traffic at a specific point within the network through a link tap or a port of a node mirroring packets. This approach is problematic because the dedicated device must be equipped with a large amount of computation and storage resources to store and analyze packets. Alternatively, in order to achieve scalability, analysis can be performed by a cluster of hosts. However, this is normally located at a remote location with respect to the observation point, hence requiring to move across the network a large volume of captured traffic. To address this problem, this paper presents an algorithm to distribute the task of capturing, processing and storing packets traversing a network across multiple packet forwarding nodes (e.g., IP routers). Essentially, our solution allows individual nodes on the path of a flow to operate on subsets of packets of that flow in a completely distributed and decentralized manner. The algorithm ensures that each packet is processed by n nodes, where n can be set to 1 to minimize overhead or to a higher value to achieve redundancy. Nodes create a distributed index that enables efficient retrieval of packets they store (e.g., for forensics applications).
Finally, the basic principles of the presented solution can also be applied, with minimal changes, to the distributed execution of generic tasks on data flowing through a network of nodes with processing and storage capabilities. This has applications in various fields ranging from Fog Computing, to microservice architectures and the Internet of Things.
References
- 1. Cisco UCS E-Series Servers. http://www.cisco.com/c/en/us/products/servers-unified-computing/ucs-e-series-servers/index.html. [Online; accessed 27-January 2017]. Google Scholar
- 2. , SmartRE: An architecture for coordinated network-wide redundancy elimination, ACM SIGCOMM Computer Communication Review, 2009. Crossref, ISI, Google Scholar
- 3. , LEISURE: Load-balanced network-wide traffic measurement and monitor placement, IEEE Transactions on Parallel and Distributed Systems, 2015. Crossref, ISI, Google Scholar
- 4. , Inter-domain networking innovation on steroids: Empowering IXPs with SDN capabilities, IEEE Communications Magazine, 2016. Crossref, ISI, Google Scholar
- 5. , DECON: Decentralized coordination for large-scale flow monitoring, in IEEE INFOCOM (IEEE, 2010). Google Scholar
- 6. , Deriving traffic demands for operational IP networks: Methodology and experience, IEEE/ACM Transactions on Networking, 2001. Crossref, ISI, Google Scholar
- 7. , Fibonacci heaps and their uses in improved network optimization algorithms, Journal of the ACM, 1987. Crossref, ISI, Google Scholar
- 8. , Monitoring traffic in computer networks with dynamic distributed remote packet capturing, in 2015 IEEE International Conference on Communications (ICC) (IEEE, 2015). Google Scholar
- 9. , Header field based partitioning of network traffic for distributed packet capturing and processing, in IEEE 28th International Conference on Advanced Information Networking and Applications (IEEE, 2014). Google Scholar
- 10. , Network monitoring as a streaming analytics problem, in 15th ACM Workshop on Hot Topics in Networks (HotNets) (ACM, 2016). Google Scholar
- 11. , Autonomic load balancing of flow monitors, Computer Networks, 2013. Crossref, ISI, Google Scholar
- 12. ,
Hash-based techniques for high-speed packet processing , in Algorithms for Next Generation Networks (Springer, 2010). Crossref, Google Scholar - 13. , Designing a smart city internet of things platform with microservice architecture, in 3rd International Conference on Future Internet of Things and Cloud (FiCloud ) (IEEE, 2015). Google Scholar
- 14. , Demystifying data deduplication, in ACM/IFIP/USENIX Middleware Conference Companion (ACM, 2008). Google Scholar
- 15. , A study of practical deduplication, in 9th USENIX Conference on File and Storage Technologies (FAST ) (USENIX, 2011). Google Scholar
- 16. , Empirical analysis and modeling of peer-to-peer traffic flows, in 14th IEEE Mediterranean Electrotechnical Conference (MELECON ) (IEEE, 2008). Google Scholar
- 17. , Coordinated sampling sans origin-destination identifiers: Algorithms and analysis, in 2nd International Conference on Communication Systems and Networks (COMSNETS ) (IEEE, 2010). Google Scholar
- 18. , CSAMP: A system for network-wide flow monitoring, in 5th USENIX Symposium on Networked Systems Design and Implementation (NSDI ) (USENIX, 2008). Google Scholar
- 19. , Beyond folklore: Observations on fragmented traffic, IEEE/ACM Transactions on Networking, 2002. Crossref, ISI, Google Scholar
- 20. , Scalable coordination techniques for distributed network monitoring, in International Workshop on Passive and Active Network Measurement (PAM ) (Springer, 2005). Google Scholar
- 21. , DECOR: A distributed coordinated resource monitoring system, in 20th International Workshop on Quality of Service (IWQoS ) (IEEE, 2012). Google Scholar
- 22. , Finding your way in the fog: Towards a comprehensive definition of fog computing, ACM SIGCOMM Computer Communication Review, 2014. Crossref, ISI, Google Scholar
- 23. , The impact of bitwise operators on hash uniformity in network packet processing, International Journal of Communication Systems, 2014. ISI, Google Scholar


