Raw Cyber attack traffic can present more questions than answers to security analysts. Especially with large-scale observables it is difficult to identify which packets are relevant and what attack behaviors are present. Many existing works in Host or Flow Clustering attempt to group similar behaviors to expedite analysis; these works often phrase the problem directly as offline unsupervised machine learning. This work proposes online processing to simultaneously model coordinating actors and segment traffic that is relevant to a target of interest, all while it is being received. The goal is not just to aggregate similar attack behaviors, but to provide situational awareness by grouping potential coordinators and isolating an attack area of interest around a particular target. The clustering problem is recast as a supervised learning problem: classifying received traffic to the most likely attack model, and iteratively introducing new attack models to explain received traffic. A novel graphical prior probability is defined based on the macroscopic attack structure to improve classification. Malicious traffic captures provided by the Cooperative Association for Internet Data Analysis are used to demonstrate the accuracy of the online model generation and segmentation.

Publication Date


Document Type


Student Type


Degree Name

Computer Engineering (MS)

Department, Program, or Center

Computer Engineering (KGCOE)


Shanchieh Jay Yang

Advisor/Committee Member

Andres Kwasinski

Advisor/Committee Member

Amlan Ganguly


Physical copy available through RIT's The Wallace Library at: HV6773.15.C97 S77 2013


RIT – Main Campus

Plan Codes