## Software Networking

Vol: 2016    Issue: 1

Published In:   January 2018

### Synflood Spoof Source DDOS Attack Defence Based on Packet ID Anomaly Detection – PIDAD

Article No: 12    Page: 213-228    doi: https://doi.org/10.13052/jsn2445-9739.2016.012

 1 2 3 4 5 6 7 8 9 10 11 12

Synflood Spoof Source DDOS Attack Defence Based on Packet ID Anomaly Detection – PIDAD

Received 2 April 2016; Accepted 13 May 2016;
Publication 3 June 2016

Tran Manh Thang and Van K. Nguyen

Dept of Software Engineering, School of Information Technology and Communication, Hanoi University of Science and Technology, No. 1 Dai Co Viet, Hai Ba Trung, Hanoi, Vietnam

E-mail: thang197@gmail.com; vannk@soict.hust.edu.vn

## Abstract

A distributed denial-of-service (DDoS) attack characterized by flooding SYN packets is one of the network attacks to make the information system unavail- able. This kind of attack becomes dangerous and more difficult to prevent and defense when attackers try to send flood SYN packets with spoof source, especially, there packets have information fields as the normal SYN packets. In this study, we propose a method called Packet Identification Anomaly Detection – PIDAD used to defense type of DDoS attack mentioned above. This method based on abnormal information of identification field in IP Header when observing the set of packets received in the victim system.

• DDoS attacks
• DBSCAN

## 1 Introduction

Distributed Denial-of-Service attack (DDoS) is a type of network attack that attackers make use of a large number of compromised computers (botnet) to attack, which makes the applications, service or an information system unavailable. This type of attacking has become more and more dangerous and harder to prevent when the number of computers connected to the Internet, the weakness of information security and malwares get increasingly going up. In fact, there have been many researches on how to prevent and defense against this kind of attack; however, it is shown that indeed there have been no effective measures of defense and resulting in serious impact on agencies, organizations, and business during the recent time.

DDoS attack comprises of various forms of attacks, wherein, TCP SYN Flooding [1] is one of the attacking types that is of most difficulty to be prevented and defensed. With this kind of attack, hackers control computers in botnet through controlling server (C&C Server) to send flooding spoof packets which have information fields like normal packets to target server. Therefore, servers attacked cannot differentiate which connection initiate packet is real.

One of the most specific, typical forms is TCP SYN Flood attack, amongst hackers aim at a handshaking process with 3 steps of initiating a connection TCP (TCP transport protocol of the Internet). During this process, hackers create flooding SYN packets sent to a victim server which incomplete the process of 3 step handshake. Thus, server has to spend resources for connec- tions which are incomplete due to hackers’ sending on purpose. This leads to server’s running out of resources, incapable of meeting the requests of connecting for real connections.

To detect SYN spoof packets, we cannot check each SYN packet sent to server but we need to find out the relationship between them.

Basing on the principle of Idle scan method [26], which is used to check an open port on server basing on the characteristic of gradually increasing information field of Packet Identification – PID in the beginning part of IP packet (IP Header) when a packet is sent out of a computer, we see that it is possible to find out SYN spoof packets by searching for packets with information field of gradually increasing PID with different IP address source (hereinafter referred to as different IP address).

A matter here is that one computer can utilize many applications at the same time, conducting different procedures simultaneously, thus, packets with the same goals will be ones with interrupted gradually increasing PID field. To illustrate more clearly, supposing that a computer sending with 3 progresses A, B and C is at the same time communicating with 3 other computers. Then, if we overlook the very computer, PID stream is created continuously, but if observing PID stream sent from the progress A (or B, or C) to some receiving computer, it is obviously to increase interruptedly.

To detect spoof packets used in DDoS attack, we study the method of grossing packets with different source addresses with continuously increasing PID value into the same Cluster using the algorithm DBSCAN [3]. As of each Cluster, we will define the expected value (Expected PID – EPID) of PID value each Cluster. Thence, any new packet send to the victim server will be considered as a spoof packet if PID value of new packet is equal EPID value in any Cluster.

In this paper, we will use some English specialized terms considered to be common in this area. In our opinion, using English terms with basic definition, common in international papers, will help specialized readers be easier to follow than translating into Vietnamese. We would like to state some English terms as below:

• DDoS: Distributed Denial-of-Service attack.
• SYN: packet in TCP protocol stack used initially to establish connection.
• PID: the value of Packet Identification field in IP Hearder.
• Cluster: cluster/the nearby object in the sample space.
• Training phase: computer training phase.
• Detection phase: detecting phase of spoofing packet.
• Core poin: point/object inside cluster.
• Border point: point/object on the border.

### 1.1 Field Overview

As the overview research of DDoS attack [16], the solutions of defense DDoS attack are applied in two main directions: deployment location and point in time that defense takes place.

Applying method basing on deployment location is divided into 2 classes of defense: the first class is defense solutions to DDoS attacks occurring at the Transport layer of OSI model downwards (network/transport-level DDoS flooding attacks). The second class is defense solutions at application layer of OSI model. With the defense method basing on point in time that defense takes place location is divided into 3 phases: pre-attack phase, attack phase, and post-attack phase.

For the solution of defense against DDoS attack at network/transport-level is divided into categories: source-based, destination-based, network-based, and hybrid and for the defense solutions against application level is divided into categories: destination-based, and hybrid based on their deployment location.

In this study, we focus on studying the method of defensing DDoS attack type SYN flood spoofing source address. This is the solution applied for defensing against DDoS attack at network/transport-level mentioned above. We will analyze in detail some related researches as following.

Figure 1 Categorize the method of defensing DDoS attacks.

Method of IP Traceback mechanisms [1718] is the method of saving information of the root from source to victim in the Packet Identification information field of IP packets corresponding to each source IP address. Basing on this information about this route to detect the spoof packets. Spoof packets are packets of route information mismatched its source IP stored beforehand. However, this method has a big limit which requires mediate router to support the mechanism of marking the route, thus it is difficult to carry out in reality.

Method of Management Information Base (MIB) [1921] is the method of combining information existing in each packet and linear information to detect DDoS attack. When DDoS attack occur, this method compares the information in each packet of ICMP, TCP, UDP with the respective information which was analyzed and stored when DDoS attack had not occurred to find out the packets with abnormal information. Wang et al. [27, 28] proposed a method for detecting SYN flood attacks at leaf routers that connect end hosts to the Internet. Based on the observation that SYN and FIN packets form pairs in normal network traffic, they proposed using a nonparametric CUSUM method to accumulate the pairs.

Method of Packet marking and filtering mechanisms includes the fol- lowing researches: History-based IP filtering [22], this method stores the information of source IP addresses which frequently connect to the system when DDoS attack has not occurred. When DDoS attack occurs, IP addresses existed in the list stored will be connected to the system; Method of Hop-count filtering [23], this method informs of source IP addresses as corresponding hop-count information in each packet which will be stored when DDoS attack has not occurred. When DDoS attack occurred, the packets with source addresses stored beforehand while information about hop-count different from information stored beforehand relating with that source IP addresses will be considered as spoof; Method of Path Identifier (Pi) [24], this method stores the values which are regarded as identifying the route of each packet when traveling from source to victim corresponding to each source IP address. The packets with the same route to the victim will share the same route information. This information will be used to filter all spoof packets of the same route with a spoof packet detected beforehand.

Method of Packet dropping based on the level of congestion wherein Packetscore [25] is a typical research for this method. Packetscore sets up priority level for each packet basing on the algorithm of Detecting- Differentiating-Discarding routers (3D-R) using Bayesian-theoretic metric. A barrier will be set up to filter the packets with low priority level when DDoS attack occurs and bring about stuck at Routers.

## 2 Theoretical Basis

### 2.1 Packet Identification (PID)

This is Identification field (16 bits) in IP Header [2]. This field is used to determine each packet sent by a computer. By default, the value of PID field will be increased by 1 unit when the computer sends out a packet. This characteristic has been used in the method of Idle Scan to check one port opened in server [26].

### 2.2 TCP Handshake and TCP SYN Flooding Attack

Transmission Control Protocol (TCP) [4] is protocol used popularly in the Internet to create a trustful connection oriented between any two computers in the Internet. Wherein, TCP handshake is a mechanism used to initiate a new TCP connection. TCP Handshakes can be described briefly as following: In the process of initiate new TCP connection, first, client sends a packet with SYN flag to server to require initiating new connection. After receiving the requirement, server will response a packet with SYN-ACK flags to inform the readiness for connection and require initiating new connection to client. The last step, client also responses a packet with ACK flag to server informing its readiness. Because this protocol has been developed from early time (when the modern Internet has not formed, the potential risks of Internet attacks have been unknown), it can be said that this connection principle is a dangerous weakness of information security that attackers can make use of it to perform different kinds of attacks.

The weakness here is: when receiving requirement of initiating new connection from client, immediately server reserve the necessary resources for new connection. Taking advantage of this loose, attacker will try to send overwhelm spoof packets and never complete TCP handshakes progress so that the server gradually depletes its resources.

Spoof packets are created by randomly spoofing source IP address and other information in each spoof packet is completely the same as normal packet. This leads to: server cannot differentiate which connection packet which packet is the real or spoof one, so server will spend resources for such unreal connections. As the number of connection spoof packets is flooded sending, server has no longer had resources to serve real connections.

### 2.3 DBSCAN Algorithm

DBSCAN (Density-based spatial clustering of applications with noise) [29] is a data clustering algorithm proposed by Martin Ester, Hans-Peter Kriegel, Jörg Sander and Xiaowei Xu in 1996. It is a density-based clustering algorithm: given a set of points in some space, it groups together points that are closely packed together (points with many nearby neighbors), marking as outliers points that lie alone in low-density regions (whose nearest neighbors are too far away). DBSCAN is one of the most common clustering algorithms and also most cited in scientific literature.

Figure 2 TCP handshakes.

The main idea of the algorithm is basing on bringing up core point term (core point, lying inside one cluster) – which has the number of neighbors reached standard level in a given neighboring radius – and the forming of clusters as spreading type based on the definition of connection or reachable destiny. In each cluster formed as DBSCAN, point objects are categorized into 2 types: core point (as the real object inside cluster) and border point (object lying on the border of cluster). We will present briefly summary of basic definitions and operative mechanism of DBSCAN algorithm as following.

#### 2.3.1 Definitions

Neighboring radius area ε of object p, symbol N ε(p), is a collection of object q under condition that the distance between p and q, symbolized as dist(p,q), is smaller than given radius ε >0:

$Nε(p)={ q∈D|dist(p,q)≤ε }(1)$

An object can be considered to have thick or thin neighboring density through the size of this collection N ε(p) We also define the term reachable density- based (directly density-reachable) from p to q: p regarded as (ε,m)-reachable directly with q if p ∈ N ε(q) and |N ε(q)|≥ m(m usually symbolized as minPts in the completed documents on DBSCAN). Generally speaking, this concept shows that p can be reachable to q thanks to locating nearby and there are many mediate points within the neighborhood of q. The relationship directly density-reachable can be on one side or two sides (or symmetry). If it is two sides (p reachable to q and vice verse), both two points are remarked as core point, but if only p is reachable to q, q then is remarked as core andpis border point.

Figure 3 DBSCAN algorithm.

In overall, object p is considered (ε,m)-reachable with q if there is string p1,p2,…,pn (p1 =q,pn=p) existing while pi+1 directly density-reachable to pi: pi+1∈N ε(pi)và | N ε(pi)|≥ m.

Object p connected as the density, symbolized as (ε,m)-connected, wit object q if there is object o existing while both p and q are density-reachable, i.e.,(ε,m)-reachable, from o.

A cluster satisfies the standard of density (ε,m)-dense if all of its objects are (ε,m)-connected with one another and any object (ε,m)-reachable from some object of cluster is also considered to belong to cluster.

#### 2.3.2 Algorithm

DBSCAN uses two parameters set up before which are ε and m = minPts, from which searching detecting (actually building and forming) clusters under the condition of density (ε,m)-dense. Algorithm of DBSCAN begins from any object and then searches every object which is (ε,m)-reachable from object p. If p is core point, this task will generate a cluster satisfying (ε,m)-dense.If p is border object, it will be impossible to search an object density-reachable from p, then DBSCAN scan the next object in the database. Due to using the sharing parameter ε and m for all of the Cluster, DBSCAN can combine 2 clusters into 1 if these two clusters are nearby. The distance between 2 collections of S1 and S2, symbolized as dist (S1, S2) is the value of the smallest distance of any 2 objects p, q insides: dist (S1, S2)=min{dist(p,q) | p S1, q S2}. Two collections of objects in the same cluster can be separate if their distance between them increases.

Figure 4 Directly density-reachable Relationship.

PIDAD method we suggested is based on the main idea detecting strings of packets with their PID value continuously increasing but source addresses are random, without coincidence. As we mentioned above (Introduction part) the packets created and sent from the same computer will have PID value increase continuously as short section; these continuously increasing sections make up the sections without coincidence and somehow interrupted. The reasons as stated are that in the same computer there can be many progresses having transactions connected to many other computers in the Internet. However, in TCP SYN flooding attack, attackers are supposed to create spoof SYN packets with spoof source IP address selected randomly. Therefore, we suggest a method to detect and filter packets, which have PID value continuously increasing short sections with different source IP addresses (due to random generation).

To realize this idea, we recommend using DBSCAN algorithm: each PID string with increasing short sections of different source IP addresses will be collected as a cluster as DBSCAN. With each cluster, we will determine EPID value of each cluster, i.e., PID value is predicted to be sent continually. The process of forming clusters and find out EPID in our method is called Training phase.

To detect the next spoof packets sent to the system, we will check PID value of each packet sent to see if there is coincidence with EPID of any Cluster (list of EPID collected through training phase); in case of finding PID value coinciding with any EPID value, this packet will be the next spoof packet sent to the system. In case of unable to find PID value which coincide with any EPID value, the packet sent can be real packet or the first spoof packet of a new Cluster (unformed in training phase).

We also suggest a technique using another restricted condition of time to build a new Cluster. Beside the phenomenon of PID continuously increasing short sections as mentioned above, normally the spoof packets sent from a minion computer in DDoS attack will be rather nearby and regular in terms of time (as it is necessary to create a large number of spoof packet without attention to responses from victim server). Thus, we suggest another condition of forming Cluster: in the time period of TCC=2*TPC there must be at least 2 coming packets sent and having full conditions to create new Cluster with the packet needed to be checked, then the new Cluster will be set up with the respective EPID value. Wherein TPC is the maximum time period of each Cluster needed to add a new member. If during the time period TCC it is under conditions to set up new Cluster to the packet being checked, that packet will be considered to be real packet. The process of detecting next spoof packet sent to the system in our method called Detection phase.

As above, we have described the basic mechanism of the suggested algorithm; we would like to demonstrate each phase of the algorithm in detail as following.

To have enough input sample for DBSCAN algorithm in training phase, first PIDAD method needs to collect Npc initial packets with different source IP addresses sent to the system. After having Npc packets, PIDAD will apply DBSCAN algorithm to find out Clusters which have packets with continuously increasing PID. To describe the characteristic of continuously increasing by 1 unit of each Cluster, DBSCAN algorithm chooses parameter ε (Eps) = 1 (the distance between 2 points in one Cluster). To describe the characteristic of the packets sent from the same computer the value MinPts = 3 (this value has the meanings of at least 3 packets with continuously increasing PID creating one Cluster). In case the value MinPts = 2 the estimate PID value of this Cluster miscollected into other Cluster will be of high potential, which reduces the accuracy of the method.

Figure 5 PIDAD algorithm in training phase.

Due to characteristic of directly density-reachable relationship ((ε,m)- reachable) of DBSCAN algorithm, whenever Cluster has new member (a new packet satisfied the conditions and belonging to that Cluster) the position of core point will be updated into the position of that new member (border point).

PIDAD uses a multi-dimension matrix MC to be able to store information of Clusters including: value of Cluster ID, EPID and the time Cluster add the last member (TLC). Using MPC to store the information of each packet sent to the system including PID, Cluster ID and SP indicates the state of whether the packets checked become member of Cluster or not. Supposed Pi the packet i (0 < i < Npc) which is checked of all conditions to bring into a Cluster, TTC is the time to check packet Pi. Algorithm and algorithm flowchart in training phase are following.

After setting up Clusters and EPID correspond from NPC packets in training phase, PIDAD will transfer to detection phase, to detect the next spoof packets sent to the system. Each next packet sent to the system can be spoof packet or real packet. Spoof packets are packets with PID coincident with any EPID value in MC. The real packets are the packets satisfying both two conditions at the same time: no PID value coinciding with any EPID value and under conditions to create one Cluster in a period of time TCC (after the time period TCC the packets being checked cannot collect the next two packets to make a new Cluster). TCC value set up is smaller than TCP timeout to allow PIDAD to determine the real packets before occurring TCP retransmission at Client side.

To check if a packet has enough conditions to make a new Cluster or not, PIDAD method uses a temporary Cluster (CT). Each packet sent to the system under the state of being checked will be assigned to be core point of CT, after a time period TCC, when CT collects 2 border points CT will be updated into new Cluster, or else CT will be deleted. MTC is used to store the information of CT, the information needed to be stored of CT as well as information of Cluster was set up in training phase. Supposed NPT as the number of packets in each CT.

In detection phase, to meet the ability of detecting spoof packets when having changes of attacking flow and minimizing the number of Clusters needed to be handled, PIDAD method should have updating method, deleting and adding new Clusters. Wherein, a Cluster is updated EPID information when that Cluster add another new member. A Cluster will be deleted if after the time TCC, that Cluster does not have new member added. A Cluster will be new created when after the time TCC the packets being checked collect the next 2 packets with full conditions to set up a new Cluster.

Supposed Pi the packet number i being checked, C(p) is Cluster with p as core point. Algorithm in detection phase is described as the following figure:

Figure 6 PIDAD algorithm in detection phase.

## 4 Experiment Evaluation

The experimental results in this study, we take the sample of packets with real time in some real systems when meeting DDoS attack with the above type of attack. To evaluate the effectiveness of the method more accurately, we have collected connection initial packets when the system work at normal mode and mixed randomly with the sample packets into experimental data. Experimental data are stored as file PCAP. Wherein the packets at normal functional mode and the sample packets during the attack will be marked to have basis of calculating the efficiency of the algorithm.

To collect SYN packets in file PCAP, we use the software Wireshack [14] to filter SYN packets in PCAP. To have information on the arrival time of the packets, and PID information from the file SYN packets collected, we use the software Tshark [15] to filter two information fields time stemp and identification of each packet.

After having information of arrival time and PID of files of packets and programming the algorithm on the software Matlab, the experimental results show that our method can detect most of spoof packets sent to the system when meeting DDoS attack. The specific results are following.

## 5 Future Works

In this research, we have studied and suggested a new method to detect spoof packets used in DDoS attack. The results show that our method has high rate of detecting spoof packets. However, in this research we still have to conduct manually some steps of solution in some parts such as: collecting initial packets with different IP in NPC; the storing information of the packets was confirmed and connected initial successfully. To be able to complete and apply the method in reality, we anticipate to doing research on using Bloom filter algorithm suggested by Bloom [3] in the future study, this algorithm was used in some methods of preventing, defensing from DDoS attack [57]. Specifically, we will continue to study using Bloom filter algorithm to determine the first SYN packets sent to the system, to store information of connection initiated and store EPID value of each Cluster.

Table 1 Experimental results of PIDAD

 Npc/Tpc 0.1 ms 0.2 ms 0.3 ms 0.4 ms 10000 86.2% 92.1% 84.1% 76.1% 20000 87.1% 91.1% 8.1% 77.1% 30000 88.1% 89.1% 94.1% 79.1% 40000 85.1% 95.1% 88.1% 80.1%

## 6 Conclusion

PIDAD method allows detecting and filtering connection initiate spoof packets used in DDoS attack typed flooding connection spoof packets. This method has advantages of allowing detecting spoof packets sent to the system without requiring Client side to resend connection initiate packets the second time. Basing on experimental results, it shows that this method can detect most of the spoof packets in DDoS attack. However, in this research, we just focus on the method of detecting spoof packets without having recommendation of overall solution to completely apply in reality. In future studies, we need to do research using Bloom filter algorithm to complete and improve the troubleshooting efficiency of the proposed method.

## References

[1] CERT. TCP SYN Flooding and IP Spoofing Attacks. Advisory CA-96.21, September 1996.

[3] Ester, M., Kriegel, H. P., Sander, J., and Xu, X. (1996). “A density- based algorithm for discovering clusters in large spatial databases with noise,” in Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, Portland, Oregon, 226–231.

[4] Postel, J. (1981). Transmission Control Protocol: DARPA internet program protocol specification, RFC 793.

[5] Abdelsayed, S., Glimsholt, D., Leckie, C., Ryan, S., and Shami, S. (2003). “An efficient filter for denial-of-service bandwidth attacks,” in IEEE Global Telecommunications Conference (GLOBECOM’03), Vol. 3, 1353–1357.

[6] Snoeren, A. C. (2001). “Hash-based IP traceback,” in Proceedings of the ACM SIGCOMM Conference, (Boston, MA: ACM Press), 3–14.

[7] Yaar, A., Perrig, A., and Song, D. (2003). “Pi: a path identification mechanism of defend against DDoS attacks,” in IEEE Symposium on Security and Privacy, 93.

[8] Yaar, A., Perrig, A., and Song, D. (2003). StackPi: New Packet Marking and Filtering Mechanisms for DDoS and IP Spoofing Defense, CMU- CS-02-208.

[9] Chen, W., and Yeung, D. Y. (2006). “Defending against TCP SYN flood- ing attacks under different types of IP spoofing,” in Fifth International Conference on Networking (ICN).

[10] Changhua, S., Jindou, F., Lei, S., and Bin, L. (2007). “A novel router- based scheme to mitigate SYN flooding DDoS attacks,” in IEEE INFOCOM (Poster), Anchorage, Alaska, USA, 2007.

[11] Chan, E., Chan, H., Chan, K., Chan, V., Chanson, S., et al. (2004). “IDR: an intrusion detection router for defending against distributed denial-of-service(DDoS) attacks,” in Proceedings of the 7th Interna- tional Symposium on Parallel Architectures, Algorithms and Networks 2004(ISPAN’04), 581–586.

[12] Wang, H., Jin, C., and Shin, K. G. (2007). Defense against spoofed IP traffic using hop-count filtering. IEEE/ACM Trans. Netw., 15, 40–53.

[13] Peng, T., Leckie, C., and Ramamohanarao, K. (2003). “Protection from distributed denial of service attacks using history-based IP filtering,” in ICC’03. Vol. 1, 482–486.

[14] https://www.wireshark.org/

[15] https://www.wireshark.org/docs/man-pages/tshark.html

[16] Taghavi Zargar, S., and Tipper, D. (2013). A survey of defense mecha- nisms against distributed denial of service (DDoS) flooding attacks. IEEE Commun. Survey Tutorials, 15, 2046–2069.

[17] John, A., and Sivakumar, T. (2009). DDoS: survey of traceback methods. Int. J. Recent Trends Eng.1.

[18] Joao, B., Cabrera, D., et al. (2001). “Proactive detection of distributed denial of service attacks using MIB traffic variables a feasibility study,” in Integrated Network Management Proceedings, Seattle, WA, 609–622.

[19] Jalili, R., and ImaniMehr, F. (2005). Detection of Distributed Denial of Service Attacks Using Statistical Pre-Prossesor and Unsupervised Neural Network, ISPEC, 192–203. Berlin: Springer-Verlag.

[20] Li, M., Liu, J., and Long, D. (2004). Probability Principle of Reliable Approach to detect signs of DDOS Flood Attacks, PDCAT, 596–599. Berlin: Springer-Verlag.

[21] Wang, H., Jin, C., and Shin, K. G. (2007). Defense Against Spoofed IP Traffic Using Hop-Count Filtering, IEEE/ACM Trans. Netw. 15, 40–53.

[22] Yaar, A., Perrig, A., and Song, D. (2003). “Pi: A Path identification mechanism to defend against DDoS attacks,” in IEEE Symposium on Security and Privacy, pp. 93.

[23] Kim, Y., Lau, W. C., Chuah, M. C., and Chao H. J. (2006). PacketScore: a statistics-based packet filtering scheme against distributed denial-of- service attacks. IEEE Trans. Depend. Secure Comput. 3, 141–155.

[24] https://en.wikipedia.org/wiki/Idle_scan

[25] Wang, H., Zhang, D., and Shin, K. G. (2002). “Detecting SYN flooding attacks,” in Proceedings of Annual Joint Conference of the IEEE Com- puter and Communications Societies(INFOCOM), Vol. 3, 1530–1539.

[26] Wang, H., Zhang, D., and Shin, K. G. (2004). Change point monitoring for the detection of dos attack. IEEE Trans. Depend. Secure Comput. 1, 193–208.

[27] Ester, M., Kriegel, H. P., Sander J., and Xu, X. (1996) “A density- based algorithm for discovering clusters in large spatial databases with noise,” in Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining.