Modern aircraft are capable of recording hundreds of parameters during flight. This fact not only facilitates the investigation of an accident or a serious incident, but also provides the opportunity to use the recorded data to predict future aircraft behavior. It is believed that, by analyzing the recorded data, one can identify precursors to hazardous behavior and develop procedures to mitigate the problems before they actually occur. Because of the enormous amount of data collected during each flight, it becomes necessary to identify the segments of data that contain useful information. The objective is to distinguish between typical data points, that are present in the majority of flights, and unusual data points that can be only found in a few flights. The distinction between typical and unusual data points is achieved by using classification procedures. In this dissertation, the application of classification procedures to flight data is investigated. It is proposed to use a Bayesian classifier that tries to identify the flight from which a particular data point came. If the flight from which the data point came is identified with a high level of confidence, then the conclusion that the data point is unusual within the investigated flights can be made. The Bayesian classifier uses the overall and conditional probability density functions together with a priori probabilities to make a decision. Estimating probability density functions is a difficult task in multiple dimensions. Because many of the recorded signals (features) are redundant or highly correlated or are very similar in every flight, feature selection techniques are applied to identify those signals that contain the most discriminatory power. In the limited amount of data available to this research, twenty five features were identified as the set exhibiting the best discriminatory power. Additionally, the number of signals is reduced by applying feature generation techniques to similar signals. To make the approach applicable in practice, when many flights are considered, a very efficient and fast sequential data clustering algorithm is proposed. The order in which the samples are presented to the algorithm is fixed according to the probability density function value. Accuracy and reduction level are controlled using two scalar parameters: a distance threshold value and a maximum compactness factor. Ph. D.
The different versions of the original document can be found in:
Are you one of the authors of this document?