TN-GTN: fault diagnosis of aircraft wiring network over edge computing

Fault diagnosis of the aircraft wiring network plays an important role in the intelligent manufacture of the aircraft. Many studies focus on the feature-based machine learning methods. However, these methods are improper in handling the data on heterogeneous graphs. Due to the scatter of the valid feature information, the relevant information between the test nodes is ignored by these methods, which leads to the low accuracy fault diagnosis. Taking the advantage of the 5G technology that can remotely process large-scale graph data, this work proposes a fault diagnosis method named “topological network-graph transformer network (TN-GTN).” TN-GTN can improve the fault diagnosis accuracy through feature enhancement and classification, which is based on the topological information of heterogeneous graphs. The graph network is able to learn new graph structures by identifying useful meta-paths and multi-hop connections between unconnected nodes on original graphs. Feature-enhanced test nodes are used to classify the final labels by the artificial neural network. Results of the performed experiment showed that TN-GTN reduced the dependence on domain knowledge and achieved an accurate classification of the fault diagnosis on aircraft wiring network.


Related work
studied at the theoretical level the applicability of learning Markov logical network weights in the knowledge base on the condition of lacking data. After weight learning, the first-order logical rules are mapped on the Markov network which uses the soft logic as its logical component and the Markov network as its statistical model. The method can simulate human reasoning and is also able to assist reasoning with domain knowledge. To ensure the accuracy of reasoning, it needs to define and analyze each type of the faults, but the rules of reasoning are not easy to make because of its reliance on large scale of domain knowledge. There is another method to analyze and classify complex data based on data mining [14] and naive Bayes method [15]. This method needs an enormous rule formulation basis. Indeed, such methods are difficult to apply.
As research goes future, attention has been focused on graph data embedding and classification as a new research direction. Researches based on graph structure data, including node classification [16], link prediction [17] and graph classification [18], come out. In recent years, because of the powerful expression of graph structure, the study of graph analysis with machine learning method has been paid more and more attention. Graph neural network (GNN) has recently become a widely used graph analysis method due to its superior performance and interpretability [19]. Other representative methods include graph convolutional network (GCN) [20], Graph SAGE [21], graph attention network (GAT) [22,23] and their variants [24].
However, these GNN-based methods can be only applied to homogeneous graphs [25], whose node types and edge types are unique. In the manufacture process, the graph data often contain multiple node types and edge types. Furthermore, each type of nodes may be associated with attributes in different feature spaces. Such graphs with heterogeneity can also be called heterogeneous information networks (HINs) [26]. To tackle the heterogeneity of graphs, a common method is to manually design meta-path [27], a composite relation scheme, widely used to represent node relationships in heterogeneous graphs [28]. Based on HAN [29], Metapath2vec [30] uses the manually defined meta-path to obtain graph representation and structure learning. Obviously, it requires enormous labor to generate meta-path, and its application scope is limited because of its dependence on work experience and domain knowledge. Then, a graph transformer network (GTN) is adopted to model the graph to obtain node embedding [31]. GTN learns to transform a heterogeneous input graph into useful meta-path graph for each task and learns node representation on the graphs. GTN can also aggregate the representations of meaningful neighbors of nodes by multi-channel mechanism.
In this paper, the topological network-graph transformer network (TN-GTN) is proposed to make fault diagnosis of aircraft wiring network by data classification. TN-GTN can generate new meta-path based on the network topology information of test equipment. TN-GTN can also aggregate representations of meaningful neighbor nodes on heterogeneous graph. The trained GTN will add new feature information to test nodes, which is named the feature-enhanced test nodes. TN-GTN uses feature-enhanced test nodes as new input and classifies the final labels by artificial neural network.
The contents of this paper are summarized as follows: (1) AWTE (Aircraft Wiring Test Equipment) network is established to analyze associations between test nodes. In the network, the nodes include test nodes, plugs on aircraft wiring network and test equipment, while the edges represent the associations between nodes. (2) TN-GTN is employed to learn new graph structures on the basis of predefined meta-path. New graph structures involve useful meta-paths and multi-hop connections between unconnected nodes on the original graph. TN-GTN can be used to aggregate the information of different nodes and obtain the feature-enhanced test nodes. (3) Evaluation results show that TN-GTN outperforms ANN approach in fault diagnosis of aircraft wiring network.

TN-GTN method
In this section, basic definitions and concepts of TN-GTN method are introduced. In addition, algorithm application background is also illustrated by constructing AWTE network. Meta-path generation, heterogeneous convolution and final fault classification through TN-GTN are described with details.   Similarly, e ij ∈ E, f e (e ij ) ∈ T e . When | T e |= 1 and | T v |= 1, it becomes a homogeneous graph. In this paper, the set condition is | T e |> 2 and | T v |> 2.
According to AWTE schema model, all nodes are simplified into three categories, (1) Equipment-test devices on AWTE, (2) Plug-Plug 2 connectors on aircraft wiring network and (3) Test Node-Junction between Plug 1 connectors on test equipment and Plug 2 connectors on aircraft wiring network.
The test equipment is a multi-level tree structure network, with many Plugs 1 at the end. Each Plug 1 contains several pins, and the test equipment forms a complete loop by docking the pins on the test equipment end with the pins on the aircraft Plug 2 at the test node (Fig. 3).    The heterogeneous graph can be represented by a set of adjacency matrices where K =|T e |. A K ∈R n×n is an adjacency matrix where A K [i, j] is nonzero when there is a k-th type edge from j to i. More compactly, it can be written as a tensor A K ∈R n×n×K . A feature matrix X∈R n×d means that the d-dimensional input feature is given to each node. The AWTE heterogeneous graph describes a graph that contains four kinds of edges: The yellow adjacency matrix A NE represents the predefined meta-path N NE → E , which is used to describe the connection between test nodes and test equipment. The red adjacency matrix A EN represents the predefined meta-path E EN → N , which is used to describe the connection between test equipment and test nodes. The green adjacency matrix A PN represents the predefined meta-path P PN → N , which is used to describe the connection between plugs and test nodes. The blue adjacency matrix A NP represents the predefined meta-path N NP → P which is used to describe the connection between test nodes and plugs. When there is an edge of type k from j to i, the element on A K [i, j] will not be zero. For example, if there is an edge between the test equipment with serial number 1 and the test node with serial number 1 in A NE , E 1 N 1 will not be zero. Otherwise, E 1 N 1 is zero (Fig. 4).
E i N j indicates whether there is a connection between the ith test equipment and the jth test node. P m N j indicates whether there is a connection between the mth aircraft plug connector and the jth test node

Classification of key nodes with enhanced feature
Classification based on data-driven approach requires valid feature information as much as possible, so it is necessary to enhance features of classified objects. An effective way is to collect effective information from relevant nodes. The graph neural network is efficient to update the new embedding of the nodes by aggregating information on the graph. GTN can transform the heterogeneous graph into a new graph structure through meta-path which contains information about nodes and edges. The information about (1)   center edges can aggregate feature information of nodes by different paths, using metapath to enhance feature on the basis of relevant nodes. Useful test paths include Node-Plug-Test Node (NPN) and Test Node-Plug-Equipment (NPE). Therefore, a new model is needed to learn node representations and graph structures on heterogeneous graph by means of convolution. The model can generate a new soft-selected edge-type graph structure ( Fig. 5).
At the data collection stage, test nodes only contain the originally collected feature information such as resistance measurement value and insulation characteristic test result. After enhancing the feature, the TN-GTN model adds a dimensional feature to the test nodes, which is used to characterize strength of the correlation between test nodes.

AWTE feature enhancement
From the point of original data collection, test nodes are independent to each other. But test nodes are merged into one huge heterogeneous graph in the sense of AWTE topological network and aircraft wiring network. The correlation between test nodes can be obtained by GTN in a way of learning the location information, connector type information and test equipment topology network information.
The associations between test nodes can be defined as strong association, weak association and non-association: Article (1): If test nodes are mounted on the same type of connectors or even the same connector, and these test nodes meet the principle of proximity, the association between test nodes is defined as strong association. Article (2): If test nodes meet the requirements of either principle in the previous article (1), the association between test nodes is defined as weak association. Article (3): If test nodes meet neither of the principles in the previous article (1), the association between test nodes is defined as non-association. Useful meta-paths are relevant to predefined meta-paths between target nodes (nodes with classifying labels). TN-GTN can discover new graph structures and new relevant meta-paths between all types of nodes.
TN-GTN judges whether test nodes N i and N j belong to the same type of connectors or even the same connector and whether they meet the principle of proximity through valid meta-paths such as NEN, NPN and NPEPN.
N i and N j represent different test nodes. Scores of f N i , N j are different according to different strength of the correlation between nodes. The score is an additional feature to test nodes after enhancing feature.
The core idea of TN-GTN is message passing between plugs with feature information of different connector types and test equipment with location information. TN-GTN is an effective way to aggregate useful information to enhance the feature of test nodes. According to the experience of troubleshooting, test nodes with strong association easily lead to some specific types of faults.

Meta-path on AWTE
Meta-path [32] denoted by P is a path on the heterogeneous graph G which is connected with heterogeneous edges, i.e., V 1 where t k ∈T e denotes an k-th edge type of meta-path. Meta-path defines a composite relation R = t 1 ot 2 . . . ot ι between node V 1 and V l+1 , where R 1 oR 2 denotes the composition of relation R 1 and R 2 . Given the composite relation R or the sequence of edge types ( t 1 ,t 2 ,…,t l ), the adjacency matrix A P of the meta-path P is obtained by the multiplications of adjacency matrices as The notion of meta-path subsumes multi-hop connections and new graph structures in AWTE framework are represented by adjacency matrices, i.e., the meta-path: Test Node-Plug-Equipment (NPE), which can be represented as N NP → P PE → E , generates a new adjacency matrix A NPE by the multiplication of A NP and A PE .

Meta-path generation and heterogeneous convolution
The Metapath2vec approach utilizes a manually defined meta-path [30]. Path learning based on HAN [29] gets graph representation. However, it takes enormous labor to generate a meta-path.
TN-GTN is used to generate different homogeneous subgraphs from original heterogeneous graphs and uses softmax to randomly select subgraphs Q 1 and Q 2 in Fig. 6. The matrix multiplication of Q 1 and Q 2 is used to generate a new meta-path. Graph Transformer Layer (GTL) is mainly used to randomly generate multiple meta-paths The generation of new meta-path graph in Graph Transformer (GT) Layer in Fig. 6 contains two components. First, GT layer softly selects two graph structures Q 1 and Q 2 from candidate adjacency matrices A. Second, it learns a new graph structure by the The soft adjacency matrix selection is a weighted sum of candidate adjacency matrices obtained by multi-channel 1 × 1 convolution with nonnegative weights from softmax The new meta-path graph computes the convex combination of adjacency matrices as where T e denotes a set of edge types, α (l) t l is an attention score for edge type t l at the l th GT layer, and A t l denotes adjacency matrix of layer l.
TN-GTN can learn an arbitrary meta-path with respect to edge types and path length. The adjacency matrix of arbitrary length l meta-paths can be calculated by Eq. 7 (Fig. 6).
Graph transformer networks (GTNs) learn to generate a set of new meta-path adjacency matrices A (l) by GT layers and perform graph convolution as in GCNs on new graph structures.
Firstly, different datasets are constructed through multi-channel sampling. Secondly, different meta-paths are generated with GT layers, and the output features are spliced as the input of a graph convolutional network (GCN) [33]. Finally, GCN is used to extract the representation of graph end to end and to learn useful representations for node classification in an end-to-end fashion.
(4) A = A+I ∈ R N ×N is the adjacency matrix A of the graph G with added self-connections, ⌣ D is the degree matrix of A, and W (l) is a trainable weight matrix. GCN can easily observe that the convolution operation across the graph is determined by the given graph structure and it is not learnable except for the node-wise linear transform H (l) W (l) . The convolution layer can be interpreted as the composition of a fixed convolution followed by an activation function σ on the graph after a node-wise linear transformation. H (l+1) is the feature representations of the (l+1)th layer in GCN, and the forward propagation becomes Eq. 8.

Fault classification of aircraft wiring network based on TN-GTN
After enhancing features of AWTE network by using the GTN, the ANN model is used as a supervised learning model for data classification. The input graph is regarded as a heterogeneous graph G = (V, E), and E is four kinds of edges: T e = {NE, EN, PN, NP}. V contains three types of nodes:T v = { X E ,X N ,X P }. All V i in X ∈ R n×d means that the d-dimensional input feature is given to each node. TN-GTN generates new meta-path on the basis of predefined meta-path. Finally, the m-dimensional feature-enhanced test nodes X N ∈R n×(d+m) are used as new inputs into ANN to complete the task of fault classification. The number of input layer nodes is n, which is also the number of test nodes. The number of nodes in output layer is 5, which represents five final classifications, including "test result Success, Fault1, Fault2, Fault3 and Fault4" (Fig. 7).
Common faults were divided into four types according to work experience as follows: Fault1 Wrong connection between the test equipment and test nodes The ZIF connector A1 is wrongly connected or not connected to the test nodes, causing the continuity test failure. When this failure occurs, the continuity test of all test nodes on ZIF connector fails. This type of fault often occurs on the condition of strong correlation between test nodes. The fault type is influenced by the resistance value of the test node and the secondary continuity test result (Fig. 8).
Fault2 Wrong connection to a similar connector There are many similar but differently defined connectors on the aircraft, such as A1 and A2. The failure continuity test results occur when AWTE connector wrongly matches the aircraft plug connector. When this failure occurs, the continuity test of all test nodes on the same type of connector fails. This type of fault often occurs on the condition of strong correlation between test nodes. The fault type is not influenced by the resistance value of the test node and the secondary continuity test result (Fig. 9).

Fig. 7 Classification on test nodes with enhanced information
Fault3 Wrong connections among the aircraft plug connectors. Adjacent pins in the same connector are misplaced. When this failure occurs, the adjacent test nodes on the connector cannot pass the continuity test. This type of fault often occurs on the condition of strong correlation between test nodes. The fault type is influenced by the resistance value of the test node and the secondary continuity test result. Test nodes must have poor insulation performance (Fig. 10).
Fault4 Wrong connection in the aircraft wiring network When broken points or wear is on a single wire, other test nodes in wiring network will not be affected. This type of fault often occurs on the condition of weak correlation or non-correlation between test nodes. The fault type is only influenced by the resistance value of the test node and the secondary continuity test result (Fig. 11).

Experimental results and discussion
In this section, large-scale experiments are designed to evaluate TN-GTN's effectiveness of fault diagnosis on aircraft wiring network.
The task to diagnose aircraft wiring network is to classify the fault types and achieve accurate fault diagnosis according to the feature of test nodes. This paper lists four fault types.
Precision, recall and F1-score are used to evaluate the classifying efficiency of fault diagnosis. Finally, an experiment is designed to compare the classifying results of

Dataset and feature extraction
Heterogeneous graph datasets with multiple types of nodes and edges were used to evaluate the efficiency of TN-GTN. Dataset contained three types of nodes (Test Node (N), Plug (P) and Equipment (E)), four types of edges (NP, PN, NE and EN) and five types of test results (Success and Fault1, Fault2, Fault3 and Fault4) as labels. The main task was to classify nodes. The feature data of the Test Node (N) were obtained from the actual test. A complete loop was formed by aircraft wiring network and test equipment. 13892 piece of test results were obtained at one time through a 2 h continuity test and a 5-h insulation test. After the test, a check on all test nodes was carried out to confirm the results. There was Success (11913), Fault1 (1285), Fault2 (224), Fault3 (347) and Fault4 (123). Each piece of data included the resistance value obtained in continuity test, the insulation characteristics of test node and the results of secondary continuity test. This type of experiment was performed for five times.
The feature data of the Plug (P) was objective, which was not generated during the testing process. There were a total of 2016 plugs on the aircraft wiring network. The feature data of the Plug included the type of the plug and the three-dimensional coordinate information under Cartesian coordinates. The data were given as one-hot encoding representations of plots.
The feature data of the Equipment (E) were objective, which was not generated during the testing process. There were a total of 234 nodes after simplification on AWTE network. The feature data of the Equipment (E) included the branch information, the type of test box and the manufacture information of test equipment. The data were given as one-hot encoding representations of plots (Table 3).
The topological network of three node types was connected by four types of predefined meta-path N The trained GTN would add new feature information to test nodes, which was named the feature-enhanced test nodes. Enhanced feature is Test Node score of f N i , N j , 0 score means non-association, 0.5 score means weak association, and 1 score means strong association according to the different strength of the correlation between nodes. (Section. AWTE feature extraction-Eq. 2).
Useful meta-paths are relevant to predefined meta-paths between target nodes (nodes with classifying labels). TN-GTN can discover new graph structures and new relevant meta-paths between all types of nodes. In Table 4, predefined meta-paths and meta-path learnt by TN-GTN were summarized with domain knowledge. Predefined meta-paths were obtained on the basis of correlation between test equipment and aircraft wiring network.

Evaluation indicators
Precision, recall and F1-score were selected as evaluation indicators. TN-GTN has greatly improved the fault classification accuracy of aircraft wiring network. The evaluation indicators can visually demonstrate the TN-GTN's excellent performance of test nodes' classification on heterogeneous graph datasets. In the case of evaluating the classification performance for each class separately, the multi-classification task can be simplified into a classification task that distinguishes its own class from a non-self-class, which is transformed into a binary classification task. In this work, precision, recall and F1-score are widely used to evaluate the performance of classification algorithms, especially in the classification evaluation of imbalanced distribution datasets.
where TP is the number of cases where the predicted type of fault is positive and the actual type of fault is positive; TN: the number of cases where the predicted type of fault is negative and the actual type of fault is negative; FP: the number of cases where the predicted type of fault is positive and the actual type of fault is negative; and FN: the number of cases where the predicted type of fault is negative and the actual type of fault is positive. Precision refers to the proportion of the number of correctly predicted faults of a certain type in the predicted faults of a certain type, so it is also called precision rate; recall refers to the proportion of the number of correctly predicted faults of a certain type in the actual faults of a certain type; and F1-score is the harmonic average value, which is an evaluation index considering both precision and recall.

Discussion with evaluation of experimental results
The aircraft wiring network test dataset was divided into two parts: 20% of the 13,892 sampled dataset was used as the test data, and the remaining 80% was used as the training data. The training data were randomly divided into 4 groups to test ANN and  Table 5 lists the precision, recall and F1-score of the four types of fault classification on different training conditions.
The ANN model contained a hidden layer. 100 nodes were built by using tanh function in the first layer and sigmoid in second layer as activation function. The number of nodes in output layer was 5, which represented five final classifications, including "test result Success, Fault1, Fault2, Fault3 and Fault4. " The learning rate of ANN model was set 0.4. The end of TN-GTN was also an ANN model with the same hidden layer parameters for comparison experiments.
The original data, such as resistance value of test nodes, continuity test results and leakage test results, were used for training ANN model. Graph data contained more information like topological network information, location, relevant information between test nodes and connector types. The graph data were used for training TN-GTN model.
When the training set was scarce, it could be seen that TN-GTN improved precision, recall and F1-score in all types of fault classification compared to ANN. In particular, Fault2, Fault3 and Fault4, which relied on the enhanced features obtained from TN-GTN to judge, had been significantly improved, and the precision improvement was above 0.1. Classification performance in Fault1 was relatively good, and Precision could reach above 0.9. The features of Fault1 were more obvious, and high precision of classification could be obtained with enough training data. It could be seen from the experiments that TN-GTN used more relevant feature information than ANN. It was rarely misdiagnosed the current type of fault with another type of fault, so the recall score was also relatively high. With the increase in training data, the precision of ANN for Fault1 and Fault3 had reached more than 0.9, the precision of Fault2 was basically unchanged, and the classification effect of TN-GTN for Fault2 had been significantly improved. ANN's judgment of Fault4 depended closely on a large number of training data. When the number of samples was sufficient (80% training), the accuracy reached 0.8. TN-GTN was more accurate for Fault4 classification regardless the sample sufficiency (Fig. 12).
It could be seen from the experiment results that both ANN and TN-GTN had a good effect on the classification of Fault1 with the increase in training data. Recall score of TN-GTN in Fault1 had been relatively high, which showed that TN-GTN could effectively reduce the misjudgment of other types of faults as Fault1. The performance of TN-GTN on Fault2 was much better than ANN; especially when the training data were insufficient, TN-GTN could improve the classification accuracy by aggregating the information of surrounding test nodes. Wrong connections among the aircraft plug connectors were more likely to appear in the test nodes with strong association. TN-GTN aggregated other types of node information in heterogeneous graphs, which was more sensitive to identify such faults. Compared with pseudo-faults such as Fault1 and Fault2 appearing on the AWTE side, real faults such as Fault3 and Fault4 appearing on the aircraft wiring network side were key points in troubleshooting. TN-GTN could transform the topological network information on the aircraft wiring network side through