The domain of diagnostics plays a crucial role in the health assessment of satellites, spacecraft, and UAVs. Detecting, localizing, and identifying faults are essential to rescuing a malfunctioning agent’s mission. Furthermore, successful prognosis relies on the awareness of a system’s fault and the availability of relevant information. As a result, fault diagnosis for these agents has garnered significant interest, spurring extensive research in this field. Due to the increasing attention that the field of fault diagnosis has received in monitoring a broad range of agents, scientists have expanded its implementation to include satellite, spacecraft, and UAV fault diagnosis. The predominant strategies employed in the current body of literature for performing fault diagnosis within Aerospace applications, with a specific focus on satellites and UAVs, encompass data-driven and model-based techniques. While there have been occasional references to their fusion and alternative methods in published works, these approaches have not been subjected to the same rigorous scrutiny in the literature as the previously mentioned strategies. This section comprehensively lists and reviews the applications of data-driven, model-driven, hybrid, and other methods on diagnostics of a single and a group of satellites, spacecraft, and UAVs in the current literature. To provide a detailed and yet concise overview of each paper’s strong suits and shortcomings,
Table 2 presents the advantages and disadvantages of each study on single-agent fault diagnostics.
Table 3 offers a similar analysis for a team of aerial and space agents. In order to provide a broad yet comprehensive overview of the innovations featured in the studies reviewed in this research, each study has been scrutinized and an extensive list of their characteristics is presented in
Table 5 and
Table 7 for single-agent and multiagent fault diagnosis, respectively.
Table 5 is structured in a way to highlight the exact architectures, health monitoring modes, considered fault types and variables, and applications of each study on single-agent systems. Consequently, facilitating the process of deriving insights from each paper. In addition,
Table 5 serves as a comparative standpoint over the methodologies utilized in this field. Furthermore,
Table 7 offers an overview of the attributes in research related to fault diagnosis in multisatellite/UAV systems.
4.1. Data-Driven Methods
Data-driven methods have become one of the most prevalent methods for diagnostics in satellites and UAVs, especially in recent years. The literature on diagnostics has utilized these methods to a great extent to capitalize on their capability of assessing the health status of systems without having rigid mathematical models. Deep learning approaches as well as other supervised, and unsupervised approaches constitute most of the body of work in this area. Their ability to accommodate applications where autonomous diagnosis is required or when manually defined features are not available has made data-driven techniques a popular method to perform diagnostics. This section reviews how the existing literature has employed data-driven techniques for fault diagnostics.
Talebi et al. [
27] proposed a robust data-driven methodology for fault diagnosis of actuators and sensors onboard a broad range of nonlinear systems. Their methodology incorporates two recurrent neural network-based observers for Fault Detection, Isolation, and Identification (FDII) and operates based on the studied nonlinear system’s state-space model. More specifically, FDII is achieved simultaneously as the diagnosis scheme gets alerted. This is performed by indicating where the fault has formed using fault-sensitive thresholds and obtaining its intensity. The proposed methodology is also strengthened against unmodeled dynamics, unknown faults, noises, and uncertainties. The methodology was tested using a case study on magnetorquer actuators and magnetometer sensors onboard the Attitude Control System (ACS) subsystem of Low Earth Orbit (LEO) satellites, and the test results verified the methodology’s capability to be used in real-life situations. While actuator fault diagnosis is of paramount significance to increase the reliability of control systems, sensor FDII is also influential in affecting an agent’s lifetime. Sensor faults could emerge due to external disturbances, aging, and power fluctuations [
6]. The work of Talebi et al. addressed the reliability requirements for actuators and sensors.
Rahimi et al. [
80] advanced the idea of utilizing data-driven approaches for Fault Detection and Isolation (FDI) using an ensemble of Adaboost decision tree, Adaboost Random Forest (RF), MultiLayer Perceptron (MLP), and K-Nearest Neighbors (KNN) ML algorithms applied to a four-wheel configuration of RWs. Principal Component Analysis (PCA) was also incorporated into the scheme to perform dimension reduction. Their method in [
80] underperformed the required expectations for an FDI system with the highest classifier precision of 58.78%. Rahimi et al. claimed that implementing more complicated datasets for running the algorithms and utilizing optimization methods for hyperparameter evaluation could lead to more accurate results. Expanding the scheme to address different malfunction intensities, symmetrical RW arrangements, and computation overload would be the next step in this scheme’s progression.
Varvani Farahani et al. [
21] discussed data-driven fault isolation for locating multiple concurrently existing malfunctions in a four-CMG pyramid assembly. The proposed data-driven scheme consists of a feature extraction method, a feature reduction method, and a classification algorithm, which, through testing, are chosen to be Correlation Analysis, PCA, and an optimized Support Vector Machine (SVM), respectively. A sensitivity analysis was carried out to demonstrate the proposed method’s robustness against added noise signals, unavailable sensors, and skipped measurements. The results of the sensitivity analysis displayed suitable error margins for practical implementations. The simulations exhibit 100% and 99% accuracies for healthy and single-malfunctioning-CMG assemblies, respectively, and 97.8%, 91.4%, and 77% accuracies which correspond to two, three, and four in-phase malfunctions coexisting in a system. The problem of diagnosing simultaneous faults is best countered using data-driven methods. As Jiang and Khorasani have demonstrated in [
45], where concurrent faults can occur, e.g., RW assemblies, the model-based methods may struggle with diagnosing all of the faults depending on the system and if more than two are present.
Rahimi et al. [
32] explored detecting and isolating in-phase malfunctions using automated data-driven methods applied on three-orthogonal, standard four-wheel, and pyramid RW configurations. Their proposed approach preprocesses the time series input data by automatically extracting features from the temporal, spectral, and statistical domains and then passing it through a feature reduction stage. Subsequently, several classification methods were considered: Gradient Boosting, RF, Decision Tree, and MLP. It was shown that Gradient Boosting and RF performed the best among the methods. However, both operated poorly under complex fault scenarios. The FDI methodology fared relatively better for the three-orthogonal case than the standard four-wheel, and pyramid arrangements, especially due to the symmetry in the pyramid assembly. Expanding upon the presented scheme by improving the accuracy of FDI for more complicated malfunction scenarios would be the next step in developing a more robust FDI scheme. Moreover, Rahimi et al. mentioned that for RW configurations that have each RW manipulating torques in more than one axis (e.g., pyramid and standard four-wheel), fault complexities (including inception and duration) could have a detrimental effect on the FDI scheme’s accuracy.
Vaz Carneiro et al. [
81] formulated a data-driven approach to fault detection in RW assemblies employing supervised and unsupervised learning techniques, including Gaussian Mixture Model (GMM), SVM, Artificial Neural Networks, and Long Short-Term Memory (LSTM) Neural Networks. The learners receive a combination of RW attitude, temperature, control torques, and power consumption to detect the occurrence of faults. However, temperature proved to be the most sensitive variable to faults. The models are trained with the simulated datasets provided by the Basilisk simulation framework [
82], a novel approach to training data-driven models for satellite fault diagnosis applications. Producing and using large amounts of synthetic data could be conducive to achieving adeptly trained models due to not having a limitation on the amount of data produced and knowing the exact details of the occurring faults beforehand as opposed to using historical data for training models [
81]. Conversely, to avoid overfitting, amply-sized synthetic and historical datasets could also be created using generative models. The datasets acquired using generative AI could become a promising approach to alleviate the problems associated with the training phase of data-driven methods.
Suo et al. [
83] developed a feature selection algorithm based on the fuzzy Bayesian risk theory along with a heuristic forward greedy algorithm for fault diagnosis of power systems of in-orbit satellites. The proposed algorithm works with an SVM for classifying faulty or nominal states. While the proposed feature selection method is observed to obtain more accurate results than other prevalent methods, it is computationally intensive, and overcoming this shortcoming could be a favorable approach to progressing this research.
Cui et al. [
17] provided a solution to overcome the problem of faulty dataset scarcity for training data-driven satellite fault detection models. The solution incorporates redistributing the healthy and faulty datasets by oversampling the faulty samples using Dynamic Time Warping (DTW). Their proposed fault detection scheme is then completed by employing the fast DTW method for assessing correlations between samples and a KNN model for detecting malfunctions. Cui et al. tested the methodology, and DTW oversampling significantly increased the fault detection accuracy. The proposed method by Cui et al. could also help mitigate the training difficulties of data-driven methods for applications where several distinct fault types could be present in a system. Data augmentation is a technique that has been widely employed in the field of ML to make classification models more robust to diverse test cases. Data augmentation is especially common in the field of computer vision. To demonstrate how data augmentation works, an example is provided in
Figure 7. Let us assume that we want to train a data-driven model to be able to discern dog images among other classes. If our dataset only consists of images like the original image in
Figure 7, our dataset would be scarce and the data-driven model would be prone to overfitting. For example, the data-driven model through the training process might come to the conclusion that dogs are only dogs if the ear-shaped features are towards the top of the image or the paw-shaped features are toward the bottom of the image. To us humans, this conclusion appears clearly erroneous; however, this must be clarified to data-driven models by curating a comprehensive dataset or using a more robust architecture. In this case, a myriad of image modifications are imposed on the original dataset to make the dataset more diverse as shown in
Figure 7. This process is referred to as data augmentation, and it is applicable to more than just images. For example, in the work of Cui et al. [
17], DTW oversampling was used that in some way operates as a deterministic way of finding an in-between time series between two original time series data points to make the training dataset more diverse. Therefore, data augmentation is also a viable way of enhancing data-driven methods’ performance aside from feeding them synthetically generated datasets.
Hedayati et al. [
84] proposed a data-generative AI model based on the Wasserstein Generative Adversarial Network (WGAN) architecture to balance scarce satellite RW datasets for use in training data-driven models. The utilized WGAN model comprises 1D convolutional layers within its generator and critic (discriminator). First, to train the WGAN model, multiple datasets for each fault type are curated. Each dataset contains very few data points as compared with the nominal RW dataset to reflect a real-life scenario. Then, identical instances of their proposed WGAN model are trained on each RW faulty class dataset. In this way, the overall training dataset becomes balanced. The final datasets augmented by the proposed framework of Hedayati et al. were tested quantitatively using an LSTM architecture and qualitatively as well. The LSTM model was trained separately on the WGAN-augmented, naturally balanced, imbalanced, artificially duplicated (inflated), and DTW-oversampled datasets and obtained accuracy rates of 84.17%, 93.25%, 48.35%, 97.51%, 89.32% as tested on a natural test dataset. The reason for WGAN-augmented dataset’s underperformance as compared with other augmentation methods was attributed to WGAN adding too much diversity to faulty datasets and some resultant classes resembling each other. However, qualitatively it was observed that the WGAN added immense diversity and variability to the base dataset while retaining its most defining patterns whereas the other data augmentation methods added little to no diversity to the base dataset.
Pan et al. [
85] used a novel data-driven method to detect faults in satellite power systems. Their presented method consists of forming correlations between time series datasets of several sensor measurements employing association rules between sensor datasets and incorporating them into a Kernel Principal Component Analysis (KPCA) model for detection. Association rule learning is especially effective since the sensor relationships change when an anomaly occurs. The novelty of the study of Pan et al. warrants the need for it to be examined further, but scant work has been carried out on this approach and application.
Ganesan et al. [
28] presented a method employing a One Dimensional (1D)-CNN to address the fault detection problem in the power systems of satellites. The input data are preprocessed using the Stockwell transform before being fed into the CNN. When tested using the Advanced Diagnostics and Prognostics Testbed [
86], the univariate detection methodology achieved an accuracy of 96.7%. Furthermore, Ganesan et al. claimed that the methodology could be extended to multivariate detecting, locating, and classifying malfunctions in the power systems.
Muthusamy and Kumar [
10] presented a data-driven methodology for FDI in CMGs. Their methodology relies on the following elements for fault diagnosis: a predictive data-driven model reducing the dependency of the method on
a priori operational data by 93.75%, a Chebyshev neural network for detecting malfunctions in CMGs, and an optimization-based scheme using a Genetic algorithm for locating faults. The input data are limited to attitude rate measurements to accommodate satellite measurement limitations, but more comprehensive input data could be incorporated into the method. Simulated results showed a 93.25% accuracy rate in isolating different types of faults. Muthusamy and Kumar’s proposed method is suitable for applications where sensor redundancy poses problems or when historical flight records are unavailable.
Nozari et al. [
24] proposed a mixed-learning approach to FDI of a tetrahedral RW assembly and performed a comparative study of the mixed and individual models. Nozari et al. incorporated the RF, SVM, partial least square, and Naïve Bayes (NB) algorithms into their mixed-learning strategy. The proposed model consists of training several local classifiers whose outputs are used to train a meta-level classifier. The proposed mixed strategy mostly outperformed its constituent learners under noisy, noiseless, and mixed circumstances. Additionally, training using mixed noisy and noiseless datasets yielded more accurate results. However, one of the main shortcomings of mixed-learning strategies is that some mediocre learners within the scheme could negatively affect the overall mixed-learning strategy’s performance [
32].
The research conducted by Abdelghafar et al. [
29] presented a data-driven predictive method employing an Extreme Learning Machine (ELM) for detecting faults by analyzing satellite telemetry data. The proposed methodology of Abdelghafar et al. first predicts the nominal operating data of the satellite. Then, it applies a static confidence interval to obtain a range containing the values corresponding to a normal operating state. ELMs are especially useful for time-sensitive applications such as satellite system fault diagnosis due to their learning speed and generalizability. They also require less expert manpower utilization [
87]. The gray wolf optimization algorithm is also utilized to enhance the biases and input weights of the learner. Testing using the NASA shuttle valve benchmark dataset [
88] achieved 98.5% to 99.6% accuracy in detecting deviations from nominal operating states, outperforming both unoptimized ELM and SVM.
A fault detection method for satellite telemetry data is presented by Xie et al. [
33]. The detection method employs a graph neural network that considers the correlations between extracted features. Based on the cyclical operation schedule of satellites, malfunction thresholds are dynamically evaluated for detecting instances where the thresholds are exceeded. The proposed method demonstrated an accuracy of 98.30% for fault detection during testing on a satellite power system telemetry information dataset. Contrary to most ML methods, graph neural networks do not require the assumption of independent features as within graph-like data, nodes are interconnected, and there are dependencies between the nodes [
89].
Luo et al. [
34] developed a ResNeXt-based slice residual attention network for the health monitoring of a CMG onboard spacecraft. The proposed data-driven method utilizes random slicing and an attention mechanism stage. As part of the data preprocessing phase of the proposed method, the short-time Fourier transform (STFT) is employed to generate spectrogram images to boost the diagnosis capability. The model identifies the CMG’s fault states but falls short in accurately measuring each sub-component’s fault severity. Luo et al. expressed the prospects of converting the approach of distinct fault classification into a regression model for more rigorous fault diagnosis or using reinforcement learning to address more fault types.
The research carried out by Zhao et al. [
35] presented a CNN-based spacecraft and mobile robot CMG fault diagnosis scheme that comprises two attention-enhanced convolutional blocks. The input data are preprocessed into time–frequency spectrums using the STFT to enhance the fault diagnostic scheme. The scheme successfully classifies the overall CMG fault states; however, it fails to quantify each subcomponent’s state of failure accurately. A demonstration of time–frequency transformation and the final images that get fed into the data-driven models is provided in
Figure 8.
Liu et al. [
90] developed two data-driven models to identify fault scenarios in a CMG onboard spacecraft. Both models employ the K-means algorithm to carry out diagnostics. However, one of the models utilizes the PCA for feature extraction, whereas the other uses the t-distribution random neighborhood embedding method based on the t-distributed Stochastic Neighbor Embedding (t-SNE) technique. Simulation results indicate the latter model provides significantly better performance. Liu et al. also indicated three ways of approaching the fault diagnosis of CMGs, namely, from the perspective of the system, assembly, and component. Their investigation into CMGs was conducted from a component-centric standpoint, considering both the physical and digital variables associated with CMGs for diagnostic purposes.
Jado and Moncayo [
25] proposed a data-driven framework utilizing an multiple-model adaptive estimation (MMAE) comprised of multiple autoencoders to perform fault diagnosis on spacecraft. Each autoencoder is trained on an operational mode of the spacecraft and a Bayesian probability framework is used to gauge the probabilities of each model correctly representing the real system. The proposed methodology was numerically and experimentally tested and demonstrated to be able to timely diagnose faults resembling the training dataset. However, there lies an issue with this approach. The issue is that due to each autoencoder being trained on a singular operational mode, the in-between modes and relatively unknown failures could cause uncertainties in the model performance even though it is capable of detection, isolation, and identification to some extent. Their model could be bolstered by training more autoencoders to account for new failure modes.
Xiao and Yin [
37] introduced a data-driven approach for FDI in satellite thrusters. They transformed the input data from the thrusters into image representations and established distinct fault categories based on these images. Subsequently, a CNN was employed for binary image classification. Their testing results revealed accuracy rates ranging from 98.30% to 98.71%, varying according to the thrusters’ axis directions.
Sadhu et al. [
3] developed a data-driven approach to detect and identify malfunctions in UAVs using sensor data from the Inertial Measurement Unit (IMU). Their proposed model combines a convolutional and bidirectional LSTM autoencoder for fault detection and a fault identification system based on CNN and LSTM. The identification component is activated only upon detecting a malfunction. During testing with empirical data, the detection module achieved an accuracy rate exceeding 90%, while the identifier module reached 85%. Sadhu et al. suggested that future work should involve applying their techniques to diverse UAV fleets and evaluating their method’s performance in such scenarios.
An LSTM-based multivariate regressor using residual filtering is proposed by Wang et al. [
73] for fault detection and recovery of UAVs. The proposed data-driven model automatically derives spatial and temporal correlations as features and utilizes a filter to cancel out the effect of random noise. Fault detection is accomplished by applying thresholds to the estimated parameters, and recovery is achieved by using the model’s regressive nature to reconstruct the detected faulty data. The model’s accuracy rate in detecting faults was determined by tests to be 99% and 93% for the gyroscope sensor’s bias and drift faults, respectively, which outperforms the least-squares SVM and the LSTM model without residual filtering. Wang et al. also indicated that the possible future steps in developing this model are considering different fault magnitudes and flight modes’ interference and addressing the online monitoring requirements by applying the model to an embedded monitoring unit.
Du et al. [
26] proposed a data-driven approach to the mechanical fault diagnosis of damaged and cracked UAV rotors. The proposed methodology employs a 1D CNN to estimate faults using rotors’ surface vibration acceleration signal as input during hovering and rising. Du et al. found that the preprocessing of input data significantly affects the model’s performance in identifying small faults. Consequently, they used interval sampling instead of sequential sampling to reconstruct the vibration signals to obtain better results in diagnosing minor faults. The developed model’s accuracy during the hovering and rising phases reached about 100% and 98%, respectively, while using the proposed sampling method and outdid the results yielded by sequential sampling. The model’s accuracy rate was also indifferent to various loading conditions.
Taimoor et al. [
91] proposed an adaptive radial basis functions (ARBF) approach based on the Lyapunov function theory and sliding-mode concept-based methodology to tune the weight parameters of neural network-based observers. The Neural Network (NN) observers are then utilized for fault estimation of sensors onboard quadrotor UAVs. The ARBF method’s performance is compared with conventional radial basis functions (CRBF), adaptive multilayer perceptron (AMLP), conventional multilayer perceptron (CMLP), and Extended State Observer (ESO) methods and was shown to be more accurate and efficient than them.
Park et al. [
22] developed a fault detection scheme based on unsupervised learning for diagnosing cyber-attacks and physical damages in the control systems of UAVs. This approach was incentivized by the shortcomings of supervised learning-based methods in requiring labeled data and failing to detect unexpected faults. The proposed model consists of a stacked autoencoder and uses thresholds for detecting faults. The input data features comprise UAV coordinates, attitude, IMU and sensors’ data, and control inputs to the actuators. The next steps in this research could be adding to the features or incorporating sequential flight features, sequential autoencoders, or RNNs for enhanced detection performance. The presented model performed better in detecting Global Positioning System (GPS) spoofing attacks, denial-of-service (DoS) attacks, and rudder failure than in detecting elevator, aileron, and engine failures.
Li et al. [
23] developed a novel Siamese hybrid neural network (SHNN) scheme based on Few-Shot Learning (FSL) for fault diagnosis of fixed-wing UAVs. FSL is an ML approach specializing in learning from a limited dataset for mainly supervised learning-based applications and exhibits high generalizability [
92]. The FSL strategies’ advantages have been leveraged to compensate for the lack of sufficient faulty state data for training the data-driven models, considering that the available faulty data for fixed-wing UAVs are scarce. Li et al. carried out comprehensive experimental testing to validate the proposed scheme under the following conditions: different training data sizes, different flights, and unknown faults. The tests also demonstrate that the SHNN framework performs better than SVM and 1D CNN models for fewer training sample sizes. The future scope of this research includes utilizing attention mechanisms and alleviating the computational burden of the algorithm.
The work by Huang et al. [
31] proposes an ensemble data-driven model for actuator fault diagnosis of UAVs, specifically the ailerons. The proposed ensemble method uses a weighted combination of three different hybrid models. The difference between ensemble and hybrid methods in this context is that ensemble methods take the final diagnosis output of different models and make a weighted decision based on each individual model’s output and are more akin to a parallelized structure between the individual models. Whereas a hybrid data-driven model combines different architectures within one holistic model and their combination only outputs one diagnosis. Weighted ensemble architectures could be used to alleviate the dominance within lopsided models and increase the generalizability of the model. On the other hand, in a hybrid model, each segment is usually responsible for one duty, e.g., feature extraction. Huang et al. used an overarching ensemble model of three hybrid models as follows: (1) A CNN-bidirectional LSTM (CNN-BiLSTM), (2) A CNN-bidirectional Gated Recurrent Unit (CNN-BiGRU), and (3) A CNN-bidirectional Gated Recurrent Unit-LSTM (CNN-BiGRU-LSTM). The CNN model that is shared across the three models is responsible for feature extraction whereas the subsequent structures are responsible for further processing and diagnostics. A combination of random search with grid search is used for pinpointing the optimal weights of the ensemble method. Through testing and comparing the model to different state-of-the-art models for this application, the proposed ensemble method mostly outperformed the other models at the cost of increased computational intensity.
Li et al. [
75] proposed a transformer and local interpretable model-agnostic explanations (LIME) combination to perform fault diagnosis on a fixed-wing UAV’s elevator. The LIME strategy was used to increase the interpretability of the transformer model by locally approximating the output of the proposed transformer and examining the effect of different features in the final diagnosis. The LIME algorithm also gives insights about how to simplify the transformer further by specifying the features that do not contribute extensively to the diagnosis and could be dropped. Furthermore, Li et al. implemented a loss function to mitigate the problem of faulty categories’ imbalance. The proposed methodology displayed high accuracy in diagnosing faults and efficient convergences owing to the utilized loss function.
Huang and Ferguson [
93] conducted work on satellite RW fault detection by proposing a simple One-Class Linear Regression (OC-LR) algorithm. The proposed OC-LR framework works by only training on nominal RW operational datasets which is curated using a simple polynomial representation of RW motor current’s relationship with its angular velocity and acceleration. This simplified approach to monitoring the RW health state enables the algorithm to perform without the need for a rigorous mathematical model. Then, three detection metrics are used as follows: if a faulty data point falls within one, two, and three standard deviations of the training dataset’s mean, there is a 68%, 95%, and 99.7% chance that it would be detected as a faulty data point, respectively. Through simulated and real-life tests, it was found that depending on how obvious the emerged fault is, any of the three detection metrics could surpass the other two, with their accuracy rates ranging from 76.7% to 99.1%. Huang and Ferguson’s proposed algorithm is relatively simple, readily applicable, and not demanding, especially when dynamic models and faulty operational datasets are not available. However, this approach carries a relatively higher risk of false positives and negatives that could be catastrophic in a space system. Moreover, solely performing anomaly detection while only outputting the anomaly’s degree of deviation from the nominal operation may not be enough for a fault-tolerant subsystem to save the whole satellite system, especially considering that different faults with different dynamics that require dissimilar remedying measures might exhibit the same levels of deviation. One of the possible avenues for enhancing this methodology is to implement a voting strategy between the different detection metrics.
Mousavi and Khorasani [
46] introduced a Dynamic Neural Network (DNN)-based system for FDI in RWs of satellite formation flying (SFF) missions with a decentralized architecture. DNNs are trained with extended backpropagation using input/output data from the attitude control subsystem to model the nonlinear dynamics of each spacecraft. A fine-tuned set of DNN parameters is used to minimize estimation errors and to meet performance criteria. The methodology effectively detects low-severity actuator faults by employing local and neighboring spacecraft-based fault detectors. In the study, multilayer DNNs are trained based on relative attitude measurements and used to represent SFF dynamics.
Using DNNs, Valdes and Khorasani [
63] developed an effective FDI system for Pulsed Plasma Thruster (PPT) used in ACS for SFF missions. The proposed approach involves three FDI strategies: a basic FDI scheme, an advanced FDI scheme, and an integrated FDI scheme. Basic schemes accurately detect and isolate faults in PPT actuators but have low precision and high error rates. The advanced scheme analyzes relative attitude data from formation flying, resulting in excellent detection but lacking isolation capability. With the integrated scheme, accuracy, precision, and minimal misclassification rates can be achieved, as well as insights into thrust production levels during faults can be obtained. Compared with traditional attitude control actuators, fault diagnosis for PPTs has been less explored because of challenges with force measurement and mathematical modeling.
4.2. Model-Based Methods
Model-based methodologies have been a staple of fault diagnostic strategies, specifically ACS diagnostics, for many years. Their capacity to carry out precise and timely FDII using high-fidelity models has made them reliable techniques for time-sensitive missions. On top of that, the interpretability present in the outputs provided by model-based methods is conducive to making appropriate judgments and taking remedying fault-tolerant measures. This section discusses their applicability in a broad spectrum of satellite and UAV diagnostic case studies.
Jiang and Khorasani [
45] utilized a second-order nonlinear sliding mode observer for FDII in a tetrahedron RW assembly. Jiang and Khorasani indicated that due to the over-actuatedness of the RW assembly and redundant RWs, no more than two in-phase faults could be fully diagnosed, and postprocessing is needed to perform isolation and identification. However, even with two concurrently occurring faults, isolating their locations proved challenging since several permutations of faulty RWs could have the same effect on the generated residuals. Jiang and Khorasani worked around this problem by taking measures to make distinctions between the different possibilities.
Chen and Liu [
41] utilized a two-stage EKF to estimate satellite actuator and sensor faults. The focal point of their study is estimating both multiplicative and additive faults while modeling actuator faults as multiplicative and additive and sensor faults as additive. Chen and Liu represented multiplicative faults as control effectiveness parameters and additive faults as added amplitudes. Their proposed model was also tested using the telemetry data of an on-orbit satellite, and the faults were successfully estimated.
Another work addressing the problem of diagnosing multiplicative faults is the study by Shahriari-kahkeshi et al. [
11], which presents an adaptive model-based methodology applicable to Lipschitz nonlinear systems. The proposed methodology performs fault detection and identification simultaneously. It involves an adaptive state observer leveraging a robust adaptive law effective against unknown faults, measurement, and modeling uncertainties. The scheme was tested on a single-link flexible joint robot arm and displayed rapid convergence and better accuracy rates than a similar existing fault estimation scheme.
To accommodate the adaptivity requirements in satellite ACS diagnosis applications, Rahimi et al. [
19] used Adaptive Unscented Kalman Filter (AUKF) techniques for fault detection of RWs. They expanded upon them by combining them with the Particle Swarm Optimization (PSO) method. The PSO method is used to circumvent the complicated procedure of setting up the AUKF for parameter estimation and making it more precise and efficient. The model-based methodology is based on a high-fidelity RW model developed by Bialke [
94]. It is indicated by Rahimi et al. that for more rigorous results, a more sophisticated implementation of the PSO and AI manipulation is required.
Furthermore, Rahimi et al. addressed the fault diagnosis of RW assemblies in [
95] by introducing a model-based hierarchical approach for FDII of RWs onboard satellites. Their three-step proposed methodology consists of (1) an AUKF for detection, (2) multiple UKFs along with Bayes’ probability theorem and probability distributions for isolation, and (3) dual-state and parameter estimation using UKFs for identification. The methodology’s accuracy rate in detecting, isolating, and identifying faults was determined by simulations to be 89.5% on average.
During their research on enhancing AUKFs for single RW usages, Rahimi et al. [
16] also presented the Covariance-based Adaptive Unscented Kalman Filter (CAUKF), a variant of the AUKF with the aim of refining parameter estimation. The model-based method incorporates adaptive covariance matrices of states and parameters to be robust to sudden fluctuations in nonmeasurable parameters of the system. The proposed CAUKF generally takes more time to diagnose faults compared with the AUKF while displaying 4%, 14%, 42%, and 90% less mean square errors for residuals of the RW model parameters under abrupt, transient, intermittent, and incipient malfunction scenarios, respectively.
Moreover, Rahimi et al. [
14] built upon their work in [
16] by proposing the binary Binary Grid Covariance Adaptive Unscented Kalman Filter (GAUKF) methodology and a two-step hierarchical method for FDII of CMGs. The first step employs the adaptive thresholding approach, and the second step encapsulates fault isolation and identification by introducing an adaptive covariance-based binary grid search procedure. The GAUKF method was developed to address the high computational cost of the CAUKF and its suboptimal operation when the control system is closed-loop. GAUKF’s fault isolation and identification performance surpasses both CAUKF and UKF methodologies in terms of precision. Additionally, the average execution time for the GAUKF is shorter than that of CAUKF and UKF when utilized in a closed-loop control system. However, the symmetry of the pyramid configuration poses problems in isolating the faults in RWs across from each other. Also, another drawback is the inaccurate estimation of one of the four wheels’ fault parameters due to the system’s lower dynamical sensitivity to the fourth RW’s output [
14,
95].
Later on, Rahimi [
96] proposed the Simplified Binary Grid Covariance Adaptive Unscented Kalman Filter (SGAUKF) methodology and further developed the isolation and identification modules presented in [
14,
16]. The SGAUKF method updates only the even or odd rows or columns of the posterior estimate covariance matrix’s diagonal elements, cutting the order of computations to its square root. The method also features either an increase or no change, as the name binary implies. The Monte Carlo simulation results show that SGAUKF adds to the GAUKF isolation accuracy by 1% while speeding up the process by about 27%, making it an even more viable approach to online monitoring. The SGAUKF also suffers from the issue of imprecise identification performance for the fourth flywheel, which could be addressed in future research.
Nasrolahi and Abdollahi [
6] designed a nonlinear observer for fault detection of attitude and rate sensors onboard satellites and presented a fault recovery strategy. Fault detection is achieved using the satellite’s measured angular velocities and attitude parameters. However, attitude parameters are described using Modified Rodriguez Parameters (MRP), which enables the fault detection of different attitude sensor types with different configurations using the same observer.
Carvalho et al. [
74] presented a fault detection filter that incorporates both (
) and (
) simultaneously for Markovian Jump Linear System (MJLS). Carvalho et al.’s proposed approach aims to make the filter robust to disturbances, noises, and applied inputs and sensitive to malfunction signals. The filter is then applied to a CMG to demonstrate its performance. A brief description of MJLSs is included below. A discretized dynamical Markovian jump linear system’s state-space model is defined as follows [
97]:
where
denotes the system’s state at discrete time step
k, the initial state
follows a distribution of
D,
denotes the control command at time step
k,
takes a value corresponding to a finite set of
for each time step
k, where
forms a discretized Markov process from
N. The probability of transition between two states for
is defined as
where for each
i the sum of all
amounts to one. The initial distribution of
is also represented as follows:
where the sum of all
is one. From Equation (
5), Markov processes are described by the quality of their transition probability from one state to another solely depending on the current state of the system, irrespective of how it arrived at its current state. For demonstration,
Figure 9 depicts a Markov chain of a fault-prone system with a transition probability matrix of
.
Iglesis et al. [
13] utilized the idea of jump Markov processes by presenting a Jump Markov Regularized Particle Filter (JMRPF) for nonlinear Inertial Navigation Sensor (INS) fault identification used in longitudinal control of a fixed-wing UAV. The jump Markov approach was introduced into a regularized PF to address the transitions of the system between faulty and nominal modes based on the Markov property. The proposed scheme of Iglesis et al. does not require predefined fault models to operate. Instead, estimation is carried out for additive abrupt and incipient faults with unexpected dynamics and amplitudes. A Kalman correction strategy is also incorporated into the proposed scheme, which distributes the particle states in the more probable state-space areas to enhance the state and malfunction estimation further. Under numerical simulations, JMRPF showed 77% less root-mean-square error in estimation than a regularized PF while exhibiting shorter convergence times and more robustness to faults with larger amplitudes.
Wang et al. [
99] developed a model-based method for actuator FDII of a Hex-rotor UAV. Faults are classified as total actuator failures and gain faults attributed to the actuator lift factor’s deviations from the nominal condition. The diagnosis model includes a set of EKF-based fault observers whose output is employed to achieve fault reconstruction. The proposed methodology of Wang et al. surpasses the accuracy of sliding mode observers in attitude angle tracking.
Maqsood et al. [
42] developed an enhanced high gain observer for FDII of angular rate sensors in quadrotor UAVs to further address UAV sensor health monitoring. The proposed model is tested under incipient, oscillatory, and intermittent fault conditions, and its results are compared with the integral chain differentiator and basic high-gain observer techniques. The accuracy rate of the proposed approach significantly surpasses that of traditional techniques while being less computationally intensive and providing fast diagnostics.
Gai et al. [
43] proposed a novel dynamic Event-Triggered Mechanism (ETM) for fault detection in UAV actuators to achieve minimal communication resource usage and remove correlations between generated residuals and dynamic event-triggered transmission errors while circumventing the Zeno phenomenon. Zeno phenomenon refers to the existence of an infinite number of events in a finite time interval, making ETMs prone to errors. The dynamic ETM approach employed by Gai et al. is based on the
/
optimization problem, which is solved using the Riccati recursion method. Their fault detection scheme also utilized a new residual evaluation and thresholding method. The simulation results indicate that the proposed dynamic ETM detection scheme outperforms the static variant in terms of accuracy, requiring at least 13% less event transmission information. Gai et al. also expressed that the next steps in this research would be to evaluate the effect of the dynamic ETM on quantitative fault inspection and examine the effect of closed-loop control systems on faults’ noticeability.
Gao et al. [
36] used a stochastic-model-based method for FDII of actuators onboard tilt-rotor UAVs. Their employed methodology consists of several EKF observers, each assigned to a single corresponding actuator. Moreover, it utilizes the MMAE method to use each observer’s residuals and the state error covariance matrix for obtaining the fault conditional probabilities, which are then employed to achieve fault diagnosis. The proposed methodology of Gao et al. also enhances the efficiency of the MMAE method. It does not require additional sensors to monitor actuator deflections and change the flight controller.
Guzmán-Rabasa et al. [
38] designed an
observer for FDI of actuators onboard a quadrotor UAV where the rotational dynamics are modeled as a reduced quasi-linear parameter-varying system. The numerical test results demonstrate that the proposed approach performs on the same accuracy level as similar methods. However, it is claimed to be more efficient because it only accounts for the system’s rotational dynamics. On the other hand, it falls short in carrying out fault diagnosis while considering translational dynamics. The next progressions for this research could include validating the proposed approach with an experimental setup, developing a mixed
H/
scheme for fault diagnosis or using it with a fault-tolerant controller.
To also address surface icing, Rotondo et al. [
39] presented a model-based methodology for UAV icing and actuator fault diagnosis. Their approach is structured around combining the concepts of internal and Unknown Input Observers (UIO) and exploiting their fusion to develop a linear parameter-varying observer to circumvent the issues surrounding model uncertainties and unknown inputs. Rotondo et al. also suggested the following additions to enhance the proposed methodology: Alleviating the methodology’s conservativeness, improving its performance, and generalizing it for more comprehensive operating conditions.
An innovative hierarchical approach for fault diagnosis in satellite components or subsystems is presented in [
66]. This method emphasizes dividing the complex system into smaller pieces for structured diagnostic reasoning based on rules specific to each element. These modules are interconnected through a Component Dependency Model (CDM) using Bayesian networks. A leader–follower SFF is used to demonstrate the method’s functionality and prove its potential to enhance satellite health monitoring. The approach will be extended to dynamic Bayesian network-based CDMs and systematically generated, providing promising directions for enhancing fault diagnosis capabilities.
In [
52,
55], Azizi and Khorasani explored strategies for managing actuator malfunctions in a decentralized manner for SFF. This is facilitated through a cooperative framework comprising three layers. Initially, established recovery methodologies are employed at the bottom-layer fault recovery based on assessments of fault severity. Nevertheless, inaccurate fault assessments can result in violations of mission error protocols. The higher layer supervising identifies faults and initiates the formation layer of fault recovery. This high-level supervision endeavors to adjust for the satellite partially restored at the lower level. In another paper, Tousi and Khorasani [
47] developed a hybrid method that combined FDI stages such as [
52,
55] for a team of cooperating UAVs. Their hybrid methodology incorporates both bottom-layer and top-layer FDI modules. Simulation results involving a team of five UAVs are provided, demonstrating the effectiveness of this approach. In [
48], Tousi et al. expanded the work in [
47] and used Discrete-Event Systems (DES) at the top level to diagnose faults, while traditional diagnostic methods are used at the lower level. A broader spectrum of faults can be detected and isolated using this method than has been reported in the literature.
In another study [
53], Azizi and Khorasani explored a novel distributed KF approach designed to estimate actuator faults for SFF. When addressing a complicated hierarchical system, they converted the representation of the system from an Overlapping Block-Diagonal State Space (OBDSS) to their innovative Constrained-State Block-Diagonal State Space (CSBDSS). The proposed approach can simplify the implementation of KFs in a distributed manner. During Kalman filtering iterations, the constrained-state condition must be met, equivalent to solving local constrained optimization cost functions. Various systems, including power systems and sensor networks, can be addressed with this technique.
A hierarchical fault estimation and accommodation approach is proposed in [
54]. To minimize the adverse effects of unmodeled dynamics, uncertainty, and disturbances within the SFF, Azizi, and Khorasani seek to encourage cooperative interactions across levels and modules. Additionally, they highlighted a crucial finding: centralized estimation schemes exhibit notable drawbacks when faced with unmodeled dynamics, uncertainties, and disturbances. As a result, the cooperative estimation procedure proposed in [
54] proves to be a highly applicable method for overcoming the mentioned challenges.
An advanced approach to cooperative actuator fault estimation in deep space is presented in [
76] that integrates hybrid and switching techniques. Every operational mode represents a distinct cooperative estimation strategy and communication pattern between the localized filters responsible for detecting and estimating the SFF’s status. With the help of this approach, fault estimation accuracy is likely to be enhanced in a cooperative and dynamically changing space environment.
A cooperative fault accommodation challenge within SFF is explored in [
56]. It used absolute rather than relative measurements, like other studies by Khorasani’s team [
45,
52,
55,
57], making the approach applicable to a wide range of SFFs operating in planetary orbits. In this study, collaboration between controllers maximizes the efficiency of supporting a faulty satellite. Therefore, the scope of this study is expanded to include a broader spectrum of SFF based on the cooperative fault accommodation framework.
In another study [
65], Azizi and Khorasani introduced an innovative, collaborative framework designed to estimate the states of SFF. This framework employed the concept of sub-observers, each dedicated to estimating specific states based on provided input, output, and state information. To maintain estimation errors at a manageable level, they proposed a directed graph representing the interdependencies among subobservers. By carefully selecting the optimal path within these subobservers, a higher-level supervisor gains the ability to thoughtfully choose and configure a set of sub-observers, ensuring accurate estimation of all system states. In cases where unreliable information is introduced due to significant disturbances, noise, or actuator faults, specific subobservers may lose their efficacy. In such scenarios, the supervisor dynamically adjusts the subobserver set by selecting a new path within the subobserver’s digraph. This adaptive approach effectively manages and restricts the impact of these uncertainties, ensuring they only influence local estimates of states and faults. Consequently, the spread of uncertainties throughout the estimation process is minimized, preventing extensive performance degradation across the entire SFF [
65].
In [
64], a particular structure is described by Meskin and Khorasani. The SFF is considered to be a Multiple-Input, Multiple-Output (MIMO) system in this study. By employing this structural approach, the overall formation is optimally managed, and a viable stability analysis can be conducted. One problem, however, is that the MIMO structure is not resilient to local faults. As a result, it is imperative to develop FDI filters capable of detecting such localized faults. FDI filters operating in a local or decentralized capacity were devised and presented to mitigate this vulnerability.
Ghasemi and Khorasani proposed an innovative FDI strategy for the ACS in SFF [
77]. This method can use three different setups: decentralized, centralized, and semi-decentralized (distributed). As a result of these three configurations, a fault diagnostic system can be created for multisatellite systems. Compared with the other two FDI architectures, the centralized FDI architecture is more effective in missions incorporating angular velocity sensors. However, it also generates more false alarms. On the other hand, FDI decentralized architectures announce fault occurrences more frequently and produce fewer false alarms. The centralized Fault Detection (FD) architecture performs better in FDI, resulting in fewer false alarms in missions when formation flight integrates attitude-measuring sensors than the other two systems. Furthermore, the decentralized FD architecture communicates fault occurrences less frequently while producing fewer false alarms compared with the distributed FD architecture.
An
-based robust distributed observer to synchronize the orientations of several satellites is investigated in [
78]. A powerful distributed
is established at each satellite to detect intermittent faults. To calculate the gains for observers, they designed two sets of Linear Matrix Inequality (LMI) criteria. Numerous satellite numerical simulations show that the resilient
observer effectively estimates time-varying and intermittent faults when the satellite’s communication layout is an undirected graph. Additionally, the designed observer is shown to be effective when several satellites have problems at once.
In [
58], Shakouri and Assadian introduced a novel approach involving intersatellite measurements, considering relative positional data and orbital parameters. Their focus was on FDI within spacecraft rate gyroscopes. This innovative technique identifies a consistent motion characteristic, specifically restricting dynamic states during relative motion. Consequently, the primary satellite’s angular velocity vector aligns with a quadratic surface. This ascertained motion characteristic aids in diagnosing faults within the gyroscope system and providing a rough estimation of the relevant scale factor or bias for the rate gyroscopes associated with the primary satellite. The proposed approach eliminates the need for additional subsystems within the SFF framework.
An exploration concerns a comprehensive approach for FDI within a specific class of nonlinear SFF systems [
67]. The methodology considers model uncertainties, input variations, and environmental disturbances throughout the diagnostic process. To address fault detection, a nonlinear observer is carefully engineered to minimize uncertainty within the robust
framework. The observer gain matrices are computed using an LMI formulation. Robust UIO are developed to pinpoint the faulty actuator and facilitate fault isolation. This isolation is executed through a strategic implementation of generalized observer techniques. The novel observer architecture can concurrently estimate faults and states while effectively mitigating the impact of unknown input disturbances, model uncertainties, and external disturbances. This is achieved by employing Lipschitz formulations and the Linear Parameter Varying (LPV) method across all proposed observers, resulting in less conservative LMI conditions. It is essential to mention that this approach allows each satellite to diagnose its faults and those of its adjacent satellites.
Gao and Wang [
59] introduced an approach to estimate faults and ensure fault tolerance in the control system for nonlinear SFF systems, explicitly addressing actuator faults. A decentralized UIO is designed to estimate the actuator fault factors. The estimated fault values obtained are then utilized to design a distributed fault-tolerant controller for SFF. Subsequently, an algorithm for fault-tolerant formation control is developed, using Adaptive Terminal Sliding Mode Control (ATSMC) techniques to improve the synchronization of each follower spacecraft with the leader spacecraft, even when faced with actuator faults.
In another research, Barzegar and Rahimi [
60] undertook to design a robust UIO specifically crafted to estimate faults within clusters of small satellites. Operating within clusters, these satellites present heterogeneous dynamics characterized by significant diversity owing to environmental conditions and inherent nonlinearities. The study extends its focus beyond fault estimation, investigating that the designed observer and controller logic not only estimate faults accurately but also arrange synchronized behavior among satellites within each cluster. This synchronization shows as a state of consensus, steering the satellites towards a collective achievement of predefined objectives and trajectories designed for their mission. In a separate study, Barzegar and Rahimi [
61] investigated the problem of fault diagnosis within clusters of small satellites. These satellites operate during diverse environmental conditions, presenting a difficult challenge in mitigating both disturbances and faults while maintaining the desired formation state among the satellites. The study aimed to alleviate the adverse effects of disturbances through an elaborate fault diagnosis approach, ultimately ensuring a consistent and reliable formation state among the small satellites, even in dynamically changing environments. To fortify the observer’s resilience against external disturbances as outlined in Barzegar and Rahimi’s work [
60,
61], they employed the
approach. In a separate study conducted by the same authors, a dissipativity-based UIO is implemented for the Lipschitz nonlinear multisatellite systems. This strategy is designed to mitigate the impact of disturbances across two scenarios: one where matching conditions are met and another where they are not [
62].
Negash et al. [
51] aimed to detect cyber-threats during UAV Formation Flight (UFF). Negash et al. have adopted a decentralized approach using UIO for FDI, which is particularly suited to the distinctive attributes of UFF and their associated control algorithms, which may involve unknown parameters. This approach facilitates the identification of system faults and extends its utility to detecting cyber intrusions and precisely locating compromised UAVs. The primary objective of this study is to introduce an algorithm capable of identifying compromised UAVs without compromising the overall performance of the formation. Numerical simulations conducted by the authors provide evidence that diagnosing cyber intrusions within a hexagonal UAV formation is highly effective [
51].
Meskin et al. [
49] proposed a unique solution to the problem of implementing UAVs in environments subject to significant disturbances. The FDI method was developed to integrate a continuous-time residual generator with a fault diagnosis system using a DES. Specifically, their hybrid FDI algorithm is applied to detect and isolate actuator faults in a quad-rotor network to demonstrate its efficacy.
Meskin and Khorasani [
50] explored investigating, designing, and examining FDI filters for actuator systems within a network of aerial and space unmanned vehicles. The study emphasizes the actuator fault patterns within the team of agents, characterizing it as an excessively actuated system. An insolubility measure is introduced to evaluate fault patterns and design a novel set of structured residuals. This set is engineered to perform detection selectively and accurately, as well as isolate multiple faults that have been identified, particularly those with dependent fault patterns, such as excessively actuated systems. Based on these concepts, their algorithm is implemented to tackle the actuator FDI challenge in unmanned vehicle networks operating under different architectures, including centralized, distributed, and decentralized. Additionally, in [
50], a comparative analysis was conducted, evaluating each architecture’s advantages and limitations.
Zaeri Amirani et al. [
68] focused on managing formations that change with time, especially when leaders move and followers lack information about their actions. The study also investigated altering the number of leaders and followers in the formation. Arrays of KFs were developed to handle noise reduction and integrate data using the state vector. A
-test was utilized for FDI in this study. The coefficients in the formation control logic were set in advance, considering changes in the model, eliminating the need for real-time adjustments. The formation control law utilized integral state feedback based on the relative position integral of the agents, allowing followers to track leaders without their input.