A Modified Nonparametric Message Passing Algorithm for Soft Iterative Channel Estimation

Based on the factor graph framework, we derived a Modified Nonparametric Message Passing Algorithm (MNMPA) for soft iterative channel estimation in a Low Density Parity-Check (LDPC) coded Bit-Interleaved Coded Modulation (BICM) system. The algorithm combines ideas from Particle Filtering (PF) with popular factor graph techniques. A Markov Chain Monte Carlo (MCMC) move step is added after typical sequential Important Sampling (SIS)-resampling to prevent particle impoverishment and to improve channel estimation precision. To reduce complexity, a new max-sum rule for updating particle based messages is reformulated and two proper update schedules are designed. Simulation results illustrate the effectiveness of MNMPA and its comparison with other sum-product algorithms in a Gaussian or non-Gaussian noise environment. We also studied the effect of the particle number, pilot symbol spacing and different schedules on BER performance.


INTRODUCTION
The two main Message Passing Algorithms (MPA), sum-product algorithm and max-product algorithm, are originally operated on Factor Graphs (FG) for iterative decoding of LDPC code (Wiberg, 1996).And the latter one is presented to further reduce complexity.In the past decade, this graphical approach has been introduced (Loeliger, 2004) and extended to solve other inference problems in communications, such as turbo equalization (Guo and Ping, 2008), multi-user detection (Bickson et al., 2008) and joint iterative channel estimation and decoding (Niu et al., 2005;Simoens and Moeneclaey, 2006;Zhu et al., 2009;Guo and Huang, 2010).FG and MPA have been proven to be powerful tools both for disclosing a unified framework behind existing iterative receiver algorithms and designing new low-complexity ones to achieve near optimal performance.
In a general continuous stochastic inference problem, the messages passed on FG are usually continuous functions making their analytical computation impossible or very complicated.Applying FG to the models with continuous variables remains a challenging task.In some special cases, all relations between variables are conditionally linear and all noises are Gaussian which results in a Gaussian or mixture Gaussian local functions.In these cases, MPA is reduced to an algorithm which only computes and updates the distributions' parameters, so we name it Parametric MPA (PMPA).The PMPA has been presented (Niu et al., 2005;Simoens and Moeneclaey, 2006;Zhu et al., 2009) and gives accurate iterative channel estimation by approximating the SISO or MIMO flat fading channel as the First-Order Autoregressive (AR (1)) process.Due to the forward and backward messages updating, this PMPA actually amounts to the famous Kalman smoothing algorithm.Another case is the Gaussian Message Passing (GMP) algorithm (Guo and Huang, 2010) which is better than unidirectional filtering for double selective channel.
However, in a practical scenario, the state-space model is usually non-Gaussian on account of the non-Gaussian measurement noise.The estimation accuracy of classical algorithms such as Kalman filter and extended Kalman filter decreases significantly and even appears to have a divergent phenomenon.As an alternative to the Gaussian approximation mentioned above, PF has advantages in inference in nonlinear and non Gaussian state-space models and can be interpreted as a message passing algorithm (Dauwels et al., 2006).These advantages encourage us to combine PF with MPA (Loeliger et al., 2007) to derive new efficient estimation and detection algorithms.Several methods have been proposed to represent distributions as a list of samples and update sample-based messages instead of parametric messages (Tienda-Luna et al., 2005;Guo and Hu, 2009;Duan et al., 2011).This algorithm is called Nonparametric Message Passing Algorithm (NMPA).A simplified NMPA (Tienda-Luna et al., 2005) for iterative decoding over channels with phase noise is derived with the mode or mean of particles at the expense of performance degrade at high SNR.A list of samples is utilized to accurately estimate Channel State Information (CSI) in un-coded systems (Guo and Hu, 2009) and coded systems (Duan et al., 2011), respectively.It is worth noting that these NMPAs only concentrates on the combination of typical PF and sumproduct algorithm and have not addressed the affections of particle impoverishment, message update rule and schedule on performance and complexity.In addition, the model mismatch between AR (1) model and the actual simulated channel have not been an area of focus.
In this study, we considered 16-QAM BICM-ID system with Modified Set Partitioning (MSP) labeling, LDPC code and non-Gaussian measurement noise and describe factor graph representation for system models.To avert particle impoverishment and further improve the precision of channel estimation of our previous NMPA (Duan et al., 2011), we proposed modified NMPA to incorporate MCMC re-sampling with typical PF and applying Metropolis Hastings (MH) algorithm to construct the transition kernel.And we reformulate particle based max-sum message update rule and design two types of selective message update schedules to spend less time on computation but also to unify the message update rule in max-sum type.Simulation results show that performance of the proposed algorithm is better than PMPA and other NMPAs.

INFERENCE IN FACTOR GRAPH BASED MODELS
System model: At the discrete transmitter, an information bit sequence b = {b j } is first encoded by a rate-R LDPC code and bit-wise interleaved.The interleaved bits are grouped into and mapped into complex data symbol x s = {x i } which is chosen from M-ary constellation χ, m = log 2 M. L pilot symbols x p are inserted into data symbols periodically and K is referred to as pilot symbol spacing.
Assuming that the correlated fading channel model is a zero mean, one unit variance complex Gaussian process and following Jakes' isotropic scattering, we know that the autocorrelation function of h i is expressed as: where, f d : The relative Doppler spread between transmitter and receiver T s : The symbol period The low-order auto-regressive model can capture most of the channel dynamics, so the fading channel is approximated with a first-order AR model (Wang and Chang, 1996).h i varies according to (1), where α = J 0 (2πf d T s ) and E (|h i | 2 ) = 1.The noise v (i) is a zero-mean complex Gaussian process with a covariance σ 2 = 1-|α| 2 and is statistically independent of h (i-1): At the receiver, the discrete-time baseband signal is written as: It is assumed that n i are independent of the symbol sequence and the independent and identically distributed (i.i.d.) random variables which follow two type distributions respectively.One is a complex Gaussian distribution: where σ 2 is the noise variance.The other is a two term Gaussian mixture distribution: where, 0<ε<1, k<1, the term N (0, σ 1 2 ) represents the nominal ambient noise and the term N (0, kσ 1 2 ) represents an impulsive component.As an approximation to the more fundamental Middleton Class A noise model (Middleton, 1999), it has been used extensively to model physical noise arising in radio and acoustic channels.

Factor graph based representation and inference:
The system is modeled by a Conditionally Gaussian Linear State-Space Model (CGLSM ) in (1) (2) and (3) and by a non-Gaussian model in (1) (2) and (4).The detailed dependence structure at the receiver has been presented in Fig. 1 following the factorization as in:

p p p I p p p h p h h p y x h
The function node T hi and T yi stand for function p (h i |h i-1 ) and p (y i |x p , h i ), respectively.T mi = I {x i = Q (d i )} is the constellation mapping.P denotes code pilot constraint.For any node, μ A→ B and L A→ B denote the message that flows from node A to node B in the probability domain and the log probability domain, respectively.
The upper part represents the Channel Estimator (CE) and the lower part represents the Iterative Decoder (ID) which covers the decoder, interleaver and demodulator.We have designed a factor graph based receiver (Duan et al., 2011) following the criterion: Check Node Variable Node ˆarg max ( ) The messages are computed by sum-product rule (Loeliger, 2004) with a particle list and are exchanged between the two parts based on turbo principle.CE outputs the channel message μ Tyi→xi downward to ID and ID updates the message μ xi→Tyi of symbols upward into CE.This turbo iteration is named outer iteration.The iteration in ID itself is named inner iteration.A basic schedule (Duan et al., 2011) has been designed to set one inner iteration in one outer iteration and obtain the marginal function p has been obtained by generic PF and incurporated with the sum-product operation to update channel message, W denotes the number of particles.Detailed steps can be drawn as follows: where k and i stand for particle index and time index, respectively o Compute the weights according to: o Normalize the weights o Resampling: Following the approach (Doucet et al., 2000), calculate the effective number of particles: and implement the resampling when N eff <0.75 W. o For pilot symbols, drawing particles from p (y i |x p , h i ) and seting the weight of each particle with equal values 1/W • Computing the data symbol's message which is a mixture gaussian distribution as: The indicator function I {statement} is 1 if the statement is true and 0 otherwise.And pilots' messages are multiplexed into data symbols'.
• Computing and passing the message upward from function node p (y i |x i , h i ) to variable node h i by: y where q l is symbol constellation point.This message is a mixture Gaussian pdf.• Updating messages forward and backward by using sum-product rule and outputing the messages downward as in (9-11).
When the channel at the receiver and actual channel are all modeled by AR (1) process (Zhu et al., 2009;Tienda-Luna et al., 2005;Guo and Hu, 2009;Duan et al., 2011), there is no modeling mismatch.Although the channel messages have been computed by particles in NMPA, PF still suffers from progressive impoverishment and degeneracy.Furthermore, when the actual channel is generated by Jakes' model, channel model mismatch will exist and the particle impoverishment will have more negative effect on channel estimation.A method to reduce degeneracy is increasing the number of particles, but it is impractical for high complexity.It is necessary to use a more efficient re-sampling method to solve this problem:

p h p h h p h h dh p h h p h h
For discrete variables in ID, the sum-product rule (Duan et al., 2011) has been applied to compute their messages.The max-sum algorithm (Henk, 2007) for higher order demodulator and the min-sum algorithm for LDPC decoder have been deduced by max-product rule (Loeliger, 2004) to decrease complexity.Therefore, we employ this type of rule instead of sum-product rule to update the log-domain messages as in (12): However, if the sum-product rule is employed in channel estimator, transforming message definition domain between ID and CE as in: exp( ) log( ) will require more computation.This is another complexity problem of soft iterative CE.

MODIFIED NONPARAMETRIC MESSAGE PASSING ALGORITHM IN CHANNEL ESTIMATION
To solve the two main problems of NMPA as mentioned above, we apply a MCMC move step in PF algorithm and derive max-sum message update rule to compute particle-based messages in CE.It is named a Modified NMPA (MNMPA).

MCMC PF:
Resample-Move algorithm (Gilks and Berzuini, 2001) is one of effective methods to rejuvenate degenerate samples by using MCMC in sequential state estimation problem.The particles have approximately followed the desired posterior distribution, so it does not require each MCMC chain initiated to have a burn-in period.The key study is to construct a transition kernel for which common choices involve the Gibbs sampler or Metropolis Hastings (MH) move.For the channel estimation problem, the MCMC with MH in PF moves the particles to new positions which tend to follow stationary distribution.In the end, the diversity of particles is increased and the depletion of the samples is avoided.The MCMC PF is carried out in the first step of CE: • SIS/resampling: ℎ �   are obtained by typical PF as mentioned above • MCMC move: MH algorithm is implemented on each particle: For k = 1 to W o Drawing proposal candidate particles: • For pilot symbols, drawing particles from p (y i |x p , h i ) and setting the weight of each particle with equal values 1/W Channel estimate accuracy is improved by this MCMC PF, thus the first problem is well solved.Meanwhile, the additional move step will increase computational cost since the particles in generic PF do not rely on jump moves.

Max-sum particle-based message update rule:
In other steps of CE, we define messages in log domain and derive a new message update rule for the estimator to update particle based message by an approximation of the Jacobian logarithm.It not only takes advantage of max and sum operation to minimize computation complexity caused by MH move step and domain transformations but also provides a kind of unified max-sum rule for all parts in iterative receiver.The details are presented as follows: • Updating message upward from function node p (y i |x i , h i ) to variable node h i in log domain.
Applying the Jacobian logarithm results in: 1 where, ( = ) log ( , ) ), 1 ( Leaving out logarithmic operations in core computation M, we get the input message of channel estimation: 1 ( ) max( , ) • Passing messages forward from T h(i+1) to h i+1 and backward from T hi to h i-1 by ( 16) and ( 17) in logdomain:

h h p h h dh p h h p h h e
where, Applying the same method as in ( 14) and ( 15), we obtain: The max operation and summation operation in (15-18) are intended to get a special summary of particle based messages instead of other type channel message.This kind of message process is referred to as a Max-Sum Rule with Particle (MSR-P).Considering the complexity of code-aided channel estimation, the MSR-P is implemented more efficiently by avoiding the exponentiation and logarithm when a larger number of particles are used: • Passing messages downward from function node p (y i |x i , h i ) to variable node x i with the MSR-P by: Since CE and ID both update the message in log domain, the transformations as in Eq. ( 13) are averted to reduce graphical receiver's complexity.Furthermore, the similarity of the proposed MSR-P to max-product rule in ID helps to unify the operation form and get harmonic cooperation in iterative receiver.
Selective message updating schedule: For the LDPC coded BICM system, we designed a basic schedule which includes six LDPC decoder iterations in inner iteration.In this schedule, the messages passed L dec (c i ) from the decoder to the de-modulator are code bit Log-Likelihood Ratios (LLRs).The computation of these messages will result in high complexity when higher order modulation is considered in BICM-ID.Since some LLRs become larger as outer iteration increases, they are assumed to have high reliabilities and the corresponding messages L dem (d ik ) need not be updated.It is possible to further reduce the BICM-ID complexity by designing a selective message updating schedule which can be viewed as a process dynamically adaptive to the current bit reliability.Other than previous applications of a similar idea in MIMO-IDMA (Novak et al., 2008), we use soft extrinsic information of every LDPC code bit LLR to evaluate the bit reliability instead of posteriori LLRs of m convo-lutional code bits.The whole schedule is completed in such way: • Initialization: The upward message L dec (c i ) out of decoder is set to be a uniform distribution.The apriori informations of data symbols are assumed to be equal and sent upward to CE. • Setting static or dynamic threshold: The static threshold is fixed to be a constant.The dynamic threshold is defined as L th = L 0 + L d *q, where L 0 and L d are constants and q is the current iteration.• Channel estimation using MNMPA: • De-multiplexing: The pilots' messages are demultiplexed from messages of x and the data symbols messages are passed downward to the demodulator.• Adaptively updating message in soft demodulator and decoder: If LLRs of c i do not exceed the threshold, the corresponding L dem (d ik ) out from demodulator are computed by employing max-sum rule.Otherwise, these messages are not updated and saved for next iteration.Next, L dec (c i ) are computed by min-sum algorithm after six iterations itself.
• Soft interleaver and modulating: The input messages L dec (c i ) are interleavered and modulated to data symbols messages.• Multiplexing: Pilots' messages are multiplexed into the data symbols messages and outputs are L xi→Tyi .• If the algorithm is not converged, return to step of Channel estimation using MNMPA.Otherwise return: Hard decisions of information bits.
This alternative scheduling of MNMPA can be viewed as a process dynamically adaptive to the current bit reliability.

SIMULATION SETTINGS AND RESULTS
In our simulation, correlated Rayleigh flat-fading channel is generated by Jakes' channel model.QPSK and 16QAM modulation are employed.The energy of pilot symbols is equal to the average energy of data symbols.The LDPC code under consideration is column regular with column weight 3 and block size 2448 at code rate 1/2.There is no channel inter-leaver in the transmitter since the LDPC code has a built-in inter-leaver.Each Min-sum LDPC decoding is set to six passes within every outer iteration.To achieve better decoder performance, the normalization factor of L ci->bj is set to be 0.75.
To start with, we compare our proposed MNMPA to the MPA (Guo and Huang, 2010) which can be viewed and named as the Kalman Smoothing algorithm, SPA (Guo and Hu, 2009) and Max-Sum Algorithm (MSA) for QPSK modulation over fast fading and slow fading channel.As the modification of SPA (Guo and Hu, 2009), MSA denotes the algorithm which only employs MSR-P instead of sum-product rule for particles message passing in CE and still uses SPA in ID.Two types of noise distributions, Additive White Gaussian Noise (AWGN) and Non-Gaussian noise, are considered in Fig. 2 and 3, respectively.The result with perfect Channel State Information (CSI) is considered as a benchmark.As shown in Fig. 2a, MNMPA has the best performance among all graph-based channel estimation algorithms.The MNMPA with 150 particles not only achieves 0.5 db gains compared to SPA with the same particle number but also gets better performance than SPA with 250 particles.The complexity of CE is decreased proportional to the numbers of particles.Therefore, a good tradeoff between the performance and complexity is obtained by the proposed MNMPA.MSA has the same performance as SPA and lower complexity due to MSR-P as analyzed in section of Max-Sum Particle-Based Message Update Rule.The cause of the inferior performance of Kalman Smoothing is that it is n not the optimal algorithm for CGLSM and the mixed Gaussian pdf as in ( 8) is simplified as a single Gaussian pdf by using soft decision of symbols to approximate message μ xi->Tyi .On the other hand, other NMPAs including SPA, MSA and MNMPA all compute this message by list of samples which can describe distributions closer to the true one when more particles are employed.Figure 2b shows the performance of MNMPA with different number of particles for faster fading channel.Contrast that to using 50 particles, MNMPA acquires 0.3-1.2 and 0.5-1.0db performance gains by using 250 and 150 particles, respectively.The more particles that are utilized, the more precise channel estimation will be.However, computation cost increases with this better performance.Meanwhile, there is little performance improvement when using more particles in low E b /N 0 .
With 0.5 db loss, the MNMPA with K = 13 approaches the performance with perfect CSI in Fig. 2c.Comparing the performance gap with that in Fig. 2a, we attributed the improvement to lower time diversity provided by the slower fading channel.Therefore, fewer particles are needed for tracking slower fading channel than that for faster fading channel.In addition, we also observed that the gap between SPA and MNMPA decreases to 0.2 db.In view of the pilot symbol spacing, a tradeoff exists between spectrum utilization efficiency and performance.The MNMPA with K = 13 achieves 0.5-1 db gain compared to that with K = 25.Since this gap widens to 1.5 db when other algorithms are performed, MNMPA is less sensitive to the variation of pilot spacing.
The Middleton Class A noise with ε = 0.3, k = 100 is considered in Fig. 3.For the non-Gaussian state-space model, Kalman Smoothing is not the optimal algorithm and is unable to catch the channel variation, while all NMPAs are robust to the noise and MNMPA is still the most effective.We conducted a simulation that compares MNMPA in three proposed adaptive schedules with a basic schedule for faster fading channel.The basic schedule with no threshold is denoted by noL.L s and L th denote the schedule with static and dynamic threshold, respectively.Figure 4 shows BER versus complexity measured by N r (q) = N 1 (q) / (2*N 2 ), where N 1 (q) is the cumulative number of messages updated in the q th iteration, N 2 is code bite number.The dynamic threshold is set at L th = L 0 + L d *q, L 0 = L d = 5.All three schedules have lower complexity than a basic schedule.BER decreases faster but smoothes-out at higher values when static threshold Ls is smaller, or vice versa.In contrast, the schedule with the dynamic threshold has intermediate behavior and offers a very favorable performance-complexity tradeoff.Achieving BER of 8.7•10 -5 with only N r = 1.8 saves 40% on computational amounts compared to the basic schedule with same BER and N r = 3.
Figure 5 illustrates the performance for 16QAM with MSP labeling in a Single-Input Single-Out (SISO) BICM system with AWGN.MSA approximates SPA and provides better performance with about 0.5 db than Kalman Smoothing.MNMPA get 0.5 db gains compared to MSA.Meanwhile, the higher order the modulation is, the more computation complexity reduction is made in MSA and MNMPA.

CONCLUSION
A Nonparametric technique has been successfully combined with the max-product algorithm to develop a new algorithm MNMPA for soft iterative channel estimation.It assumes no limitations on the type of relations between the variables and is not restricted to the Gaussian noise assumption.Compared to basic SIS re-sampling PF, the MCMC based PF algorithm can maintain the diversity of particles to avoid the sample impoverishment and can obtain more accurate channel estimation with a fewer number of sample particles.Simulation results show that it has not only better performance than Kalman smoothing algorithm but also is robust to Gaussian and non-Gaussian noise.Moreover, it is a new low-cost algorithm because of the unified max-sum message update operation in logprobability domain and well-designed selective message update schedule with static or dynamic threshold.
While this study focuses on the SISO BICM system, the extension of this method to Multi-Input Multi-Output (MIMO) BICM system is the next natural step.Furthermore, the potential property of the proposed NMPA will encourage us to carry out its general application into other iterative receiver design, e.g., equalization, multi-input multi-output detecting and decoding.Our proposed particle-based MPA is also applicable to many estimation problems which can be modeled in hidden Markov chain in communications, e.g., channel estimation, phase estimation.

Fig. 2 :
Fig. 2: BER performance comparison of MNMPA to MSA SPA and Kalman smoothing for system in AWGN with (a) 150 particles and (b) different number of particles for K = 9 and f d T s = 0.02; (c) 150 particles and different K for f d T s = 0.005.All performance shown in (c) are obtained after three outer iterations