Wlan fingerprinting based indoor positioning in the presence of dropped mixture data - Vu Trung Kien

Tài liệu Wlan fingerprinting based indoor positioning in the presence of dropped mixture data - Vu Trung Kien: Research Journal of Military Science and Technology, Special Issue, No.57A, 11 - 2018 25 WLAN FINGERPRINTING BASED INDOOR POSITIONING IN THE PRESENCE OF DROPPED MIXTURE DATA Vu Trung Kien1,*, Hoang Manh Kha1, Le Hung Lan2 Abstract: In the Wireless Local Area Network (WLAN), due to the unexpected operation of equipments and the changing of surround environment, the dropping and multi-component problems might present in the observed data. Dropping refers to the fact that occasionally Received Signal Strength Indication (RSSI) measurements of Wi-Fi access points (AP) are not available, although their value is clearly above the limited sensitivity of Wi-Fi sensors on portable devices. The multi- component problem occurs when the measured data varies due to obstacles as well as user directions, door close or open, etc.. Taken these problems into consideration, this paper proposes to model the RSSI distribution by the dropping Gaussian Mixture Model (GMM) and develo...

10 trang | Chia sẻ: quangot475 | Lượt xem: 872 | Lượt tải: 0

Bạn đang xem nội dung tài liệu Wlan fingerprinting based indoor positioning in the presence of dropped mixture data - Vu Trung Kien, để tải tài liệu về máy bạn click vào nút DOWNLOAD ở trên

Research Journal of Military Science and Technology, Special Issue, No.57A, 11 - 2018 25 WLAN FINGERPRINTING BASED INDOOR POSITIONING IN THE PRESENCE OF DROPPED MIXTURE DATA Vu Trung Kien1,*, Hoang Manh Kha1, Le Hung Lan2 Abstract: In the Wireless Local Area Network (WLAN), due to the unexpected operation of equipments and the changing of surround environment, the dropping and multi-component problems might present in the observed data. Dropping refers to the fact that occasionally Received Signal Strength Indication (RSSI) measurements of Wi-Fi access points (AP) are not available, although their value is clearly above the limited sensitivity of Wi-Fi sensors on portable devices. The multi- component problem occurs when the measured data varies due to obstacles as well as user directions, door close or open, etc.. Taken these problems into consideration, this paper proposes to model the RSSI distribution by the dropping Gaussian Mixture Model (GMM) and develop an extended version of the Expectation-Maximization (EM) algorithm to estimate parameters of such a model in the training phase of the WLAN fingerprinting based indoor positioning systems (IPS). Simulation results demonstrate the effectiveness of proposed method. Keywords: Indoor positioning, Fingerprinting, EM algorithm, Dropping, Gaussian mixture model. 1. INTRODUCTION WLAN fingerprinting based IPSs: With the popularity of Wireless Local Area Networks (WLAN), Wi-Fi based indoor positioning techniques are widely used for indoor user localization. Most popular Wi-Fi positioning methods are to make use of the Received Signal Strength Indication (RSSI). Among those, the fingerprinting based method is most suitable for the complex indoor environment because a line of sight between transmitter and receiver is not required [1]. This method estimates the position of an object which relies on training data from a set of reference points (RP) with known locations, including two phases: the training phase and the classification (online positioning) phase. In the training phase, training data (values of RSSI) are collected at RPs from Wi-Fi access points (AP) and used to build the training database which is often called radio map. In the online positioning phase, the target’s position is estimated by computing the similarity between online observations and the radio map. Within RSSI fingerprinting based positioning methods, there are two common approaches to estimate user location: deterministic approaches [2, 7], and stochastic approaches [9÷11] which use a probabilistic model of the training data and compute the likelihood of observing the online measurements given the position dependent probability density function (PDF), or the posterior of being at a position given the online observations, to come up with the position estimate. The stochastic approaches seem to be able to efficiently cope with the variations in observed data in the training as well as the classification procedure. In these approaches, the radio map stores statistical parameters of RSSI distributions of all APs at all training positions instead of raw RSSI measurements [10, 11]. Therefore, the accuracy of positioning results highly depend on the accuracy of the estimated parameters. Modeling RSSI distribution: There are two common categories of model of the distribution of RSSI values: parametric and non-parametric models. As Electronics and Automation V. T. Kien, H. M. Kha, L. H. Lan, “ WLAN fingerprinting based dropped mixture data.” 26 reported in [2÷4], systems employed in the parametric model outperform the non- parametric model. Most studies showed that the majority of RSSI histograms fitted very well with the Gaussian distribution if sufficient samples have been collected [2], [12÷16]. Therefore, the Gaussian model is the most feasible parametric model for modeling Wi-Fi RSSI data. In this work, the phenomenon in the measured Wi-Fi RSSI data has been carried out. In [3, 4, 17], authors have recognized the dropping problem in the observed data. The dropping problem refers to the fact that occasionally RSSI measurements return to the limited sensitivity of the Wi-Fi sensor, although the portable device (e.g. a smart phone) used to measure RSSI is close enough to the Wi-Fi AP. Dropping might be due to reasons such as: the limitation of the Wi-Fi chipset driver, that is, limited buffer sizes or time-outs; the temporary switching off state of APs for energy-saving purposes. A single Gaussian distribution was chosen as the model for Wi-Fi RSSI data throughout [3,4]. In [12, 5, 6], the multi-component problem was noticed. In [12], authors showed that human behaviors in the measurement environment (absence, sitting/standing still, moving randomly and moving specifically) led to the bi-modal phenomena in experimental data. In this case, using the single Gaussian distribution to model the distribution of RSSI is not appropriate. In [5,6], the GMM was used to model the observed RSSI data. The reason is the changes in the surrounding environment, for example, door closed/open and the direction of the user, will obviously change the measured signal strength. However, authors in [5, 6] have not considered the dropping problem in their work. Parameter estimation: In order to estimate parameters of the probabilistic model in the presence of missing data, the EM algorithm [8] seems to be the most feasible estimator among available approaches. This algorithm is an iterative method to find maximum likelihood estimates of parameters in statistical models. Each iteration consists of two processes: The E-step and M-step. In [3, 4], an EM algorithm was proposed to estimate parameters of censored and dropped data, but the multi-component problem has not been mentioned. In [5, 6, 8], the EM algorithm for the GMM can deal with the multi-component problem but the dropping problem has not been solved. Considering the multi-component and dropping problems presented in collected Wi-Fi RSSI data, this paper proposes to model Wi-Fi RSSI distribution by the dropping GMM and develop an extended version of the EM algorithm to estimate parameters of such this model. Moreover, the Maximum a Posteriori (MAP) method will be expanded to estimate the target’s position in the online positioning phase in case of the online measurements suffer from two problems namely multi- component and dropping. This paper consists of four sections. Section 1 is the introduction. In section 2, our proposal is presented. In section 3, the effectiveness of the proposed approach in the WF-IPS is evaluated and compared to others. The paper is concluded in section 4. 2. PROPOSED METHODS Research Journal of Military Science and Technology, Special Issue, No.57A, 11 - 2018 27 2.1. Modeling RSSI distribution by the dropping GMM Let y⃗ = [y, y,⋯ , y]; y ∈ ℝ; n = 1 ÷ N be the set of unobservable, non- dropped data (complete data), N is the number of measurements, y are independent and identically distributed random variables. Let d⃗ = [d,⋯ , d] be the set of hidden binary variables indicating whether an observation is dropped (d = 1) or not (d = 0); c is the limited sensitivity of the Wi-Fi sensor on the mobile target. Observable data are possibly dropped data: x⃗ = [x,⋯ , x] where x = y, if d = 0 c, if d = 1 . The measurement model is depicted in figure 1. Figure 1. The measurement model in case of the presence of dropped data. 2.2. An extended EM algorithm for parameter estimation in the presence of dropped data In a GMM, the likelihood function of y⃗ given Θ⃗ is: py⃗ | Θ⃗ = wpy|θ . (1) In Equation (1), Θ⃗ = w,⋯ , w ; µ,⋯ ,µ ; σ,⋯ , σ is the set of parameters, θ = µ , σ is the set parameters of the j Gaussian components (j = 1 ÷ J), J is number of components, w are positive mixing weights which sum up to 1. Let Δ⃗ = Δ ⋯ ∆ ⋮ ⋱ ⋮ ∆ ⋯ ∆ be the set of latent variables, Complete data follow the Mixture Gaussian distribution Change of environment Complete data follow the single Gaussian distribution Presence of dropped data Dropping yn c dn=0 dn=1 yn xn Electronics and Automation V. T. Kien, H. M. Kha, L. H. Lan, “ WLAN fingerprinting based dropped mixture data.” 28 Δ = 1,when y belongs to j component 0, otherwise . Due to the dropping problem, d⃗ is introduced into the likelihood computation, complete data is now (y⃗ ,Δ⃗, d⃗ ), therefore, the GMM is modified to the dropping GMM and the equation (1) becomes: py⃗ ,Δ⃗, d⃗ | Θ⃗ = wpy, d|θ Δ . (2) Hence, the log-likelihood is lnpy⃗ ,Δ⃗, d⃗ | Θ⃗ = Δlnwpy, d|θ . (3) E-step: Since hidden variables are not observable, instead of computing the log- likelihood directly, the expected value of the log-likelihood of complete data (y⃗ ,Δ⃗, d⃗ ) given the observations x⃗ and previous estimated parameters are calculated: QΘ⃗ ,Θ⃗ () = E lnpy⃗ ,Δ⃗, d⃗ | Θ⃗ |x⃗ ;Θ⃗ () (4) = P (Δ = 1) lnw + lnpy, d; θ p Δ, y, d|x;Θ () . In equation (4), Θ⃗ () denotes the current estimated parameters, k is the iteration index. In the case of (d = 0), equation (4) can be calculated as follows: Q Θ ⃗ ,Θ⃗ () = (1 − d) γ x;Θ () lnw + ln(1 − ψ) + lnx; θ. (5) Throughout this paper, we use the notation ψ = P(d = 1) as the dropped rate; (⋯) is the Gaussian distribution parameterized by θ and γ x;Θ () = () ;θ () ∑ () ;θ () . In the case of (d = 1), equation (4) can be calculated as follows: Q Θ ⃗ ,Θ⃗ () = dw () lnw + ln(ψ). (6) Combining Equation (5), (6), equation (4) ends up with: QΘ⃗ ,Θ⃗ () = (1 − d) γ x;Θ () lnw + ln(1 − ψ) + lnx; θ (7) Research Journal of Military Science and Technology, Special Issue, No.57A, 11 - 2018 29 +dw () lnw + ln(ψ). M-step: The parameter re-estimation formulae are obtained by computing the partial derivatives of equation (7) w.r.t. the elements of µ , σ , w ,ψ, and setting them to zero: μ () = ∑ (1 − d)γ x;Θ () x ∑ (1 − d)γ x;Θ () . (8) σ () = ∑ (1 − d)γ x;Θ () (x − μ) ∑ (1 − d)γ x;Θ () . (9) w () = ∑ (1 − d)γ x;Θ () + ∑ w ()d N . (10) ψ() = ∑ d N . (11) The E-step and M-step execute alternately until the improvement of the log- likelihood is smaller than a threshold. After convergence we have estimated parameters: μ () ≈ μ () : = μ ; σ () ≈ σ () : = σ; w () ≈ w () : = w; ψ() ≈ ψ(): = ψ. (12) Given equations (8÷11), both observable data and dropped data contribute to estimates. On the other hand, when the dropping does not occur (d = 0), those formulae reduce to the traditional EM algorithm for GMM [5]; when w = 1, w, ,w = [0, ,0] those formulae reduce to the EM algorithm for single Gaussian distribution in the presence of dropped data [4]. It means that our proposal not only can deal with the dropping and multi-component problems but also can work well in case collected RSSI data are complete or RSSI distribution follows the single Gaussian distribution. 2.3. The online positioning/classification procedure In this sub-section, the Maximum a Posteriori (MAP) method will be utilized to perform the classification. First, the posterior is calculated as follows P(ℓ |x⃗ ) = ∏ p(x|ℓ)P(ℓ) ∑ ∏ p(x|ℓ′)P(ℓ′) ′ (13) In equation (13), K and N is the total number of RPs and APs, respectively. x is the online measurement from i-th AP, x⃗ is the set of x (i = 1 ÷ N). We considered that the RSSI measurements of different APs are independent, and the prior P(ℓ) is equal for all locations. 30 th those with the largest posteriors. 3.1. set of parameters (true parameters): [σ dropping randomly with dr (MSE) of estimated parameters of the proposed EM algorithm for the dropping GMM (solid line) and traditional EM algorithm for GMM [5] (dashed line with “*” marker) when the dropped rate changed from 0% t V The likelihood In AP in the training phase The estimated position of the mobile object is obtained by: Here, Parameter estimation In this simulation, complete data , σ . T p equation ] . Kien, (x Knn = Figure 2. |ℓ [3 H. M. Kha, L. H. Lan ) (14 are 3. SIMULATION RESULTS AND DISCUSSION , 4] p = ), nearest neighbors chosen among the reference ; (x ⎩ ⎪ ⎨ ⎪ ⎧ ψ θ µ MSE after 1000 simulations of estimated parameters |ℓ w , ,, ,µ ) , w . ℓ( can be calculated as follows ,, ,, x⃗ ) = opped rate ( ,ψ = [− , “ WLAN fingerprinting based x; , ∑ ∑ 80 θ, are estimated parameters at the k ∈ ∈ y , − ,) ⃗ followed the GMM were generated with a N 90] ψ ℓp p( = . . Figure 2 illustrates mean square error (ℓ ℓ 1000 Observable data |x |x⃗ ) ⃗) ; o 30%. J = Electronics and Automation 2; if x if x [w dropped mixture data ≠ = positions by taking , x⃗ c c - w was performed th RP of the i ] = [0 . .5 (14 (15 , 0. .” ) - ) 5]; Research Journal of Military 3.2 we generated a floor plan with 100 RPs (small red circles) and 10 APs (blue circles) as illustrated in figure 3. The experiment was setup as follows: In the training phase, 1000 Measured data (value of RSSIs) were computed by log adding a Gaussian for re of training positions (RPs) follow the sing the GMMs with number of components is J = 2, 3, 4, 5, 6, respectively (10% for each model). Collected data were also performed dropping with the rates are 10%, 20% and 30%. The radio map was built by employing equat section 2.2 with approaches introduced in [7]. For the online localization phase, 100 simulations were performed. Each simulation, one online measurement per position fr APs was generated in the same scenarios with the training data. The MAP method proposed sub for estimating the target’s position. certain distance. The plot in the figure is computed by averaging the positioning results of 100 simulations. It can be seen that the proposed method outperforms the others. distance error proposal dropped mixture data, while authors of [4] is problem, the proposal in [5] have not considered the dropping problem, works in [7] could not deal with both problems, experiments with artificial data simulations result shows that our proposal is able to cope with th measured Wi Fig floor plan. . Positioning accuracy In order to evaluate the effectiveness of the proposed approach in the WF Figures 4, 5, 6 show the probability that the posi In term of Wi ure When the dropping occurs with ratios are 10%, 20% an which 3. The computer -section 2.3 and classification rules introduced in [4,5,7] were used of our proposal reduces -Fi RSSI data. Science and Technology, Special Issue, No. J = produced best results - 4 Fi fingerprinting based indoor positioning in the presence of measurements are collected for each RP from all APs. , stochastic approaches proposed in [4,5] and deterministic - ﬂecting the ﬂuctuation of the signal [5]. The data at 50% generated . 9.18 le Gaussians, randomly, the rest follows %, Fig results when the observable training online 1 not able to solve the multi 6.71 ure % and data ratio w 57 tioning error is lower than a e phenomena presented in the 4. A, - Comparison of positioning 11 distance path loss model 18.46 - 201 ions developed in sub ere 8 % d 30%, m compared to 90%. - component om all ean of -IPS, 31 - the and Electronics and Automation V. T. Kien, H. M. Kha, L. H. Lan, “ WLAN fingerprinting based dropped mixture data.” 32 Figure 5. Comparison of positioning results when the observable training and online data ratio were 80%. Figure 6. Comparison of positioning results when the observable training and online data ratio were 70%. Further, figure 7 vadidates that dropping GMM is still appropriate to model data with single Gaussian histogram. In this simulation, the measured data at 100% positions follow the single Gaussians, the dropped rate is 20%. Figure 8 shows that our proposal produces the same results as the standard GMM when data are complete. Figure 7. Comparison of positioning results when the observable training and online data ratio were 80%; data at 100% positions follow the single Gaussians. Figure 8. Comparison of positioning results when the observable training and online data ratio were 100%; data at 50% positions follow the single Gaussians. 4. CONCLUSION Operation states of the WLAN and the variations of received signal strength in the real indoor environments are responsible for the dropping and multi- components problems, and then have strong effects on the accuracy of WF-IPS. In this paper, novel approaches have been introduced to take into account the phenomena presented in collected Wi-Fi RSSI data due to those problems. When a part of data follows the dropping GMM, by utilizing our proposed EM algorithm, error of estimated parameters has been reduced and, consequently, positioning results can be improved considerably. It has to be noted that the proposed approach Research Journal of Military Science and Technology, Special Issue, No.57A, 11 - 2018 33 still works well in case measured data are complete (dropped rate is 0%) or measured data follows single Gaussians. In the future work, we are going to make a big enforce of labor work for gathering real data and evaluate our proposed method on the collected data. REFERENCES [1]. L. Mainetti, L. Patrono and I. Sergi, “A survey on indoor positioning systems,” in Prod. 22nd Int. Conf. on Software, Telecommunications and Computer Networks (SoftCOM), 2014. [2]. K. Kaemarungsi and P. Krishnamurth, “Modeling of indoor positioning systems based on location fingerprinting,” Proceedings of the INFOCOM, Hong Kong, March 2004. [3]. K. Hoang and R. Haeb-Umbach, “Parameter Estimation and Classication of Censored Gaussian Data with Application to Wi-Fi Indoor Positioning,” Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),Vancouver, May 2013. [4]. K. Hoang, J. Schmalenstroeer, and R. Haeb-Umbach, “Aligning Training Models with Smartphone Properties in Wi-Fi Fingerprinting based Indoor Localization,” Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, April 2015. [5]. M. Alfakih, M. Keche and H. Benoudnine, “Gaussian Mixture Modeling for Indoor Positioning WI-FI Systems,” 3rd Int. Conf. on Control, Engineering and Information Technology (CEIT), Tlemcen, Algeria, 2015. [6]. A. Goswami, L. E. Ortiz, and S. R. Das, WiGEM, “A Learning-Based Approach for Indoor Localization,” ACM CoNEXT, Tokyo, Japan, 2011. [7]. Xuxing Ding, Bingbing Wang and Zaijian Wang, “Dynamic threshold location algorithm based on fingerprinting method,” Wiley ETRI Journal, DOI: 10.4218/etrij.2017-0155, 2018. [8]. G. Lee and C. Scott. “EM algorithms for multivariate Gaussian mixture models with truncated and censored data,” Computational Statistics & Data Analysis, Vol. 56, no. 9, pp. 2816–2829, September 2012. [9]. T. Roos, P. Myllymaki, H. Tirri, P. Misikangas, and J. Sievanen, “A probabilistic approach to wlan user location estimation,” International Journal of Wireless Information Networks, Vol. 9, no. 3, pp. 155–164, 2002. [10]. M. Youssef and A. Agrawala, “The Horus WLAN location determination system,” in Proc. ACM MobiSys, 2005, pp. 205–218. [11]. P. Mirowski; D. Milioris; P. Whiting and T. Kam Ho, “Probabilistic radio-frequency fingerprinting and localization on the run,” Bell Labs Technical Journal, Vol. 18, no. 4, pp. 111–133, 2014. [12]. Jiayou Luo and Xingqun Zhan, “Characterization of Smart Phone Received Signal Strength Indication for WLAN Indoor Positioning Accuracy Improvement,” Journal of Networks, Vol. 9, No. 3, March 2014. [13]. K. Kaemarungsi and P. Krishnamurth, “Properties of Indoor Received Signal Strength for WLAN Location Fingerprinting,” In Proceedings of the 1st Annual International Conference on Mobile and Ubiquitous Systems: Electronics and Automation V. T. Kien, H. M. Kha, L. H. Lan, “ WLAN fingerprinting based dropped mixture data.” 34 Networking and Services (MOBIQUITOUS 2004), Boston, MA, USA, 22–26 August 2004. [14]. Youssef, M.; Agrawala, A, “The Horus WLAN location determination system,” In Proceedings of the 3rd International Conference on Mobile Systems, Applications, and Services, Seattle, WA, USA, 6–8 June 2005; pp. 205–218. [15]. A. Haeberlen; E. Flannery; A. M. Ladd; A. Rudys; D. S. Wallach and L. E. Kavraki, “Practical Robust Localization over Large-Scale 802.11 Wireless Networks,” In Proceedings of the 10th annual International Conference on mobile computing and networking MobiCom, Philadelphia, PA, USA, September 2004. [16]. Chinyang Henry Tseng, and Jing-Shyang Yen, “Enhanced Gaussian Mixture Model for Indoor Positioning Accuracy,” 2016 International Computer Symposium (ICS), Pages: 462 - 466. [17]. S. Beller. Modelladaption zur Verbesserung von Fingerprinting basierter Indoor navigation. Master Thesis approved by the University of Paderborn, Paderborn, July 2014. TÓM TẮT ĐỊNH VỊ TRONG NHÀ SỬ DỤNG PHƯƠNG PHÁP “DẤU VÂN TAY” DỰA TRÊN MẠNG NỘI BỘ KHÔNG DÂY TRONG TRƯỜNG HỢP TÍN HIỆU WiFi ĐÔI KHI BỊ RỚT VÀ THAY ĐỔI VỀ CƯỜNG ĐỘ Bài báo này đề cập đến hiện tượng rớt tín hiệu Wi-Fi (dropping) do một số thiết bị trong mạng nội bộ không dây (WLAN) không hoạt động hoặc bị lỗi; hiện tượng chỉ số cường độ tín hiệu (RSSI) biến đổi do sự thay đổi của môi trường truyền sóng. Các hiện tượng này dẫn tới phân bố của dữ liệu (là giá trị của cường độ tín hiệu) thu thập từ các trạm thu phát Wi-Fi thay đổi. Nói cách khác, dùng các phân bố Gaussian không mô tả được chính xác phân bố của dữ liệu trong các trường hợp này. Từ thực tế đó, các tác giả của bài báo đề xuất sử dụng mô hình Gaussian hỗn hợp (GMM) mở rộng cho cả trường hợp dữ liệu bị rớt (dropping GMM) để mô tả phân bố của dữ liệu. Kèm theo đó, thuật toán cực đại hóa kỳ vọng (EM) cũng được đề xuất để ước lượng các tham số của mô hình trên. Các kết quả thực nghiệm trên dữ liệu mô phỏng chỉ ra, khi các hiện tượng nêu trên xảy ra, hệ thống định vị trong nhà sử dụng các kết quả nghiên cứu của bài báo có độ chính xác cao hơn các hệ thống định vị khác. Từ khóa: Định vị trong nhà, Phương pháp “dấu vân tay”, Thuật toán cực đại hóa kỳ vọng, Hiện tượng rớt tín hiệu, Mô hình Gaussian hỗn hợp. Received 2nd September 2018 Revised 20 th October 2018 Accepted 1 st November 2018 Author affiliations: 1 Hanoi University of Industry; 2 National Center for Technological Progress, Ministry of Science and Technology. *Corresponding author: [email protected]; [email protected]

Các file đính kèm theo tài liệu này:

3_kien_9123_2150406.pdf